Agent Skills

Experimental Data Analysis

AIPOCH

Statistical analysis and reporting for experimental datasets; use when you need to interpret experimental results, test significance (t-tests/ANOVA), or generate reproducible reports.

3
0
FILES
experimental-data-analysis/
skill.md
scripts
analyze_experiment.py
init_run.py
references
reporting-template.md
stats-method-selection.md
91100Total Score
View Evaluation Report
Core Capability
88 / 100
Functional Suitability
11 / 12
Reliability
10 / 12
Performance & Context
8 / 8
Agent Usability
14 / 16
Human Usability
8 / 8
Security
10 / 12
Maintainability
10 / 12
Agent-Specific
17 / 20
Medical Task
20 / 20 Passed
98You have experimental results in CSV form and need a reproducible end-to-end analysis workflow (clean → test → report)
4/4
94You need to compare two conditions (independent or paired) and determine statistical significance with effect sizes
4/4
92Reproducible, run-based execution that writes all artifacts into outputs/runs/<timestamp>/
4/4
92Data preparation guidance: missing values, outliers, and variable type identification (continuous/categorical; grouping factors)
4/4
92End-to-end case for Reproducible, run-based execution that writes all artifacts into outputs/runs/<timestamp>/
4/4

SKILL.md

When to Use

  • You have experimental results in CSV form and need a reproducible end-to-end analysis workflow (clean → test → report).
  • You need to compare two conditions (independent or paired) and determine statistical significance with effect sizes.
  • You need to compare 3+ groups (one-way) or multiple factors (multi-way) using ANOVA and post-hoc multiple comparisons.
  • You must validate assumptions (normality, homogeneity of variance) and document them in a report.
  • You need standardized run outputs (timestamped run directories) for traceability and auditing.

Key Features

  • Reproducible, run-based execution that writes all artifacts into outputs/runs/<timestamp>/.
  • Data preparation guidance: missing values, outliers, and variable type identification (continuous/categorical; grouping factors).
  • Descriptive statistics: means, standard deviations, confidence intervals, and grouped summary tables.
  • Inferential testing:
    • t-tests (independent/paired) and non-parametric alternatives when assumptions fail.
    • ANOVA (one-way and multi-way) with post-hoc testing (e.g., Tukey).
  • Reporting outputs: test statistics, p-values, effect sizes, tables, charts, and explicit assumption notes.
  • Reference materials for method selection and reporting templates:
    • references/stats-method-selection.md
    • references/reporting-template.md

Dependencies

  • Python 3.10+
  • pandas >= 2.0
  • numpy >= 1.24
  • scipy >= 1.10

Example Usage

The workflow is run-directory based. Initialize a new run, then analyze using the latest run by default.

# 1) Initialize a new run directory with sample inputs/config
python scripts/init_run.py

# 2) Run analysis (uses the latest outputs/runs/<timestamp>/ by default)
python scripts/analyze_experiment.py

Expected directory conventions:

  • A new run directory is created at: outputs/runs/<timestamp>/
  • Configuration file location: outputs/runs/<timestamp>/config.json
  • All intermediate and final artifacts (config, inputs, outputs, figures, tables) must be written inside the run directory.
  • Writing outside the run directory is prohibited.

Implementation Details

Reproducible Run Management

  • Before each execution, run:
    • scripts/init_run.py to create outputs/runs/<timestamp>/ and populate initial inputs/config.
  • Analysis scripts default to the latest run directory under outputs/runs/ unless explicitly overridden (if supported by the script).

Analysis Pipeline

  1. Data Preparation

    • Handle missing values (e.g., drop, impute, or flag) according to the experimental design.
    • Detect and treat outliers (e.g., robust rules, domain thresholds), documenting any exclusions.
    • Identify variable roles:
      • Outcome variable(s): typically continuous measurements.
      • Grouping factors: categorical condition labels (treatment/control, timepoint, genotype, etc.).
  2. Descriptive Statistics

    • Compute summary metrics per group:
      • Mean, standard deviation, and confidence intervals (commonly 95% CI).
    • Produce grouped summary tables suitable for reporting.
  3. Inferential Statistics

    • Two-group comparisons
      • Use an independent t-test for separate groups.
      • Use a paired t-test for repeated measures / matched pairs.
      • If assumptions are violated, switch to an appropriate non-parametric alternative.
    • Multi-group / multi-factor comparisons
      • Use one-way ANOVA for a single factor with 3+ levels.
      • Use multi-way ANOVA when multiple factors are present.
    • Multiple comparisons
      • Apply post-hoc procedures (e.g., Tukey) after ANOVA when needed.
      • Define and document the multiple-comparison control strategy.
  4. Assumption Checks and Reporting Standards

    • Validate and report:
      • Normality (per group or model residuals, as appropriate).
      • Homogeneity of variance.
    • Report, at minimum:
      • Test statistic, degrees of freedom (if applicable), p-value.
      • Effect size(s) and confidence intervals where applicable.
    • Retain analysis code and random seeds to ensure reproducibility.