Agent Skills
Experimental Data Analysis
AIPOCH
Statistical analysis and reporting for experimental datasets; use when you need to interpret experimental results, test significance (t-tests/ANOVA), or generate reproducible reports.
3
0
FILES
91100Total Score
View Evaluation ReportCore Capability
88 / 100
Functional Suitability
11 / 12
Reliability
10 / 12
Performance & Context
8 / 8
Agent Usability
14 / 16
Human Usability
8 / 8
Security
10 / 12
Maintainability
10 / 12
Agent-Specific
17 / 20
Medical Task
20 / 20 Passed
98You have experimental results in CSV form and need a reproducible end-to-end analysis workflow (clean → test → report)
4/4
94You need to compare two conditions (independent or paired) and determine statistical significance with effect sizes
4/4
92Reproducible, run-based execution that writes all artifacts into outputs/runs/<timestamp>/
4/4
92Data preparation guidance: missing values, outliers, and variable type identification (continuous/categorical; grouping factors)
4/4
92End-to-end case for Reproducible, run-based execution that writes all artifacts into outputs/runs/<timestamp>/
4/4
SKILL.md
When to Use
- You have experimental results in CSV form and need a reproducible end-to-end analysis workflow (clean → test → report).
- You need to compare two conditions (independent or paired) and determine statistical significance with effect sizes.
- You need to compare 3+ groups (one-way) or multiple factors (multi-way) using ANOVA and post-hoc multiple comparisons.
- You must validate assumptions (normality, homogeneity of variance) and document them in a report.
- You need standardized run outputs (timestamped run directories) for traceability and auditing.
Key Features
- Reproducible, run-based execution that writes all artifacts into
outputs/runs/<timestamp>/. - Data preparation guidance: missing values, outliers, and variable type identification (continuous/categorical; grouping factors).
- Descriptive statistics: means, standard deviations, confidence intervals, and grouped summary tables.
- Inferential testing:
- t-tests (independent/paired) and non-parametric alternatives when assumptions fail.
- ANOVA (one-way and multi-way) with post-hoc testing (e.g., Tukey).
- Reporting outputs: test statistics, p-values, effect sizes, tables, charts, and explicit assumption notes.
- Reference materials for method selection and reporting templates:
references/stats-method-selection.mdreferences/reporting-template.md
Dependencies
- Python 3.10+
- pandas >= 2.0
- numpy >= 1.24
- scipy >= 1.10
Example Usage
The workflow is run-directory based. Initialize a new run, then analyze using the latest run by default.
# 1) Initialize a new run directory with sample inputs/config
python scripts/init_run.py
# 2) Run analysis (uses the latest outputs/runs/<timestamp>/ by default)
python scripts/analyze_experiment.py
Expected directory conventions:
- A new run directory is created at:
outputs/runs/<timestamp>/ - Configuration file location:
outputs/runs/<timestamp>/config.json - All intermediate and final artifacts (config, inputs, outputs, figures, tables) must be written inside the run directory.
- Writing outside the run directory is prohibited.
Implementation Details
Reproducible Run Management
- Before each execution, run:
scripts/init_run.pyto createoutputs/runs/<timestamp>/and populate initial inputs/config.
- Analysis scripts default to the latest run directory under
outputs/runs/unless explicitly overridden (if supported by the script).
Analysis Pipeline
-
Data Preparation
- Handle missing values (e.g., drop, impute, or flag) according to the experimental design.
- Detect and treat outliers (e.g., robust rules, domain thresholds), documenting any exclusions.
- Identify variable roles:
- Outcome variable(s): typically continuous measurements.
- Grouping factors: categorical condition labels (treatment/control, timepoint, genotype, etc.).
-
Descriptive Statistics
- Compute summary metrics per group:
- Mean, standard deviation, and confidence intervals (commonly 95% CI).
- Produce grouped summary tables suitable for reporting.
- Compute summary metrics per group:
-
Inferential Statistics
- Two-group comparisons
- Use an independent t-test for separate groups.
- Use a paired t-test for repeated measures / matched pairs.
- If assumptions are violated, switch to an appropriate non-parametric alternative.
- Multi-group / multi-factor comparisons
- Use one-way ANOVA for a single factor with 3+ levels.
- Use multi-way ANOVA when multiple factors are present.
- Multiple comparisons
- Apply post-hoc procedures (e.g., Tukey) after ANOVA when needed.
- Define and document the multiple-comparison control strategy.
- Two-group comparisons
-
Assumption Checks and Reporting Standards
- Validate and report:
- Normality (per group or model residuals, as appropriate).
- Homogeneity of variance.
- Report, at minimum:
- Test statistic, degrees of freedom (if applicable), p-value.
- Effect size(s) and confidence intervals where applicable.
- Retain analysis code and random seeds to ensure reproducibility.
- Validate and report: