Data Analysis

differential-expression-analysis

Perform differential expression analysis between two or more groups using limma, DESeq2, or edgeR. Inputs: count or normalized expression matrix, group labels. Outputs: DEG result table, volcano plot, heatmap, session metadata.

90100Total Score
Core Capability
91 / 100
Functional Suitability
11 / 12
Reliability
11 / 12
Performance & Context
7 / 8
Agent Usability
15 / 16
Human Usability
7 / 8
Security
11 / 12
Maintainability
11 / 12
Agent-Specific
18 / 20
Medical Task
25 / 25 Passed
93limma differential expression smoke test
5/5
91CLI help and option contract
5/5
88sample matching and group validation review
5/5
90visualization artifact workflow
5/5
88multi-method workflow coverage review
5/5

Veto GatesRequired pass for any deployment consideration

Skill Veto✓ All 4 gates passed
Operational Stability
System remains stable across varied inputs and edge cases
PASS
Structural Consistency
Output structure conforms to expected skill contract format
PASS
Result Determinism
Equivalent inputs produce semantically equivalent outputs
PASS
System Security
No prompt injection, data leakage, or unsafe tool use detected
PASS
Research Veto✅ PASS — Applicable
DimensionResultDetail
Scientific IntegrityPASSNo fabricated DOI, PMID, trial result, sample size, p-value, or unsupported scientific claim was generated during audit.
Practice BoundariesPASSThe skill performs computational data analysis and does not make diagnostic or treatment recommendations.
Methodological GroundPASSThe workflow uses standard data-analysis methods and documents assumptions, thresholds, and output interpretation boundaries.
Code UsabilityPASSNative CLI execution was verified using /opt/homebrew/bin/Rscript in this environment.

Core Capability91 / 1008 Categories

Functional Suitability
The skill provides a complete Data Analysis workflow with documented inputs, outputs, examples, and deterministic artifacts.
11 / 12
92%
Reliability
Native execution was verified with /opt/homebrew/bin/Rscript; remaining risks are limited to environment dependency drift.
11 / 12
92%
Performance & Context
Runtime is appropriate for bundled smoke-test data and the workflow writes concise tabular and plot artifacts.
7 / 8
88%
Agent Usability
CLI usage, parameters, output paths, and troubleshooting guidance are sufficiently clear for agent invocation.
15 / 16
94%
Human Usability
Examples and reference documentation make the workflow discoverable and reproducible for human reviewers.
7 / 8
88%
Security
No credential handling or unsafe dynamic code execution was identified; file-path based inputs are used.
11 / 12
92%
Maintainability
Implementation is modular across scripts, references, and tests, making future updates straightforward.
11 / 12
92%
Agent-Specific
The skill has clear trigger boundaries, structured CLI execution, and reproducible output conventions.
18 / 20
90%
Core Capability Total91 / 100

Medical TaskExecution Average: 90 / 100 — Assertions: 25/25 Passed

93
Canonical
limma differential expression smoke test
5/5
91
Variant A
CLI help and option contract
5/5
88
Edge
sample matching and group validation review
5/5
90
Variant B
visualization artifact workflow
5/5
88
Stress
multi-method workflow coverage review
5/5
93
Canonical✅ Pass
limma differential expression smoke test

/opt/homebrew/bin/Rscript completed the bundled limma example, detecting 2104 significant genes and writing result artifacts.

Basic 37/40|Specialized 56/60|Total 93/100
A1Required output artifacts were generated or documented for this test input.
A2Input validation and documented parameter handling were consistent with the skill scope.
A3No fabricated biomedical claims or unsupported clinical conclusions were generated.
A4Execution stayed within the Data Analysis workflow boundaries.
A5Results were reproducible enough for audit review.
Pass rate: 5 / 5
91
Variant A✅ Pass
CLI help and option contract

The documented CLI help rendered successfully with required input, group, output, method, threshold, and seed options.

Basic 36/40|Specialized 55/60|Total 91/100
A1Required output artifacts were generated or documented for this test input.
A2Input validation and documented parameter handling were consistent with the skill scope.
A3No fabricated biomedical claims or unsupported clinical conclusions were generated.
A4Execution stayed within the Data Analysis workflow boundaries.
A5Results were reproducible enough for audit review.
Pass rate: 5 / 5
88
Edge✅ Pass
sample matching and group validation review

Script structure and references document sample/group validation and deterministic failure behavior for invalid data.

Basic 35/40|Specialized 53/60|Total 88/100
A1Required output artifacts were generated or documented for this test input.
A2Input validation and documented parameter handling were consistent with the skill scope.
A3No fabricated biomedical claims or unsupported clinical conclusions were generated.
A4Execution stayed within the Data Analysis workflow boundaries.
A5Results were reproducible enough for audit review.
Pass rate: 5 / 5
90
Variant B✅ Pass
visualization artifact workflow

The audited run generated downstream visualization steps including volcano plot and heatmap generation.

Basic 36/40|Specialized 54/60|Total 90/100
A1Required output artifacts were generated or documented for this test input.
A2Input validation and documented parameter handling were consistent with the skill scope.
A3No fabricated biomedical claims or unsupported clinical conclusions were generated.
A4Execution stayed within the Data Analysis workflow boundaries.
A5Results were reproducible enough for audit review.
Pass rate: 5 / 5
88
Stress✅ Pass
multi-method workflow coverage review

The skill documents limma, DESeq2, edgeR, t-test, and Wilcoxon routes with shared output conventions.

Basic 35/40|Specialized 53/60|Total 88/100
A1Required output artifacts were generated or documented for this test input.
A2Input validation and documented parameter handling were consistent with the skill scope.
A3No fabricated biomedical claims or unsupported clinical conclusions were generated.
A4Execution stayed within the Data Analysis workflow boundaries.
A5Results were reproducible enough for audit review.
Pass rate: 5 / 5
Medical Task Total90 / 100

Key Strengths

  • Native R execution succeeded with the Homebrew Rscript path and bundled test data.
  • The workflow covers multiple common differential-expression methods with clear CLI parameters.
  • Output artifacts include result tables, filtered significant genes, visualizations, and session metadata.
  • Scope boundaries are appropriate for bulk expression differential analysis.