Data Analysis
wgcna-analysis
Identify co-expression gene modules and correlate them with clinical traits using Weighted Gene Co-expression Network Analysis. Inputs: expression matrix, trait/phenotype table. Outputs: module color assignments, trait correlation heatmap, hub gene list per module.
90100Total Score
Core Capability
91 / 100
Functional Suitability
11 / 12
Reliability
10 / 12
Performance & Context
7 / 8
Agent Usability
15 / 16
Human Usability
7 / 8
Security
11 / 12
Maintainability
11 / 12
Agent-Specific
19 / 20
Medical Task
25 / 25 Passed
90Updated smoke test
5/5
88Signed bicor run
5/5
91Documentation conformance
5/5
91Explicit module export
5/5
89Chunked loading path
5/5
Veto GatesRequired pass for any deployment consideration
Skill Veto✓ All 4 gates passed
✓
Operational Stability
System remains stable across varied inputs and edge cases
PASS✓
Structural Consistency
Output structure conforms to expected skill contract format
PASS✓
Result Determinism
Equivalent inputs produce semantically equivalent outputs
PASS✓
System Security
No prompt injection, data leakage, or unsafe tool use detected
PASSResearch Veto✅ PASS — Applicable
| Dimension | Result | Detail |
|---|---|---|
| Scientific Integrity | PASS | No fabricated statistics, identifiers, or claims were introduced during any audited execution. |
| Practice Boundaries | PASS | The skill remained inside computational analysis boundaries and did not emit diagnostic or prescriptive advice. |
| Methodological Ground | PASS | All audited runs used coherent WGCNA methods with documented parameter branches and no methodological redline violation. |
| Code Usability | PASS | The R pipeline executed successfully for all five audited inputs, including the repaired documentation-conformant smoke test. |
Core Capability91 / 100 — 8 Categories
Functional Suitability
The core WGCNA use cases are covered well and the smoke test now matches the shipped fixture set.
11 / 12
92%
Reliability
Error handling is strong and actionable, though failed runs can still leave partially populated output directories behind.
10 / 12
83%
Performance & Context
Progressive disclosure is strong; minor context bloat remains in the auxiliary CLI examples.
7 / 8
88%
Agent Usability
The documentation is clearer after the out-of-scope and verification updates; only minor wording redundancy remains.
15 / 16
94%
Human Usability
Trigger language is natural for bioinformatics users and stop conditions are now easier to apply.
7 / 8
88%
Security
Input validation is solid and no dangerous execution primitives were found; data-retention boundaries could still be stated more explicitly.
11 / 12
92%
Maintainability
The scripts remain modular, but the bundled validator still does not assert every documented option-dependent artifact.
11 / 12
92%
Agent-Specific
Trigger precision, layering, and escape hatches are strong after the revision.
19 / 20
95%
Core Capability Total91 / 100
Medical TaskExecution Average: 89.8 / 100 — Assertions: 25/25 Passed
90
Canonical
Updated smoke test
5/5 ✓
88
Variant A
Signed bicor run
5/5 ✓
91
Edge
Documentation conformance
5/5 ✓
91
Variant B
Explicit module export
5/5 ✓
89
Stress
Chunked loading path
5/5 ✓
90
Canonical✅ Pass
Updated smoke test
Executed successfully and passed the bundled baseline validator.
Basic 37/40|Specialized 53/60|Total 90/100
✅A1The documented smoke-test command completes successfully.
✅A2The baseline validator passes on the smoke-test output directory.
✅A3The run exports a ranked module summary and a per-module gene table.
✅A4The workflow stays within its stated scope and does not fabricate scientific claims.
✅A5The analysis completes without manual intervention.
Pass rate: 5 / 5
88
Variant A✅ Pass
Signed bicor run
Alternative network settings executed successfully and passed validation.
Basic 36/40|Specialized 52/60|Total 88/100
✅A1The alternative signed-network parameter set executes successfully.
✅A2The baseline output bundle remains intact under a parameter variation.
✅A3The skill accepts documented correlation and network-type switches.
✅A4The run produces a selected module export in the stated format.
✅A5No fabricated scientific claims or unsafe instructions appear in the output.
Pass rate: 5 / 5
91
Edge✅ Pass
Documentation conformance
The repaired smoke-test documentation is now fully aligned with the bundled assets.
Basic 38/40|Specialized 53/60|Total 91/100
✅A1The repository no longer references the removed tests/data/expression_subset.csv fixture.
✅A2The documented smoke-test command completes successfully.
✅A3The validation command succeeds against the documented output directory.
✅A4The repaired documentation is recoverable and self-consistent.
✅A5The script does not perform unsafe operations during documentation-conformant execution.
Pass rate: 5 / 5
91
Variant B✅ Pass
Explicit module export
Explicit trait and multi-module export completed successfully.
Basic 37/40|Specialized 54/60|Total 91/100
✅A1The skill accepts an explicit trait selection within the encoded trait matrix.
✅A2Two requested modules are exported when they are valid.
✅A3Module-specific scatter plots are produced for the requested exports.
✅A4The module-selection behavior matches the stated CLI contract.
✅A5The output remains within the intended WGCNA analysis scope.
Pass rate: 5 / 5
89
Stress✅ Pass
Chunked loading path
Chunked loading completed successfully and repeated smoke-test outputs remained deterministic.
Basic 36/40|Specialized 53/60|Total 89/100
✅A1The chunked-loading execution path completes successfully.
✅A2The chunked workflow emits useful progress feedback.
✅A3The chunked run still produces the baseline required outputs.
✅A4Repeated runs with the same seed remain deterministic on key summary files.
✅A5The stress run remains within the intended WGCNA workflow.
Pass rate: 5 / 5
Medical Task Total89.8 / 100
Key Strengths
- The documented smoke test now matches the shipped fixtures and executes successfully end to end.
- The workflow is reproducible: repeated smoke-test runs produced identical hashes for key summary outputs.
- Input validation and troubleshooting guidance are strong, with consistent SKILL_* error codes and actionable recovery advice.
- The implementation is modular and supports both standard and chunked loading paths without changing the canonical summary on bundled data.