Data Analysis
gokegg-analysis
Run Gene Ontology (GO) and KEGG pathway enrichment analysis on gene lists using clusterProfiler. Inputs: DEG or candidate gene list, organism background. Outputs: enrichment result tables, dot plots, bar plots, enrichment map.
86100Total Score
Core Capability
87 / 100
Functional Suitability
11 / 12
Reliability
10 / 12
Performance & Context
7 / 8
Agent Usability
14 / 16
Human Usability
7 / 8
Security
11 / 12
Maintainability
10 / 12
Agent-Specific
17 / 20
Medical Task
21 / 22 Passed
93Human SYMBOL smoke test
4/4
88Mouse ENSEMBL example
5/5
64Unsupported species validation
3/4
91Mixed-separator human input
4/4
93Custom plotting parameter run
5/5
Veto GatesRequired pass for any deployment consideration
Skill Veto✓ All 4 gates passed
✓
Operational Stability
System remains stable across varied inputs and edge cases
PASS✓
Structural Consistency
Output structure conforms to expected skill contract format
PASS✓
Result Determinism
Equivalent inputs produce semantically equivalent outputs
PASS✓
System Security
No prompt injection, data leakage, or unsafe tool use detected
PASSResearch Veto✅ PASS — Applicable
| Dimension | Result | Detail |
|---|---|---|
| Scientific Integrity | PASS | No fabricated statistics, citations, or invented biological claims were observed in any tested output. |
| Practice Boundaries | PASS | Outputs stayed within enrichment-analysis scope and did not provide diagnostic or prescriptive medical advice. |
| Methodological Ground | PASS | The ORA-based GO and KEGG workflow was appropriate for the tested gene-list inputs, although one summary line overstated empty KEGG results. |
| Code Usability | PASS | R scripts executed successfully on four valid scenarios and failed safely with a controlled validation error on the invalid scenario. |
Core Capability87 / 100 — 8 Categories
Functional Suitability
Core enrichment, plotting, and reporting coverage is strong after the empty-result summary fix, with only minor recovery guidance gaps remaining.
11 / 12
92%
Reliability
Validation and regression coverage are solid, and empty-result handling now aligns warnings with final summary states.
10 / 12
83%
Performance & Context
Documentation is layered reasonably well and the workflow is concise, though the SKILL.md still carries a fair amount of inline detail.
7 / 8
88%
Agent Usability
Sectioning and expected outputs are clear, and the repaired CLI guide now provides consistent examples for agents to follow.
14 / 16
88%
Human Usability
Trigger language is natural and the fixed CLI examples improve forgiveness for users following the documentation literally.
7 / 8
88%
Security
No unsafe shell execution or credential handling issues were found, and user parameters are validated before analysis starts.
11 / 12
92%
Maintainability
The code is modular and testable, but duplicated documentation content increases drift risk across references and behavior.
10 / 12
83%
Agent-Specific
Trigger precision and scope boundaries are good, and partial-success reporting is now clearer for empty enrichment outputs.
17 / 20
85%
Core Capability Total87 / 100
Medical TaskExecution Average: 85.8 / 100 — Assertions: 21/22 Passed
93
Canonical
Human SYMBOL smoke test
4/4 ✓
88
Variant A
Mouse ENSEMBL example
5/5 ✓
64
Edge
Unsupported species validation
3/4 ⚠
91
Variant B
Mixed-separator human input
4/4 ✓
93
Stress
Custom plotting parameter run
5/5 ✓
93
Canonical✅ Pass
Human SYMBOL smoke test
Executed perfectly and produced all documented outputs.
Basic 38/40|Specialized 55/60|Total 93/100
✅A1Output completes GO, KEGG, plotting, and session-info generation
✅A2Output reports parsed gene count and final status fields
✅A3Output enumerates key files promised by the skill
✅A4Output stays within enrichment-analysis scope
Pass rate: 4 / 4
88
Variant A✅ Pass
Mouse ENSEMBL example
Completed cleanly with a KEGG-empty warning and an accurate EMPTY summary state.
Basic 35/40|Specialized 53/60|Total 88/100
✅A1Output handles the documented mouse ENSEMBL input without crashing
✅A2Output warns when KEGG returns no enriched pathways
✅A3Summary does not overstate KEGG success when KEGG results are empty
✅A4Plot generation degrades gracefully when only GO results remain
✅A5Run still records reproducibility metadata
Pass rate: 5 / 5
64
Edge⚠️ Warning
Unsupported species validation
Failed safely with a clear SKILL_INVALID_PARAMETER error before analysis work began.
Basic 26/40|Specialized 38/60|Total 64/100
✅A1Unsupported species is rejected safely before downstream analysis
✅A2Error output includes an exact SKILL code
❌A3Error output gives a concrete next step for retry
✅A4Failure path avoids misleading success summaries
Pass rate: 3 / 4
91
Variant B✅ Pass
Mixed-separator human input
Executed cleanly and confirmed separator normalization in the logs.
Basic 37/40|Specialized 54/60|Total 91/100
✅A1Mixed separators are normalized into the expected four genes
✅A2The documented SVG output path is produced
✅A3Summary reports the expected artifact set
✅A4Output remains within the skill's stated scope
Pass rate: 4 / 4
93
Stress✅ Pass
Custom plotting parameter run
Handled the multi-parameter plotting request cleanly and produced all expected artifacts.
Basic 38/40|Specialized 55/60|Total 93/100
✅A1Advanced plotting parameters execute successfully
✅A2Output keeps the documented summary structure
✅A3Stress input does not break reproducibility controls
✅A4Output does not drift outside the skill scope under multi-parameter input
✅A5Plot data artifacts are preserved for downstream reuse
Pass rate: 5 / 5
Medical Task Total85.8 / 100
Key Strengths
- The core R workflow is genuinely executable and passed four non-trivial runtime scenarios in this audit.
- Input validation and regression tests cover important boundary cases such as separator parsing, empty input, and malformed plotting parameters.
- Documentation clearly states supported species, gene ID types, expected artifacts, and out-of-scope analysis types, and the CLI guide examples are now complete and consistent.
- The implementation is modular, separating parsing, analysis, plotting, and utility logic into focused script files.