Evidence Insight

reactome-skill

92100Total Score
Core Capability
85 / 100
Functional Suitability
11 / 12
Reliability
9 / 12
Performance & Context
8 / 8
Agent Usability
14 / 16
Human Usability
8 / 8
Security
9 / 12
Maintainability
9 / 12
Agent-Specific
17 / 20
Medical Task
20 / 20 Passed
100You have a list of genes/proteins and want to run pathway overrepresentation (enrichment) analysis against Reactome
4/4
97You need to retrieve curated pathway content (hierarchy, reactions, participants) by Reactome stable IDs (e.g., R-HSA-69278)
4/4
95Pathway enrichment (overrepresentation) for identifier lists
4/4
94Expression analysis by mapping expression data to Reactome pathways
4/4
94End-to-end case for Pathway enrichment (overrepresentation) for identifier lists
4/4

Veto GatesRequired pass for any deployment consideration

Skill Veto✓ All 4 gates passed
Operational Stability
System remains stable across varied inputs and edge cases
PASS
Structural Consistency
Output structure conforms to expected skill contract format
PASS
Result Determinism
Equivalent inputs produce semantically equivalent outputs
PASS
System Security
No prompt injection, data leakage, or unsafe tool use detected
PASS
Research Veto✅ PASS — Applicable
DimensionResultDetail
Scientific IntegrityPASSScientific content remained anchored to fetched metadata or source-linked evidence in the legacy review.
Practice BoundariesPASSPractice boundaries held because the package remained focused on source handling, lookup, or structured evidence use.
Methodological GroundPASSNo methodological-grounding issue was recorded for reactome-skill in the archived evaluation.
Code UsabilityPASSCode usability passed because the search or lookup workflow still exposed a usable entrypoint and output expectation.

Core Capability85 / 1008 Categories

Functional Suitability
A modest deduction remained in functional suitability for reactome-skill in the archived review.
11 / 12
92%
Reliability
The archived evaluation left some headroom for reactome-skill under reliability.
9 / 12
75%
Performance & Context
No point loss was recorded for performance context in the legacy audit.
8 / 8
100%
Agent Usability
The archived evaluation left some headroom for reactome-skill under agent usability.
14 / 16
88%
Human Usability
No point loss was recorded for human usability in the legacy audit.
8 / 8
100%
Security
The archived evaluation left some headroom for reactome-skill under security.
9 / 12
75%
Maintainability
A modest deduction remained in maintainability for reactome-skill in the archived review.
9 / 12
75%
Agent-Specific
The archived evaluation left some headroom for reactome-skill under agent specific.
17 / 20
85%
Core Capability Total85 / 100

Medical TaskExecution Average: 96 / 100 — Assertions: 20/20 Passed

100
Canonical
You have a list of genes/proteins and want to run pathway overrepresentation (enrichment) analysis against Reactome
4/4
97
Variant A
You need to retrieve curated pathway content (hierarchy, reactions, participants) by Reactome stable IDs (e.g., R-HSA-69278)
4/4
95
Edge
Pathway enrichment (overrepresentation) for identifier lists
4/4
94
Variant B
Expression analysis by mapping expression data to Reactome pathways
4/4
94
Stress
End-to-end case for Pathway enrichment (overrepresentation) for identifier lists
4/4
100
Canonical✅ Pass
You have a list of genes/proteins and want to run pathway overrepresentation (enrichment) analysis against Reactome

The You have a list of genes/proteins and want to run pathway... scenario completed within the documented Query the Reactome REST API for pathway content and enrichment analyses boundary.

Basic 38/40|Specialized 60/60|Total 100/100
A1The reactome-skill output structure matches the documented deliverable
A2The script execution path completed successfully for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
97
Variant A✅ Pass
You need to retrieve curated pathway content (hierarchy, reactions, participants) by Reactome stable IDs (e.g., R-HSA-69278)

The You need to retrieve curated pathway content (hierarchy, reactions,... scenario completed within the documented Query the Reactome REST API for pathway content and enrichment analyses boundary.

Basic 36/40|Specialized 60/60|Total 97/100
A1The reactome-skill output structure matches the documented deliverable
A2The script execution path completed successfully for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
95
Edge✅ Pass
Pathway enrichment (overrepresentation) for identifier lists

The archived evaluation treated Pathway enrichment (overrepresentation) for identifier lists as a clean in-scope run.

Basic 35/40|Specialized 60/60|Total 95/100
A1The reactome-skill output structure matches the documented deliverable
A2The script execution path completed successfully for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
94
Variant B✅ Pass
Expression analysis by mapping expression data to Reactome pathways

The archived evaluation treated Expression analysis by mapping expression data to Reactome pathways as a clean in-scope run.

Basic 34/40|Specialized 60/60|Total 94/100
A1The reactome-skill output structure matches the documented deliverable
A2The script execution path completed successfully for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
94
Stress✅ Pass
End-to-end case for Pathway enrichment (overrepresentation) for identifier lists

The End-to-end case for Pathway enrichment (overrepresentation) for... scenario completed within the documented Query the Reactome REST API for pathway content and enrichment analyses boundary.

Basic 31/40|Specialized 60/60|Total 94/100
A1The reactome-skill output structure matches the documented deliverable
A2The script execution path completed successfully for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
Medical Task Total96 / 100

Key Strengths

  • Primary routing is Evidence Insight with execution mode B
  • Static quality score is 85/100 and dynamic average is 83.6/100
  • Assertions and command execution outcomes are recorded per input for human review
  • Execution verification summary: Script verification 1/1; adjustment=5. reactome_tool.py: OK