Data Analysis

singlecell-portal

89100Total Score
Core Capability
84 / 100
Functional Suitability
10 / 12
Reliability
9 / 12
Performance & Context
8 / 8
Agent Usability
13 / 16
Human Usability
8 / 8
Security
10 / 12
Maintainability
10 / 12
Agent-Specific
16 / 20
Medical Task
20 / 20 Passed
96You need to discover relevant public single-cell studies by filtering on organism (e.g., human/mouse) and tissue (e.g., lung/brain)
4/4
92You want to quickly retrieve study-level metadata (e.g., study name, accession, cell counts) for downstream curation or reporting
4/4
90Direct REST access to the official Single Cell Portal API (/single_cell/api/v1/*)
4/4
90No API key required (public endpoints)
4/4
90End-to-end case for Direct REST access to the official Single Cell Portal API (/single_cell/api/v1/*)
4/4

Veto GatesRequired pass for any deployment consideration

Skill Veto✓ All 4 gates passed
Operational Stability
System remains stable across varied inputs and edge cases
PASS
Structural Consistency
Output structure conforms to expected skill contract format
PASS
Result Determinism
Equivalent inputs produce semantically equivalent outputs
PASS
System Security
No prompt injection, data leakage, or unsafe tool use detected
PASS
Research Veto✅ PASS — Applicable
DimensionResultDetail
Scientific IntegrityPASSNo scientific-integrity problem was surfaced because the package did not claim more than the available records, article text, or script evidence supported.
Practice BoundariesPASSThe archived review kept this package within Programmatically query public single-cell study metadata from the Broad Institute Single..., not freeform inference detached from source data.
Methodological GroundPASSMethodological grounding was preserved through the documented inputs, transformations, and expected artifacts.
Code UsabilityPASSCode usability passed because the package still exposed a reviewable execution surface for its documented workflow.

Core Capability84 / 1008 Categories

Functional Suitability
Related legacy finding for singlecell-portal: Improve stress-case output rigor. Stress and boundary scenarios show weaker consistency
10 / 12
83%
Reliability
The archived deduction in reliability traces back to: Improve stress-case output rigor. Stress and boundary scenarios show weaker consistency
9 / 12
75%
Performance & Context
No point loss was recorded for performance context in the legacy audit.
8 / 8
100%
Agent Usability
Agent usability was strong, but the workflow could surface its entry conditions a little more directly.
13 / 16
81%
Human Usability
Human usability reached full score in the archived evaluation.
8 / 8
100%
Security
The packaged workflow stayed safe overall, with only a small remaining deduction around boundary signaling.
10 / 12
83%
Maintainability
The archived review treated the package as maintainable, while still preserving some room for cleanup.
10 / 12
83%
Agent-Specific
Agent specific was softened by the legacy issue 'Improve stress-case output rigor'. Stress and boundary scenarios show weaker consistency
16 / 20
80%
Core Capability Total84 / 100

Medical TaskExecution Average: 91.6 / 100 — Assertions: 20/20 Passed

96
Canonical
You need to discover relevant public single-cell studies by filtering on organism (e.g., human/mouse) and tissue (e.g., lung/brain)
4/4
92
Variant A
You want to quickly retrieve study-level metadata (e.g., study name, accession, cell counts) for downstream curation or reporting
4/4
90
Edge
Direct REST access to the official Single Cell Portal API (/single_cell/api/v1/*)
4/4
90
Variant B
No API key required (public endpoints)
4/4
90
Stress
End-to-end case for Direct REST access to the official Single Cell Portal API (/single_cell/api/v1/*)
4/4
96
Canonical✅ Pass
You need to discover relevant public single-cell studies by filtering on organism (e.g., human/mouse) and tissue (e.g., lung/brain)

The archived run treated You need to discover relevant public single-cell studies by... as a bounded analysis workflow rather than a purely narrative instruction path.

Basic 36/40|Specialized 60/60|Total 96/100
A1The singlecell-portal output structure matches the documented deliverable
A2The instruction path remains actionable for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
92
Variant A✅ Pass
You want to quickly retrieve study-level metadata (e.g., study name, accession, cell counts) for downstream curation or reporting

The archived run treated You want to quickly retrieve study-level metadata (e.g., study... as a bounded extraction workflow, keeping attention on source fields, fallback logic, and output shape.

Basic 34/40|Specialized 58/60|Total 92/100
A1The singlecell-portal output structure matches the documented deliverable
A2The instruction path remains actionable for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
90
Edge✅ Pass
Direct REST access to the official Single Cell Portal API (/single_cell/api/v1/*)

The archived run treated Direct REST access to the official Single Cell Portal API... as a bounded analysis workflow rather than a purely narrative instruction path.

Basic 33/40|Specialized 57/60|Total 90/100
A1The singlecell-portal output structure matches the documented deliverable
A2The instruction path remains actionable for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
90
Variant B✅ Pass
No API key required (public endpoints)

No API key required (public endpoints) remained tied to the documented analysis contract even when the preserved evidence centered on instructions instead of a full rerun.

Basic 32/40|Specialized 58/60|Total 90/100
A1The singlecell-portal output structure matches the documented deliverable
A2The instruction path remains actionable for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
90
Stress✅ Pass
End-to-end case for Direct REST access to the official Single Cell Portal API (/single_cell/api/v1/*)

End-to-end case for Direct REST access to the official Single Cell... remained tied to the documented analysis contract even when the preserved evidence centered on instructions instead of a full rerun.

Basic 29/40|Specialized 60/60|Total 90/100
A1The singlecell-portal output structure matches the documented deliverable
A2The instruction path remains actionable for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
Medical Task Total91.6 / 100

Key Strengths

  • Primary routing is Data Analysis with execution mode B
  • Static quality score is 84/100 and dynamic average is 78.6/100
  • Assertions and command execution outcomes are recorded per input for human review
  • Execution verification summary: Script verification 1/1; adjustment=5. query.py: OK