Data Analysis

neuropixels-analysis

86100Total Score
Core Capability
86 / 100
Functional Suitability
11 / 12
Reliability
10 / 12
Performance & Context
7 / 8
Agent Usability
14 / 16
Human Usability
8 / 8
Security
9 / 12
Maintainability
10 / 12
Agent-Specific
17 / 20
Medical Task
20 / 20 Passed
91End-to-end Neuropixels extracellular electrophysiology analysis (SpikeGLX/Open Ephys/NWB) including preprocessing, motion correction, Kilosort4 spike sorting, QC metrics, and Allen/IBL-style curation; use when processing Neuropixels recordings or when users mention Neuropixels, SpikeGLX, Open Ephys, Kilosort, quality metrics, drift/motion correction, or unit curation
4/4
87End-to-end Neuropixels extracellular electrophysiology analysis (SpikeGLX/Open Ephys/NWB) including preprocessing, motion correction, Kilosort4 spike sorting, QC metrics, and Allen/IBL-style curation; use when processing Neuropixels recordings or when users mention Neuropixels, SpikeGLX, Open Ephys, Kilosort, quality metrics, drift/motion correction, or unit curation
4/4
85Multi-format ingestion: SpikeGLX, Open Ephys, and NWB readers via SpikeInterface
4/4
85Neuropixels-aware preprocessing:
4/4
85End-to-end case for Multi-format ingestion: SpikeGLX, Open Ephys, and NWB readers via SpikeInterface
4/4

Veto GatesRequired pass for any deployment consideration

Skill Veto✓ All 4 gates passed
Operational Stability
System remains stable across varied inputs and edge cases
PASS
Structural Consistency
Output structure conforms to expected skill contract format
PASS
Result Determinism
Equivalent inputs produce semantically equivalent outputs
PASS
System Security
No prompt injection, data leakage, or unsafe tool use detected
PASS
Research Veto✅ PASS — Applicable
DimensionResultDetail
Scientific IntegrityPASSThe archived review kept this workflow anchored to supplied data fields and observable execution behavior, not fabricated results.
Practice BoundariesPASSThe archived review kept this package within End-to-end Neuropixels extracellular electrophysiology analysis (SpikeGLX/Open Ephys/NWB)..., not freeform inference detached from source data.
Methodological GroundPASSMethodological grounding was preserved through the documented inputs, transformations, and expected artifacts.
Code UsabilityPASSThe legacy audit did not record a code-usability failure in the packaged analysis path.

Core Capability86 / 1008 Categories

Functional Suitability
Related legacy finding for neuropixels-analysis: Improve stress-case output rigor. Stress and boundary scenarios show weaker consistency
11 / 12
92%
Reliability
Reliability was softened by the legacy issue 'Improve stress-case output rigor'. Stress and boundary scenarios show weaker consistency
10 / 12
83%
Performance & Context
Performance-context scoring suggests the package could handle larger or denser runs a little more gracefully.
7 / 8
88%
Agent Usability
Agent usability was strong, but the workflow could surface its entry conditions a little more directly.
14 / 16
88%
Human Usability
No point loss was recorded for human usability in the legacy audit.
8 / 8
100%
Security
A modest security gap remained in the archived evaluation despite otherwise controlled workflow behavior.
9 / 12
75%
Maintainability
Maintainability stayed solid, with only limited room to simplify scripts, dependencies, or packaging structure.
10 / 12
83%
Agent-Specific
Agent specific was softened by the legacy issue 'Improve stress-case output rigor'. Stress and boundary scenarios show weaker consistency
17 / 20
85%
Core Capability Total86 / 100

Medical TaskExecution Average: 86.6 / 100 — Assertions: 20/20 Passed

91
Canonical
End-to-end Neuropixels extracellular electrophysiology analysis (SpikeGLX/Open Ephys/NWB) including preprocessing, motion correction, Kilosort4 spike sorting, QC metrics, and Allen/IBL-style curation; use when processing Neuropixels recordings or when users mention Neuropixels, SpikeGLX, Open Ephys, Kilosort, quality metrics, drift/motion correction, or unit curation
4/4
87
Variant A
End-to-end Neuropixels extracellular electrophysiology analysis (SpikeGLX/Open Ephys/NWB) including preprocessing, motion correction, Kilosort4 spike sorting, QC metrics, and Allen/IBL-style curation; use when processing Neuropixels recordings or when users mention Neuropixels, SpikeGLX, Open Ephys, Kilosort, quality metrics, drift/motion correction, or unit curation
4/4
85
Edge
Multi-format ingestion: SpikeGLX, Open Ephys, and NWB readers via SpikeInterface
4/4
85
Variant B
Neuropixels-aware preprocessing:
4/4
85
Stress
End-to-end case for Multi-format ingestion: SpikeGLX, Open Ephys, and NWB readers via SpikeInterface
4/4
91
Canonical✅ Pass
End-to-end Neuropixels extracellular electrophysiology analysis (SpikeGLX/Open Ephys/NWB) including preprocessing, motion correction, Kilosort4 spike sorting, QC metrics, and Allen/IBL-style curation; use when processing Neuropixels recordings or when users mention Neuropixels, SpikeGLX, Open Ephys, Kilosort, quality metrics, drift/motion correction, or unit curation

End-to-end Neuropixels extracellular electrophysiology analysis... remained tied to the documented analysis contract even when the preserved evidence centered on instructions instead of a full rerun.

Basic 36/40|Specialized 55/60|Total 91/100
A1The neuropixels-analysis output structure matches the documented deliverable
A2The instruction path remains actionable for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
87
Variant A✅ Pass
End-to-end Neuropixels extracellular electrophysiology analysis (SpikeGLX/Open Ephys/NWB) including preprocessing, motion correction, Kilosort4 spike sorting, QC metrics, and Allen/IBL-style curation; use when processing Neuropixels recordings or when users mention Neuropixels, SpikeGLX, Open Ephys, Kilosort, quality metrics, drift/motion correction, or unit curation

The archived run treated End-to-end Neuropixels extracellular electrophysiology analysis... as a bounded analysis workflow rather than a purely narrative instruction path.

Basic 34/40|Specialized 53/60|Total 87/100
A1The neuropixels-analysis output structure matches the documented deliverable
A2The instruction path remains actionable for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
85
Edge✅ Pass
Multi-format ingestion: SpikeGLX, Open Ephys, and NWB readers via SpikeInterface

Multi-format ingestion: SpikeGLX, Open Ephys, and NWB readers via... remained tied to the documented analysis contract even when the preserved evidence centered on instructions instead of a full rerun.

Basic 33/40|Specialized 52/60|Total 85/100
A1The neuropixels-analysis output structure matches the documented deliverable
A2The instruction path remains actionable for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
85
Variant B✅ Pass
Neuropixels-aware preprocessing:

The archived run treated Neuropixels-aware preprocessing: as a bounded analysis workflow rather than a purely narrative instruction path.

Basic 32/40|Specialized 53/60|Total 85/100
A1The neuropixels-analysis output structure matches the documented deliverable
A2The instruction path remains actionable for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
85
Stress✅ Pass
End-to-end case for Multi-format ingestion: SpikeGLX, Open Ephys, and NWB readers via SpikeInterface

This stress case stayed within the packaged analysis boundary and kept a reviewable task contract.

Basic 29/40|Specialized 56/60|Total 85/100
A1The neuropixels-analysis output structure matches the documented deliverable
A2The instruction path remains actionable for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
Medical Task Total86.6 / 100

Key Strengths

  • Primary routing is Data Analysis with execution mode B
  • Static quality score is 86/100 and dynamic average is 78.6/100
  • Assertions and command execution outcomes are recorded per input for human review
  • Execution verification summary: Script verification 0/6; adjustment=0. compute_metrics.py: rc=1; explore_recording.py: rc=1; export_to_phy.py: rc=1; neuropixels_pipeline.py: rc=1