diagnostic-study-quality-assessment-quadas-2
Analyzes clinical diagnostic accuracy studies for bias using the QUADAS-2 tool. Use when Claude needs to assess the quality, risk of bias, or applicability of diagnostic accuracy studies (e.g., "Assess this paper using QUADAS-2").
Veto GatesRequired pass for any deployment consideration
| Dimension | Result | Detail |
|---|---|---|
| Scientific Integrity | PASS | Scientific integrity held because extraction and analysis outputs stayed tied to provided text, metadata, or runtime evidence rather than invented study findings. |
| Practice Boundaries | PASS | The archived review kept this package within Analyzes clinical diagnostic accuracy studies for bias using the QUADAS-2 tool, not freeform inference detached from source data. |
| Methodological Ground | PASS | Methodological grounding held because the package kept its judgments tied to explicit rubric logic. |
| Code Usability | PASS | The legacy audit did not flag code-usability issues for the packaged diagnostic-study-quality-assessment-quadas-2 workflow. |
Core Capability81 / 100 — 8 Categories
Medical TaskExecution Average: 88.6 / 100 — Assertions: 20/20 Passed
The archived run treated Analyzes clinical diagnostic accuracy studies for bias using the... as a bounded analysis workflow rather than a purely narrative instruction path.
This variant a case stayed within the packaged analysis boundary and kept a reviewable task contract.
This edge case stayed within the packaged analysis boundary and kept a reviewable task contract.
Packaged executable path(s): scripts/pdf_extractor.py plus 1... remained an analysis-style extraction path whose value came from structured data capture rather than a freeform narrative response.
The archived run treated Analyzes clinical diagnostic accuracy studies for bias using the QUADAS-2 tool as a bounded analysis workflow rather than a purely narrative instruction path.
Key Strengths
- Primary routing is Data Analysis with execution mode B
- Static quality score is 81/100 and dynamic average is 77.6/100
- Assertions and command execution outcomes are recorded per input for human review
- Execution verification summary: Script verification 1/2; adjustment=3. pdf_extractor.py: rc=1; quadas_assessment.py: OK