evidence-level-ranker
Ranks papers by evidence family, methodological quality tier, validation depth, and claim discipline; assigns anchor, context-setting, mechanistic support, or caution citation roles. Polished: frontmatter normalized to canonical schema; reference module integration corrected to actual file names; p-value proxy check added to Step 3; Input Validation section added.
Veto GatesRequired pass for any deployment consideration
| Dimension | Result | Detail |
|---|---|---|
| Scientific Integrity | PASS | Hard Rules 11-14 prohibit fabricating references, PMIDs, DOIs, validation claims, sample sizes, and effect estimates; Section J verification notes requirement enforced. |
| Practice Boundaries | PASS | Explicitly prohibits turning evidence ranking into clinical advice or treatment recommendation; citation priority framing correctly scoped to manuscript support use. |
| Methodological Ground | PASS | Four-dimension ranking framework (evidence family, methodological quality, validation depth, claim discipline) is methodologically sound; Hard Rule 1 correctly separates design label from true evidence value. |
| Code Usability | N/A | Mode A evidence appraisal skill; no code generated. |
Core Capability84 / 100 — 8 Categories
Medical TaskExecution Average: 81.4 / 100 — Assertions: 32/35 Passed
All four dimensions assessed separately; meta-analysis not auto-ranked #1; citation roles assigned with explicit reasoning; uncertainties section present.
Non-comparable papers identified rather than forced into single ladder; different evidence roles explained; clinical vs mechanistic separation maintained; journal prestige not used as criterion.
Poorly executed RCT correctly ranked below well-executed cohort; overclaim pattern identified; caution citation applied. One instance of statistical significance used as partial proxy for methodological quality.
Heterogeneity identified as quality limitation; meta-analysis not auto-ranked above primary studies; claim discipline appropriately downgraded; no fabricated I² statistics.
Evidence families correctly identified for all 7; non-comparable roles handled; clinical vs mechanistic maintained. Minor: citation role assignments for 7 papers show some grouping without adequate per-paper differentiation.
Clinical treatment decision correctly identified as beyond citation-priority scope; evidence ranking provided for manuscript/research use; no treatment recommendation generated.
Proceeds with available material per input validation policy; limitations labeled. Minor: methodological quality claims from titles alone not consistently labeled as provisional throughout Sections C-E.
Key Strengths
- Four-dimension ranking framework (evidence family, methodological quality, validation depth, claim discipline) prevents design-label-to-rank conflation — a common appraisal error
- Five citation roles (anchor, high-value support, context-setting, mechanistic support, caution) give manuscript authors actionable guidance beyond a generic quality score
- 17 hard rules explicitly address the most common evidence appraisal errors including prestige ranking, statistical significance conflation, and validation overclaiming
- Design-role separation rule (Hard Rules 15-16) prevents forcing non-comparable papers into a misleading single-number ranking