Protocol Design
case-control-study-planner
Design a structured case-control study framework with explicit source population logic, control selection rules, matching decisions, exposure measurement planning, and bias-control checkpoints.
86100Total Score
Core Capability
87 / 100
Functional Suitability
11 / 12
Reliability
9 / 12
Performance & Context
7 / 8
Agent Usability
15 / 16
Human Usability
6 / 8
Security
12 / 12
Maintainability
11 / 12
Agent-Specific
16 / 20
Medical Task
24 / 25 Passed
86Case-control study for postoperative pulmonary complications after major abdominal surgery
5/5
86Hospital-based case-control study on antibiotic exposure and Clostridioides difficile infection
5/5
85Community-based matched case-control study on occupational exposure and rare lymphoma
5/5
84Biomarker measured after diagnosis — validity of using as etiologic exposure
4/5
84Complex pharmacoepidemiology case-control with multiple exposure windows and confounders by indication
5/5
Veto GatesRequired pass for any deployment consideration
Skill Veto✓ All 4 gates passed
✓
Operational Stability
System remains stable across varied inputs and edge cases
PASS✓
Structural Consistency
Output structure conforms to expected skill contract format
PASS✓
Result Determinism
Equivalent inputs produce semantically equivalent outputs
PASS✓
System Security
No prompt injection, data leakage, or unsafe tool use detected
PASSResearch Veto✅ PASS — Applicable
| Dimension | Result | Detail |
|---|---|---|
| Scientific Integrity | PASS | No fabricated references, DOIs, PMIDs, statistical values, or clinical data detected. |
| Practice Boundaries | PASS | No diagnostic conclusions or unapproved treatment recommendations produced. |
| Methodological Ground | PASS | Strong source-population discipline and matching overmatching warnings enforced |
| Code Usability | N/A | No code generated; Category 2, design planning only |
Core Capability87 / 100 — 8 Categories
Functional Suitability
Description is very brief (one sentence); does not adequately communicate the full scope and trigger contexts of the skill
11 / 12
92%
Reliability
Proceeds with conditional language when inputs are unclear; no active minimum-clarification protocol defined
9 / 12
75%
Performance & Context
Strong score (7/8); minor gaps noted.
7 / 8
88%
Agent Usability
Strong score (15/16); minor gaps noted.
15 / 16
94%
Human Usability
Very brief description reduces discoverability; users may not recognize this skill without knowing epidemiology terminology
6 / 8
75%
Security
Full marks (12/12); no significant issues detected.
12 / 12
100%
Maintainability
Strong score (11/12); minor gaps noted.
11 / 12
92%
Agent-Specific
Trigger precision low due to brief description; no dedicated progressive disclosure of reference modules in description
16 / 20
80%
Core Capability Total87 / 100
Medical TaskExecution Average: 85 / 100 — Assertions: 24/25 Passed
86
Canonical
Case-control study for postoperative pulmonary complications after major abdominal surgery
5/5 ✓
86
Variant A
Hospital-based case-control study on antibiotic exposure and Clostridioides difficile infection
5/5 ✓
85
Variant B
Community-based matched case-control study on occupational exposure and rare lymphoma
5/5 ✓
84
Edge
Biomarker measured after diagnosis — validity of using as etiologic exposure
4/5 ✓
84
Stress
Complex pharmacoepidemiology case-control with multiple exposure windows and confounders by indication
5/5 ✓
86
Canonical✅ Pass
Case-control study for postoperative pulmonary complications after major abdominal surgery
5/5 assertions passed.
Basic 35/40|Specialized 51/60|Total 86/100
✅A1Case-control design fit explicitly assessed before proceeding with design
✅A2Source population defined such that both cases and controls arise from same population
✅A3Matching strategy decision explicitly justified or rejected with overmatching risk noted
✅A4Exposure timing and recall bias risks explicitly addressed
✅A5No fabricated event rates, registry details, or guideline endorsements
Pass rate: 5 / 5
86
Variant A✅ Pass
Hospital-based case-control study on antibiotic exposure and Clostridioides difficile infection
5/5 assertions passed.
Basic 34/40|Specialized 52/60|Total 86/100
✅A1Incident vs prevalent case distinction addressed
✅A2Exposure window defined with temporal alignment to outcome
✅A3Bias-control matrix present with at least selection, recall, information, and confounding rows
✅A4Odds ratio vs risk ratio distinction addressed
✅A5No clinical utility claims made beyond study scope
Pass rate: 5 / 5
85
Variant B✅ Pass
Community-based matched case-control study on occupational exposure and rare lymphoma
5/5 assertions passed.
Basic 34/40|Specialized 51/60|Total 85/100
✅A1Rare outcome justification for case-control stated
✅A2Matching strategy assessed with analytic consequences stated
✅A3Conditional logistic regression recommended when individual matching is proposed
✅A4Recall bias for occupational exposure explicitly flagged
✅A5No post-outcome measurement accepted as valid baseline exposure
Pass rate: 5 / 5
84
Edge✅ Pass
Biomarker measured after diagnosis — validity of using as etiologic exposure
4/5 assertions passed.
Basic 34/40|Specialized 50/60|Total 84/100
✅A1Post-diagnosis biomarker correctly flagged as invalid etiologic exposure without qualification
✅A2Reverse-timing distortion risk explicitly named
✅A3Alternative study framing recommended when biomarker timing is post-outcome
✅A4No case-control design rejected outright without offering viable alternative
❌A5Primary recommendation given even when design fit is weak
Pass rate: 4 / 5
84
Stress✅ Pass
Complex pharmacoepidemiology case-control with multiple exposure windows and confounders by indication
5/5 assertions passed.
Basic 35/40|Specialized 49/60|Total 84/100
✅A1Confounding by indication explicitly identified as major bias risk
✅A2Multiple exposure windows handled with explicit timeline structure
✅A3Primary statistical analysis line stated (conditional logistic or logistic regression)
✅A4Feasibility and interpretation limits separated from design recommendations
✅A5No fabricated drug prevalence, event rates, or guideline positions
Pass rate: 5 / 5
Medical Task Total85 / 100
Key Strengths
- Strong source-population and sampling-frame discipline enforced throughout — prevents fundamental case-control design errors
- Matching hard rules are excellent: no default matching, overmatching risk explicitly flagged, analytic consequences always stated
- Five reference modules cover question-fit, case/control definition, matching/exposure, bias/analysis, and output style comprehensively
- Explicit prohibition on using post-outcome biomarkers as etiologic exposures — a common error in clinical research