Protocol Design

confounder-and-bias-control-planner

Plans confounder control, variable adjustment logic, and bias mitigation strategies at the protocol stage for clinical, epidemiologic, translational, observational, and biomarker studies. Always use this skill when a user needs to identify major confounders, decide which variable

89100Total Score
Core Capability
93 / 100
Functional Suitability
12 / 12
Reliability
10 / 12
Performance & Context
7 / 8
Agent Usability
16 / 16
Human Usability
7 / 8
Security
12 / 12
Maintainability
12 / 12
Agent-Specific
17 / 20
Medical Task
25 / 25 Passed
88Baseline CRP as predictor of sepsis mortality — what to adjust for
5/5
87Retrospective EHR cohort — identify bias and confounders before protocol finalization
5/5
87Case-control study of smoking and lupus — which variables to match on
5/5
87Mixed variable list including post-treatment response — role classification challenge
5/5
86Pressure-test a complex observational protocol with propensity score plan already drafted
5/5

Veto GatesRequired pass for any deployment consideration

Skill Veto✓ All 4 gates passed
Operational Stability
System remains stable across varied inputs and edge cases
PASS
Structural Consistency
Output structure conforms to expected skill contract format
PASS
Result Determinism
Equivalent inputs produce semantically equivalent outputs
PASS
System Security
No prompt injection, data leakage, or unsafe tool use detected
PASS
Research Veto✅ PASS — Applicable
DimensionResultDetail
Scientific IntegrityPASSNo fabricated references, DOIs, PMIDs, statistical values, or clinical data detected.
Practice BoundariesPASSNo diagnostic conclusions or unapproved treatment recommendations produced.
Methodological GroundPASSCollider bias, mediator misclassification, and immortal time bias all explicitly handled
Code UsabilityN/ANo code generated; Category 2, design planning only

Core Capability93 / 1008 Categories

Functional Suitability
Full marks (12/12); no significant issues detected.
12 / 12
100%
Reliability
Strong handling of uncertain variable roles; gap: no fallback when user provides no variable list at all
10 / 12
83%
Performance & Context
Strong score (7/8); minor gaps noted.
7 / 8
88%
Agent Usability
Full marks (16/16); no significant issues detected.
16 / 16
100%
Human Usability
Strong score (7/8); minor gaps noted.
7 / 8
88%
Security
Full marks (12/12); no significant issues detected.
12 / 12
100%
Maintainability
Full marks (12/12); no significant issues detected.
12 / 12
100%
Agent-Specific
Strong critical posture enforced; description well-targeted but could be shortened and add natural trigger phrases
17 / 20
85%
Core Capability Total93 / 100

Medical TaskExecution Average: 87 / 100 — Assertions: 25/25 Passed

88
Canonical
Baseline CRP as predictor of sepsis mortality — what to adjust for
5/5
87
Variant A
Retrospective EHR cohort — identify bias and confounders before protocol finalization
5/5
87
Variant B
Case-control study of smoking and lupus — which variables to match on
5/5
87
Edge
Mixed variable list including post-treatment response — role classification challenge
5/5
86
Stress
Pressure-test a complex observational protocol with propensity score plan already drafted
5/5
88
Canonical✅ Pass
Baseline CRP as predictor of sepsis mortality — what to adjust for

5/5 assertions passed.

Basic 35/40|Specialized 53/60|Total 88/100
A1Variable role map produced with explicit confounder/mediator/collider classifications
A2Time order established before any adjustment recommendation
A3Post-baseline variables excluded from baseline adjustment set with warning
A4Minimum sufficient control set defined with reasoning, not just variable list
A5Residual confounding acknowledged as non-removable in Section J
Pass rate: 5 / 5
87
Variant A✅ Pass
Retrospective EHR cohort — identify bias and confounders before protocol finalization

5/5 assertions passed.

Basic 35/40|Specialized 52/60|Total 87/100
A1Immortal time bias risk assessed for EHR time-zero structure
A2Selection bias and confounding by indication both identified
A3Control strategy recommendation justified for the specific design context
A4Critical weak points section present with specific protocol revision recommendation
A5No fabricated variable availability or dataset fields assumed
Pass rate: 5 / 5
87
Variant B✅ Pass
Case-control study of smoking and lupus — which variables to match on

5/5 assertions passed.

Basic 35/40|Specialized 52/60|Total 87/100
A1Variables on causal pathway explicitly excluded from matching recommendation
A2Overmatching risk and analytic consequences stated for each matching candidate
A3Alternative to matching (restriction or multivariable adjustment) presented as comparison
A4Collider bias risk checked for proposed matching variables
A5No recommendation to 'adjust for everything available'
Pass rate: 5 / 5
87
Edge✅ Pass
Mixed variable list including post-treatment response — role classification challenge

5/5 assertions passed.

Basic 35/40|Specialized 52/60|Total 87/100
A1Post-treatment variables correctly classified as post-baseline and excluded from baseline adjustment
A2Role-uncertain variables labeled rather than forced into a false classification
A3Mediator vs confounder boundary explicitly addressed for ambiguous variables
A4Prediction variables correctly distinguished from confounders
A5Caution against statistical complexity masking bias explicitly stated
Pass rate: 5 / 5
86
Stress✅ Pass
Pressure-test a complex observational protocol with propensity score plan already drafted

5/5 assertions passed.

Basic 35/40|Specialized 51/60|Total 86/100
A1Propensity score plan reviewed with explicit justification check (design, data quality, covariate set)
A2Critical review of proposed propensity model identifies whether covariate set is appropriate
A3Residual bias after propensity score method acknowledged
A4Bias mitigation actions provided per major risk identified
A5Practical next step section provides most useful immediate action
Pass rate: 5 / 5
Medical Task Total87 / 100

Key Strengths

  • Variable role classification before any adjustment recommendation is a methodologically sound and rare discipline in AI-assisted protocol review
  • Explicit collider bias detection and prohibition of mediator adjustment are strong differentiators from generic statistics advice
  • Hard Rules effectively prevent the most common critical errors: adjust-for-everything, post-baseline variables in baseline set, propensity score by default
  • Eight reference modules with precise step-level mapping provide comprehensive bias-sensing coverage