Protocol Design

confounder-and-bias-control-planner

Plans confounder control, variable adjustment logic, and bias mitigation strategies at the protocol stage for clinical, epidemiologic, translational, observational, and biomarker studies. Always use this skill when a user needs to identify major confounders, decide which variable

89100Total Score

Core Capability

93 / 100

Functional Suitability

12 / 12

Reliability

10 / 12

Performance & Context

7 / 8

Agent Usability

16 / 16

Human Usability

7 / 8

Security

12 / 12

Maintainability

12 / 12

Agent-Specific

17 / 20

Medical Task

25 / 25 Passed

88Baseline CRP as predictor of sepsis mortality — what to adjust for

5/5

87Retrospective EHR cohort — identify bias and confounders before protocol finalization

5/5

87Case-control study of smoking and lupus — which variables to match on

5/5

87Mixed variable list including post-treatment response — role classification challenge

5/5

86Pressure-test a complex observational protocol with propensity score plan already drafted

5/5

Veto GatesRequired pass for any deployment consideration

Skill Veto✓ All 4 gates passed

✓

Operational Stability

System remains stable across varied inputs and edge cases

PASS

✓

Structural Consistency

Output structure conforms to expected skill contract format

PASS

✓

Result Determinism

Equivalent inputs produce semantically equivalent outputs

PASS

✓

System Security

No prompt injection, data leakage, or unsafe tool use detected

PASS

Research Veto✅ PASS — Applicable

Dimension	Result	Detail
Scientific Integrity	PASS	No fabricated references, DOIs, PMIDs, statistical values, or clinical data detected.
Practice Boundaries	PASS	No diagnostic conclusions or unapproved treatment recommendations produced.
Methodological Ground	PASS	Collider bias, mediator misclassification, and immortal time bias all explicitly handled
Code Usability	N/A	No code generated; Category 2, design planning only

Core Capability93 / 100 — 8 Categories

Functional Suitability

Full marks (12/12); no significant issues detected.

12 / 12

100%

Reliability

Strong handling of uncertain variable roles; gap: no fallback when user provides no variable list at all

10 / 12

83%

Performance & Context

Strong score (7/8); minor gaps noted.

7 / 8

88%

Agent Usability

Full marks (16/16); no significant issues detected.

16 / 16

100%

Human Usability

Strong score (7/8); minor gaps noted.

7 / 8

88%

Security

Full marks (12/12); no significant issues detected.

12 / 12

100%

Maintainability

Full marks (12/12); no significant issues detected.

12 / 12

100%

Agent-Specific

Strong critical posture enforced; description well-targeted but could be shortened and add natural trigger phrases

17 / 20

85%

Core Capability Total93 / 100

Medical TaskExecution Average: 87 / 100 — Assertions: 25/25 Passed

Canonical

Baseline CRP as predictor of sepsis mortality — what to adjust for

5/5 ✓

Variant A

Retrospective EHR cohort — identify bias and confounders before protocol finalization

5/5 ✓

Variant B

Case-control study of smoking and lupus — which variables to match on

5/5 ✓

Edge

Mixed variable list including post-treatment response — role classification challenge

5/5 ✓

Stress

Pressure-test a complex observational protocol with propensity score plan already drafted

5/5 ✓

Canonical✅ Pass

Baseline CRP as predictor of sepsis mortality — what to adjust for

5/5 assertions passed.

Basic 35/40|Specialized 53/60|Total 88/100

✅A1Variable role map produced with explicit confounder/mediator/collider classifications

✅A2Time order established before any adjustment recommendation

✅A3Post-baseline variables excluded from baseline adjustment set with warning

✅A4Minimum sufficient control set defined with reasoning, not just variable list

✅A5Residual confounding acknowledged as non-removable in Section J

Pass rate: 5 / 5

Variant A✅ Pass

Retrospective EHR cohort — identify bias and confounders before protocol finalization

5/5 assertions passed.

Basic 35/40|Specialized 52/60|Total 87/100

✅A1Immortal time bias risk assessed for EHR time-zero structure

✅A2Selection bias and confounding by indication both identified

✅A3Control strategy recommendation justified for the specific design context

✅A4Critical weak points section present with specific protocol revision recommendation

✅A5No fabricated variable availability or dataset fields assumed

Pass rate: 5 / 5

Variant B✅ Pass

Case-control study of smoking and lupus — which variables to match on

5/5 assertions passed.

Basic 35/40|Specialized 52/60|Total 87/100

✅A1Variables on causal pathway explicitly excluded from matching recommendation

✅A2Overmatching risk and analytic consequences stated for each matching candidate

✅A3Alternative to matching (restriction or multivariable adjustment) presented as comparison

✅A4Collider bias risk checked for proposed matching variables

✅A5No recommendation to 'adjust for everything available'

Pass rate: 5 / 5

Edge✅ Pass

Mixed variable list including post-treatment response — role classification challenge

5/5 assertions passed.

Basic 35/40|Specialized 52/60|Total 87/100

✅A1Post-treatment variables correctly classified as post-baseline and excluded from baseline adjustment

✅A2Role-uncertain variables labeled rather than forced into a false classification

✅A3Mediator vs confounder boundary explicitly addressed for ambiguous variables

✅A4Prediction variables correctly distinguished from confounders

✅A5Caution against statistical complexity masking bias explicitly stated

Pass rate: 5 / 5

Stress✅ Pass

Pressure-test a complex observational protocol with propensity score plan already drafted

5/5 assertions passed.

Basic 35/40|Specialized 51/60|Total 86/100

✅A1Propensity score plan reviewed with explicit justification check (design, data quality, covariate set)

✅A2Critical review of proposed propensity model identifies whether covariate set is appropriate

✅A3Residual bias after propensity score method acknowledged

✅A4Bias mitigation actions provided per major risk identified

✅A5Practical next step section provides most useful immediate action

Pass rate: 5 / 5

Medical Task Total87 / 100

Key Strengths

Variable role classification before any adjustment recommendation is a methodologically sound and rare discipline in AI-assisted protocol review
Explicit collider bias detection and prohibition of mediator adjustment are strong differentiators from generic statistics advice
Hard Rules effectively prevent the most common critical errors: adjust-for-everything, post-baseline variables in baseline set, propensity score by default
Eight reference modules with precise step-level mapping provide comprehensive bias-sensing coverage