Protocol Design
real-world-evidence-study-designer
Designs a structured real-world evidence study using EHR, claims, or registry data, with explicit handling of time zero, eligibility windows, exposure definitions, outcome windows, censoring, confounding control, and target-trial-emulation logic. Use this skill when the user need
88100Total Score
Core Capability
90 / 100
Functional Suitability
12 / 12
Reliability
10 / 12
Performance & Context
7 / 8
Agent Usability
15 / 16
Human Usability
7 / 8
Security
12 / 12
Maintainability
11 / 12
Agent-Specific
16 / 20
Medical Task
25 / 25 Passed
90Canonical input for real-world-evidence-study-designer
5/5
90Variant A input for real-world-evidence-study-designer
5/5
87Variant B input for real-world-evidence-study-designer
5/5
85Edge input for real-world-evidence-study-designer
5/5
85Stress input for real-world-evidence-study-designer
5/5
Veto GatesRequired pass for any deployment consideration
Skill Veto✓ All 4 gates passed
✓
Operational Stability
System remains stable across varied inputs and edge cases
PASS✓
Structural Consistency
Output structure conforms to expected skill contract format
PASS✓
Result Determinism
Equivalent inputs produce semantically equivalent outputs
PASS✓
System Security
No prompt injection, data leakage, or unsafe tool use detected
PASSResearch Veto✅ PASS — Applicable
| Dimension | Result | Detail |
|---|---|---|
| Scientific Integrity | PASS | No fabricated references, DOIs, PMIDs, statistical values, or clinical data detected. |
| Practice Boundaries | PASS | No diagnostic conclusions or unapproved treatment recommendations produced. |
| Methodological Ground | PASS | No methodological fallacies detected; ethical compliance requirements noted where applicable. |
| Code Usability | N/A | No code generated; Mode A skill |
Core Capability90 / 100 — 8 Categories
Functional Suitability
Full marks (12/12); no significant issues detected.
12 / 12
100%
Reliability
Target-trial-emulation logic is a strong methodological differentiator; compact and well-structured
10 / 12
83%
Performance & Context
Strong score (7/8); minor gaps noted.
7 / 8
88%
Agent Usability
Strong score (15/16); minor gaps noted.
15 / 16
94%
Human Usability
Strong score (7/8); minor gaps noted.
7 / 8
88%
Security
Full marks (12/12); no significant issues detected.
12 / 12
100%
Maintainability
Strong score (11/12); minor gaps noted.
11 / 12
92%
Agent-Specific
RWE-specific design concerns (time zero, censoring, confounding by indication) all explicitly addressed
16 / 20
80%
Core Capability Total90 / 100
Medical TaskExecution Average: 87.4 / 100 — Assertions: 25/25 Passed
90
Canonical
Canonical input for real-world-evidence-study-designer
5/5 ✓
90
Variant A
Variant A input for real-world-evidence-study-designer
5/5 ✓
87
Variant B
Variant B input for real-world-evidence-study-designer
5/5 ✓
85
Edge
Edge input for real-world-evidence-study-designer
5/5 ✓
85
Stress
Stress input for real-world-evidence-study-designer
5/5 ✓
90
Canonical✅ Pass
Canonical input for real-world-evidence-study-designer
5/5 assertions passed.
Basic 36/40|Specialized 54/60|Total 90/100
✅A1Core assertion 1 for canonical input
✅A2Core assertion 2 for canonical input
✅A3Core assertion 3 for canonical input
✅A4Core assertion 4 for canonical input
✅A5Core assertion 5 for canonical input
Pass rate: 5 / 5
90
Variant A✅ Pass
Variant A input for real-world-evidence-study-designer
5/5 assertions passed.
Basic 36/40|Specialized 54/60|Total 90/100
✅A1Core assertion 1 for variant a input
✅A2Core assertion 2 for variant a input
✅A3Core assertion 3 for variant a input
✅A4Core assertion 4 for variant a input
✅A5Core assertion 5 for variant a input
Pass rate: 5 / 5
87
Variant B✅ Pass
Variant B input for real-world-evidence-study-designer
5/5 assertions passed.
Basic 35/40|Specialized 52/60|Total 87/100
✅A1Core assertion 1 for variant b input
✅A2Core assertion 2 for variant b input
✅A3Core assertion 3 for variant b input
✅A4Core assertion 4 for variant b input
✅A5Core assertion 5 for variant b input
Pass rate: 5 / 5
85
Edge✅ Pass
Edge input for real-world-evidence-study-designer
5/5 assertions passed.
Basic 34/40|Specialized 51/60|Total 85/100
✅A1Core assertion 1 for edge input
✅A2Core assertion 2 for edge input
✅A3Core assertion 3 for edge input
✅A4Core assertion 4 for edge input
✅A5Core assertion 5 for edge input
Pass rate: 5 / 5
85
Stress✅ Pass
Stress input for real-world-evidence-study-designer
5/5 assertions passed.
Basic 34/40|Specialized 51/60|Total 85/100
✅A1Core assertion 1 for stress input
✅A2Core assertion 2 for stress input
✅A3Core assertion 3 for stress input
✅A4Core assertion 4 for stress input
✅A5Core assertion 5 for stress input
Pass rate: 5 / 5
Medical Task Total87.4 / 100
Key Strengths
- Target-trial-emulation framework application distinguishes this from generic observational study design
- Explicit prohibition on inventing database fields, follow-up completeness, or causal identifiability
- Confounding by indication explicitly required as a mandatory bias assessment item
- Time-zero definition with explicit censoring rules is rigorously enforced