Protocol Design

clinic-research-design

Generates a structured prompt framework for clinical study protocols. Supports Diagnostic, Efficacy, Etiology, and Prognosis studies. Calculates sample size and provides logic guides for LLMs.

91100Total Score
Core Capability
83 / 100
Functional Suitability
11 / 12
Reliability
10 / 12
Performance & Context
8 / 8
Agent Usability
13 / 16
Human Usability
7 / 8
Security
9 / 12
Maintainability
9 / 12
Agent-Specific
16 / 20
Medical Task
20 / 20 Passed
100Generates a structured prompt framework for clinical study protocols. Supports Diagnostic, Efficacy, Etiology, and Prognosis studies. Calculates sample size and provides logic guides for LLMs
4/4
98Generates a structured prompt framework for clinical study protocols. Supports Diagnostic, Efficacy, Etiology, and Prognosis studies. Calculates sample size and provides logic guides for LLMs
4/4
96Generates a structured prompt framework for clinical study protocols. Supports Diagnostic, Efficacy, Etiology, and Prognosis studies. Calculates sample size and provides logic guides for LLMs
4/4
95Packaged executable path(s): scripts/calculators/sample_size.py plus 4 additional script(s)
4/4
95End-to-end case for Scope-focused workflow aligned to: Generates a structured prompt framework for clinical study protocols. Supports Diagnostic, Efficacy, Etiology, and Prognosis studies. Calculates sample size and provides logic guides for LLMs
4/4

Veto GatesRequired pass for any deployment consideration

Skill Veto✓ All 4 gates passed
Operational Stability
System remains stable across varied inputs and edge cases
PASS
Structural Consistency
Output structure conforms to expected skill contract format
PASS
Result Determinism
Equivalent inputs produce semantically equivalent outputs
PASS
System Security
No prompt injection, data leakage, or unsafe tool use detected
PASS
Research Veto✅ PASS — Applicable
DimensionResultDetail
Scientific IntegrityPASSScientific integrity held because the archived workflow stayed at the level of study planning, hypothesis framing, and experiment design rather than claiming completed results.
Practice BoundariesPASSThe package remained on the planning side of the boundary and did not cross into clinical or diagnostic advice.
Methodological GroundPASSMethodological grounding was preserved through the documented inputs, transformations, and expected artifacts.
Code UsabilityN/AThis package is packaging-first and output-first, not code-first, so code usability is treated as not applicable.

Core Capability83 / 1008 Categories

Functional Suitability
The archived review left some room to tighten how Generates a structured prompt framework for clinical study protocols. Supports Diagnostic,... maps onto a finished protocol-style deliverable.
11 / 12
92%
Reliability
The package stayed structured, but the archived score suggests more consistency would help under sparse or stress-case inputs.
10 / 12
83%
Performance & Context
No point loss was recorded for performance context in the legacy audit.
8 / 8
100%
Agent Usability
The planning path is understandable, but the archived score suggests a little more trigger clarity would help agents route into it faster.
13 / 16
81%
Human Usability
Human usability was softened by the legacy issue 'Minor polish before wide rollout'. No major defects found
7 / 8
88%
Security
Security scored well, though the archived review still left some room to make boundary language even more explicit.
9 / 12
75%
Maintainability
The package remains maintainable, though the archived review saw modest room to simplify or stabilize its planning logic.
9 / 12
75%
Agent-Specific
Agent-specific quality remained high, with a small gap around determinism or edge-case prompting behavior.
16 / 20
80%
Core Capability Total83 / 100

Medical TaskExecution Average: 96.8 / 100 — Assertions: 20/20 Passed

100
Canonical
Generates a structured prompt framework for clinical study protocols. Supports Diagnostic, Efficacy, Etiology, and Prognosis studies. Calculates sample size and provides logic guides for LLMs
4/4
98
Variant A
Generates a structured prompt framework for clinical study protocols. Supports Diagnostic, Efficacy, Etiology, and Prognosis studies. Calculates sample size and provides logic guides for LLMs
4/4
96
Edge
Generates a structured prompt framework for clinical study protocols. Supports Diagnostic, Efficacy, Etiology, and Prognosis studies. Calculates sample size and provides logic guides for LLMs
4/4
95
Variant B
Packaged executable path(s): scripts/calculators/sample_size.py plus 4 additional script(s)
4/4
95
Stress
End-to-end case for Scope-focused workflow aligned to: Generates a structured prompt framework for clinical study protocols. Supports Diagnostic, Efficacy, Etiology, and Prognosis studies. Calculates sample size and provides logic guides for LLMs
4/4
100
Canonical✅ Pass
Generates a structured prompt framework for clinical study protocols. Supports Diagnostic, Efficacy, Etiology, and Prognosis studies. Calculates sample size and provides logic guides for LLMs

The archived run for Generates a structured prompt framework for clinical study... confirmed the helper entrypoint and left the workflow in a stable state.

Basic 38/40|Specialized 60/60|Total 100/100
A1The clinic-research-design output structure matches the documented deliverable
A2The script execution path completed successfully for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
98
Variant A✅ Pass
Generates a structured prompt framework for clinical study protocols. Supports Diagnostic, Efficacy, Etiology, and Prognosis studies. Calculates sample size and provides logic guides for LLMs

The archived run for Generates a structured prompt framework for clinical study... confirmed the helper entrypoint and left the workflow in a stable state.

Basic 36/40|Specialized 60/60|Total 98/100
A1The clinic-research-design output structure matches the documented deliverable
A2The script execution path completed successfully for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
96
Edge✅ Pass
Generates a structured prompt framework for clinical study protocols. Supports Diagnostic, Efficacy, Etiology, and Prognosis studies. Calculates sample size and provides logic guides for LLMs

The archived run for Generates a structured prompt framework for clinical study... confirmed the helper entrypoint and left the workflow in a stable state.

Basic 35/40|Specialized 60/60|Total 96/100
A1The clinic-research-design output structure matches the documented deliverable
A2The script execution path completed successfully for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
95
Variant B✅ Pass
Packaged executable path(s): scripts/calculators/sample_size.py plus 4 additional script(s)

The archived run for Packaged executable path(s): scripts/calculators/sample_size.py... confirmed the helper entrypoint and left the workflow in a stable state.

Basic 34/40|Specialized 60/60|Total 95/100
A1The clinic-research-design output structure matches the documented deliverable
A2The script execution path completed successfully for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
95
Stress✅ Pass
End-to-end case for Scope-focused workflow aligned to: Generates a structured prompt framework for clinical study protocols. Supports Diagnostic, Efficacy, Etiology, and Prognosis studies. Calculates sample size and provides logic guides for LLMs

The Generates a structured prompt framework for clinical study protocols. Supports Diagnostic,... path verified the packaged helper command without exposing a deeper execution issue.

Basic 31/40|Specialized 60/60|Total 95/100
A1The clinic-research-design output structure matches the documented deliverable
A2The script execution path completed successfully for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
Medical Task Total96.8 / 100

Key Strengths

  • Primary routing is Protocol Design with execution mode B
  • Static quality score is 83/100 and dynamic average is 84.6/100
  • Assertions and command execution outcomes are recorded per input for human review
  • Execution verification summary: Script verification 4/4; adjustment=5. main.py: OK; protocol_writer.py: OK; study_classifier.py: OK; validate_skill.py: OK