Protocol Design

meta-protocol-writer

Generates a PROSPERO-compliant Meta-analysis protocol based on Title and PICOS. Use when the user wants to write a protocol for a systematic review or meta-analysis.

87100Total Score
Core Capability
78 / 100
Functional Suitability
9 / 12
Reliability
9 / 12
Performance & Context
8 / 8
Agent Usability
12 / 16
Human Usability
7 / 8
Security
9 / 12
Maintainability
9 / 12
Agent-Specific
15 / 20
Medical Task
20 / 20 Passed
97Generates a PROSPERO-compliant Meta-analysis protocol based on Title and PICOS
4/4
93Gather Inputs
4/4
91Validate Title
4/4
91Generate Protocol Sections
4/4
91Generate Protocol Sections
4/4

Veto GatesRequired pass for any deployment consideration

Skill Veto✓ All 4 gates passed
Operational Stability
System remains stable across varied inputs and edge cases
PASS
Structural Consistency
Output structure conforms to expected skill contract format
PASS
Result Determinism
Equivalent inputs produce semantically equivalent outputs
PASS
System Security
No prompt injection, data leakage, or unsafe tool use detected
PASS
Research Veto✅ PASS — Applicable
DimensionResultDetail
Scientific IntegrityPASSScientific integrity held because the archived workflow stayed at the level of study planning, hypothesis framing, and experiment design rather than claiming completed results.
Practice BoundariesPASSPractice boundaries were preserved because the outputs stayed within research-design support rather than executed-study claims.
Methodological GroundPASSMethodological grounding was preserved through the documented inputs, transformations, and expected artifacts.
Code UsabilityN/AThe package is evaluated primarily as a structured deliverable rather than an executable scientific code workflow.

Core Capability78 / 1008 Categories

Functional Suitability
The archived deduction in functional suitability traces back to: Improve stress-case output rigor. Stress and boundary scenarios show weaker consistency
9 / 12
75%
Reliability
The archived deduction in reliability traces back to: Improve stress-case output rigor. Stress and boundary scenarios show weaker consistency
9 / 12
75%
Performance & Context
No point loss was recorded for performance context in the legacy audit.
8 / 8
100%
Agent Usability
The archived review left some headroom in how directly the workflow guides an agent through the planning sequence.
12 / 16
75%
Human Usability
The package is readable, but the archived review still saw a little room to simplify how users inspect the final plan.
7 / 8
88%
Security
Security scored well, though the archived review still left some room to make boundary language even more explicit.
9 / 12
75%
Maintainability
Maintainability held up, but a little more consolidation or clearer packaging would likely close the remaining gap.
9 / 12
75%
Agent-Specific
The archived deduction in agent specific traces back to: Improve stress-case output rigor. Stress and boundary scenarios show weaker consistency
15 / 20
75%
Core Capability Total78 / 100

Medical TaskExecution Average: 92.6 / 100 — Assertions: 20/20 Passed

97
Canonical
Generates a PROSPERO-compliant Meta-analysis protocol based on Title and PICOS
4/4
93
Variant A
Gather Inputs
4/4
91
Edge
Validate Title
4/4
91
Variant B
Generate Protocol Sections
4/4
91
Stress
Generate Protocol Sections
4/4
97
Canonical✅ Pass
Generates a PROSPERO-compliant Meta-analysis protocol based on Title and PICOS

This canonical case remained a study-design support path, not a code-driven execution run.

Basic 35/40|Specialized 60/60|Total 97/100
A1The meta-protocol-writer output structure matches the documented deliverable
A2The instruction path remains actionable for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
93
Variant A✅ Pass
Gather Inputs

This variant a case remained a study-design support path, not a code-driven execution run.

Basic 33/40|Specialized 60/60|Total 93/100
A1The meta-protocol-writer output structure matches the documented deliverable
A2The instruction path remains actionable for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
91
Edge✅ Pass
Validate Title

This edge case remained a study-design support path, not a code-driven execution run.

Basic 32/40|Specialized 59/60|Total 91/100
A1The meta-protocol-writer output structure matches the documented deliverable
A2The instruction path remains actionable for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
91
Variant B✅ Pass
Generate Protocol Sections

This variant b case remained a study-design support path, not a code-driven execution run.

Basic 31/40|Specialized 60/60|Total 91/100
A1The meta-protocol-writer output structure matches the documented deliverable
A2The instruction path remains actionable for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
91
Stress✅ Pass
Generate Protocol Sections

This stress case remained a study-design support path, not a code-driven execution run.

Basic 28/40|Specialized 60/60|Total 91/100
A1The meta-protocol-writer output structure matches the documented deliverable
A2The instruction path remains actionable for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
Medical Task Total92.6 / 100

Key Strengths

  • Primary routing is Protocol Design with execution mode B
  • Static quality score is 78/100 and dynamic average is 79.6/100
  • Assertions and command execution outcomes are recorded per input for human review
  • Execution verification summary: Script verification 1/1; adjustment=5. utils.py: OK