plan-generator
Veto GatesRequired pass for any deployment consideration
Core Capability83 / 100 — 8 Categories
Medical TaskExecution Average: 94.6 / 100 — Assertions: 20/20 Passed
For You need a final exam review plan across a specific start/end date range, the preserved evidence is lightweight but positive: the packaged validation command behaved as expected.
For You need a lab experiment schedule that allocates tasks by duration..., the preserved evidence is lightweight but positive: the packaged validation command behaved as expected.
For Supports two plan types:, the preserved evidence is lightweight but positive: the packaged validation command behaved as expected.
The Review plan (course/exam-oriented) path verified the packaged helper command without exposing a deeper execution issue.
The End-to-end case for Supports two plan types: path verified the packaged helper command without exposing a deeper execution issue.
Key Strengths
- Primary routing is Other with execution mode B
- Static quality score is 83/100 and dynamic average is 81.6/100
- Assertions and command execution outcomes are recorded per input for human review
- Execution verification summary: Script verification 2/2; adjustment=5. plan_generator.py: OK; validate_skill.py: OK