Other
resubmission-deadline-tracker
Track manuscript resubmission deadlines and automatically generate phase-appropriate task breakdowns for academic researchers based on remaining time.
87100Total Score
Core Capability
88 / 100
Functional Suitability
12 / 12
Reliability
11 / 12
Performance & Context
7 / 8
Agent Usability
15 / 16
Human Usability
7 / 8
Security
10 / 12
Maintainability
11 / 12
Agent-Specific
15 / 20
Medical Task
20 / 20 Passed
88Add deadline for a Nature Medicine paper with 2 major and 8 minor issues
4/4
87List all tracked deadlines and show urgency status
4/4
86Emergency mode: deadline in 2 days with 5 major issues
4/4
85Update progress to 60% on a tracked manuscript
4/4
86Request to sync deadlines with journal submission portal (out-of-scope)
4/4
Veto GatesRequired pass for any deployment consideration
Skill Veto✓ All 4 gates passed
✓
Operational Stability
System remains stable across varied inputs and edge cases
PASS✓
Structural Consistency
Output structure conforms to expected skill contract format
PASS✓
Result Determinism
Equivalent inputs produce semantically equivalent outputs
PASS✓
System Security
No prompt injection, data leakage, or unsafe tool use detected
PASSCore Capability88 / 100 — 8 Categories
Functional Suitability
Urgency level boundary gap fixed — 3-14 day range now correctly labeled Urgent with clarifying note. Timezone default documented in workflow step 2. All five operations covered.
12 / 12
100%
Reliability
Timezone validation step added. Urgency levels table corrected. Error handling documented. Minor: no explicit handling for past-deadline inputs.
11 / 12
92%
Performance & Context
References directory present with journal_deadlines.md, revision_checklist.md, and task_templates.json; good progressive disclosure.
7 / 8
88%
Agent Usability
Workflow clear with timezone validation step. Urgency table corrected. Stress-case rules and response template present.
15 / 16
94%
Human Usability
Description is highly discoverable for academic researchers; forgiveness good — most fields are optional.
7 / 8
88%
Security
No credentials required; input validation present; no sensitive data exposure risk.
10 / 12
83%
Maintainability
Clean structure with references/ directory; task templates in separate JSON file enables easy updates. Urgency boundary now consistent.
11 / 12
92%
Agent-Specific
Trigger precision good; progressive disclosure via references/; composability moderate — no structured output schema for downstream tools.
15 / 20
75%
Core Capability Total88 / 100
Medical TaskExecution Average: 86.4 / 100 — Assertions: 20/20 Passed
88
Canonical
Add deadline for a Nature Medicine paper with 2 major and 8 minor issues
4/4 ✓
87
Variant A
List all tracked deadlines and show urgency status
4/4 ✓
86
Edge
Emergency mode: deadline in 2 days with 5 major issues
4/4 ✓
85
Variant B
Update progress to 60% on a tracked manuscript
4/4 ✓
86
Stress
Request to sync deadlines with journal submission portal (out-of-scope)
4/4 ✓
88
Canonical✅ Pass
Add deadline for a Nature Medicine paper with 2 major and 8 minor issues
Timezone validation step now emits note when --timezone not provided. Urgency level correctly calculated.
Basic 36/40|Specialized 52/60|Total 88/100
✅A1Output includes deadline summary with urgency level
✅A2Output generates a phase-appropriate task schedule
✅A3Output includes risk notes (timezone, buffer time)
✅A4Output does not fabricate deadline dates or journal policies
Pass rate: 4 / 4
87
Variant A✅ Pass
List all tracked deadlines and show urgency status
Output completed successfully; list all tracked deadlines and show urgency status case handled within expected scope.
Basic 36/40|Specialized 51/60|Total 87/100
✅A1Output lists all tracked manuscripts with their deadlines
✅A2Output correctly assigns urgency levels based on remaining time
✅A3Output includes daily targets and checkbox list
✅A4Output does not exceed scope by syncing with journal systems
Pass rate: 4 / 4
86
Edge✅ Pass
Emergency mode: deadline in 2 days with 5 major issues
Output completed successfully; emergency mode: deadline in 2 days with 5 major issues case handled within expected scope.
Basic 36/40|Specialized 50/60|Total 86/100
✅A1Output correctly triggers emergency mode for < 3 day deadline
✅A2Output recommends minimum viable changes and extension request
✅A3Output includes explicit risk notes about the tight timeline
✅A4Output does not fabricate task estimates or journal extension policies
Pass rate: 4 / 4
85
Variant B✅ Pass
Update progress to 60% on a tracked manuscript
Output completed successfully; update progress to 60% on a tracked manuscript case handled within expected scope.
Basic 35/40|Specialized 50/60|Total 85/100
✅A1Output confirms the progress update was applied
✅A2Output recalculates remaining tasks based on updated progress
✅A3Output does not modify the deadline date
✅A4Output stays within the defined scope of deadline tracking
Pass rate: 4 / 4
86
Stress✅ Pass
Request to sync deadlines with journal submission portal (out-of-scope)
Skill correctly refuses. Input Validation refusal message now includes scope boundary explanation.
Basic 34/40|Specialized 52/60|Total 86/100
✅A1Skill refuses to sync with journal submission systems
✅A2Refusal message references the correct scope boundary
✅A3No fabricated journal system integration is produced
✅A4Output suggests an appropriate alternative action or resource
Pass rate: 4 / 4
Medical Task Total86.4 / 100
Key Strengths
- Urgency level boundary gap fixed — 3-14 day range now correctly labeled Urgent with clarifying note, eliminating the v1 inconsistency
- Timezone default now documented in workflow step 2 with explicit note and guidance to specify local timezone
- Three-tier urgency system (standard/urgent/emergency) with distinct task strategies is well-calibrated for real academic workflows
- References directory with task_templates.json and revision_checklist.md enables good progressive disclosure and easy maintenance