Other

word-read-write

85100Total Score
Core Capability
78 / 100
Functional Suitability
10 / 12
Reliability
9 / 12
Performance & Context
8 / 8
Agent Usability
12 / 16
Human Usability
7 / 8
Security
8 / 12
Maintainability
9 / 12
Agent-Specific
15 / 20
Medical Task
20 / 20 Passed
94Generate Word reports (e.g., weekly status, audit summaries) from structured data in Node.js
4/4
90Produce standardized memos/letters/templates with consistent page size, margins, headings, and tables
4/4
88Create .docx documents using the docx (docx-js) library
4/4
88Explicit page setup (US Letter vs A4), margins, and landscape handling
4/4
88End-to-end case for Create .docx documents using the docx (docx-js) library
4/4

Veto GatesRequired pass for any deployment consideration

Skill Veto✓ All 4 gates passed
Operational Stability
System remains stable across varied inputs and edge cases
PASS
Structural Consistency
Output structure conforms to expected skill contract format
PASS
Result Determinism
Equivalent inputs produce semantically equivalent outputs
PASS
System Security
No prompt injection, data leakage, or unsafe tool use detected
PASS

Core Capability78 / 1008 Categories

Functional Suitability
Related legacy finding for word-read-write: Improve stress-case output rigor. Stress and boundary scenarios show weaker consistency
10 / 12
83%
Reliability
Reliability was softened by the legacy issue 'Improve stress-case output rigor'. Stress and boundary scenarios show weaker consistency
9 / 12
75%
Performance & Context
No point loss was recorded for performance context in the legacy audit.
8 / 8
100%
Agent Usability
A modest deduction remained in agent usability for word-read-write in the archived review.
12 / 16
75%
Human Usability
A modest deduction remained in human usability for word-read-write in the archived review.
7 / 8
88%
Security
A modest deduction remained in security for word-read-write in the archived review.
8 / 12
67%
Maintainability
The archived evaluation left some headroom for word-read-write under maintainability.
9 / 12
75%
Agent-Specific
Related legacy finding for word-read-write: Improve stress-case output rigor. Stress and boundary scenarios show weaker consistency
15 / 20
75%
Core Capability Total78 / 100

Medical TaskExecution Average: 89.6 / 100 — Assertions: 20/20 Passed

94
Canonical
Generate Word reports (e.g., weekly status, audit summaries) from structured data in Node.js
4/4
90
Variant A
Produce standardized memos/letters/templates with consistent page size, margins, headings, and tables
4/4
88
Edge
Create .docx documents using the docx (docx-js) library
4/4
88
Variant B
Explicit page setup (US Letter vs A4), margins, and landscape handling
4/4
88
Stress
End-to-end case for Create .docx documents using the docx (docx-js) library
4/4
94
Canonical✅ Pass
Generate Word reports (e.g., weekly status, audit summaries) from structured data in Node.js

The archived run for Generate Word reports (e.g., weekly status, audit summaries) from... remained guidance-driven rather than command-driven.

Basic 35/40|Specialized 59/60|Total 94/100
A1The word-read-write output structure matches the documented deliverable
A2The instruction path remains actionable for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
90
Variant A✅ Pass
Produce standardized memos/letters/templates with consistent page size, margins, headings, and tables

Produce standardized memos/letters/templates with consistent page... was evaluated as a bounded documentation path, not as a runnable script workflow.

Basic 33/40|Specialized 57/60|Total 90/100
A1The word-read-write output structure matches the documented deliverable
A2The instruction path remains actionable for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
88
Edge✅ Pass
Create .docx documents using the docx (docx-js) library

The archived run for Create .docx documents using the docx (docx-js) library remained guidance-driven rather than command-driven.

Basic 32/40|Specialized 56/60|Total 88/100
A1The word-read-write output structure matches the documented deliverable
A2The instruction path remains actionable for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
88
Variant B✅ Pass
Explicit page setup (US Letter vs A4), margins, and landscape handling

Explicit page setup (US Letter vs A4), margins, and landscape handling was evaluated as a bounded documentation path, not as a runnable script workflow.

Basic 31/40|Specialized 57/60|Total 88/100
A1The word-read-write output structure matches the documented deliverable
A2The instruction path remains actionable for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
88
Stress✅ Pass
End-to-end case for Create .docx documents using the docx (docx-js) library

This stress case stayed inside the documented workflow and remained instruction-led.

Basic 28/40|Specialized 60/60|Total 88/100
A1The word-read-write output structure matches the documented deliverable
A2The instruction path remains actionable for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
Medical Task Total89.6 / 100

Key Strengths

  • Primary routing is Other with execution mode B
  • Static quality score is 78/100 and dynamic average is 76.6/100
  • Assertions and command execution outcomes are recorded per input for human review
  • Execution verification summary: Script verification 2/2; adjustment=5. accept_changes.py: OK; comment.py: OK