Academic Writing

meta-results-forest-plot-analyzer

90100Total Score

Core Capability

83 / 100

Functional Suitability

11 / 12

Reliability

10 / 12

Performance & Context

8 / 8

Agent Usability

13 / 16

Human Usability

7 / 8

Security

9 / 12

Maintainability

9 / 12

Agent-Specific

16 / 20

Medical Task

20 / 20 Passed

99Analyzes forest plots for meta-analysis, generating detailed descriptions and formatting figure legends in Chinese or English

4/4

95Output Formatting (Script)

4/4

93Output Formatting (Script)

4/4

93Output Formatting (Script)

4/4

93Output Formatting (Script)

4/4

Veto GatesRequired pass for any deployment consideration

Skill Veto✓ All 4 gates passed

✓

Operational Stability

System remains stable across varied inputs and edge cases

PASS

✓

Structural Consistency

Output structure conforms to expected skill contract format

PASS

✓

Result Determinism

Equivalent inputs produce semantically equivalent outputs

PASS

✓

System Security

No prompt injection, data leakage, or unsafe tool use detected

PASS

Research Veto✅ PASS — Applicable

Dimension	Result	Detail
Scientific Integrity	PASS	Scientific integrity remained intact because the package rewrote or structured material without fabricating findings.
Practice Boundaries	PASS	The evaluated outputs stayed inside the Analyzes forest plots for meta-analysis, generating detailed descriptions and formatting... workflow rather than drifting into unsupported scientific interpretation.
Methodological Ground	PASS	No methodological-grounding issue was recorded for meta-results-forest-plot-analyzer in the archived evaluation.
Code Usability	PASS	No code-usability failure was preserved for meta-results-forest-plot-analyzer in the legacy evaluation.

Core Capability83 / 100 — 8 Categories

Functional Suitability

The writing workflow lands well overall, with minor remaining headroom in the final deliverable contract.

11 / 12

92%

Reliability

A small reliability gap remained around the more demanding writing-to-format conversion paths.

10 / 12

83%

Performance & Context

No point loss was recorded for performance context in the legacy audit.

8 / 8

100%

Agent Usability

The archived score suggests slightly clearer routing would help an agent choose the right dissemination path faster.

13 / 16

81%

Human Usability

Related legacy finding for meta-results-forest-plot-analyzer: Minor polish before wide rollout. No major defects found

7 / 8

88%

Security

The workflow stayed safe overall, with only a small remaining deduction around boundary signaling.

9 / 12

75%

Maintainability

The archived review treated the package as maintainable overall, while still leaving some cleanup headroom.

9 / 12

75%

Agent-Specific

The archived score suggests this workflow could make its orchestration cues a little more explicit for agents.

16 / 20

80%

Core Capability Total83 / 100

Medical TaskExecution Average: 94.6 / 100 — Assertions: 20/20 Passed

Canonical

Analyzes forest plots for meta-analysis, generating detailed descriptions and formatting figure legends in Chinese or English

4/4 ✓

Variant A

Output Formatting (Script)

4/4 ✓

Edge

Output Formatting (Script)

4/4 ✓

Variant B

Output Formatting (Script)

4/4 ✓

Stress

Output Formatting (Script)

4/4 ✓

Canonical✅ Pass

Analyzes forest plots for meta-analysis, generating detailed descriptions and formatting figure legends in Chinese or English

The Analyzes forest plots for meta-analysis, generating detailed... path verified the packaged helper command without exposing a deeper execution issue.

Basic 38/40|Specialized 60/60|Total 99/100

✅A1The meta-results-forest-plot-analyzer output structure matches the documented deliverable

✅A2The script execution path completed successfully for the documented case

✅A3The output stays fully within the documented skill boundary

✅A4The response quality is acceptable for the documented path

Pass rate: 4 / 4

Variant A✅ Pass

Output Formatting (Script)

The archived run for Output Formatting (Script) confirmed the helper entrypoint and left the workflow in a stable state.

Basic 36/40|Specialized 59/60|Total 95/100

✅A1The meta-results-forest-plot-analyzer output structure matches the documented deliverable

✅A2The script execution path completed successfully for the documented case

✅A3The output stays fully within the documented skill boundary

✅A4The response quality is acceptable for the documented path

Pass rate: 4 / 4

Edge✅ Pass

Output Formatting (Script)

For Output Formatting (Script), the preserved evidence is lightweight but positive: the packaged validation command behaved as expected.

Basic 35/40|Specialized 58/60|Total 93/100

✅A1The meta-results-forest-plot-analyzer output structure matches the documented deliverable

✅A2The script execution path completed successfully for the documented case

✅A3The output stays fully within the documented skill boundary

✅A4The response quality is acceptable for the documented path

Pass rate: 4 / 4

Variant B✅ Pass

Output Formatting (Script)

For Output Formatting (Script), the preserved evidence is lightweight but positive: the packaged validation command behaved as expected.

Basic 34/40|Specialized 59/60|Total 93/100

✅A1The meta-results-forest-plot-analyzer output structure matches the documented deliverable

✅A2The script execution path completed successfully for the documented case

✅A3The output stays fully within the documented skill boundary

✅A4The response quality is acceptable for the documented path

Pass rate: 4 / 4

Stress✅ Pass

Output Formatting (Script)

For Output Formatting (Script), the preserved evidence is lightweight but positive: the packaged validation command behaved as expected.

Basic 31/40|Specialized 60/60|Total 93/100

✅A1The meta-results-forest-plot-analyzer output structure matches the documented deliverable

✅A2The script execution path completed successfully for the documented case

✅A3The output stays fully within the documented skill boundary

✅A4The response quality is acceptable for the documented path

Pass rate: 4 / 4

Medical Task Total94.6 / 100

Key Strengths

Primary routing is Academic Writing with execution mode B
Static quality score is 83/100 and dynamic average is 83.6/100
Assertions and command execution outcomes are recorded per input for human review
Execution verification summary: Script verification 1/2; adjustment=3. format_result.py: rc=1; validate_skill.py: OK