Academic Writing

academic-highlight-generator

86100Total Score

Core Capability

77 / 100

Functional Suitability

9 / 12

Reliability

9 / 12

Performance & Context

8 / 8

Agent Usability

12 / 16

Human Usability

7 / 8

Security

8 / 12

Maintainability

9 / 12

Agent-Specific

15 / 20

Medical Task

20 / 20 Passed

96Extracts and generates academic highlights from research papers (PDF/Doc) suitable for Elsevier/SCI journals, with auto-classification and self-correction. Use when users want to generate "Highlights" section for a paper

4/4

92Extracts and generates academic highlights from research papers (PDF/Doc) suitable for Elsevier/SCI journals, with auto-classification and self-correction. Use when users want to generate "Highlights" section for a paper

4/4

90Validate source sufficiency

4/4

90Generate draft highlights

4/4

90Self-critique and refine

4/4

Veto GatesRequired pass for any deployment consideration

Skill Veto✓ All 4 gates passed

✓

Operational Stability

System remains stable across varied inputs and edge cases

PASS

✓

Structural Consistency

Output structure conforms to expected skill contract format

PASS

✓

Result Determinism

Equivalent inputs produce semantically equivalent outputs

PASS

✓

System Security

No prompt injection, data leakage, or unsafe tool use detected

PASS

Research Veto✅ PASS — Applicable

Dimension	Result	Detail
Scientific Integrity	PASS	Scientific integrity remained intact because the package rewrote or structured material without fabricating findings.
Practice Boundaries	PASS	Practice boundaries held because the package kept to Generates submission-ready Elsevier/SCI Highlights from manuscript text or extracted... instead of claiming new evidence.
Methodological Ground	PASS	The older review treated the package logic as methodologically aligned with its stated workflow.
Code Usability	N/A	The audited output is a narrative or formatting deliverable rather than a code-first scientific workflow.

Core Capability77 / 100 — 8 Categories

Functional Suitability

Functional suitability was softened by the legacy issue 'Improve stress-case output rigor'. Stress and boundary scenarios show weaker consistency

9 / 12

75%

Reliability

Related legacy finding for academic-highlight-generator: Improve stress-case output rigor. Stress and boundary scenarios show weaker consistency

9 / 12

75%

Performance & Context

The legacy audit gave full marks to performance context for this package.

8 / 8

100%

Agent Usability

The package guides agents reasonably well, while still leaving a little room for crisper trigger wording.

12 / 16

75%

Human Usability

The writing package is readable, though the archived score suggests slightly cleaner presentation would help.

7 / 8

88%

Security

Security scored well, though the archived review still left some room to state source-faithful boundaries more explicitly.

8 / 12

67%

Maintainability

The archived review treated the package as maintainable overall, while still leaving some cleanup headroom.

9 / 12

75%

Agent-Specific

The archived deduction in agent specific traces back to: Improve stress-case output rigor. Stress and boundary scenarios show weaker consistency

15 / 20

75%

Core Capability Total77 / 100

Medical TaskExecution Average: 91.6 / 100 — Assertions: 20/20 Passed

Canonical

Extracts and generates academic highlights from research papers (PDF/Doc) suitable for Elsevier/SCI journals, with auto-classification and self-correction. Use when users want to generate "Highlights" section for a paper

4/4 ✓

Variant A

4/4 ✓

Edge

Validate source sufficiency

4/4 ✓

Variant B

Generate draft highlights

4/4 ✓

Stress

Self-critique and refine

4/4 ✓

Canonical✅ Pass

Extracts and generates academic highlights from research papers... remained a writing-first workflow and was evaluated without depending on a runnable helper script.

Basic 35/40|Specialized 60/60|Total 96/100

✅A1The academic-highlight-generator output structure matches the documented deliverable

✅A2The instruction path remains actionable for the documented case

✅A3The output stays fully within the documented skill boundary

✅A4The response quality is acceptable for the documented path

Pass rate: 4 / 4

Variant A✅ Pass

This variant a case was handled as a bounded writing workflow, not as an executable pipeline.

Basic 33/40|Specialized 59/60|Total 92/100

✅A1The academic-highlight-generator output structure matches the documented deliverable

✅A2The instruction path remains actionable for the documented case

✅A3The output stays fully within the documented skill boundary

✅A4The response quality is acceptable for the documented path

Pass rate: 4 / 4

Edge✅ Pass

Validate source sufficiency

This edge case was handled as a bounded writing workflow, not as an executable pipeline.

Basic 32/40|Specialized 58/60|Total 90/100

✅A1The academic-highlight-generator output structure matches the documented deliverable

✅A2The instruction path remains actionable for the documented case

✅A3The output stays fully within the documented skill boundary

✅A4The response quality is acceptable for the documented path

Pass rate: 4 / 4

Variant B✅ Pass

Generate draft highlights

The archived run for Generate draft highlights stayed on the narrative-deliverable path rather than a code path.

Basic 31/40|Specialized 59/60|Total 90/100

✅A1The academic-highlight-generator output structure matches the documented deliverable

✅A2The instruction path remains actionable for the documented case

✅A3The output stays fully within the documented skill boundary

✅A4The response quality is acceptable for the documented path

Pass rate: 4 / 4

Stress✅ Pass

Self-critique and refine

The archived run for Self-critique and refine stayed on the narrative-deliverable path rather than a code path.

Basic 28/40|Specialized 60/60|Total 90/100

✅A1The academic-highlight-generator output structure matches the documented deliverable

✅A2The instruction path remains actionable for the documented case

✅A3The output stays fully within the documented skill boundary

✅A4The response quality is acceptable for the documented path

Pass rate: 4 / 4

Medical Task Total91.6 / 100

Key Strengths

Primary routing is Academic Writing with execution mode B
Static quality score is 77/100 and dynamic average is 78.6/100
Assertions and command execution outcomes are recorded per input for human review
Execution verification summary: Script verification 1/1; adjustment=5. extract_text.py: OK