Evidence Insight

citation-network

92100Total Score
Core Capability
87 / 100
Functional Suitability
11 / 12
Reliability
10 / 12
Performance & Context
8 / 8
Agent Usability
14 / 16
Human Usability
8 / 8
Security
9 / 12
Maintainability
10 / 12
Agent-Specific
17 / 20
Medical Task
20 / 20 Passed
100You have a citation relationship table (who cites whom) and want to quickly turn it into a directed network for analysis
4/4
97You are conducting a literature review and need to identify influential papers (high in-degree / centrality) and core clusters
4/4
95Builds a directed citation graph from a minimal CSV containing source and target
4/4
94De-duplicates nodes by identifier (DOI recommended; otherwise unique titles)
4/4
94End-to-end case for Builds a directed citation graph from a minimal CSV containing source and target
4/4

Veto GatesRequired pass for any deployment consideration

Skill Veto✓ All 4 gates passed
Operational Stability
System remains stable across varied inputs and edge cases
PASS
Structural Consistency
Output structure conforms to expected skill contract format
PASS
Result Determinism
Equivalent inputs produce semantically equivalent outputs
PASS
System Security
No prompt injection, data leakage, or unsafe tool use detected
PASS
Research Veto✅ PASS — Applicable
DimensionResultDetail
Scientific IntegrityPASSScientific content remained anchored to fetched metadata or source-linked evidence in the legacy review.
Practice BoundariesPASSThe legacy review kept this workflow on the evidence-access side of the boundary, not the advice-giving side.
Methodological GroundPASSThe older review treated the package logic as methodologically aligned with its stated workflow.
Code UsabilityPASSThe packaged retrieval surface remained understandable at the command and parameter level in the archived review.

Core Capability87 / 1008 Categories

Functional Suitability
The legacy audit deducted points for citation-network in functional suitability.
11 / 12
92%
Reliability
The archived evaluation left some headroom for citation-network under reliability.
10 / 12
83%
Performance & Context
The legacy audit gave full marks to performance context for this package.
8 / 8
100%
Agent Usability
The archived evaluation left some headroom for citation-network under agent usability.
14 / 16
88%
Human Usability
Human usability reached full score in the archived evaluation.
8 / 8
100%
Security
The archived evaluation left some headroom for citation-network under security.
9 / 12
75%
Maintainability
A modest deduction remained in maintainability for citation-network in the archived review.
10 / 12
83%
Agent-Specific
A modest deduction remained in agent specific for citation-network in the archived review.
17 / 20
85%
Core Capability Total87 / 100

Medical TaskExecution Average: 96 / 100 — Assertions: 20/20 Passed

100
Canonical
You have a citation relationship table (who cites whom) and want to quickly turn it into a directed network for analysis
4/4
97
Variant A
You are conducting a literature review and need to identify influential papers (high in-degree / centrality) and core clusters
4/4
95
Edge
Builds a directed citation graph from a minimal CSV containing source and target
4/4
94
Variant B
De-duplicates nodes by identifier (DOI recommended; otherwise unique titles)
4/4
94
Stress
End-to-end case for Builds a directed citation graph from a minimal CSV containing source and target
4/4
100
Canonical✅ Pass
You have a citation relationship table (who cites whom) and want to quickly turn it into a directed network for analysis

You have a citation relationship table (who cites whom) and want to... remained well-aligned with the documented contract in the preserved audit.

Basic 38/40|Specialized 60/60|Total 100/100
A1The citation-network output structure matches the documented deliverable
A2The script execution path completed successfully for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
97
Variant A✅ Pass
You are conducting a literature review and need to identify influential papers (high in-degree / centrality) and core clusters

The archived evaluation treated You are conducting a literature review and need to identify... as a clean in-scope run.

Basic 36/40|Specialized 60/60|Total 97/100
A1The citation-network output structure matches the documented deliverable
A2The script execution path completed successfully for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
95
Edge✅ Pass
Builds a directed citation graph from a minimal CSV containing source and target

The archived evaluation treated Builds a directed citation graph from a minimal CSV containing... as a clean in-scope run.

Basic 35/40|Specialized 60/60|Total 95/100
A1The citation-network output structure matches the documented deliverable
A2The script execution path completed successfully for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
94
Variant B✅ Pass
De-duplicates nodes by identifier (DOI recommended; otherwise unique titles)

The De-duplicates nodes by identifier (DOI recommended; otherwise... scenario completed within the documented Build and visualize a citation network from a source/target CSV to identify key papers,... boundary.

Basic 34/40|Specialized 60/60|Total 94/100
A1The citation-network output structure matches the documented deliverable
A2The script execution path completed successfully for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
94
Stress✅ Pass
End-to-end case for Builds a directed citation graph from a minimal CSV containing source and target

The End-to-end case for Builds a directed citation graph from a minimal... scenario completed within the documented Build and visualize a citation network from a source/target CSV to identify key papers,... boundary.

Basic 31/40|Specialized 60/60|Total 94/100
A1The citation-network output structure matches the documented deliverable
A2The script execution path completed successfully for the documented case
A3The output stays fully within the documented skill boundary
A4The response quality is acceptable for the documented path
Pass rate: 4 / 4
Medical Task Total96 / 100

Key Strengths

  • Primary routing is Evidence Insight with execution mode B
  • Static quality score is 87/100 and dynamic average is 83.6/100
  • Assertions and command execution outcomes are recorded per input for human review
  • Execution verification summary: Script verification 3/3; adjustment=5. build_citation_network.py: OK; export_gexf_html.py: OK; init_run.py: OK