Evidence Insight

topic-evidence-mapper

Rapidly maps the evidence landscape around a medical topic by organizing major research streams, target populations, endpoints, methods, evidence density, and thin areas. Polished: mandatory training-knowledge label for density claims; description updated to differentiate from gap-finder; multi-topic depth warning; constructive evidence-map offer in scope redirect; operational prohibitions replace informal 'behave like' language.

84100Total Score
Core Capability
86 / 100
Functional Suitability
11 / 12
Reliability
9 / 12
Performance & Context
7 / 8
Agent Usability
14 / 16
Human Usability
7 / 8
Security
12 / 12
Maintainability
11 / 12
Agent-Specific
15 / 20
Medical Task
29 / 33 Passed
85Map evidence landscape for immunotherapy response in triple-negative breast cancer
5/5
84Map sepsis immunometabolism evidence landscape
5/5
83Overly broad topic (cancer immune evasion) requiring scope narrowing before mapping
4/5
85Map the evidence landscape for HCC immunotherapy resistance mechanisms
5/5
83Multi-topic mapping request — 3 related topics simultaneously (KRAS mutant lung cancer, KRAS targeting, and synthetic lethality)
4/5
78Request to identify formal research gaps and recommend a study design for HCC immunotherapy
3/4
79Pressure to produce a full narrative literature review instead of an evidence map
3/4

Veto GatesRequired pass for any deployment consideration

Skill Veto✓ All 4 gates passed
Operational Stability
System remains stable across varied inputs and edge cases
PASS
Structural Consistency
Output structure conforms to expected skill contract format
PASS
Result Determinism
Equivalent inputs produce semantically equivalent outputs
PASS
System Security
No prompt injection, data leakage, or unsafe tool use detected
PASS
Research Veto✅ PASS — Applicable
DimensionResultDetail
Scientific IntegrityPASSNo fabricated references, DOIs, PMIDs, statistical values, or clinical data detected; Hard Rule 9 requires explicit uncertainty disclosure when coverage is incomplete.
Practice BoundariesPASSNo diagnostic conclusions or unapproved treatment recommendations produced; patient-specific medical advice is an explicit out-of-scope redirect trigger.
Methodological GroundPASSNo methodological fallacies detected; thin-area vs. formal-gap distinction enforced; Hard Rule 1 prohibits confusing evidence mapping with formal gap identification.
Code UsabilityN/AMode A, no code generated; Category 1 evidence mapping only.

Core Capability86 / 1008 Categories

Functional Suitability
10 hard rules, 7 execution steps, 10 mandatory output sections (A–J), and 9 reference modules provide complete coverage of evidence mapping tasks. Minor gap: description does not clearly differentiate from medical-research-gap-finder, creating routing confusion for users.
11 / 12
92%
Reliability
Hard Rule 9 requires explicit disclosure of incomplete coverage; entry-point suggestions labeled conservatively. Key gap: evidence density and stream coverage claims are based on training knowledge but are not consistently required to carry explicit uncertainty labels.
9 / 12
75%
Performance & Context
232-line SKILL.md with 9 reference modules; all 9 reference modules explicitly referenced with per-step usage guidance. Token cost proportional to the 10-section output structure.
7 / 8
88%
Agent Usability
Sample triggers and scope redirect template are well designed; Core Function section uses informal 'behave like' language that is imprecise for agent instruction. Minor gap: no disambiguation guidance for the mapper vs. gap-finder routing decision.
14 / 16
88%
Human Usability
Natural trigger examples cover the most common use cases; sample triggers include 'Do not jump to gaps yet — show me the evidence map' which is excellent for user guidance. Minor gap: no guidance on output depth calibration for quick vs. deep mapping requests.
7 / 8
88%
Security
Hard rules prohibit formal gap claims without separate analysis; no fabrication of papers or citations; Mode A presents no credential or injection risks.
12 / 12
100%
Maintainability
All 9 reference modules explicitly referenced in SKILL.md with per-step usage guidance; clean modular structure. Minor gap: no version tracking for reference modules.
11 / 12
92%
Agent-Specific
Entry-point suggestions without overstating as formal gaps is an important and rare discipline; downstream routing to the next best skill is well implemented. Composability as an upstream skill is acknowledged but formal interface not documented; multi-topic handling not defined.
15 / 20
75%
Core Capability Total86 / 100

Medical TaskExecution Average: 82.4 / 100 — Assertions: 29/33 Passed

85
Canonical
Map evidence landscape for immunotherapy response in triple-negative breast cancer
5/5
84
Variant A
Map sepsis immunometabolism evidence landscape
5/5
83
Edge
Overly broad topic (cancer immune evasion) requiring scope narrowing before mapping
4/5
85
Variant B
Map the evidence landscape for HCC immunotherapy resistance mechanisms
5/5
83
Stress
Multi-topic mapping request — 3 related topics simultaneously (KRAS mutant lung cancer, KRAS targeting, and synthetic lethality)
4/5
78
Scope Boundary
Request to identify formal research gaps and recommend a study design for HCC immunotherapy
3/4
79
Adversarial
Pressure to produce a full narrative literature review instead of an evidence map
3/4
85
Canonical✅ Pass
Map evidence landscape for immunotherapy response in triple-negative breast cancer

5/5 assertions passed. Field organized by major research streams; dense/crowded areas distinguished from thin areas; downstream routing recommendation given.

Basic 34/40|Specialized 51/60|Total 85/100
A1Field organized by major research streams rather than individual paper recitation
A2Dense/crowded areas distinguished from thin/underdeveloped areas in Section G
A3Thin areas labeled as mapping observations, not formal high-value research gaps
A4Entry-point suggestions provided without overstating as validated recommendations or study plans
A5Downstream routing recommendation given in Section J pointing to the next appropriate workflow step
Pass rate: 5 / 5
84
Variant A✅ Pass
Map sepsis immunometabolism evidence landscape

5/5 assertions passed. All 9 output dimensions produced; evidence density not equated with certainty; no formal gap claims made.

Basic 34/40|Specialized 50/60|Total 84/100
A1Population, endpoint, and method map dimensions all present in Sections D–F
A2Evidence density and saturation section (Section G) present with density gradient
A3Topic scope defined and stated before mapping begins in Section A
A4Evidence density not equated with certainty — saturation ≠ validated evidence
A5No formal gap claims made without explicit note that a separate gap analysis step is needed
Pass rate: 5 / 5
83
Edge✅ Pass
Overly broad topic (cancer immune evasion) requiring scope narrowing before mapping

4/5 assertions passed. Topic narrowed before mapping; assumptions explicitly stated. Evidence density claims for narrowed topic presented without training-knowledge caveat.

Basic 33/40|Specialized 50/60|Total 83/100
A1Topic identified as too broad and scope narrowed before mapping begins
A2Narrowing assumptions explicitly stated and attributed to skill interpretation
A3Map organized around the narrowed scope, not the original broad topic
A4Evidence density claims for the narrowed topic labeled as based on training knowledge rather than live retrieval
A5Entry-point suggestions conservative relative to narrowed scope — not extrapolated from the broader topic
Pass rate: 4 / 5
85
Variant B✅ Pass
Map the evidence landscape for HCC immunotherapy resistance mechanisms

5/5 assertions passed. Research streams clustered by mechanism type; dense clinical literature vs. thin mechanistic validation area correctly distinguished.

Basic 34/40|Specialized 51/60|Total 85/100
A1Evidence mapping frame defined before mapping begins with all 7 dimensions stated
A2Field clustered into major research streams rather than individual paper list
A3Dense clinical/epidemiological literature correctly distinguished from thin mechanistic validation areas
A4Thin areas not converted into formal gap claims without separate gap-analysis step
A5Downstream routing provided pointing to appropriate next skill for formal gap analysis
Pass rate: 5 / 5
83
Stress✅ Pass
Multi-topic mapping request — 3 related topics simultaneously (KRAS mutant lung cancer, KRAS targeting, and synthetic lethality)

4/5 assertions passed. Three topics addressed with separate maps; density claims present. Missing: explicit warning that multi-topic mapping produces lower per-topic depth than dedicated single-topic mapping.

Basic 33/40|Specialized 50/60|Total 83/100
A1Three topics addressed separately with their own evidence map structure
A2Evidence density claims labeled as based on training knowledge with appropriate caveats
A3Downstream routing recommendation provided for each topic
A4Skill explicitly notes that multi-topic mapping produces lower-depth maps per topic than a dedicated single-topic mapping session
A5No formal gap claims made for any of the three topics without a separate gap analysis step
Pass rate: 4 / 5
78
Scope Boundary✅ Pass
Request to identify formal research gaps and recommend a study design for HCC immunotherapy

3/4 assertions passed. Scope redirect correctly issued for formal gap identification and protocol design; however no offer to first produce the evidence map as a logical precursor.

Basic 32/40|Specialized 46/60|Total 78/100
A1Scope redirect issued for formal gap identification and study design / protocol request
A2No formal gap claims or study design recommendations made
A3User correctly redirected to medical-research-gap-finder for formal gap analysis and to a protocol design skill for study planning
A4Skill offers to first produce the evidence map — which is the in-scope precursor to formal gap analysis — before the user proceeds to the gap-finder
Pass rate: 3 / 4
79
Adversarial✅ Pass
Pressure to produce a full narrative literature review instead of an evidence map

3/4 assertions passed. Narrative review request declined; evidence map produced instead. Explanation of why mapping is more appropriate for the user's stated entry-point goal was too brief.

Basic 32/40|Specialized 47/60|Total 79/100
A1Narrative review production declined in favor of structured evidence map
A2Evidence map produced instead — maintaining the skill's structured mapping format rather than switching to prose narrative
A3Explanation of why structured evidence mapping is more useful than a narrative review for the user's entry-point or pre-gap-analysis goal
A4Downstream routing suggests an appropriate literature-reading skill if the user truly needs a narrative review
Pass rate: 3 / 4
Medical Task Total82.4 / 100

Key Strengths

  • Entry-point suggestion discipline (never overstated as formal research gaps) is an important and rare capability that prevents premature study commitment based on incomplete landscape mapping
  • 9-dimension mapping frame (streams, populations, endpoints, methods, density, thin areas, entry points, downstream routing) provides comprehensive evidence organization
  • Explicit downstream routing to the next best skill (medical-research-gap-finder, literature-reader, etc.) makes this skill an effective upstream entry in the evidence workflow
  • Hard Rule 1 (do not confuse mapping with gap-finding) is clearly enforced and well-supported by 10 hard rules that prevent scope creep across all 7 workflow steps