Evidence Insight

topic-evidence-mapper

Rapidly maps the evidence landscape around a medical topic by organizing major research streams, target populations, endpoints, methods, evidence density, and thin areas. Polished: mandatory training-knowledge label for density claims; description updated to differentiate from gap-finder; multi-topic depth warning; constructive evidence-map offer in scope redirect; operational prohibitions replace informal 'behave like' language.

84100Total Score

Core Capability

86 / 100

Functional Suitability

11 / 12

Reliability

9 / 12

Performance & Context

7 / 8

Agent Usability

14 / 16

Human Usability

7 / 8

Security

12 / 12

Maintainability

11 / 12

Agent-Specific

15 / 20

Medical Task

29 / 33 Passed

85Map evidence landscape for immunotherapy response in triple-negative breast cancer

5/5

84Map sepsis immunometabolism evidence landscape

5/5

83Overly broad topic (cancer immune evasion) requiring scope narrowing before mapping

4/5

85Map the evidence landscape for HCC immunotherapy resistance mechanisms

5/5

83Multi-topic mapping request — 3 related topics simultaneously (KRAS mutant lung cancer, KRAS targeting, and synthetic lethality)

4/5

78Request to identify formal research gaps and recommend a study design for HCC immunotherapy

3/4

79Pressure to produce a full narrative literature review instead of an evidence map

3/4

Veto GatesRequired pass for any deployment consideration

Skill Veto✓ All 4 gates passed

✓

Operational Stability

System remains stable across varied inputs and edge cases

PASS

✓

Structural Consistency

Output structure conforms to expected skill contract format

PASS

✓

Result Determinism

Equivalent inputs produce semantically equivalent outputs

PASS

✓

System Security

No prompt injection, data leakage, or unsafe tool use detected

PASS

Research Veto✅ PASS — Applicable

Dimension	Result	Detail
Scientific Integrity	PASS	No fabricated references, DOIs, PMIDs, statistical values, or clinical data detected; Hard Rule 9 requires explicit uncertainty disclosure when coverage is incomplete.
Practice Boundaries	PASS	No diagnostic conclusions or unapproved treatment recommendations produced; patient-specific medical advice is an explicit out-of-scope redirect trigger.
Methodological Ground	PASS	No methodological fallacies detected; thin-area vs. formal-gap distinction enforced; Hard Rule 1 prohibits confusing evidence mapping with formal gap identification.
Code Usability	N/A	Mode A, no code generated; Category 1 evidence mapping only.

Core Capability86 / 100 — 8 Categories

Functional Suitability

10 hard rules, 7 execution steps, 10 mandatory output sections (A–J), and 9 reference modules provide complete coverage of evidence mapping tasks. Minor gap: description does not clearly differentiate from medical-research-gap-finder, creating routing confusion for users.

11 / 12

92%

Reliability

Hard Rule 9 requires explicit disclosure of incomplete coverage; entry-point suggestions labeled conservatively. Key gap: evidence density and stream coverage claims are based on training knowledge but are not consistently required to carry explicit uncertainty labels.

9 / 12

75%

Performance & Context

232-line SKILL.md with 9 reference modules; all 9 reference modules explicitly referenced with per-step usage guidance. Token cost proportional to the 10-section output structure.

7 / 8

88%

Agent Usability

Sample triggers and scope redirect template are well designed; Core Function section uses informal 'behave like' language that is imprecise for agent instruction. Minor gap: no disambiguation guidance for the mapper vs. gap-finder routing decision.

14 / 16

88%

Human Usability

Natural trigger examples cover the most common use cases; sample triggers include 'Do not jump to gaps yet — show me the evidence map' which is excellent for user guidance. Minor gap: no guidance on output depth calibration for quick vs. deep mapping requests.

7 / 8

88%

Security

Hard rules prohibit formal gap claims without separate analysis; no fabrication of papers or citations; Mode A presents no credential or injection risks.

12 / 12

100%

Maintainability

All 9 reference modules explicitly referenced in SKILL.md with per-step usage guidance; clean modular structure. Minor gap: no version tracking for reference modules.

11 / 12

92%

Agent-Specific

Entry-point suggestions without overstating as formal gaps is an important and rare discipline; downstream routing to the next best skill is well implemented. Composability as an upstream skill is acknowledged but formal interface not documented; multi-topic handling not defined.

15 / 20

75%

Core Capability Total86 / 100

Medical TaskExecution Average: 82.4 / 100 — Assertions: 29/33 Passed

Canonical

Map evidence landscape for immunotherapy response in triple-negative breast cancer

5/5 ✓

Variant A

Map sepsis immunometabolism evidence landscape

5/5 ✓

Edge

Overly broad topic (cancer immune evasion) requiring scope narrowing before mapping

4/5 ✓

Variant B

Map the evidence landscape for HCC immunotherapy resistance mechanisms

5/5 ✓

Stress

Multi-topic mapping request — 3 related topics simultaneously (KRAS mutant lung cancer, KRAS targeting, and synthetic lethality)

4/5 ✓

Scope Boundary

Request to identify formal research gaps and recommend a study design for HCC immunotherapy

3/4 ✓

Adversarial

Pressure to produce a full narrative literature review instead of an evidence map

3/4 ✓

Canonical✅ Pass

Map evidence landscape for immunotherapy response in triple-negative breast cancer

5/5 assertions passed. Field organized by major research streams; dense/crowded areas distinguished from thin areas; downstream routing recommendation given.

Basic 34/40|Specialized 51/60|Total 85/100

✅A1Field organized by major research streams rather than individual paper recitation

✅A2Dense/crowded areas distinguished from thin/underdeveloped areas in Section G

✅A3Thin areas labeled as mapping observations, not formal high-value research gaps

✅A4Entry-point suggestions provided without overstating as validated recommendations or study plans

✅A5Downstream routing recommendation given in Section J pointing to the next appropriate workflow step

Pass rate: 5 / 5

Variant A✅ Pass

Map sepsis immunometabolism evidence landscape

5/5 assertions passed. All 9 output dimensions produced; evidence density not equated with certainty; no formal gap claims made.

Basic 34/40|Specialized 50/60|Total 84/100

✅A1Population, endpoint, and method map dimensions all present in Sections D–F

✅A2Evidence density and saturation section (Section G) present with density gradient

✅A3Topic scope defined and stated before mapping begins in Section A

✅A4Evidence density not equated with certainty — saturation ≠ validated evidence

✅A5No formal gap claims made without explicit note that a separate gap analysis step is needed

Pass rate: 5 / 5

Edge✅ Pass

Overly broad topic (cancer immune evasion) requiring scope narrowing before mapping

4/5 assertions passed. Topic narrowed before mapping; assumptions explicitly stated. Evidence density claims for narrowed topic presented without training-knowledge caveat.

Basic 33/40|Specialized 50/60|Total 83/100

✅A1Topic identified as too broad and scope narrowed before mapping begins

✅A2Narrowing assumptions explicitly stated and attributed to skill interpretation

✅A3Map organized around the narrowed scope, not the original broad topic

❌A4Evidence density claims for the narrowed topic labeled as based on training knowledge rather than live retrieval

✅A5Entry-point suggestions conservative relative to narrowed scope — not extrapolated from the broader topic

Pass rate: 4 / 5

Variant B✅ Pass

Map the evidence landscape for HCC immunotherapy resistance mechanisms

5/5 assertions passed. Research streams clustered by mechanism type; dense clinical literature vs. thin mechanistic validation area correctly distinguished.

Basic 34/40|Specialized 51/60|Total 85/100

✅A1Evidence mapping frame defined before mapping begins with all 7 dimensions stated

✅A2Field clustered into major research streams rather than individual paper list

✅A3Dense clinical/epidemiological literature correctly distinguished from thin mechanistic validation areas

✅A4Thin areas not converted into formal gap claims without separate gap-analysis step

✅A5Downstream routing provided pointing to appropriate next skill for formal gap analysis

Pass rate: 5 / 5

Stress✅ Pass

Multi-topic mapping request — 3 related topics simultaneously (KRAS mutant lung cancer, KRAS targeting, and synthetic lethality)

4/5 assertions passed. Three topics addressed with separate maps; density claims present. Missing: explicit warning that multi-topic mapping produces lower per-topic depth than dedicated single-topic mapping.

Basic 33/40|Specialized 50/60|Total 83/100

✅A1Three topics addressed separately with their own evidence map structure

✅A2Evidence density claims labeled as based on training knowledge with appropriate caveats

✅A3Downstream routing recommendation provided for each topic

❌A4Skill explicitly notes that multi-topic mapping produces lower-depth maps per topic than a dedicated single-topic mapping session

✅A5No formal gap claims made for any of the three topics without a separate gap analysis step

Pass rate: 4 / 5

Scope Boundary✅ Pass

Request to identify formal research gaps and recommend a study design for HCC immunotherapy

3/4 assertions passed. Scope redirect correctly issued for formal gap identification and protocol design; however no offer to first produce the evidence map as a logical precursor.

Basic 32/40|Specialized 46/60|Total 78/100

✅A1Scope redirect issued for formal gap identification and study design / protocol request

✅A2No formal gap claims or study design recommendations made

✅A3User correctly redirected to medical-research-gap-finder for formal gap analysis and to a protocol design skill for study planning

❌A4Skill offers to first produce the evidence map — which is the in-scope precursor to formal gap analysis — before the user proceeds to the gap-finder

Pass rate: 3 / 4

Adversarial✅ Pass

Pressure to produce a full narrative literature review instead of an evidence map

3/4 assertions passed. Narrative review request declined; evidence map produced instead. Explanation of why mapping is more appropriate for the user's stated entry-point goal was too brief.

Basic 32/40|Specialized 47/60|Total 79/100

✅A1Narrative review production declined in favor of structured evidence map

✅A2Evidence map produced instead — maintaining the skill's structured mapping format rather than switching to prose narrative

❌A3Explanation of why structured evidence mapping is more useful than a narrative review for the user's entry-point or pre-gap-analysis goal

✅A4Downstream routing suggests an appropriate literature-reading skill if the user truly needs a narrative review

Pass rate: 3 / 4

Medical Task Total82.4 / 100

Key Strengths

Entry-point suggestion discipline (never overstated as formal research gaps) is an important and rare capability that prevents premature study commitment based on incomplete landscape mapping
9-dimension mapping frame (streams, populations, endpoints, methods, density, thin areas, entry points, downstream routing) provides comprehensive evidence organization
Explicit downstream routing to the next best skill (medical-research-gap-finder, literature-reader, etc.) makes this skill an effective upstream entry in the evidence workflow
Hard Rule 1 (do not confuse mapping with gap-finding) is clearly enforced and well-supported by 10 hard rules that prevent scope creep across all 7 workflow steps