Agent Skills
SearchBooleanPubMedMEDLINE

PubMed Search Specialist

AIPOCH-AI

Construct complex Boolean query strings to precisely retrieve PubMed/MEDLINE.

51
2
FILES
pubmed-search-specialist/
skill.md
scripts
main.py
references
boolean-examples.md
mesh-structure.md

SKILL.md

PubMed Search Specialist

Expert tool for constructing sophisticated Boolean queries to search PubMed/MEDLINE database with precision.

Core Capabilities

  • MeSH Term Mapping: Convert natural language concepts to standardized Medical Subject Headings
  • Boolean Query Builder: Construct complex nested queries with AND/OR/NOT operators
  • Advanced Filters: Apply study type, date, language, age, and species filters
  • Search Strategy Optimization: Refine sensitivity vs specificity trade-offs

Usage Workflow

1. Concept Extraction

Extract key concepts from user's research question using PICO framework:

  • Population/Problem
  • Intervention
  • Comparison
  • Outcome

2. MeSH Term Mapping

For each concept, identify appropriate MeSH terms:

  • Preferred terms (mapped to MeSH hierarchy)
  • Entry terms (synonyms mapped to preferred)
  • Subheadings for precision
  • Explode vs Focus options

3. Boolean Construction

Build query strings following PubMed syntax:

("Term"[MeSH Terms] OR "Term"[Title/Abstract] OR synonym[Title/Abstract])

4. Filter Application

Append filters as needed:

  • Publication dates: from 2020 to 2024
  • Article types: Clinical Trial, Review, Meta-Analysis
  • Species: humans[MeSH Terms] or animals[MeSH Terms]
  • Languages: english[Language]
  • Age groups: adult[MeSH Terms], aged[MeSH Terms]

5. Search Strategy Output

Provide complete, copy-paste ready PubMed search string with:

  • Line-by-line breakdown
  • Estimated result count guidance
  • Alternative strategies for sensitivity/specificity balance

Key MeSH Features

FeatureSyntaxUse Case
MeSH Terms"Diabetes Mellitus"[MeSH Terms]Subject heading search
MeSH Major Topic"Diabetes Mellitus"[MeSH Major Topic]Core focus articles
Explode"Diabetes Mellitus"[MeSH Terms:noexp]Exclude subcategories
Subheadings"Diabetes Mellitus/drug therapy"[MeSH Terms]Specific aspects
Entry Terms"Blood Sugar"[Title/Abstract]Non-MeSH synonyms

Boolean Operators

  • AND: Both terms must appear (narrows search)
  • OR: Either term may appear (broadens search)
  • NOT: Exclude terms (use sparingly)

Operator Precedence: Use parentheses to control evaluation order.

Field Tags Reference

TagFieldExample
[MeSH Terms]Medical Subject Headings"Hypertension"[MeSH Terms]
[Title]Article title only"stroke"[Title]
[Title/Abstract]Title and abstract"aspirin"[Title/Abstract]
[Author]Author name"Smith J"[Author]
[Journal]Journal name"Lancet"[Journal]
[Publication Date]Date range2020:2024[Publication Date]
[Language]Article languageenglish[Language]
[Publication Type]Article typeclinical trial[Publication Type]

Clinical Query Filters

Therapy

(randomized controlled trial[Publication Type] OR (randomized[Title/Abstract] AND controlled[Title/Abstract] AND trial[Title/Abstract]))

Diagnosis

(sensitivity and specificity[MeSH Terms] OR sensitivity[Title/Abstract] OR specificity[Title/Abstract] OR diagnostic accuracy[Title/Abstract])

Prognosis

(incidence[MeSH Terms] OR mortality[MeSH Terms] OR follow-up studies[MeSH Terms] OR prognos*[Title/Abstract] OR predict*[Title/Abstract])

Etiology

(risk[MeSH Terms] OR (risk factors[MeSH Terms]) OR (risk[Title/Abstract] AND factor*[Title/Abstract]))

Parameters

ParameterTypeDefaultDescription
--populationstrRequiredPopulation/Problem
--interventionstrRequiredIntervention
--comparisonstrRequiredComparison
--outcomestrRequiredOutcome
--study-typestrRequiredClinical query category
--formatstr'lines'Output format

Example: Complete Search Strategy

Research Question: Does aspirin reduce stroke risk in diabetic patients?

Line 1 - Population:

("Diabetes Mellitus"[MeSH Terms] OR "Diabetic"[Title/Abstract] OR "Diabetics"[Title/Abstract])

Line 2 - Intervention:

("Aspirin"[MeSH Terms] OR "Acetylsalicylic Acid"[Title/Abstract] OR "aspirin"[Title/Abstract])

Line 3 - Outcome:

("Stroke"[MeSH Terms] OR "Cerebrovascular Accident"[Title/Abstract] OR "stroke"[Title/Abstract] OR "cerebrovascular"[Title/Abstract])

Line 4 - Study Type Filter:

(randomized controlled trial[Publication Type] OR systematic review[Publication Type] OR meta-analysis[Publication Type])

Final Query:

(("Diabetes Mellitus"[MeSH Terms] OR "Diabetic"[Title/Abstract] OR "Diabetics"[Title/Abstract]) AND ("Aspirin"[MeSH Terms] OR "Acetylsalicylic Acid"[Title/Abstract] OR "aspirin"[Title/Abstract]) AND ("Stroke"[MeSH Terms] OR "Cerebrovascular Accident"[Title/Abstract] OR "stroke"[Title/Abstract] OR "cerebrovascular"[Title/Abstract]) AND (randomized controlled trial[Publication Type] OR systematic review[Publication Type] OR meta-analysis[Publication Type]))

MeSH Browser Usage

When mapping terms:

  1. Check MeSH Browser for exact term hierarchy
  2. Note tree numbers for related terms
  3. Identify entry terms (synonyms)
  4. Consider subheadings for precision
  5. Decide on explode vs noexp based on scope needs

Quality Checklist

Before finalizing query:

  • All concepts covered with OR within, AND between groups
  • MeSH terms verified against current MeSH database
  • Free-text synonyms included for completeness
  • Filters appropriate for research question
  • Parentheses balanced and precedence correct
  • Copy-paste ready for PubMed search box

Technical Difficulty

🔴 High - Requires understanding of:

  • MeSH hierarchical structure and term relationships
  • Boolean logic and operator precedence
  • Field tag semantics and limitations
  • Search sensitivity vs specificity trade-offs
  • Clinical query methodology

⚠️ Verification Required: MeSH terms change annually. Always verify current MeSH version at https://meshb.nlm.nih.gov/

References

See references/mesh-structure.md for detailed MeSH hierarchy guidance. See references/boolean-examples.md for categorized query templates.

Risk Assessment

Risk IndicatorAssessmentLevel
Code ExecutionPython scripts executed locallyMedium
Network AccessPubMed E-utilities API callsHigh
File System AccessRead/write search strategiesLow
Instruction TamperingQuery construction guidelinesLow
Data ExposureSearch terms logged locallyLow

Security Checklist

  • No hardcoded credentials or API keys
  • NCBI API requests use HTTPS only
  • API rate limits respected (max 3 requests/second without API key)
  • Input validation for search terms (injection prevention)
  • Output directory restricted to workspace
  • Error messages sanitized (no internal paths exposed)
  • API timeout and retry mechanisms implemented
  • No exposure of internal service architecture

Prerequisites

# Python dependencies
pip install -r requirements.txt

# Optional: NCBI API key for higher rate limits
# Set as environment variable: NCBI_API_KEY

Evaluation Criteria

Success Metrics

  • Successfully constructs valid PubMed Boolean queries
  • MeSH term mapping is accurate and current
  • Query syntax is copy-paste ready for PubMed
  • Provides sensitivity/specificity trade-off options
  • Handles complex multi-concept research questions
  • Estimated result counts are reasonable

Test Cases

  1. Basic Query: "diabetes treatment" → Valid MeSH-based query
  2. PICO Framework: Complex clinical question → Complete search strategy
  3. MeSH Mapping: Free-text term → Correct MeSH term identification
  4. Boolean Logic: Multiple concepts → Properly nested AND/OR/NOT
  5. Clinical Query: Therapy-focused question → Includes appropriate filters
  6. API Integration: Execute search via E-utilities → Successful retrieval
  7. Error Handling: Invalid search term → Graceful error with suggestions

Lifecycle Status

  • Current Stage: Draft
  • Next Review Date: 2026-03-06
  • Known Issues:
    • MeSH terms updated annually, may need periodic validation
    • API rate limits without key
  • Planned Improvements:
    • Integration with NCBI API key support for higher rate limits
    • Automatic MeSH term validation against current database
    • Support for additional databases (Embase, Cochrane)