PubMed Search Specialist
Construct complex Boolean query strings to precisely retrieve PubMed/MEDLINE.
SKILL.md
PubMed Search Specialist
Expert tool for constructing sophisticated Boolean queries to search PubMed/MEDLINE database with precision.
Core Capabilities
- MeSH Term Mapping: Convert natural language concepts to standardized Medical Subject Headings
- Boolean Query Builder: Construct complex nested queries with AND/OR/NOT operators
- Advanced Filters: Apply study type, date, language, age, and species filters
- Search Strategy Optimization: Refine sensitivity vs specificity trade-offs
Usage Workflow
1. Concept Extraction
Extract key concepts from user's research question using PICO framework:
- Population/Problem
- Intervention
- Comparison
- Outcome
2. MeSH Term Mapping
For each concept, identify appropriate MeSH terms:
- Preferred terms (mapped to MeSH hierarchy)
- Entry terms (synonyms mapped to preferred)
- Subheadings for precision
- Explode vs Focus options
3. Boolean Construction
Build query strings following PubMed syntax:
("Term"[MeSH Terms] OR "Term"[Title/Abstract] OR synonym[Title/Abstract])
4. Filter Application
Append filters as needed:
- Publication dates:
from 2020 to 2024 - Article types:
Clinical Trial,Review,Meta-Analysis - Species:
humans[MeSH Terms]oranimals[MeSH Terms] - Languages:
english[Language] - Age groups:
adult[MeSH Terms],aged[MeSH Terms]
5. Search Strategy Output
Provide complete, copy-paste ready PubMed search string with:
- Line-by-line breakdown
- Estimated result count guidance
- Alternative strategies for sensitivity/specificity balance
Key MeSH Features
| Feature | Syntax | Use Case |
|---|---|---|
| MeSH Terms | "Diabetes Mellitus"[MeSH Terms] | Subject heading search |
| MeSH Major Topic | "Diabetes Mellitus"[MeSH Major Topic] | Core focus articles |
| Explode | "Diabetes Mellitus"[MeSH Terms:noexp] | Exclude subcategories |
| Subheadings | "Diabetes Mellitus/drug therapy"[MeSH Terms] | Specific aspects |
| Entry Terms | "Blood Sugar"[Title/Abstract] | Non-MeSH synonyms |
Boolean Operators
- AND: Both terms must appear (narrows search)
- OR: Either term may appear (broadens search)
- NOT: Exclude terms (use sparingly)
Operator Precedence: Use parentheses to control evaluation order.
Field Tags Reference
| Tag | Field | Example |
|---|---|---|
[MeSH Terms] | Medical Subject Headings | "Hypertension"[MeSH Terms] |
[Title] | Article title only | "stroke"[Title] |
[Title/Abstract] | Title and abstract | "aspirin"[Title/Abstract] |
[Author] | Author name | "Smith J"[Author] |
[Journal] | Journal name | "Lancet"[Journal] |
[Publication Date] | Date range | 2020:2024[Publication Date] |
[Language] | Article language | english[Language] |
[Publication Type] | Article type | clinical trial[Publication Type] |
Clinical Query Filters
Therapy
(randomized controlled trial[Publication Type] OR (randomized[Title/Abstract] AND controlled[Title/Abstract] AND trial[Title/Abstract]))
Diagnosis
(sensitivity and specificity[MeSH Terms] OR sensitivity[Title/Abstract] OR specificity[Title/Abstract] OR diagnostic accuracy[Title/Abstract])
Prognosis
(incidence[MeSH Terms] OR mortality[MeSH Terms] OR follow-up studies[MeSH Terms] OR prognos*[Title/Abstract] OR predict*[Title/Abstract])
Etiology
(risk[MeSH Terms] OR (risk factors[MeSH Terms]) OR (risk[Title/Abstract] AND factor*[Title/Abstract]))
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--population | str | Required | Population/Problem |
--intervention | str | Required | Intervention |
--comparison | str | Required | Comparison |
--outcome | str | Required | Outcome |
--study-type | str | Required | Clinical query category |
--format | str | 'lines' | Output format |
Example: Complete Search Strategy
Research Question: Does aspirin reduce stroke risk in diabetic patients?
Line 1 - Population:
("Diabetes Mellitus"[MeSH Terms] OR "Diabetic"[Title/Abstract] OR "Diabetics"[Title/Abstract])
Line 2 - Intervention:
("Aspirin"[MeSH Terms] OR "Acetylsalicylic Acid"[Title/Abstract] OR "aspirin"[Title/Abstract])
Line 3 - Outcome:
("Stroke"[MeSH Terms] OR "Cerebrovascular Accident"[Title/Abstract] OR "stroke"[Title/Abstract] OR "cerebrovascular"[Title/Abstract])
Line 4 - Study Type Filter:
(randomized controlled trial[Publication Type] OR systematic review[Publication Type] OR meta-analysis[Publication Type])
Final Query:
(("Diabetes Mellitus"[MeSH Terms] OR "Diabetic"[Title/Abstract] OR "Diabetics"[Title/Abstract]) AND ("Aspirin"[MeSH Terms] OR "Acetylsalicylic Acid"[Title/Abstract] OR "aspirin"[Title/Abstract]) AND ("Stroke"[MeSH Terms] OR "Cerebrovascular Accident"[Title/Abstract] OR "stroke"[Title/Abstract] OR "cerebrovascular"[Title/Abstract]) AND (randomized controlled trial[Publication Type] OR systematic review[Publication Type] OR meta-analysis[Publication Type]))
MeSH Browser Usage
When mapping terms:
- Check MeSH Browser for exact term hierarchy
- Note tree numbers for related terms
- Identify entry terms (synonyms)
- Consider subheadings for precision
- Decide on explode vs noexp based on scope needs
Quality Checklist
Before finalizing query:
- All concepts covered with OR within, AND between groups
- MeSH terms verified against current MeSH database
- Free-text synonyms included for completeness
- Filters appropriate for research question
- Parentheses balanced and precedence correct
- Copy-paste ready for PubMed search box
Technical Difficulty
🔴 High - Requires understanding of:
- MeSH hierarchical structure and term relationships
- Boolean logic and operator precedence
- Field tag semantics and limitations
- Search sensitivity vs specificity trade-offs
- Clinical query methodology
⚠️ Verification Required: MeSH terms change annually. Always verify current MeSH version at https://meshb.nlm.nih.gov/
References
See references/mesh-structure.md for detailed MeSH hierarchy guidance.
See references/boolean-examples.md for categorized query templates.
Risk Assessment
| Risk Indicator | Assessment | Level |
|---|---|---|
| Code Execution | Python scripts executed locally | Medium |
| Network Access | PubMed E-utilities API calls | High |
| File System Access | Read/write search strategies | Low |
| Instruction Tampering | Query construction guidelines | Low |
| Data Exposure | Search terms logged locally | Low |
Security Checklist
- No hardcoded credentials or API keys
- NCBI API requests use HTTPS only
- API rate limits respected (max 3 requests/second without API key)
- Input validation for search terms (injection prevention)
- Output directory restricted to workspace
- Error messages sanitized (no internal paths exposed)
- API timeout and retry mechanisms implemented
- No exposure of internal service architecture
Prerequisites
# Python dependencies
pip install -r requirements.txt
# Optional: NCBI API key for higher rate limits
# Set as environment variable: NCBI_API_KEY
Evaluation Criteria
Success Metrics
- Successfully constructs valid PubMed Boolean queries
- MeSH term mapping is accurate and current
- Query syntax is copy-paste ready for PubMed
- Provides sensitivity/specificity trade-off options
- Handles complex multi-concept research questions
- Estimated result counts are reasonable
Test Cases
- Basic Query: "diabetes treatment" → Valid MeSH-based query
- PICO Framework: Complex clinical question → Complete search strategy
- MeSH Mapping: Free-text term → Correct MeSH term identification
- Boolean Logic: Multiple concepts → Properly nested AND/OR/NOT
- Clinical Query: Therapy-focused question → Includes appropriate filters
- API Integration: Execute search via E-utilities → Successful retrieval
- Error Handling: Invalid search term → Graceful error with suggestions
Lifecycle Status
- Current Stage: Draft
- Next Review Date: 2026-03-06
- Known Issues:
- MeSH terms updated annually, may need periodic validation
- API rate limits without key
- Planned Improvements:
- Integration with NCBI API key support for higher rate limits
- Automatic MeSH term validation against current database
- Support for additional databases (Embase, Cochrane)