Reference Search

Name: Reference Search
Author: AIPOCH

AIPOCH

Multi-database literature search and search-strategy design that outputs structured, reproducible result lists; use when you need reference retrieval, systematic searching, review topic selection, or to construct a traceable search strategy.

167

FILES

reference-search/

skill.md

scripts

pubmed_search.py

references

evaluation-checklist.md

guide.md

assets

search_log_template.csv

search_results_template.csv

run_query.py

run_search.py

91100Total Score

View Evaluation Report

Core Capability

84 / 100

Functional Suitability

10 / 12

Reliability

10 / 12

Performance & Context

8 / 8

Agent Usability

13 / 16

Human Usability

7 / 8

Security

10 / 12

Maintainability

10 / 12

Agent-Specific

16 / 20

Medical Task

20 / 20 Passed

100Multi-database literature search and search-strategy design that outputs structured, reproducible result lists; use when you need reference retrieval, systematic searching, review topic selection, or to construct a traceable search strategy

4/4

97Multi-database literature search and search-strategy design that outputs structured, reproducible result lists; use when you need reference retrieval, systematic searching, review topic selection, or to construct a traceable search strategy

4/4

95Multi-database literature search and search-strategy design that outputs structured, reproducible result lists

4/4

94Packaged executable path(s): scripts/pubmed_search.py

4/4

94End-to-end case for Scope-focused workflow aligned to: Multi-database literature search and search-strategy design that outputs structured, reproducible result lists; use when you need reference retrieval, systematic searching, review topic selection, or to construct a traceable search strategy

4/4

SKILL.md

Reference Search

When to Use

Use this skill when you need multi-database literature search and search-strategy design that outputs structured, reproducible result lists; use when you need reference retrieval, systematic searching, review topic selection, or to construct a traceable search strategy in a reproducible workflow.
Use this skill when a evidence insight task needs a packaged method instead of ad-hoc freeform output.
Use this skill when the user expects a concrete deliverable, validation step, or file-based result.
Use this skill when scripts/pubmed_search.py is the most direct path to complete the request.
Use this skill when you need the reference-search package behavior rather than a generic answer.

Key Features

Scope-focused workflow aligned to: Multi-database literature search and search-strategy design that outputs structured, reproducible result lists; use when you need reference retrieval, systematic searching, review topic selection, or to construct a traceable search strategy.
Packaged executable path(s): scripts/pubmed_search.py.
Reference material available in references/ for task-specific guidance.
Reusable packaged asset(s), including assets/search_log_template.csv.
Structured execution path designed to keep outputs consistent and reviewable.

Dependencies

Python: 3.10+. Repository baseline for current packaged skills.
Third-party packages: not explicitly version-pinned in this skill package. Add pinned versions if this skill needs stricter environment control.

Example Usage

cd "20260316/scientific-skills/Evidence Insight/reference-search"
python -m py_compile scripts/pubmed_search.py
python scripts/pubmed_search.py --help

Example run plan:

Confirm the user input, output path, and any required config values.
Edit the in-file CONFIG block or documented parameters if the script uses fixed settings.
Run python scripts/pubmed_search.py with the validated inputs.
Review the generated output and return the final artifact with any assumptions called out.

Implementation Details

Execution model: validate the request, choose the packaged workflow, and produce a bounded deliverable.
Input controls: confirm the source files, scope limits, output format, and acceptance criteria before running any script.
Primary implementation surface: scripts/pubmed_search.py.
Reference guidance: references/ contains supporting rules, prompts, or checklists.
Packaged assets: reusable files are available under assets/.
Parameters to clarify first: input path, output path, scope filters, thresholds, and any domain-specific constraints.
Output discipline: keep results reproducible, identify assumptions explicitly, and avoid undocumented side effects.

1. When to Use

Use this skill in the following scenarios:

Systematic or scoping reviews where you must document a reproducible search strategy and export structured results.
Rapid evidence retrieval for a research question, with quick export to CSV/JSON for screening.
Search strategy construction (keywords, synonyms, Boolean logic, field restrictions) before running searches at scale.
Review topic selection by exploring the volume and distribution of literature for candidate topics.
Traceable search logging when you need to record search date, query string, and result counts for auditability.

2. Key Features

Multi-database search framework (currently implemented for PubMed).
Automatic keyword extraction and search strategy construction (Boolean logic + field constraints).
Structured outputs:
- Machine-readable JSON
- Spreadsheet-friendly CSV
Reproducible search records (query string, keywords, counts, and record list).
Compliance-oriented network access restricted to official PubMed E-utilities endpoints.

3. Dependencies

Dependency	Version	Notes
Python	3.10+	Uses Python standard library only (no third-party packages).

4. Example Usage

Run the PubMed search script

cd skills/reference-search
python scripts/pubmed_search.py

Configure the script

Edit the CONFIG section in scripts/pubmed_search.py:

from pathlib import Path

CONFIG = {
    "EMAIL": "[email protected]",          # Required (must be provided by the user)
    "API_KEY": "",                               # Optional (can increase rate limits)
    "RETMAX": 20,                                # Max number of records to return
    "OUTPUT_DIR": Path("outputs/pubmed_search"), # Allowed output directory
}

Example output (JSON)

{
  "query": "\"Cancer cachexia\"[Title] AND cachexia[Title/Abstract] AND pancreatic[Title/Abstract]",
  "keywords": ["cachexia", "pancreatic", "cancer", "weight", "muscle", "atrophy", "mortality", "treatment"],
  "count": 20,
  "records": [
    {
      "pmid": "36280389",
      "title": "Role of noncoding RNAs in pancreatic ductal adenocarcinoma associated cachexia.",
      "journal": "Journal of Cachexia, Sarcopenia and Muscle",
      "pubdate": "2022",
      "authors": "Wang X, Li Y, Zhang S"
    }
  ]
}

5. Implementation Details

Supported databases and endpoints

PubMed (NCBI E-utilities) only.
- https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi (search)
- https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi (record summaries)

Search workflow (recommended)

Define requirements and scope
- Confirm research question and core concepts.
- Set inclusion/exclusion criteria (time window, language, publication type).
Design the search strategy
- Expand keywords with synonyms.
- Combine with Boolean operators (AND/OR) and apply field restrictions (e.g., Title/Abstract/MeSH).
Execute and export
- Run the script and export results to JSON/CSV.
- If combining multiple sources, merge and deduplicate externally while preserving source labels.
Record for reproducibility
- Save the final query string, search date, and result counts.

Configuration parameters

EMAIL (required): Must be provided by the user; must not be hard-coded as a real credential.
API_KEY (optional): If provided, can improve throughput under NCBI policies.
RETMAX: Limits the number of returned records.
OUTPUT_DIR: Must point to an outputs/ subdirectory.

Security, compliance, and access constraints

Network access: restricted to the official NCBI host eutils.ncbi.nlm.nih.gov only.
Prohibited: any third-party URLs.
File read constraints: do not read files outside the skill directory.
File write constraints: write outputs only under outputs/ (ensure the directory exists or is created by the script).
Timeout: 20 seconds per API request.
Rate limiting: 0.35 seconds between requests.
Error handling: return semantic, user-facing error messages without exposing sensitive technical details.

Included assets and references (in-repo)

Templates:
- assets/search_log_template.csv
- assets/search_results_template.csv
Additional guidance and checklists:
- references/guide.md
- references/evaluation-checklist.md
Tests:
- tests/test_pubmed_search.py
External documentation:
- PubMed E-utilities: https://www.ncbi.nlm.nih.gov/books/NBK25504/

Details

AuthorAIPOCH

LicenseMIT

Languagebash,python,json

Updated2026-04-13

Version1.0.0

SourceGitHub