Agent Skills
Citation Management
AIPOCH
Comprehensive citation management for academic research; use when you need to discover papers (Google Scholar/PubMed), extract/verify metadata (DOI/PMID/arXiv/URL), and produce validated, clean BibTeX for manuscripts.
139
10
FILES
87100Total Score
View Evaluation ReportCore Capability
88 / 100
Functional Suitability
11 / 12
Reliability
10 / 12
Performance & Context
8 / 8
Agent Usability
14 / 16
Human Usability
8 / 8
Security
10 / 12
Maintainability
10 / 12
Agent-Specific
17 / 20
Medical Task
15 / 20 Passed
88You need to find relevant or highly cited papers on a topic using Google Scholar or PubMed
3/4
86You have identifiers (e.g., DOI, PMID, arXiv ID, URL) and must convert them into correct BibTeX
3/4
86Paper discovery
3/4
86Google Scholar search with year filtering, pagination, and citation-count sorting
3/4
86End-to-end case for Paper discovery
3/4
SKILL.md
When to Use
- You need to find relevant or highly cited papers on a topic using Google Scholar or PubMed.
- You have identifiers (e.g., DOI, PMID, arXiv ID, URL) and must convert them into correct BibTeX.
- You want to verify citation accuracy (DOI resolution, required fields, consistency with CrossRef/PubMed).
- You need to clean, deduplicate, sort, and standardize an existing
.bibfile before submission. - You are preparing a thesis/manuscript and need a reproducible workflow from search → extraction → formatting → validation.
Key Features
- Paper discovery
- Google Scholar search with year filtering, pagination, and citation-count sorting.
- PubMed search with MeSH terms, field tags, publication-type filters, and date ranges.
- Metadata extraction
- Resolve DOI/PMID/arXiv/URL to structured metadata via CrossRef, PubMed E-utilities, and arXiv APIs.
- Batch processing from files containing mixed identifiers.
- BibTeX generation & cleanup
- Generate BibTeX entries with appropriate entry types and required fields.
- Format, sort (key/year/author), and deduplicate BibTeX libraries.
- Citation validation
- DOI resolution checks and metadata cross-checking.
- Required-field checks by entry type, syntax validation, duplicate detection, and optional auto-fix.
- Workflow integration
- Produces submission-ready
.bibfiles for LaTeX/Overleaf workflows and complements literature review pipelines.
- Produces submission-ready
Dependencies
- Python: 3.10+ (recommended)
- Python packages:
requests>=2.31.0scholarly>=1.7.11(optional; required only for Google Scholar automation)
Example Usage
A complete, end-to-end workflow that searches, extracts metadata, formats, deduplicates, and validates a bibliography:
# 1) Search PubMed (biomedical focus)
python scripts/search_pubmed.py \
--query '"CRISPR-Cas Systems"[MeSH] AND "Gene Editing"[MeSH]' \
--date-start 2020-01-01 \
--date-end 2024-12-31 \
--limit 200 \
--output crispr_pubmed.json
# 2) Search Google Scholar (broad coverage)
python scripts/search_google_scholar.py "CRISPR gene editing therapeutics" \
--year-start 2020 \
--year-end 2024 \
--limit 100 \
--output crispr_scholar.json
# 3) Extract metadata from search outputs (or mixed identifiers)
cat crispr_pubmed.json crispr_scholar.json > combined_results.json
python scripts/extract_metadata.py \
--input combined_results.json \
--output combined.bib
# 4) Add known papers by DOI (append)
python scripts/doi_to_bibtex.py 10.1038/s41586-021-03819-2 >> combined.bib
python scripts/doi_to_bibtex.py 10.1126/science.aam9317 >> combined.bib
# 5) Format + deduplicate + sort (newest first)
python scripts/format_bibtex.py combined.bib \
--deduplicate \
--sort year \
--descending \
--output formatted.bib
# 6) Validate + auto-fix common issues + emit report
python scripts/validate_citations.py formatted.bib \
--auto-fix \
--report validation.json \
--output final_references.bib
# 7) Inspect validation results
cat validation.json
Implementation Details
1) Search (Discovery)
-
Google Scholar (
scripts/search_google_scholar.py)- Supports query operators such as exact phrases (
"deep learning"), author filters (author:LeCun), title-only (intitle:"neural networks"), exclusions (-survey), and year ranges. - Typical parameters:
--year-start,--year-end: constrain publication years--limit: cap results--sort-by citations: prioritize highly cited papers (when supported by the script)
- Supports query operators such as exact phrases (
-
PubMed (
scripts/search_pubmed.py)- Uses NCBI E-utilities (e.g., ESearch/EFetch/ESummary) to retrieve PMIDs and metadata.
- Typical parameters:
--query: supports MeSH terms, field tags, and Boolean logic--date-start,--date-end: publication date filtering--publication-types: e.g.,Clinical Trial,Review--format: JSON or BibTeX output (if supported)
(See: references/google_scholar_search.md, references/pubmed_search.md)
2) Metadata Extraction (Normalization)
- Identifier inputs: DOI, PMID, arXiv ID, URL, or mixed lists/files.
- Primary sources:
- CrossRef API for DOI-centric journal metadata
- PubMed E-utilities for biomedical records (PMID/PMCID, MeSH, abstracts)
- arXiv API for preprints and versioned records
- DataCite API for datasets/software DOIs (if implemented/used)
- Field mapping goals:
- Required:
author,title,year - Articles:
journal,volume,number,pages,doi - Conferences:
booktitle,pages - Preprints: repository + identifier (e.g.,
eprint,archivePrefix)
- Required:
(See: references/metadata_extraction.md)
3) BibTeX Formatting (Quality & Consistency)
- Entry types commonly produced:
@article,@inproceedings,@book,@misc. - Formatting rules enforced/encouraged:
- Page ranges use
--(e.g.,123--145) - Protect capitalization in titles with braces (e.g.,
{CRISPR}) - Consistent author formatting (
Last, First and Last, First) - Stable citation keys (project convention; often
FirstAuthorYearKeyword)
- Page ranges use
(See: references/bibtex_formatting.md)
4) Validation (Correctness)
Validation typically checks:
- DOI validity: resolves via
doi.organd matches CrossRef metadata. - Required fields: present per entry type; no empty critical fields.
- Consistency: year format, numeric volume/issue, page-range syntax, URL accessibility.
- Duplicates: same DOI, near-identical titles, or same author/year/title combinations.
- BibTeX syntax: braces/quotes, commas, unique keys, special character handling.
Outputs may include a machine-readable report (e.g., JSON) with errors and warnings.
(See: references/citation_validation.md)