Agent Skills

Arxiv Database

AIPOCH

Search and retrieve scientific preprints from arXiv; use it when you need to find papers by keyword/author/category, fetch metadata (abstract, DOI, PDF URL), or download PDFs for offline reading.

2
0
FILES
arxiv-database/
skill.md
scripts
arxiv_search.py
references
api_docs.md
89100Total Score
View Evaluation Report
Core Capability
78 / 100
Functional Suitability
9 / 12
Reliability
9 / 12
Performance & Context
8 / 8
Agent Usability
12 / 16
Human Usability
7 / 8
Security
9 / 12
Maintainability
9 / 12
Agent-Specific
15 / 20
Medical Task
20 / 20 Passed
100Search and retrieve scientific preprints from arXiv; use it when you need to find papers by keyword/author/category, fetch metadata (abstract, DOI, PDF URL), or download PDFs for offline reading
4/4
96Search and retrieve scientific preprints from arXiv; use it when you need to find papers by keyword/author/category, fetch metadata (abstract, DOI, PDF URL), or download PDFs for offline reading
4/4
94Search and retrieve scientific preprints from arXiv; use it when you need to find papers by keyword/author/category, fetch metadata (abstract, DOI, PDF URL), or download PDFs for offline reading
4/4
94Packaged executable path(s): scripts/arxiv_search.py
4/4
94End-to-end case for Scope-focused workflow aligned to: Search and retrieve scientific preprints from arXiv; use it when you need to find papers by keyword/author/category, fetch metadata (abstract, DOI, PDF URL), or download PDFs for offline reading
4/4

SKILL.md

ArXiv Database Skill

When to Use

  • Use this skill when you need search and retrieve scientific preprints from arxiv; use it when you need to find papers by keyword/author/category, fetch metadata (abstract, doi, pdf url), or download pdfs for offline reading in a reproducible workflow.
  • Use this skill when a evidence insight task needs a packaged method instead of ad-hoc freeform output.
  • Use this skill when the user expects a concrete deliverable, validation step, or file-based result.
  • Use this skill when scripts/arxiv_search.py is the most direct path to complete the request.
  • Use this skill when you need the arxiv-database package behavior rather than a generic answer.

Key Features

  • Scope-focused workflow aligned to: Search and retrieve scientific preprints from arXiv; use it when you need to find papers by keyword/author/category, fetch metadata (abstract, DOI, PDF URL), or download PDFs for offline reading.
  • Packaged executable path(s): scripts/arxiv_search.py.
  • Reference material available in references/ for task-specific guidance.
  • Structured execution path designed to keep outputs consistent and reviewable.

Dependencies

  • Python: 3.10+. Repository baseline for current packaged skills.
  • Third-party packages: not explicitly version-pinned in this skill package. Add pinned versions if this skill needs stricter environment control.

Example Usage

cd "20260316/scientific-skills/Evidence Insight/arxiv-database"
python -m py_compile scripts/arxiv_search.py
python scripts/arxiv_search.py --help

Example run plan:

  1. Confirm the user input, output path, and any required config values.
  2. Edit the in-file CONFIG block or documented parameters if the script uses fixed settings.
  3. Run python scripts/arxiv_search.py with the validated inputs.
  4. Review the generated output and return the final artifact with any assumptions called out.

Implementation Details

  • Execution model: validate the request, choose the packaged workflow, and produce a bounded deliverable.
  • Input controls: confirm the source files, scope limits, output format, and acceptance criteria before running any script.
  • Primary implementation surface: scripts/arxiv_search.py.
  • Reference guidance: references/ contains supporting rules, prompts, or checklists.
  • Parameters to clarify first: input path, output path, scope filters, thresholds, and any domain-specific constraints.
  • Output discipline: keep results reproducible, identify assumptions explicitly, and avoid undocumented side effects.

1. When to Use

  • You need to quickly find arXiv preprints by keyword, phrase, author, or category (e.g., cs.AI, cs.CL).
  • You want to collect paper metadata (title, authors, publication date, abstract/summary, PDF link) for review or indexing.
  • You need the latest submissions in a topic area (sorted by submission date or last updated date).
  • You want to download one or more PDFs from search results for offline reading or batch processing.
  • You have a known arXiv identifier and want to retrieve the corresponding paper directly.

2. Key Features

  • arXiv query-based search (supports category filters, author filters, phrases, and ID lookups).
  • Configurable result limits (--max-results).
  • Sort control (--sort-by: Relevance, LastUpdatedDate, SubmittedDate).
  • Metadata output per result (title, authors, published date, abstract/summary, PDF URL; DOI when available via arXiv metadata).
  • Optional PDF download for returned results (--download) with configurable output directory (--dir).

3. Dependencies

  • Python 3.8+
  • arxiv (Python package) — version depends on your environment; install a recent release (e.g., arxiv>=1.4.0)

4. Example Usage

Install dependencies

pip install "arxiv>=1.4.0"

Run searches and downloads

Search for papers in cs.AI about reinforcement learning (top 5 results):

python scripts/arxiv_search.py --query "cat:cs.AI AND reinforcement learning" --max-results 5

Search for “Large Language Models” in cs.CL:

python scripts/arxiv_search.py --query "cat:cs.CL AND \"Large Language Models\""

Get the latest 5 papers on “quantum computing” (sorted by submission date):

python scripts/arxiv_search.py --query "quantum computing" --sort-by SubmittedDate --max-results 5

Download a specific paper by arXiv ID:

python scripts/arxiv_search.py --query "id:2101.12345" --download

Download results into a specific directory:

python scripts/arxiv_search.py --query "cat:cs.LG AND diffusion" --max-results 3 --download --dir ./papers

5. Implementation Details

  • Entry point: scripts/arxiv_search.py wraps the arxiv Python API to execute queries against the arXiv search endpoint.
  • Query syntax: The --query string is passed to arXiv search and can include:
    • Category filters (e.g., cat:cs.AI)
    • Author filters (e.g., au:Smith)
    • Exact phrases using quotes (e.g., "Large Language Models")
    • ID lookup (e.g., id:2101.12345)
    • Boolean operators such as AND
  • Result limiting: --max-results controls how many entries are returned (default: 10).
  • Sorting: --sort-by selects the ordering of results:
    • Relevance (default)
    • LastUpdatedDate
    • SubmittedDate
  • Downloads: When --download is set, the script downloads the PDF for each returned result using the provided PDF URL and saves it to --dir (default: current working directory).
  • Metadata fields: Each result includes core arXiv metadata (title, authors, published date, summary/abstract, PDF URL). DOI is included when present in arXiv’s metadata for that record.