Agent Skills

Ensembl Database

AIPOCH

Access Ensembl REST API for vertebrate genomic data; use when you need gene/ID lookups, sequence retrieval, variant effect prediction (VEP), or homology/assembly coordinate mapping.

71
6
FILES
ensembl-database/
skill.md
scripts
query_ensembl.py
references
api_endpoints.md
89100Total Score
View Evaluation Report
Core Capability
78 / 100
Functional Suitability
9 / 12
Reliability
9 / 12
Performance & Context
8 / 8
Agent Usability
12 / 16
Human Usability
7 / 8
Security
9 / 12
Maintainability
9 / 12
Agent-Specific
15 / 20
Medical Task
20 / 20 Passed
100Access Ensembl REST API for vertebrate genomic data; use when you need gene/ID lookups, sequence retrieval, variant effect prediction (VEP), or homology/assembly coordinate mapping
4/4
96Access Ensembl REST API for vertebrate genomic data; use when you need gene/ID lookups, sequence retrieval, variant effect prediction (VEP), or homology/assembly coordinate mapping
4/4
94Access Ensembl REST API for vertebrate genomic data
4/4
94Packaged executable path(s): scripts/query_ensembl.py
4/4
94End-to-end case for Scope-focused workflow aligned to: Access Ensembl REST API for vertebrate genomic data; use when you need gene/ID lookups, sequence retrieval, variant effect prediction (VEP), or homology/assembly coordinate mapping
4/4

SKILL.md

Ensembl Database Skill

When to Use

  • Use this skill when you need access ensembl rest api for vertebrate genomic data; use when you need gene/id lookups, sequence retrieval, variant effect prediction (vep), or homology/assembly coordinate mapping in a reproducible workflow.
  • Use this skill when a evidence insight task needs a packaged method instead of ad-hoc freeform output.
  • Use this skill when the user expects a concrete deliverable, validation step, or file-based result.
  • Use this skill when scripts/query_ensembl.py is the most direct path to complete the request.
  • Use this skill when you need the ensembl-database package behavior rather than a generic answer.

Key Features

  • Scope-focused workflow aligned to: Access Ensembl REST API for vertebrate genomic data; use when you need gene/ID lookups, sequence retrieval, variant effect prediction (VEP), or homology/assembly coordinate mapping.
  • Packaged executable path(s): scripts/query_ensembl.py.
  • Reference material available in references/ for task-specific guidance.
  • Structured execution path designed to keep outputs consistent and reviewable.

Dependencies

  • Python: 3.10+. Repository baseline for current packaged skills.
  • Third-party packages: not explicitly version-pinned in this skill package. Add pinned versions if this skill needs stricter environment control.

Example Usage

cd "20260316/scientific-skills/Evidence Insight/ensembl-database"
python -m py_compile scripts/query_ensembl.py
python scripts/query_ensembl.py --help

Example run plan:

  1. Confirm the user input, output path, and any required config values.
  2. Edit the in-file CONFIG block or documented parameters if the script uses fixed settings.
  3. Run python scripts/query_ensembl.py with the validated inputs.
  4. Review the generated output and return the final artifact with any assumptions called out.

Implementation Details

  • Execution model: validate the request, choose the packaged workflow, and produce a bounded deliverable.
  • Input controls: confirm the source files, scope limits, output format, and acceptance criteria before running any script.
  • Primary implementation surface: scripts/query_ensembl.py.
  • Reference guidance: references/ contains supporting rules, prompts, or checklists.
  • Parameters to clarify first: input path, output path, scope filters, thresholds, and any domain-specific constraints.
  • Output discipline: keep results reproducible, identify assumptions explicitly, and avoid undocumented side effects.

1. When to Use

  • Gene-centric queries: When you need to resolve a gene symbol or region to Ensembl identifiers and basic annotations (e.g., BRCA2 in human).
  • Sequence extraction: When you need DNA/cDNA/protein sequences for a known Ensembl gene/transcript/protein ID in FASTA or JSON.
  • Variant interpretation: When you need to predict functional consequences of variants using VEP from HGVS notation.
  • Comparative genomics: When you need ortholog/paralog relationships across vertebrate species.
  • Assembly/coordinate mapping: When you need to map coordinates between assemblies (e.g., GRCh37 ↔ GRCh38).

2. Key Features

  • Query Ensembl REST endpoints for:
    • Gene lookup by symbol, Ensembl ID, or genomic region
    • Sequence retrieval (DNA, cDNA, protein) in FASTA/JSON
    • Variant Effect Predictor (VEP) analysis from HGVS inputs
    • Homology retrieval (orthologs/paralogs)
    • Assembly/coordinate mapping between common human assemblies
  • CLI helper script for repeatable queries:
    • scripts/query_ensembl.py (wrapper around an ensembl_rest client)
  • Reference documentation for endpoints:

3. Dependencies

  • Python >=3.8
  • ensembl_rest (Python client; version depends on your environment)
  • Network access to https://rest.ensembl.org

4. Example Usage

CLI: Gene lookup by symbol

python scripts/query_ensembl.py --action lookup --species human --symbol BRCA2

CLI: Retrieve sequence by Ensembl ID

python scripts/query_ensembl.py --action sequence --id ENSG00000139618

CLI: Variant effect prediction (VEP) by HGVS

python scripts/query_ensembl.py --action vep --species human --hgvs "ENST00000380152.8:c.68_69delAG"

5. Implementation Details

Script entry point

  • Tool: scripts/query_ensembl.py
  • Purpose: Provide a simple command-line interface that dispatches to Ensembl REST calls via an ensembl_rest client.

Core parameters

  • --action: Operation selector.
    • Supported values: lookup, sequence, vep
  • --species: Target species name used by Ensembl REST (e.g., human).
  • --symbol: Gene symbol used for lookup actions (e.g., BRCA2).
  • --id: Ensembl stable ID used for sequence retrieval (e.g., ENSG..., ENST..., ENSP...).
  • --hgvs: HGVS notation string used for VEP (e.g., ENST...:c.123A>G).

Data types and outputs

  • Lookup: Returns gene/transcript metadata as provided by Ensembl REST.
  • Sequence: Returns DNA/cDNA/protein sequence; format depends on the endpoint/options (commonly FASTA or JSON).
  • VEP: Returns consequence annotations and (when available) population frequency fields as provided by Ensembl VEP REST responses.

Endpoint reference

For the exact REST paths, required parameters, and response schemas, see: