Agent Skills

Cosmic Database

AIPOCH

Access COSMIC to download mutation datasets, query Cancer Gene Census, and retrieve mutational signatures when your genomic analysis requires curated somatic mutation resources.

100
8
FILES
cosmic-database/
skill.md
scripts
download_cosmic.py
references
cosmic_data_reference.md
87100Total Score
View Evaluation Report
Core Capability
81 / 100
Functional Suitability
10 / 12
Reliability
9 / 12
Performance & Context
8 / 8
Agent Usability
13 / 16
Human Usability
7 / 8
Security
9 / 12
Maintainability
9 / 12
Agent-Specific
16 / 20
Medical Task
20 / 20 Passed
96Access COSMIC to download mutation datasets, query Cancer Gene Census, and retrieve mutational signatures when your genomic analysis requires curated somatic mutation resources
4/4
92Access COSMIC to download mutation datasets, query Cancer Gene Census, and retrieve mutational signatures when your genomic analysis requires curated somatic mutation resources
4/4
90Access COSMIC to download mutation datasets, query Cancer Gene Census, and retrieve mutational signatures when your genomic analysis requires curated somatic mutation resources
4/4
90Packaged executable path(s): scripts/download_cosmic.py
4/4
90End-to-end case for Scope-focused workflow aligned to: Access COSMIC to download mutation datasets, query Cancer Gene Census, and retrieve mutational signatures when your genomic analysis requires curated somatic mutation resources
4/4

SKILL.md

COSMIC Database Skill

When to Use

  • Use this skill when you need access cosmic to download mutation datasets, query cancer gene census, and retrieve mutational signatures when your genomic analysis requires curated somatic mutation resources in a reproducible workflow.
  • Use this skill when a evidence insight task needs a packaged method instead of ad-hoc freeform output.
  • Use this skill when the user expects a concrete deliverable, validation step, or file-based result.
  • Use this skill when scripts/download_cosmic.py is the most direct path to complete the request.
  • Use this skill when you need the cosmic-database package behavior rather than a generic answer.

Key Features

  • Scope-focused workflow aligned to: Access COSMIC to download mutation datasets, query Cancer Gene Census, and retrieve mutational signatures when your genomic analysis requires curated somatic mutation resources.
  • Packaged executable path(s): scripts/download_cosmic.py.
  • Reference material available in references/ for task-specific guidance.
  • Structured execution path designed to keep outputs consistent and reviewable.

Dependencies

  • Python: 3.10+. Repository baseline for current packaged skills.
  • Third-party packages: not explicitly version-pinned in this skill package. Add pinned versions if this skill needs stricter environment control.

Example Usage

cd "20260316/scientific-skills/Evidence Insight/cosmic-database"
python -m py_compile scripts/download_cosmic.py
python scripts/download_cosmic.py --help

Example run plan:

  1. Confirm the user input, output path, and any required config values.
  2. Edit the in-file CONFIG block or documented parameters if the script uses fixed settings.
  3. Run python scripts/download_cosmic.py with the validated inputs.
  4. Review the generated output and return the final artifact with any assumptions called out.

Implementation Details

  • Execution model: validate the request, choose the packaged workflow, and produce a bounded deliverable.
  • Input controls: confirm the source files, scope limits, output format, and acceptance criteria before running any script.
  • Primary implementation surface: scripts/download_cosmic.py.
  • Reference guidance: references/ contains supporting rules, prompts, or checklists.
  • Parameters to clarify first: input path, output path, scope filters, thresholds, and any domain-specific constraints.
  • Output discipline: keep results reproducible, identify assumptions explicitly, and avoid undocumented side effects.

1. When to Use

Use this skill when you need COSMIC data for tasks such as:

  • Downloading COSMIC mutation exports (TSV/VCF) for cohort or sample-level variant analysis.
  • Retrieving Cancer Gene Census (CGC) gene lists for oncogene/tumor suppressor annotation and prioritization.
  • Working with COSMIC mutational signatures (SBS/DBS/ID) for signature attribution or comparative studies.
  • Accessing additional COSMIC genomics datasets (e.g., copy number, fusions, expression) for multi-omics integration.
  • Building reproducible pipelines that programmatically fetch the latest COSMIC releases.

2. Key Features

  • Authenticated downloads of COSMIC files (e.g., TSV/VCF; often GZIP-compressed).
  • Cancer Gene Census access for curated cancer gene information.
  • Mutational signature retrieval including SBS, DBS, and ID signatures.
  • Support for multiple COSMIC dataset types, such as mutation, copy number, fusion, and expression resources.
  • Pandas-friendly workflow for loading and filtering downloaded tables.

3. Dependencies

  • Python 3.9+
  • pandas >= 1.5
  • requests >= 2.28

External requirements:

4. Example Usage

The following example downloads a COSMIC file and loads it into a pandas DataFrame.

from scripts.download_cosmic import download_cosmic_file
import pandas as pd

# 1) Download a COSMIC dataset (example path; adjust to your target release/build)
download_cosmic_file(
    email="[email protected]",
    password="pwd",
    filepath="GRCh38/cosmic/latest/CosmicMutantExport.tsv.gz"
)

# 2) Load the downloaded GZIP-compressed TSV
df = pd.read_csv(
    "CosmicMutantExport.tsv.gz",
    sep="\t",
    compression="gzip"
)

# 3) Example analysis: filter by gene symbol (column name depends on the dataset)

# df_gene = df[df["Gene name"] == "TP53"]

For dataset field definitions and COSMIC file specifics, see: references/cosmic_data_reference.md.

5. Implementation Details

  • Authentication: Downloads require COSMIC account credentials (email/password) and are performed via an authenticated HTTP session.
  • File targeting: The filepath parameter specifies the COSMIC resource path (e.g., genome build such as GRCh38, release channel such as latest, and the target filename).
  • Data format: Many COSMIC exports are distributed as GZIP-compressed TSV (and sometimes VCF). Use pandas.read_csv(..., sep="\t", compression="gzip") for TSV .gz files.
  • Typical workflow:
    1. Download the desired COSMIC export.
    2. Load into a DataFrame (or parse VCF with an appropriate library if needed).
    3. Filter/aggregate by gene, tumor type, sample, or signature depending on the analysis goal.