Agent Skills
Cosmic Database
AIPOCH
Access COSMIC to download mutation datasets, query Cancer Gene Census, and retrieve mutational signatures when your genomic analysis requires curated somatic mutation resources.
100
8
FILES
87100Total Score
View Evaluation ReportCore Capability
81 / 100
Functional Suitability
10 / 12
Reliability
9 / 12
Performance & Context
8 / 8
Agent Usability
13 / 16
Human Usability
7 / 8
Security
9 / 12
Maintainability
9 / 12
Agent-Specific
16 / 20
Medical Task
20 / 20 Passed
96Access COSMIC to download mutation datasets, query Cancer Gene Census, and retrieve mutational signatures when your genomic analysis requires curated somatic mutation resources
4/4
92Access COSMIC to download mutation datasets, query Cancer Gene Census, and retrieve mutational signatures when your genomic analysis requires curated somatic mutation resources
4/4
90Access COSMIC to download mutation datasets, query Cancer Gene Census, and retrieve mutational signatures when your genomic analysis requires curated somatic mutation resources
4/4
90Packaged executable path(s): scripts/download_cosmic.py
4/4
90End-to-end case for Scope-focused workflow aligned to: Access COSMIC to download mutation datasets, query Cancer Gene Census, and retrieve mutational signatures when your genomic analysis requires curated somatic mutation resources
4/4
SKILL.md
COSMIC Database Skill
When to Use
- Use this skill when you need access cosmic to download mutation datasets, query cancer gene census, and retrieve mutational signatures when your genomic analysis requires curated somatic mutation resources in a reproducible workflow.
- Use this skill when a evidence insight task needs a packaged method instead of ad-hoc freeform output.
- Use this skill when the user expects a concrete deliverable, validation step, or file-based result.
- Use this skill when
scripts/download_cosmic.pyis the most direct path to complete the request. - Use this skill when you need the
cosmic-databasepackage behavior rather than a generic answer.
Key Features
- Scope-focused workflow aligned to: Access COSMIC to download mutation datasets, query Cancer Gene Census, and retrieve mutational signatures when your genomic analysis requires curated somatic mutation resources.
- Packaged executable path(s):
scripts/download_cosmic.py. - Reference material available in
references/for task-specific guidance. - Structured execution path designed to keep outputs consistent and reviewable.
Dependencies
Python:3.10+. Repository baseline for current packaged skills.Third-party packages:not explicitly version-pinned in this skill package. Add pinned versions if this skill needs stricter environment control.
Example Usage
cd "20260316/scientific-skills/Evidence Insight/cosmic-database"
python -m py_compile scripts/download_cosmic.py
python scripts/download_cosmic.py --help
Example run plan:
- Confirm the user input, output path, and any required config values.
- Edit the in-file
CONFIGblock or documented parameters if the script uses fixed settings. - Run
python scripts/download_cosmic.pywith the validated inputs. - Review the generated output and return the final artifact with any assumptions called out.
Implementation Details
- Execution model: validate the request, choose the packaged workflow, and produce a bounded deliverable.
- Input controls: confirm the source files, scope limits, output format, and acceptance criteria before running any script.
- Primary implementation surface:
scripts/download_cosmic.py. - Reference guidance:
references/contains supporting rules, prompts, or checklists. - Parameters to clarify first: input path, output path, scope filters, thresholds, and any domain-specific constraints.
- Output discipline: keep results reproducible, identify assumptions explicitly, and avoid undocumented side effects.
1. When to Use
Use this skill when you need COSMIC data for tasks such as:
- Downloading COSMIC mutation exports (TSV/VCF) for cohort or sample-level variant analysis.
- Retrieving Cancer Gene Census (CGC) gene lists for oncogene/tumor suppressor annotation and prioritization.
- Working with COSMIC mutational signatures (SBS/DBS/ID) for signature attribution or comparative studies.
- Accessing additional COSMIC genomics datasets (e.g., copy number, fusions, expression) for multi-omics integration.
- Building reproducible pipelines that programmatically fetch the latest COSMIC releases.
2. Key Features
- Authenticated downloads of COSMIC files (e.g., TSV/VCF; often GZIP-compressed).
- Cancer Gene Census access for curated cancer gene information.
- Mutational signature retrieval including SBS, DBS, and ID signatures.
- Support for multiple COSMIC dataset types, such as mutation, copy number, fusion, and expression resources.
- Pandas-friendly workflow for loading and filtering downloaded tables.
3. Dependencies
- Python 3.9+
pandas>= 1.5requests>= 2.28
External requirements:
- A registered COSMIC account at https://cancer.sanger.ac.uk/cosmic
- Valid COSMIC login credentials (email + password)
4. Example Usage
The following example downloads a COSMIC file and loads it into a pandas DataFrame.
from scripts.download_cosmic import download_cosmic_file
import pandas as pd
# 1) Download a COSMIC dataset (example path; adjust to your target release/build)
download_cosmic_file(
email="[email protected]",
password="pwd",
filepath="GRCh38/cosmic/latest/CosmicMutantExport.tsv.gz"
)
# 2) Load the downloaded GZIP-compressed TSV
df = pd.read_csv(
"CosmicMutantExport.tsv.gz",
sep="\t",
compression="gzip"
)
# 3) Example analysis: filter by gene symbol (column name depends on the dataset)
# df_gene = df[df["Gene name"] == "TP53"]
For dataset field definitions and COSMIC file specifics, see: references/cosmic_data_reference.md.
5. Implementation Details
- Authentication: Downloads require COSMIC account credentials (email/password) and are performed via an authenticated HTTP session.
- File targeting: The
filepathparameter specifies the COSMIC resource path (e.g., genome build such asGRCh38, release channel such aslatest, and the target filename). - Data format: Many COSMIC exports are distributed as GZIP-compressed TSV (and sometimes VCF). Use
pandas.read_csv(..., sep="\t", compression="gzip")for TSV.gzfiles. - Typical workflow:
- Download the desired COSMIC export.
- Load into a DataFrame (or parse VCF with an appropriate library if needed).
- Filter/aggregate by gene, tumor type, sample, or signature depending on the analysis goal.