Agent Skills
Uniprot Database
AIPOCH
Direct REST API access to UniProt for protein search, entry retrieval, and identifier mapping; use when you need programmatic UniProtKB queries or cross-database ID conversion.
157
6
FILES
90100Total Score
View Evaluation ReportCore Capability
87 / 100
Functional Suitability
11 / 12
Reliability
10 / 12
Performance & Context
8 / 8
Agent Usability
14 / 16
Human Usability
8 / 8
Security
9 / 12
Maintainability
10 / 12
Agent-Specific
17 / 20
Medical Task
20 / 20 Passed
97You need to search UniProtKB with Lucene-style queries (e.g., by gene name, organism, reviewed status)
4/4
93You want to fetch the full details of a specific protein entry by UniProt accession (e.g., P12345)
4/4
91Protein search via UniProtKB REST endpoint using Lucene query syntax
4/4
91Entry retrieval by accession with selectable output formats
4/4
91End-to-end case for Protein search via UniProtKB REST endpoint using Lucene query syntax
4/4
SKILL.md
When to Use
- You need to search UniProtKB with Lucene-style queries (e.g., by gene name, organism, reviewed status).
- You want to fetch the full details of a specific protein entry by UniProt accession (e.g.,
P12345). - You need to map identifiers between databases (e.g., gene names, Ensembl IDs, RefSeq IDs ↔ UniProt accessions).
- You are building pipelines that require automated protein annotation retrieval in JSON/TSV/FASTA formats.
- You need a lightweight client that talks directly to UniProt’s REST API without additional SDKs.
Key Features
- Protein search via UniProtKB REST endpoint using Lucene query syntax.
- Entry retrieval by accession with selectable output formats.
- Identifier mapping between supported source/target databases using UniProt ID mapping service.
- Format control (default
json) for consistent downstream parsing. - Reference docs for query syntax and available API fields:
references/query_syntax.mdreferences/api_fields.md
Dependencies
- Python
>=3.8 requests >=2.31.0
Example Usage
import time
import requests
BASE = "https://rest.uniprot.org"
def search_protein(query: str, fmt: str = "json", size: int = 5):
"""
Search UniProtKB using Lucene-style query syntax.
"""
url = f"{BASE}/uniprotkb/search"
params = {"query": query, "format": fmt, "size": size}
r = requests.get(url, params=params, timeout=30)
r.raise_for_status()
return r.json() if fmt == "json" else r.text
def retrieve_entry(accession: str, fmt: str = "json"):
"""
Retrieve a UniProtKB entry by accession.
"""
url = f"{BASE}/uniprotkb/{accession}"
params = {"format": fmt}
r = requests.get(url, params=params, timeout=30)
r.raise_for_status()
return r.json() if fmt == "json" else r.text
def id_mapping(from_db: str, to_db: str, ids, poll_interval_s: float = 1.0):
"""
Map identifiers using UniProt ID Mapping.
ids can be a list of strings or a comma-separated string.
"""
if isinstance(ids, (list, tuple)):
ids = ",".join(ids)
# 1) Submit mapping job
submit_url = f"{BASE}/idmapping/run"
r = requests.post(
submit_url,
data={"from": from_db, "to": to_db, "ids": ids},
timeout=30,
)
r.raise_for_status()
job_id = r.json()["jobId"]
# 2) Poll job status
status_url = f"{BASE}/idmapping/status/{job_id}"
while True:
s = requests.get(status_url, timeout=30)
s.raise_for_status()
payload = s.json()
if payload.get("jobStatus") in (None, "FINISHED"):
break
if payload.get("jobStatus") == "FAILED":
raise RuntimeError(f"ID mapping failed: {payload}")
time.sleep(poll_interval_s)
# 3) Fetch results (JSON)
results_url = f"{BASE}/idmapping/results/{job_id}"
res = requests.get(results_url, params={"format": "json"}, timeout=30)
res.raise_for_status()
return res.json()
if __name__ == "__main__":
# Search example: human BRCA1
search = search_protein("gene:BRCA1 AND organism_id:9606", size=3)
print("Search results (first accessions):",
[item["primaryAccession"] for item in search.get("results", [])])
# Retrieve entry example
entry = retrieve_entry("P38398") # UniProt accession for human BRCA1 (example)
print("Entry primaryAccession:", entry.get("primaryAccession"))
print("Protein name:", entry.get("proteinDescription", {}).get("recommendedName", {}).get("fullName", {}).get("value"))
# ID mapping example: gene name -> UniProtKB
mapping = id_mapping(from_db="Gene_Name", to_db="UniProtKB", ids=["BRCA1"])
print("Mapping results keys:", mapping.keys())
Implementation Details
-
Search Protein
- Uses
GET /uniprotkb/search - Key parameters:
query: Lucene-style query string (seereferences/query_syntax.md)format: output format (defaultjson)- Optional common parameters:
size,fields,sort
- Returns parsed JSON when
format=json, otherwise raw text.
- Uses
-
Retrieve Entry
- Uses
GET /uniprotkb/{accession} - Key parameters:
accession: UniProt accession (e.g.,P12345)format: output format (defaultjson)
- Suitable for fetching full record details for a known accession.
- Uses
-
ID Mapping
- Uses UniProt asynchronous mapping workflow:
POST /idmapping/runwithfrom,to,ids- Poll
GET /idmapping/status/{jobId}until finished - Fetch
GET /idmapping/results/{jobId}?format=json
idsaccepts either a list or a comma-separated string.- Recommended parameters:
poll_interval_s: controls polling frequency to avoid excessive requests.
from_db/to_dbmust match UniProt-supported database identifiers (consult UniProt mapping documentation as needed).
- Uses UniProt asynchronous mapping workflow: