Agent Skills

Medchem

AIPOCH

Medicinal chemistry screening filters for compound prioritization; use when you need to apply drug-likeness rules, PAINS/structural alerts, and complexity metrics to triage or optimize libraries.

101
6
FILES
medchem/
skill.md
scripts
filter_molecules.py
references
api_guide.md
rules_catalog.md
87100Total Score
View Evaluation Report
Core Capability
87 / 100
Functional Suitability
11 / 12
Reliability
10 / 12
Performance & Context
8 / 8
Agent Usability
14 / 16
Human Usability
8 / 8
Security
9 / 12
Maintainability
10 / 12
Agent-Specific
17 / 20
Medical Task
20 / 20 Passed
92Screening large compound libraries to quickly triage for drug-like candidates (e.g., Lipinski/Veber + alerts)
4/4
88Flagging problematic chemotypes (e.g., PAINS, reactive/toxicophores, curated structural alerts) before follow-up assays
4/4
86Drug-likeness and medchem rule sets: Lipinski (Ro5), Veber, Oprea, CNS, lead-like (soft/strict), Rule of Three, REOS, Golden Triangle, etc
4/4
86PAINS and structural alert filtering: curated alert catalogs and pattern-based screening
4/4
86End-to-end case for Drug-likeness and medchem rule sets: Lipinski (Ro5), Veber, Oprea, CNS, lead-like (soft/strict), Rule of Three, REOS, Golden Triangle, etc
4/4

SKILL.md

When to Use

  • Screening large compound libraries to quickly triage for drug-like candidates (e.g., Lipinski/Veber + alerts).
  • Flagging problematic chemotypes (e.g., PAINS, reactive/toxicophores, curated structural alerts) before follow-up assays.
  • Prioritizing lead-optimization candidates with stricter criteria (lead-like rules, demerit systems, complexity caps).
  • Enforcing property constraints (MW/logP/TPSA/rotatable bonds) for target-specific design windows (e.g., CNS).
  • Identifying molecules containing specific functional groups/scaffolds (e.g., Michael acceptors, hinge binders) for SAR or risk assessment.

Key Features

  • Drug-likeness and medchem rule sets: Lipinski (Ro5), Veber, Oprea, CNS, lead-like (soft/strict), Rule of Three, REOS, Golden Triangle, etc.
  • PAINS and structural alert filtering: curated alert catalogs and pattern-based screening.
  • Curated industrial filter sets: e.g., NIBR filters; Lilly demerit scoring with pass/fail thresholds.
  • Functional-group detection via SMARTS-based group matchers (hinge binders, phosphate binders, Michael acceptors, reactive groups, custom patterns).
  • Named catalogs of curated structures (functional groups, protecting groups, reagents, fragments) for matching and annotation.
  • Molecular complexity metrics (e.g., Bertz/Whitlock/Barone-style) and threshold-based complexity filters.
  • Constraint-based filtering for property windows (MW/logP/TPSA/RB, etc.).
  • Query language to combine heterogeneous criteria (rules + alerts + numeric thresholds) into a single expression.

Dependencies

  • medchem (latest)
  • datamol (latest)
  • pandas (latest, for tabular workflows)

Example Usage

# End-to-end, runnable example:
# 1) load SMILES
# 2) apply Ro5 + Veber
# 3) apply common structural alerts
# 4) compute complexity and filter
# 5) export a CSV with decisions

import pandas as pd
import datamol as dm
import medchem as mc

smiles_list = [
    "CC(=O)OC1=CC=CC=C1C(=O)O",  # aspirin
    "CN1C=NC2=C1C(=O)N(C(=O)N2C)C",  # caffeine
    "c1ccccc1",  # benzene
]

df = pd.DataFrame({"smiles": smiles_list})
mols = [dm.to_mol(smi) for smi in df["smiles"]]

# 1) Drug-likeness rules
rule_filter = mc.rules.RuleFilters(rule_list=["rule_of_five", "rule_of_veber"])
rule_res = rule_filter(mols=mols, n_jobs=-1, progress=False)
df["passes_rules"] = rule_res["pass"]

# 2) Structural alerts
alerts = mc.structural.CommonAlertsFilters()
alert_res = alerts(mols=mols, n_jobs=-1, progress=False)
df["has_alerts"] = alert_res["has_alerts"]

# 3) Complexity (example threshold)
complex_filter = mc.complexity.ComplexityFilter(max_complexity=500)
complex_res = complex_filter(mols=mols, n_jobs=-1, progress=False)
df["passes_complexity"] = complex_res["pass"]

# 4) Final decision
df["keep"] = df["passes_rules"] & (~df["has_alerts"]) & df["passes_complexity"]

# 5) Save results
df.to_csv("medchem_screening_results.csv", index=False)
print(df)

Implementation Details

  • Rule evaluation (medchem.rules)

    • Rules are implemented as callable checks over SMILES or RDKit-like molecule objects (commonly via datamol).
    • RuleFilters(rule_list=[...]) applies multiple rules and returns a structured result (typically including an overall pass plus per-rule details).
    • Typical use: start broad (Ro5/Veber), then tighten (CNS/lead-like) as project constraints become clearer.
  • Structural alerts (medchem.structural)

    • Alert systems are primarily SMARTS/pattern-based matchers curated from literature/industrial practice.
    • CommonAlertsFilters, NIBRFilters, and LillyDemeritsFilters provide different philosophies:
      • Common alerts: general-purpose red flags.
      • NIBR: curated industrial filter set.
      • Lilly demerits: assigns penalties per matched rule; a common convention is reject if total demerits > 100.
  • Complexity (medchem.complexity)

    • Complexity scores approximate synthetic difficulty / structural intricacy using established heuristics (e.g., Bertz/Whitlock/Barone-style metrics).
    • ComplexityFilter(max_complexity=...) converts a numeric score into a pass/fail gate for library triage.
  • Constraints (medchem.constraints)

    • Property windows (MW/logP/TPSA/rotatable bonds, etc.) are applied as hard filters.
    • Use constraints to encode target-specific design hypotheses (e.g., CNS-like space) rather than universal “good/bad” judgments.
  • Groups and catalogs (medchem.groups, medchem.catalogs)

    • Group detection is SMARTS-driven and returns boolean matches and/or match details (substructure hits).
    • Named catalogs provide curated sets for consistent annotation and matching across projects.
  • Parallelization

    • Most batch APIs accept n_jobs; set n_jobs=-1 to use all available CPU cores for large libraries.