Agent Skills

External Model Validation

AIPOCH

Use when validating an existing prognostic risk signature on an external bulk expression cohort with survival outcomes, producing risk scores, Kaplan-Meier curves, risk distribution plots, heatmap, and time-dependent ROC curves. NOT for: model training, feature selection, nomogram construction, calibration analysis, or single-cell data.

19
1
FILES
external-model-validation/
skill.md
scripts
functions.R
io.R
main.R
plotting.R
run_analysis.R
utils.R
references
algorithm.md
baseline-run.md
cli-guide.md
project-structure.md
troubleshooting.md
90100Total Score
View Evaluation Report
Core Capability
95 / 100
Functional Suitability
12 / 12
Reliability
12 / 12
Performance & Context
8 / 8
Agent Usability
15 / 16
Human Usability
7 / 8
Security
11 / 12
Maintainability
12 / 12
Agent-Specific
18 / 20
Medical Task
34 / 35 Passed
91BRCA cohort validation with bundled test data
5/5
88Day-unit follow-up with custom ROC time points in years
5/5
84Non-empty output directory without --overwrite flag
5/5
90Custom plot colors, ROC legend position, and KM break interval
5/5
87Five ROC time points including potentially out-of-range values
4/5
82Request for model training, nomogram, or calibration analysis
5/5
85Model genes absent from expression matrix
5/5

SKILL.md

External Model Validation

Input Validation

This skill accepts: an existing prognostic gene signature (model coefficient file with Gene and Coef columns), a bulk expression matrix in CSV format (genes as rows, samples as columns), and a clinical file with OS and OS.time survival columns.

If the user's request does not involve validating a pre-existing prognostic model on an external cohort — for example, asking to train a new model, perform feature selection, build a nomogram, run calibration curves, analyze single-cell data, or process data without survival endpoints — do not proceed with the workflow. Instead respond:

"external-model-validation is designed to validate an existing prognostic risk signature on an external bulk expression cohort with survival outcomes. Your request appears to be outside this scope. Please provide a fixed model coefficient file plus expression and clinical data with OS/OS.time columns, or use a more appropriate tool for model training, nomogram construction, or single-cell analysis."

When to Read External Files

SituationFile to ReadPurpose
Need to run the analysisscripts/main.RExecute: Rscript scripts/main.R --exp_file ... --cli_file ... --model_file ...
Need workflow order or output generation stepsscripts/run_analysis.RReview the 4-step orchestration of loading, scoring, plotting, and metadata export
Need risk score or sample matching logicscripts/functions.RInspect core data preparation and validation logic
Need output writing or metadata export detailsscripts/io.RInspect output directory creation and file-writing helpers
Need plotting implementation detailsscripts/plotting.RInspect Kaplan-Meier, risk, heatmap, and ROC plot generation
Need input validation, logging, timeout, or dependency logicscripts/utils.RReview validation helpers, SKILL_* error handling, logging, and runtime safeguards
Need statistical assumptions or method detailsreferences/algorithm.mdRisk score formula, group cutoff, survival analysis, ROC, and heatmap assumptions
Need troubleshooting helpreferences/troubleshooting.mdCommon failures, warnings, and concrete fixes
Need CLI usage examplesreferences/cli-guide.mdParameter explanations, examples, and command patterns
Need expected outputs or benchmark runreferences/baseline-run.mdReal-data baseline command, runtime, memory checkpoints, and output inventory
Need test inputstests/data/Example expression, clinical, and model files for validation
Need to refresh the retained example outputtests/refresh_example_output.RRebuild tests/output/ with --overwrite using the bundled test data

Usage

Rscript scripts/main.R \
  --exp_file ./expression.csv \
  --cli_file ./clinical.csv \
  --model_file ./model.csv \
  --output_dir ./output/ \
  --time_unit month \
  --seed 42

Arguments

ShortLongTypeDefaultDescription
-e--exp_filecharacterrequiredExpression matrix CSV with genes as rows and samples as columns
-c--cli_filecharacterrequiredClinical CSV with sample IDs as row names and OS, OS.time columns
-m--model_filecharacterrequiredModel coefficient CSV with Gene and Coef columns
-o--output_dircharacter./output/Output directory
--overwriteflagFALSEAllow writing into a non-empty output directory
-u--time_unitcharactermonthSurvival time unit in input clinical file: day, month, year
--col_highcharacter#E64B35Color for high-risk samples
--col_lowcharacter#4DBBD5Color for low-risk samples
--roc_colscharacter#E64B35,#00A087,#3C5488Comma-separated colors for ROC curves
--roc_timescharacter1,3,5Comma-separated ROC time points always in years, regardless of --time_unit. When follow-up is in days or months, still provide --roc_times in years (e.g., 1,3,5 for 1, 3, and 5 years).
--roc_poscharacterbottomrightROC legend position
--km_breaksinteger0Kaplan-Meier x-axis break in years; 0 selects automatically
-s--seedinteger42Random seed for reproducibility
--timeout_secondsinteger3600Elapsed timeout limit in seconds

When to Use

  • You already have a fixed prognostic gene signature and coefficients.
  • You need to test that model on an independent cohort with bulk expression and survival data.
  • You want standard outputs for external validation: risk table, Kaplan-Meier curve, risk score plot, survival status plot, expression heatmap, and time-dependent ROC.

When Not to Use

  • Do not use this skill to train or re-fit a prognostic model.
  • Do not use it for nomogram construction, calibration curves, DCA, or diagnostic classification.
  • Do not use it for single-cell expression matrices or cohorts without survival endpoints.
  • Do not use identifiable patient data without de-identification and local compliance approval.
  • Do not use for cohorts with very few events (fewer than 5 events may produce unreliable Kaplan-Meier and ROC results).

Research Use Notice

  • This skill is for research and validation workflows only.
  • It does not provide diagnosis, treatment recommendations, or clinical decision support.
  • Use de-identified data and follow IRB, ethics, and data-use requirements before running on human cohorts.

Input Format

Expression Matrix (exp_file)

CSV with genes as rows and samples as columns. The first column must contain gene identifiers.

"","Sample_1","Sample_2","Sample_3"
"TSPAN6",3.87,4.54,8.12
"TNMD",9.98,5.86,5.38
"DPM1",7.95,6.11,5.41

Clinical File (cli_file)

CSV with sample IDs as row names and at least OS and OS.time columns.

,Age,OS,OS.time
Sample_1,59,0,133.5
Sample_2,60,0,49.13
Sample_3,59,1,22.40
  • OS must use 0/1 encoding.
  • OS.time must be positive and interpretable under --time_unit.

Model Coefficient File (model_file)

CSV with two required columns: Gene and Coef.

Gene,Coef
TSPAN6,-0.25
TNMD,0.15
DPM1,0.32

Output Files

FileDescription
data/risk_data.rdsSerialized analysis dataset containing survival data, model gene expression, risk scores, and risk groups
table/out_varifyRisk.txtTab-delimited risk table for all matched samples
plot/out_varifySurv.pdfKaplan-Meier survival curve with risk table
plot/out_varify.riskScore.pdfOrdered risk score plot
plot/out_varify.survStat.pdfSurvival status plot
plot/out_varify.heatmap.pdfHeatmap of model genes across ordered samples
plot/out_varify.ROC.pdfTime-dependent ROC curve PDF
analysis.logRuntime log including memory checkpoints and processing steps
run_parameters.tsvExact parameter values used for the run
session_info.txtR version, platform, and package session information

Workflow

Step 1: Validate Inputs

  • Check required files and CSV extensions.
  • Validate color strings, timeout, seed, KM break setting, and time unit choice.
  • Parse --roc_times and --roc_cols.

Step 2: Build Matched Validation Dataset

  • Read expression, clinical, and model files.
  • Match samples shared by expression columns and clinical row names.
  • Check all model genes exist in the expression matrix.
  • Remove incomplete cases before downstream analysis.

Step 3: Calculate Risk Scores and Groups

  • Compute risk scores with the supplied linear predictor.
  • Convert follow-up time into years.
  • Split patients into low and high groups using the median risk score.

Step 4: Generate Validation Outputs

  • Save the full risk table and RDS object.
  • Produce Kaplan-Meier, risk score, survival status, heatmap, and time-dependent ROC plots.
  • Save session metadata and exact run parameters.

Methods

Risk Score Formula

For sample i, the skill computes:

riskScore_i = sum(expression_ig * coefficient_g)

using all genes listed in model_file.

Risk Stratification

  • Samples are ordered by riskScore.
  • The median risk score is used as the cutoff.
  • Samples with scores above the median are labeled high; the others are labeled low.

Survival Analysis

  • Kaplan-Meier curves are fit with survival::survfit.
  • Group difference is shown with the default log-rank p-value in survminer::ggsurvplot.

Time-Dependent ROC

  • ROC analysis is performed with timeROC::timeROC using follow-up time in years.
  • All --roc_times values must be smaller than the maximum observed follow-up time.
  • --roc_times is always interpreted in years, regardless of --time_unit.

Examples

Basic Usage

Rscript scripts/main.R \
  -e tests/data/BRCA_data.csv \
  -c tests/data/BRCA_clinic.csv \
  -m tests/data/BRCA_coef.csv \
  -o ./output/

Input Follow-up Recorded in Days

Rscript scripts/main.R \
  -e expression.csv \
  -c clinical.csv \
  -m model.csv \
  -o ./output \
  -u day \
  --roc_times 1,2,3

Note: --roc_times 1,2,3 means 1, 2, and 3 years — even though --time_unit day was supplied. The skill converts OS.time from days to years internally before ROC computation.

Custom Plot Colors and ROC Settings

Rscript scripts/main.R \
  -e expression.csv \
  -c clinical.csv \
  -m model.csv \
  -o ./output \
  --col_high '#B2182B' \
  --col_low '#2166AC' \
  --roc_cols '#B2182B,#4D9221,#2166AC' \
  --roc_pos topleft \
  --km_breaks 2

Error Handling

Common Errors

ErrorCauseSolution
SKILL_FILE_NOT_FOUNDInput path is missing or wrongCheck file path and permissions
SKILL_MISSING_COLUMNSClinical or model file lacks required columnsEnsure OS, OS.time, Gene, and Coef exist
SKILL_SAMPLE_MISMATCHNo overlapping samples between expression and clinical dataAlign sample IDs exactly
SKILL_EMPTY_DATAAn input file is empty after loadingVerify the CSV contains at least one row and one column of usable data
SKILL_INVALID_DATADuplicate genes, empty data, non-numeric coefficients, or invalid survival values. For duplicate genes: deduplicate with dplyr::distinct() or keep the row with highest mean expression (e.g., mat[order(-rowMeans(mat[,-1])),] %>% distinct(Gene, .keep_all=TRUE))Clean input tables and verify formats
SKILL_ANALYSIS_ERRORRisk groups collapse or event count is too lowUse a valid signature and cohort with enough events (minimum ~5)
SKILL_INVALID_PARAMETERBad --time_unit, invalid color, or impossible ROC time pointCorrect the parameter value
SKILL_DEPENDENCY_MISSINGRequired R package is not installedInstall the missing package
SKILL_PKG_VERSIONInstalled package version is below the required minimumUpgrade the package to the required version

IF error persists, READ: references/troubleshooting.md


Testing

Test with Included Data

# Check CLI
Rscript scripts/main.R --help

# Run with bundled test data in a fresh output directory
Rscript scripts/main.R \
  -e tests/data/BRCA_data.csv \
  -c tests/data/BRCA_clinic.csv \
  -m tests/data/BRCA_coef.csv \
  -o ./output/

Validation Commands

# Run R tests
Rscript tests/testthat.R

# Refresh the retained example output bundle
Rscript tests/refresh_example_output.R

# Inspect the generated risk table
wc -l tests/output/table/out_varifyRisk.txt

# Review the retained example outputs
ls -la tests/output/

Real-data Baseline

The repository stores a documented real-data baseline summary in references/baseline-run.md.

IF you need exact benchmark outputs or runtime expectations, READ: references/baseline-run.md

→ Directory structure and implementation details: references/project-structure.md