Agent Skills

Ppi Network Analysis

AIPOCH

Use when you need a standardized R CLI workflow to build a protein-protein interaction network from a local gene list and an offline STRING cache, export node and edge tables, and render a reproducible PDF network plot. NOT for online API fetching, arbitrary graph databases, multi-omics integration, or non-STRING interaction sources.

17
0
FILES
ppi-network-analysis/
skill.md
scripts
cli_options.R
core_option_groups.R
functions.R
io.R
main.R
option_validation.R
path_utils.R
plot_option_groups.R
recording.R
run_analysis.R
string_cache.R
utils.R
validation_utils.R
visualization.R
references
algorithm.md
cli-guide.md
string_cache
10090.protein.aliases.v11.5.txt.gz
10090.protein.info.v11.5.txt.gz
10090.protein.links.v11.5.txt.gz
9606.protein.aliases.v11.5.txt.gz
9606.protein.info.v11.5.txt.gz
9606.protein.links.v11.5.txt.gz
troubleshooting.md
90100Total Score
View Evaluation Report
Core Capability
94 / 100
Functional Suitability
12 / 12
Reliability
11 / 12
Performance & Context
8 / 8
Agent Usability
15 / 16
Human Usability
8 / 8
Security
12 / 12
Maintainability
12 / 12
Agent-Specific
16 / 20
Medical Task
32 / 32 Passed
92Human bundled gene list
5/5
90Numeric species with styled plot
5/5
88Lower-bound threshold 400
5/5
84Plot-only regeneration
4/4
92High-option styled run
5/5
82Unsupported species rejection
4/4
84Invalid line_type rejection
4/4

SKILL.md

PPI Network Analysis

When to Read External Files

SituationFile to ReadPurpose
Need algorithm detailsreferences/algorithm.mdExplain local STRING mapping, interaction filtering, network metrics, and plot interpretation
Need to execute the analysisscripts/main.RRun the CLI entry point with a complete Rscript command
Encounter an errorreferences/troubleshooting.mdMap standardized error codes to causes and fixes
Need CLI examples or baseline usagereferences/cli-guide.mdReview installation notes, offline cache requirements, and runnable examples
Need a runnable smoke testtests/data/Use the bundled small gene list for verification

Usage

Rscript scripts/main.R \
  --genelist_file ./input/gene_list.csv \
  --species human \
  --threshold 700 \
  --output_dir output/basic-run \
  --seed 42 \
  --timeout_seconds 600
Rscript scripts/main.R \
  --plot_only TRUE \
  --output_dir output/basic-run \
  --seed 42 \
  --timeout_seconds 600

Arguments

ShortLongTypeDefaultRequiredDescription
-g--genelist_filecharacternoneyes, unless --plot_only TRUEGene list file in CSV, TSV, TXT, or XLSX format
-s--speciescharacternoneyes, unless --plot_only TRUESpecies: human, mouse, 9606, or 10090
-t--thresholdintegernoneyes, unless --plot_only TRUESTRING combined-score threshold from 400 to 1000
-o--output_dircharacteroutputnoOutput directory inside the skill root
-p--plot_onlylogicalFALSEnoReuse output_dir/data/ppi_result.rds and regenerate the network plot
-d--seedinteger42noRandom seed used for layout reproducibility
-u--timeout_secondsinteger600noElapsed time limit in seconds
--string_cache_dircharacterreferences/string_cachenoLocal STRING cache directory; if omitted, the bundled cache inside the skill is used
--string_versioncharacterautonoPreferred STRING cache version; use auto, v11.5, or v12.0 when available
--figure_familycharactersansnoPDF font family: sans, serif, or mono
--figure_widthnumeric12noPlot width in inches
--figure_heightnumeric10noPlot height in inches
--labelcharacternodenoLabel mode: node or none
--label_sizenumeric0.8noLabel size
--label_colorcharacterblacknoLabel color
--label_distnumeric0noLabel distance from the node center
--line_alphanumeric1noEdge alpha
--line_colorcharacterbuilt-in palettenoComma-separated edge colors
--line_sizenumeric0.8noBase edge width
--line_typecharactersolidnoEdge line type; supported values in plotting are solid, dashed, or dotted
--mapping_link_alphacharactervaluenoMap edge alpha from interaction score: value or none
--mapping_link_colorcharactervaluenoMap edge color from interaction score: value or none
--mapping_link_sizecharactervaluenoMap edge width from interaction score: value or none
--mapping_node_alphacharacternonenoMap node alpha from degree: value or none
--mapping_node_colorcharacternonenoMap node color from degree: value or none
--mapping_node_sizecharactervaluenoMap node size from degree: value or none
--point_alphanumeric1noNode alpha
--point_colorcharacterbuilt-in palettenoComma-separated node border colors
--point_fillcharacterbuilt-in palettenoComma-separated node fill colors
--point_shapecharactercirclenoNode shape: circle or square
--point_sizenumeric12noBase node size
--style_layoutcharacternicelynoLayout style: kk, fr, nicely, circle, star, grid, or randomly
--style_linecharacterstraightnoEdge style: straight or curve
--theme_sizenumeric0.8noTheme size placeholder retained for compatibility
--titlecharacteremptynoMain plot title

Input Format

Supported input types

--genelist_file accepts the following formats:

  • .csv
  • .tsv
  • .txt
  • .xlsx

Gene list parsing rules

  • Plain-text .txt files can be provided as one gene symbol per line without a header.
  • For .csv, .tsv, and .xlsx, the tool automatically selects a likely gene column.
  • Preferred column names include: gene, genes, genename, genesymbol, symbol, hgnc, hgncsymbol, mgi, ensembl, ensemblgeneid, geneid, and id.
  • If no standard gene column name is found, the tool falls back to the column with the strongest non-numeric signal.
  • Values may contain multiple genes separated by commas, semicolons, pipes, tabs, or spaces; these are split automatically.
  • Empty inputs, unsupported file extensions, or inputs with no parsable genes will raise a SKILL_EMPTY_DATA or SKILL_INVALID_PARAMETER error.

Minimal examples

TXT example

TP53
EGFR
BRCA1
MYC

CSV example

gene
TP53
EGFR
BRCA1
MYC

Output Files

FileFormatDescription
data/ppi_result.rdsRDSSerialized PPI bundle with mappings, interactions, nodes, summary, and metadata
table/ppi_network_edges.xlsxXLSXEdge table with from, to, and combined_score
table/ppi_network_nodes.xlsxXLSXNode table with gene, degree, betweenness, and closeness
table/ppi_summary.csvCSVSummary metrics for input genes, mapped genes, unmapped genes, nodes, edges, and threshold
plot/ppi_network_plot.pdfPDFRendered PPI network plot from the local STRING interaction graph
session_info.txtTXTR version, platform, and package version information

Error Handling

Error CodeMeaningHow to Fix
SKILL_FILE_NOT_FOUNDInput gene list, STRING cache directory, required cache files, or data/ppi_result.rds in plot-only mode was not foundConfirm the path exists, required cache files are present, and run a full analysis before --plot_only TRUE
SKILL_EMPTY_DATANo valid genes were parsed, no genes mapped to STRING, fewer than two mapped STRING IDs remained, no interactions passed filtering, or the interaction table was empty for plottingCheck that the input is not empty, verify gene symbols are supported by the local STRING cache, and lower the threshold if the network is too sparse
SKILL_INVALID_PARAMETERA required argument is missing, a numeric value is out of range, an unsupported choice was supplied, the output path is invalid, or the input extension is unsupportedRecheck the parameter value and allowed choices, especially --species, --threshold, mapping options, plot options, and output paths
SKILL_MISSING_COLUMNSRequired columns were not found in a STRING cache tableConfirm the local aliases, info, and links files are valid STRING cache files with expected columns
SKILL_PACKAGE_NOT_FOUNDRequired R packages are not installedInstall the missing packages listed in the error message before rerunning

Detailed fixes and troubleshooting steps: READ references/troubleshooting.md

Testing

Smoke test with bundled data

Rscript scripts/main.R \
  --genelist_file tests/data/gene_list.csv \
  --species human \
  --threshold 700 \
  --output_dir tests/output/basic-run

Plot-only regeneration test

Rscript scripts/main.R \
  --plot_only TRUE \
  --output_dir tests/output/basic-run \
  --seed 42

Expected outputs after test

  • tests/output/basic-run/data/ppi_result.rds
  • tests/output/basic-run/table/ppi_network_edges.xlsx
  • tests/output/basic-run/table/ppi_network_nodes.xlsx
  • tests/output/basic-run/table/ppi_summary.csv
  • tests/output/basic-run/plot/ppi_network_plot.pdf
  • tests/output/basic-run/session_info.txt