Data Analysis

svm-model-importance-analysis

Train support vector machine classifiers and assess feature importance via recursive feature elimination (RFE). Inputs: feature matrix, binary class labels. Outputs: trained SVM model, RFE importance ranking, optimal feature subset, ROC performance curve.

93100Total Score

Core Capability

94 / 100

Functional Suitability

11 / 12

Reliability

10 / 12

Performance & Context

8 / 8

Agent Usability

15 / 16

Human Usability

8 / 8

Security

12 / 12

Maintainability

11 / 12

Agent-Specific

20 / 20

Medical Task

20 / 20 Passed

95Bundled binary SVM-RFE run

4/4

94Tolerance-rule custom run

4/4

91Missing plot-only bundle

4/4

92Successful plot-only regeneration

4/4

89Corrupt plot-only bundle

4/4

Veto GatesRequired pass for any deployment consideration

Skill Veto✓ All 4 gates passed

✓

Operational Stability

System remains stable across varied inputs and edge cases

PASS

✓

Structural Consistency

Output structure conforms to expected skill contract format

PASS

✓

Result Determinism

Equivalent inputs produce semantically equivalent outputs

PASS

✓

System Security

No prompt injection, data leakage, or unsafe tool use detected

PASS

Research Veto✅ PASS — Applicable

Dimension	Result	Detail
Scientific Integrity	PASS	No fabricated scientific claims or invented performance numbers were observed in any output.
Practice Boundaries	PASS	The skill stayed within a computational feature-ranking workflow and made no diagnostic or prescriptive claims.
Methodological Ground	PASS	The documented two-class linear SVM-RFE approach matched the tested binary dataset and scope restrictions.
Code Usability	PASS	Canonical, custom, missing-bundle, plot-only reuse, and corrupt-bundle executions all behaved as documented during the audit.

Core Capability94 / 100 — 8 Categories

Functional Suitability

The skill covers the main binary SVM-RFE workflow, plot-only reuse, scope limits, and reproducibility expectations clearly.

11 / 12

92%

Reliability

Input validation, path confinement, and recovery guidance are strong; the main remaining gap is limited automated coverage for negative parameter cases.

10 / 12

83%

Performance & Context

Documentation is layered well and the execution path is linear with no redundant steps.

8 / 8

100%

Agent Usability

Instructions are easy to follow and outputs are well specified; the only deduction is that one selection heuristic is implicit rather than user-controlled.

15 / 16

94%

Human Usability

Trigger language is natural, runnable examples are present, and out-of-scope boundaries are clearly called out.

8 / 8

100%

Security

Path confinement, no network activity, and no dynamic code execution were all confirmed.

12 / 12

100%

Maintainability

Scripts are modular and tests exist, but the automated suite can still cover more documented failure branches.

11 / 12

92%

Agent-Specific

Trigger precision, progressive disclosure, composability, idempotency, and escape hatches are all strong.

20 / 20

100%

Core Capability Total94 / 100

Medical TaskExecution Average: 92.2 / 100 — Assertions: 20/20 Passed

Canonical

Bundled binary SVM-RFE run

4/4 ✓

Variant A

Tolerance-rule custom run

4/4 ✓

Edge

Missing plot-only bundle

4/4 ✓

Variant B

Successful plot-only regeneration

4/4 ✓

Stress

Corrupt plot-only bundle

4/4 ✓

Canonical✅ Pass

Bundled binary SVM-RFE run

Canonical run completed successfully and produced the documented bundle, tables, plots, and session info.

Basic 38/40|Specialized 57/60|Total 95/100

✅A1Command completes successfully and logs the expected major stages

✅A2The documented analysis artifacts are created in the requested output directory

✅A3The selected feature table is consistent with the ranking output

✅A4The workflow stays within the documented scope

Pass rate: 4 / 4

Variant A✅ Pass

Tolerance-rule custom run

Custom tolerance-rule settings executed cleanly and produced a full output set in a separate directory.

Basic 38/40|Specialized 56/60|Total 94/100

✅A1Custom feature-selection settings execute successfully

✅A2The run still produces the expected result bundle, tables, and plots

✅A3The skill remains reproducible under fixed seed settings

✅A4The workflow remains scoped to two-class SVM-RFE ranking

Pass rate: 4 / 4

Edge✅ Pass

Missing plot-only bundle

Missing plot-only artifact was rejected with a clear standardized error and an obvious recovery path.

Basic 37/40|Specialized 54/60|Total 91/100

✅A1Plot-only mode rejects missing result bundles before plotting

✅A2The error uses a standardized skill error code

✅A3The error is actionable and recoverable

✅A4The failure is safe and scoped

Pass rate: 4 / 4

Variant B✅ Pass

Successful plot-only regeneration

Existing result bundles were reused cleanly and plots were regenerated without rerunning analysis.

Basic 37/40|Specialized 55/60|Total 92/100

✅A1Existing result bundles can be reused without re-reading raw inputs

✅A2Plot regeneration succeeds with the documented minimal arguments

✅A3The command avoids unnecessary reruns of the full SVM-RFE workflow

✅A4Behavior remains deterministic on the same stored bundle

Pass rate: 4 / 4

Stress✅ Pass

Corrupt plot-only bundle

The corrupt RDS bundle was rejected with a standardized skill error and explicit recovery guidance.

Basic 36/40|Specialized 53/60|Total 89/100

✅A1Corrupt svm_result.rds input is rejected instead of producing undefined plots

✅A2Corrupt bundle errors use a standardized SKILL_* code

✅A3Corrupt bundle errors provide actionable recovery guidance

✅A4Failure handling remains non-destructive and within scope

Pass rate: 4 / 4

Medical Task Total92.2 / 100

Key Strengths

The skill has a strong CLI contract with clear inputs, outputs, and scope boundaries for two-class SVM-RFE work.
Path confinement is implemented correctly: output traversal outside the skill root was rejected with a standardized error.
Determinism is strong: repeat canonical runs produced byte-identical ranking CSV outputs under a fixed seed.
The implementation is modular and backed by a runnable packaged test suite that passed during the audit.