Data Analysis

svm-model-importance-analysis

Train support vector machine classifiers and assess feature importance via recursive feature elimination (RFE). Inputs: feature matrix, binary class labels. Outputs: trained SVM model, RFE importance ranking, optimal feature subset, ROC performance curve.

93100Total Score
Core Capability
94 / 100
Functional Suitability
11 / 12
Reliability
10 / 12
Performance & Context
8 / 8
Agent Usability
15 / 16
Human Usability
8 / 8
Security
12 / 12
Maintainability
11 / 12
Agent-Specific
20 / 20
Medical Task
20 / 20 Passed
95Bundled binary SVM-RFE run
4/4
94Tolerance-rule custom run
4/4
91Missing plot-only bundle
4/4
92Successful plot-only regeneration
4/4
89Corrupt plot-only bundle
4/4

Veto GatesRequired pass for any deployment consideration

Skill Veto✓ All 4 gates passed
Operational Stability
System remains stable across varied inputs and edge cases
PASS
Structural Consistency
Output structure conforms to expected skill contract format
PASS
Result Determinism
Equivalent inputs produce semantically equivalent outputs
PASS
System Security
No prompt injection, data leakage, or unsafe tool use detected
PASS
Research Veto✅ PASS — Applicable
DimensionResultDetail
Scientific IntegrityPASSNo fabricated scientific claims or invented performance numbers were observed in any output.
Practice BoundariesPASSThe skill stayed within a computational feature-ranking workflow and made no diagnostic or prescriptive claims.
Methodological GroundPASSThe documented two-class linear SVM-RFE approach matched the tested binary dataset and scope restrictions.
Code UsabilityPASSCanonical, custom, missing-bundle, plot-only reuse, and corrupt-bundle executions all behaved as documented during the audit.

Core Capability94 / 1008 Categories

Functional Suitability
The skill covers the main binary SVM-RFE workflow, plot-only reuse, scope limits, and reproducibility expectations clearly.
11 / 12
92%
Reliability
Input validation, path confinement, and recovery guidance are strong; the main remaining gap is limited automated coverage for negative parameter cases.
10 / 12
83%
Performance & Context
Documentation is layered well and the execution path is linear with no redundant steps.
8 / 8
100%
Agent Usability
Instructions are easy to follow and outputs are well specified; the only deduction is that one selection heuristic is implicit rather than user-controlled.
15 / 16
94%
Human Usability
Trigger language is natural, runnable examples are present, and out-of-scope boundaries are clearly called out.
8 / 8
100%
Security
Path confinement, no network activity, and no dynamic code execution were all confirmed.
12 / 12
100%
Maintainability
Scripts are modular and tests exist, but the automated suite can still cover more documented failure branches.
11 / 12
92%
Agent-Specific
Trigger precision, progressive disclosure, composability, idempotency, and escape hatches are all strong.
20 / 20
100%
Core Capability Total94 / 100

Medical TaskExecution Average: 92.2 / 100 — Assertions: 20/20 Passed

95
Canonical
Bundled binary SVM-RFE run
4/4
94
Variant A
Tolerance-rule custom run
4/4
91
Edge
Missing plot-only bundle
4/4
92
Variant B
Successful plot-only regeneration
4/4
89
Stress
Corrupt plot-only bundle
4/4
95
Canonical✅ Pass
Bundled binary SVM-RFE run

Canonical run completed successfully and produced the documented bundle, tables, plots, and session info.

Basic 38/40|Specialized 57/60|Total 95/100
A1Command completes successfully and logs the expected major stages
A2The documented analysis artifacts are created in the requested output directory
A3The selected feature table is consistent with the ranking output
A4The workflow stays within the documented scope
Pass rate: 4 / 4
94
Variant A✅ Pass
Tolerance-rule custom run

Custom tolerance-rule settings executed cleanly and produced a full output set in a separate directory.

Basic 38/40|Specialized 56/60|Total 94/100
A1Custom feature-selection settings execute successfully
A2The run still produces the expected result bundle, tables, and plots
A3The skill remains reproducible under fixed seed settings
A4The workflow remains scoped to two-class SVM-RFE ranking
Pass rate: 4 / 4
91
Edge✅ Pass
Missing plot-only bundle

Missing plot-only artifact was rejected with a clear standardized error and an obvious recovery path.

Basic 37/40|Specialized 54/60|Total 91/100
A1Plot-only mode rejects missing result bundles before plotting
A2The error uses a standardized skill error code
A3The error is actionable and recoverable
A4The failure is safe and scoped
Pass rate: 4 / 4
92
Variant B✅ Pass
Successful plot-only regeneration

Existing result bundles were reused cleanly and plots were regenerated without rerunning analysis.

Basic 37/40|Specialized 55/60|Total 92/100
A1Existing result bundles can be reused without re-reading raw inputs
A2Plot regeneration succeeds with the documented minimal arguments
A3The command avoids unnecessary reruns of the full SVM-RFE workflow
A4Behavior remains deterministic on the same stored bundle
Pass rate: 4 / 4
89
Stress✅ Pass
Corrupt plot-only bundle

The corrupt RDS bundle was rejected with a standardized skill error and explicit recovery guidance.

Basic 36/40|Specialized 53/60|Total 89/100
A1Corrupt svm_result.rds input is rejected instead of producing undefined plots
A2Corrupt bundle errors use a standardized SKILL_* code
A3Corrupt bundle errors provide actionable recovery guidance
A4Failure handling remains non-destructive and within scope
Pass rate: 4 / 4
Medical Task Total92.2 / 100

Key Strengths

  • The skill has a strong CLI contract with clear inputs, outputs, and scope boundaries for two-class SVM-RFE work.
  • Path confinement is implemented correctly: output traversal outside the skill root was rejected with a standardized error.
  • Determinism is strong: repeat canonical runs produced byte-identical ranking CSV outputs under a fixed seed.
  • The implementation is modular and backed by a runnable packaged test suite that passed during the audit.