Data Analysis

scvi-tools

86100Total Score

Core Capability

84 / 100

Functional Suitability

11 / 12

Reliability

9 / 12

Performance & Context

7 / 8

Agent Usability

14 / 16

Human Usability

8 / 8

Security

10 / 12

Maintainability

9 / 12

Agent-Specific

16 / 20

Medical Task

20 / 20 Passed

91Deep generative models for single-cell omics; use when you need probabilistic batch correction (scVI), transfer learning, uncertainty-aware differential expression, or multimodal integration (totalVI/MultiVI)

4/4

87Deep generative models for single-cell omics; use when you need probabilistic batch correction (scVI), transfer learning, uncertainty-aware differential expression, or multimodal integration (totalVI/MultiVI)

4/4

85Unified model API: setup_anndata(...) → Model(adata) → train() → get_*() across model families

4/4

85Probabilistic latent representations for integration, denoising, and downstream clustering/visualization

4/4

85End-to-end case for Unified model API: setup_anndata(...) → Model(adata) → train() → get_*() across model families

4/4

Veto GatesRequired pass for any deployment consideration

Skill Veto✓ All 4 gates passed

✓

Operational Stability

System remains stable across varied inputs and edge cases

PASS

✓

Structural Consistency

Output structure conforms to expected skill contract format

PASS

✓

Result Determinism

Equivalent inputs produce semantically equivalent outputs

PASS

✓

System Security

No prompt injection, data leakage, or unsafe tool use detected

PASS

Research Veto✅ PASS — Applicable

Dimension	Result	Detail
Scientific Integrity	PASS	No scientific-integrity problem was surfaced because the package did not claim more than the available records, article text, or script evidence supported.
Practice Boundaries	PASS	The evaluated outputs stayed inside the Deep generative models for single-cell omics and did not drift into unsupported interpretation beyond the available inputs.
Methodological Ground	PASS	The legacy review kept the package aligned with its named analysis library, data structure, or processing workflow.
Code Usability	PASS	The archived review preserved a usable code path with named scripts, expected inputs, and a recognizable output contract.

Core Capability84 / 100 — 8 Categories

Functional Suitability

Functional suitability was softened by the legacy issue 'Improve stress-case output rigor'. Stress and boundary scenarios show weaker consistency

11 / 12

92%

Reliability

Related legacy finding for scvi-tools: Improve stress-case output rigor. Stress and boundary scenarios show weaker consistency

9 / 12

75%

Performance & Context

The package performed well overall, with only a small remaining performance-context deduction.

7 / 8

88%

Agent Usability

The archived review left some headroom in how quickly an agent can lock onto the intended analysis path.

14 / 16

88%

Human Usability

No point loss was recorded for human usability in the legacy audit.

8 / 8

100%

Security

The packaged workflow stayed safe overall, with only a small remaining deduction around boundary signaling.

10 / 12

83%

Maintainability

The analysis package is maintainable overall, though the archived score suggests modest cleanup headroom.

9 / 12

75%

Agent-Specific

Related legacy finding for scvi-tools: Improve stress-case output rigor. Stress and boundary scenarios show weaker consistency

16 / 20

80%

Core Capability Total84 / 100

Medical TaskExecution Average: 86.6 / 100 — Assertions: 20/20 Passed

Canonical

Deep generative models for single-cell omics; use when you need probabilistic batch correction (scVI), transfer learning, uncertainty-aware differential expression, or multimodal integration (totalVI/MultiVI)

4/4 ✓

Variant A

4/4 ✓

Edge

Unified model API: setup_anndata(...) → Model(adata) → train() → get_*() across model families

4/4 ✓

Variant B

Probabilistic latent representations for integration, denoising, and downstream clustering/visualization

4/4 ✓

Stress

End-to-end case for Unified model API: setup_anndata(...) → Model(adata) → train() → get_*() across model families

4/4 ✓

Canonical✅ Pass

Deep generative models for single-cell omics; use when you need... remained tied to the documented analysis contract even when the preserved evidence centered on instructions instead of a full rerun.

Basic 36/40|Specialized 55/60|Total 91/100

✅A1The scvi-tools output structure matches the documented deliverable

✅A2The instruction path remains actionable for the documented case

✅A3The output stays fully within the documented skill boundary

✅A4The response quality is acceptable for the documented path

Pass rate: 4 / 4

Variant A✅ Pass

Deep generative models for single-cell omics; use when you need... remained tied to the documented analysis contract even when the preserved evidence centered on instructions instead of a full rerun.

Basic 34/40|Specialized 53/60|Total 87/100

✅A1The scvi-tools output structure matches the documented deliverable

✅A2The instruction path remains actionable for the documented case

✅A3The output stays fully within the documented skill boundary

✅A4The response quality is acceptable for the documented path

Pass rate: 4 / 4

Edge✅ Pass

Unified model API: setup_anndata(...) → Model(adata) → train() → get_*() across model families

Unified model API: setup_anndata(...) → Model(adata) → train() →... remained tied to the documented analysis contract even when the preserved evidence centered on instructions instead of a full rerun.

Basic 33/40|Specialized 52/60|Total 85/100

✅A1The scvi-tools output structure matches the documented deliverable

✅A2The instruction path remains actionable for the documented case

✅A3The output stays fully within the documented skill boundary

✅A4The response quality is acceptable for the documented path

Pass rate: 4 / 4

Variant B✅ Pass

Probabilistic latent representations for integration, denoising, and downstream clustering/visualization

The archived run treated Probabilistic latent representations for integration, denoising,... as a bounded analysis workflow rather than a purely narrative instruction path.

Basic 32/40|Specialized 53/60|Total 85/100

✅A1The scvi-tools output structure matches the documented deliverable

✅A2The instruction path remains actionable for the documented case

✅A3The output stays fully within the documented skill boundary

✅A4The response quality is acceptable for the documented path

Pass rate: 4 / 4

Stress✅ Pass

End-to-end case for Unified model API: setup_anndata(...) → Model(adata) → train() → get_*() across model families

The archived run treated End-to-end case for Unified model API: setup_anndata(...) →... as a bounded analysis workflow rather than a purely narrative instruction path.

Basic 29/40|Specialized 56/60|Total 85/100

✅A1The scvi-tools output structure matches the documented deliverable

✅A2The instruction path remains actionable for the documented case

✅A3The output stays fully within the documented skill boundary

✅A4The response quality is acceptable for the documented path

Pass rate: 4 / 4

Medical Task Total86.6 / 100

Key Strengths

Primary routing is Data Analysis with execution mode A
Static quality score is 84/100 and dynamic average is 78.6/100
Assertions and command execution outcomes are recorded per input for human review
Execution verification summary: No script verification was applicable