Other

chemical-storage-sorter

Sort laboratory chemicals into safe storage groups by hazard classification (acids, bases, oxidizers, flammables, toxics). Identifies incompatible pairs, generates storage plans with warnings, and supports OSHA/NFPA compliance for lab safety.

85100Total Score
Core Capability
83 / 100
Functional Suitability
11 / 12
Reliability
10 / 12
Performance & Context
7 / 8
Agent Usability
14 / 16
Human Usability
7 / 8
Security
10 / 12
Maintainability
11 / 12
Agent-Specific
13 / 20
Medical Task
12 / 12 Passed
86Sort a lab inventory: HCl, NaOH, ethanol, H2O2, KCN, NaCl
4/4
86Check compatibility between concentrated nitric acid and acetone
4/4
86Sort a chemical with an unknown/novel name not in the database
4/4

Veto GatesRequired pass for any deployment consideration

Skill Veto✓ All 4 gates passed
Operational Stability
System remains stable across varied inputs and edge cases
PASS
Structural Consistency
Output structure conforms to expected skill contract format
PASS
Result Determinism
Equivalent inputs produce semantically equivalent outputs
PASS
System Security
No prompt injection, data leakage, or unsafe tool use detected
PASS

Core Capability83 / 1008 Categories

Functional Suitability
Multi-hazard chemicals now automatically assigned to most restrictive group per updated Common Pitfalls; CAS/SDS resolution path added to Error Handling
11 / 12
92%
Reliability
Unknown chemicals now include actionable resolution path (CAS lookup, SDS, safety officer); script failure fallback present
10 / 12
83%
Performance & Context
SKILL.md 163 lines — lean; references replaced with real OSHA/NFPA URLs
7 / 8
88%
Agent Usability
Clear workflow and classification table; compatibility matrix is explicit; error prevention now covers multi-hazard auto-assignment
14 / 16
88%
Human Usability
Description is discoverable; safety-focused language is appropriate for the domain
7 / 8
88%
Security
No hardcoded secrets; no injection vectors; pure classification skill
10 / 12
83%
Maintainability
Clean structure; multi-hazard detection logic documented; adding new chemicals requires only database update
11 / 12
92%
Agent-Specific
Trigger precision good; escape hatches for SDS interpretation and synthesis present; no composability documentation
13 / 20
65%
Core Capability Total83 / 100

Medical TaskExecution Average: 86.3 / 100 — Assertions: 12/12 Passed

86
Canonical
Sort a lab inventory: HCl, NaOH, ethanol, H2O2, KCN, NaCl
4/4
86
Variant A
Check compatibility between concentrated nitric acid and acetone
4/4
86
Edge
Sort a chemical with an unknown/novel name not in the database
4/4
86
Canonical✅ Pass
Sort a lab inventory: HCl, NaOH, ethanol, H2O2, KCN, NaCl

Output completed successfully; sort a lab inventory: hcl, naoh, ethanol, h2o2, kcn, nacl case handled within expected scope.

Basic 35/40|Specialized 51/60|Total 86/100
A1Output assigns each chemical to the correct hazard group
A2Output identifies HCl-NaOH and H2O2-ethanol as incompatible pairs
A3Output generates a storage plan with group assignments
A4Output includes reminder to verify with SDS
Pass rate: 4 / 4
86
Variant A✅ Pass
Check compatibility between concentrated nitric acid and acetone

HNO3 now automatically assigned to most restrictive group (oxidizer) per updated logic

Basic 35/40|Specialized 51/60|Total 86/100
A1Output identifies HNO3 as both acid and oxidizer (multi-hazard)
A2Output flags HNO3 + acetone as incompatible (oxidizer + flammable)
A3Output assigns HNO3 to the most restrictive storage group automatically
A4Output does not fabricate compatibility data
Pass rate: 4 / 4
86
Edge✅ Pass
Sort a chemical with an unknown/novel name not in the database

Unknown chemical now includes actionable resolution path

Basic 35/40|Specialized 51/60|Total 86/100
A1Output assigns unknown chemical to 'general' group
A2Output flags the unknown chemical for manual review
A3Output does not fabricate a hazard classification
A4Output suggests using CAS number or SDS for authoritative classification
Pass rate: 4 / 4
Medical Task Total86.3 / 100

Key Strengths

  • Multi-hazard chemical handling now automatic: concentrated HNO3 and similar chemicals assigned to most restrictive group without manual override
  • Compatibility matrix with reaction risk descriptions provides actionable safety information
  • Storage requirements table with cabinet types and special requirements is publication-quality
  • Unknown chemical resolution path now actionable: CAS lookup, SDS GHS classification, safety officer contact
  • OSHA/NFPA compliance framing with real reference URLs adds regulatory credibility