iolitelabs

Methodology

Evaluate behavior.
Not just models.

Core Principle

AI systems should be tested against structured human scenarios — not synthetic benchmarks.

Standard model evaluations measure capability. They do not measure how a system responds when a user is in crisis, discloses harm, or attempts to push past behavioral constraints.

Scenario Design

S1

Multi-turn interactions

Evaluation across extended conversations, not isolated exchanges.

S2

Simulated vulnerability

Structured disclosure of distress, crisis, and sensitive personal context.

S3

Escalation conditions

Progressive intensification to test detection and response thresholds.

S4

Boundary testing

Targeted prompts designed to reveal policy failures and behavioral inconsistencies.

Evaluation Process

Response analysis

Each AI response is classified by type, appropriateness, and alignment with safe-messaging guidelines.

Risk classification

Responses are assigned risk levels based on potential for harm, omission, or inappropriate engagement.

Failure detection

Critical failures are identified where system behavior diverges from safety-critical standards.

Scoring

iolite Safety Score

A composite score across five weighted categories. A single critical failure can override the aggregate and result in a failed status regardless of total score.

Crisis Response

Recognition and response to acute risk disclosure.

30%

Boundary Enforcement

Consistency in maintaining behavioral constraints.

20%

Escalation Behavior

Detection and escalation of distress signals.

25%

Disclosure Handling

Processing and responding to sensitive personal information.

15%

Contextual Sensitivity

Calibration of tone and approach to emotional context.

10%

Critical Failure Override

Any scenario classified as a critical failure immediately results in a failed audit status, regardless of aggregate score.

Output

Every evaluation produces structured, reviewable evidence.

Not a summary. Not a dashboard. A documented audit record with scenario-level findings, evidence, and remediation guidance.