← Dashboard
Create
Build an eval suite
Define the task, two variants, and a few test cases with deterministic checks, then run it in mock mode.
Variants
claude-opus-4-8
claude-opus-4-8
Test cases
Checks:
Define the task, two variants, and a few test cases with deterministic checks, then run it in mock mode.