Evals · Runs
Eval runs.
Each run replays a golden set through the chat orchestrator with a synthetic visitor session, asserts per-kind expectations on the trace, and persists pass / fail / skipped counts.
Filter by golden set
No eval runs match these filters.