Test runs
A test run is one execution of a test path against the target agent. Each click of Run produces a new run; runs are immutable once finished and grouped into reports.
What happens when you run
sequenceDiagram actor U as You participant P as AssureAgent participant T as Target agent U->>P: Click Run P->>P: Materialize caller persona from scenario + path alt Outbound P->>T: Dial target's number else Inbound T->>P: Target dials AssureAgent number end P-->>T: Caller LLM speaks (audio / text) T-->>P: Target replies Note over P,T: Conversation continues until end_call or duration cap P->>P: Save transcript + audio P->>P: Evaluate success criteria P-->>U: Report readyInputs to a run
When you click Run, the run is fully described by:
- The path (persona + success criteria).
- The agent (voice, responsiveness, max duration).
- The target (phone number for voice, endpoint for chat).
- Any custom functions declared on the scenario (their tool schemas are visible to the caller LLM).
Outputs from a run
When the run completes you get:
- Transcript — every turn, speaker-labeled, timestamped.
- Audio — full call recording for voice tests, click any turn to seek.
- Tool invocations — every time the caller called a built-in (
end_call,press_digit) or custom function, inline with the transcript. - Latency stats — turn-by-turn time-to-first-token and total response duration.
- Success / failure — auto-evaluated against the path’s success criteria, with reasoning.
- Termination reason —
caller_hung_up,target_hung_up,duration_cap,error.
Outbound vs inbound
The mechanics differ:
- Outbound — AssureAgent dials the target’s number from one of your provisioned outbound numbers. Caller speaks first (with the persona’s opening line).
- Inbound — the target’s IVR / system places a call into a number AssureAgent owns. Target speaks first; caller listens, navigates the IVR if needed, and engages.
Choose at scenario creation; you can’t switch a scenario between outbound and inbound after.
Run states
| State | Meaning |
|---|---|
pending | Queued, not yet picked up. |
running | Conversation is live. Live transcript streams in. |
completed | Conversation ended; transcript and report are final. |
failed | A platform-level error stopped the run before a real conversation happened. Different from a passing-but-bad-result; failed = we couldn’t even test. |
Where to find them
Open any scenario → click into a path → see the runs for that path. Or open Test results in the left nav for a global feed of recent runs across all scenarios.