Test runs

A test run is one execution of a test path against the target agent. Each click of Run produces a new run; runs are immutable once finished and grouped into reports.

What happens when you run

sequenceDiagram
  actor U as You
  participant P as AssureAgent
  participant T as Target agent
  U->>P: Click Run
  P->>P: Materialize caller persona from scenario + path
  alt Outbound
    P->>T: Dial target's number
  else Inbound
    T->>P: Target dials AssureAgent number
  end
  P-->>T: Caller LLM speaks (audio / text)
  T-->>P: Target replies
  Note over P,T: Conversation continues until end_call or duration cap
  P->>P: Save transcript + audio
  P->>P: Evaluate success criteria
  P-->>U: Report ready

Inputs to a run

When you click Run, the run is fully described by:

The path (persona + success criteria).
The agent (voice, responsiveness, max duration).
The target (phone number for voice, endpoint for chat).
Any custom functions declared on the scenario (their tool schemas are visible to the caller LLM).

Outputs from a run

When the run completes you get:

Transcript — every turn, speaker-labeled, timestamped.
Audio — full call recording for voice tests, click any turn to seek.
Tool invocations — every time the caller called a built-in (end_call, press_digit) or custom function, inline with the transcript.
Latency stats — turn-by-turn time-to-first-token and total response duration.
Success / failure — auto-evaluated against the path’s success criteria, with reasoning.
Termination reason — caller_hung_up, target_hung_up, duration_cap, error.

Outbound vs inbound

The mechanics differ:

Outbound — AssureAgent dials the target’s number from one of your provisioned outbound numbers. Caller speaks first (with the persona’s opening line).
Inbound — the target’s IVR / system places a call into a number AssureAgent owns. Target speaks first; caller listens, navigates the IVR if needed, and engages.

Choose at scenario creation; you can’t switch a scenario between outbound and inbound after.

Run states

State	Meaning
`pending`	Queued, not yet picked up.
`running`	Conversation is live. Live transcript streams in.
`completed`	Conversation ended; transcript and report are final.
`failed`	A platform-level error stopped the run before a real conversation happened. Different from a passing-but-bad-result; failed = we couldn’t even test.

Where to find them

Open any scenario → click into a path → see the runs for that path. Or open Test results in the left nav for a global feed of recent runs across all scenarios.