Skip to content

Reports

Reports are how AssureAgent surfaces what happened. There are two flavors.

Single-run reports

One run, one detail page. You get this view by clicking any run from a scenario, path, or the global Test results feed.

What’s on the page:

  • Header — scenario name, path name, agent, run timestamp, pass / fail badge, termination reason.
  • Transcript — every turn, speaker-labeled, timestamped. Tool invocations are shown inline as collapsible blocks (call → result). Click any turn to seek the audio (voice runs).
  • Audio player — full call recording for voice runs. Speed control, scrub, download.
  • Success criteria evaluation — the rubric used and the reasoning for the verdict. If the run failed, this is where you see why.
  • Latency — per-turn time-to-first-token and total response time, with a small chart for the whole call.
  • Metadata — caller persona snapshot, success criteria as it was at run time, path name, links to the parent scenario.

View result guide →

Aggregated reports

Many runs, one trend view. Useful for:

  • Pass-rate over time. Did we get worse this week?
  • Regression detection. Which paths started failing after the last deploy?
  • Slowest paths. Where is the target spending the most time per turn?
  • Most-failed scenarios. Where do we have the worst coverage?

Aggregated views live under Reports in the left nav and can be filtered by:

  • Scenario
  • Path
  • Date range
  • Label (the tags you put on scenarios)
  • Run status

Sharing

Any single-run report can be shared via a tokenized public link. The recipient sees a read-only view at a path like https://app.assureagent.ai/shared/report/<token> — no AssureAgent account required.

Sharing reports →

What success criteria does (and doesn’t)

Success criteria is a free-text rubric — natural language, not code. Examples:

  • “The agent verified the caller using DOB and last 4 of card before discussing the dispute.”
  • “The caller received either a refund confirmation number OR a supervisor callback.”

After every run, the platform reads the transcript and the rubric, and decides pass / fail with a short reasoning paragraph. This is intentionally opinionated and human-readable. It does NOT replace your own judgment for edge cases — read the transcript yourself when something looks weird.

If you don’t write success criteria, the platform auto-generates a sensible one from the scenario description. You can always override it per path.

Retention

Transcripts, success criteria reasoning, and run metadata are retained indefinitely. Audio recordings have a longer-but-finite retention; check Settings → Usage for the exact policy on your plan.