Reports
Reports are how AssureAgent surfaces what happened. There are two flavors.
Single-run reports
One run, one detail page. You get this view by clicking any run from a scenario, path, or the global Test results feed.
What’s on the page:
- Header — scenario name, path name, agent, run timestamp, pass / fail badge, termination reason.
- Transcript — every turn, speaker-labeled, timestamped. Tool invocations are shown inline as collapsible blocks (call → result). Click any turn to seek the audio (voice runs).
- Audio player — full call recording for voice runs. Speed control, scrub, download.
- Success criteria evaluation — the rubric used and the reasoning for the verdict. If the run failed, this is where you see why.
- Latency — per-turn time-to-first-token and total response time, with a small chart for the whole call.
- Metadata — caller persona snapshot, success criteria as it was at run time, path name, links to the parent scenario.
Aggregated reports
Many runs, one trend view. Useful for:
- Pass-rate over time. Did we get worse this week?
- Regression detection. Which paths started failing after the last deploy?
- Slowest paths. Where is the target spending the most time per turn?
- Most-failed scenarios. Where do we have the worst coverage?
Aggregated views live under Reports in the left nav and can be filtered by:
- Scenario
- Path
- Date range
- Label (the tags you put on scenarios)
- Run status
Sharing
Any single-run report can be shared via a tokenized public link. The recipient sees a read-only view at a path like https://app.assureagent.ai/shared/report/<token> — no AssureAgent account required.
What success criteria does (and doesn’t)
Success criteria is a free-text rubric — natural language, not code. Examples:
- “The agent verified the caller using DOB and last 4 of card before discussing the dispute.”
- “The caller received either a refund confirmation number OR a supervisor callback.”
After every run, the platform reads the transcript and the rubric, and decides pass / fail with a short reasoning paragraph. This is intentionally opinionated and human-readable. It does NOT replace your own judgment for edge cases — read the transcript yourself when something looks weird.
If you don’t write success criteria, the platform auto-generates a sensible one from the scenario description. You can always override it per path.
Retention
Transcripts, success criteria reasoning, and run metadata are retained indefinitely. Audio recordings have a longer-but-finite retention; check Settings → Usage for the exact policy on your plan.