Vibe testing
Vibe testing is the rapid exploratory mode in AssureAgent. Instead of authoring scenarios one by one, you provide a high-level intent — “stress-test the agent’s hold-music handling”, “find places it leaks PII” — and the platform generates a batch of scenarios on the fly, runs them, and surfaces the most interesting failures.
The Anthropic key
Vibe testing is the only AssureAgent feature that requires an end-user key. You bring your own Anthropic API key; the platform uses it to power the scenario generation.
This is by design. Vibe testing can produce a large volume of generated scenarios on demand; routing those through a key you own keeps your usage and billing transparent.
Adding the key
- Open Settings.
- In Vibe testing key, paste your Anthropic API key.
- Click Save.
The key is stored encrypted at rest and never logged. You can rotate any time; the new key takes effect on the next vibe test.
You don’t need this key for any other AssureAgent feature. Test runs, custom functions, reports — all of those work without it.
Running a vibe test
- From the left nav, click Vibe testing.
- Pick the target (an existing agent / number / endpoint you want to probe).
- Write the intent — 1–3 sentences describing what you want to find.
- Pick the batch size — how many scenarios to generate and run.
- Click Run.
The platform generates the scenarios using your Anthropic key, runs them sequentially or in parallel (per your batch settings), and produces a summary report.
What the report shows
Vibe test reports group runs into outcomes:
- Behavior matched intent — runs that surfaced what you asked about.
- Notable but off-topic — runs that found something interesting but unrelated.
- No signal — runs that completed without surfacing anything notable.
Click any group to see the underlying runs, full transcripts, and audio. Anything you find worth keeping can be promoted to a permanent scenario for future regression coverage.
When to use vibe testing
- Early in agent development when you don’t know what to test for yet.
- After a vendor change — quickly probe whether the new model behaves differently.
- During pre-launch — find weird edge cases that hand-authored scenarios miss.
- When triaging a customer complaint — paste the complaint as the intent, see what the platform reproduces.
When NOT to use vibe testing
- For your regression baseline. Hand-authored scenarios are deterministic and reproducible; vibe tests are exploratory and the generated personas vary across runs. Use authored scenarios for the test suite that runs every night.
- When you need exact reproducibility. Two vibe-test runs with the same intent will not produce identical scenarios.
Cost
You pay for the Anthropic API usage directly through Anthropic — AssureAgent doesn’t markup or proxy. Generation cost depends on intent complexity and batch size; a typical batch of 10 scenarios costs a few cents in tokens.
The call time for the resulting test runs counts against your AssureAgent test minutes quota the same as any other run.