Testing AI Agents
You’re building AI agents that make autonomous decisions. The agent decides WHAT to do, not just HOW. Traditional tests don’t work because behavior is non-deterministic. MockWorld Tests let you assert on outcomes, not paths.The challenge
Your agent processes requests like this:- Look up Ana’s account
- Find her recent orders
- Check the return policy
- Create a refund in Stripe
- Update the order in Shopify
- Send Ana an email
- Log a support ticket
The solution
Test what happened, not how it happened.Complete example with LangChain
Seeding state
Set up realistic test scenarios:Testing safety boundaries
Verify your agent stays within bounds:Testing error handling
Verify graceful degradation:Programmatic assertions
For complex validations:Mokra vs LangSmith
| Aspect | LangSmith | Mokra |
|---|---|---|
| Layer | LangChain callbacks | HTTP |
| Sees | Tool invocations, LLM calls | All HTTP requests |
| Catches | What LangChain reports | Everything |
| Output | Traces, tokens, latency | Plain English |
| Purpose | Debug LLM reasoning | Test outcomes |
Running in CI/CD
MOKRA_API_KEY.