Use this file to discover all available pages before exploring further.
LLMs are probabilistic—the same prompt might produce different results each time. Running a test once tells you it can work, not that it reliably works. Evals run the same test many times and measure statistical accuracy.
await test.run(agent, { iterations: 30, // How many times to run concurrency: 5, // Parallel runs (careful with rate limits) retries: 2, // Retry failures timeoutMs: 30000, // Per-test timeout mcpjam: { // Auto-save is enabled when MCPJAM_API_KEY is available suiteName: "SDK eval smoke", strict: false, // warn by default; true to fail CI on upload errors }, onProgress: (done, total) => { console.log(`${done}/${total}`); },});
Both EvalTest and EvalSuite can automatically save results to MCPJam when a run completes. Set MCPJAM_API_KEY in your environment and results are saved automatically:
By default, when results are built from traces (auto-save and reporter helpers), a failed tool execution (MCP isError, errored tool spans, or error tool-results in messages) sets passed: false for that iteration even if your test function returned true. That keeps CI pass rate aligned with real server behavior.To disable that for a run (for example you only assert which tools were called, not that every call succeeded), set failOnToolError: false on the mcpjam block:
You can also generate eval code from the MCPJam Inspector. Click ⋮ → Copy markdown for server evals on any server card, then paste it into an LLM. See the Quickstart for details.If you have a MCPJAM_API_KEY, the generated code will automatically save results to the Evals tab in the Inspector. Go to Settings > Workspace API Key to get your key.