Handling failures
Tools fail, networks blip, APIs change. The patterns that keep agents useful through it all.
Real agents fail in real ways. The good news: failures inside an agent run are recoverable by the Mind, not by you. The right prompt patterns turn "the script crashed" into "the agent worked around it."
How failures surface
When a tool fails, it returns { "error": "Jettson <Tool> failed: <reason>" }. The Mind reads this on the next iteration like any other tool result — it doesn't crash the agent. Your prompt decides what happens next.
The whole loop only ends in error if:
- The Mind hits the 20-iteration cap
- The agent exceeds the plan's max duration
- The reasoning proxy is unavailable for the entire run (very rare)
Single-tool failures? The agent keeps going. Your job is to tell it how.
Pattern 1 — Ask the Mind to handle errors gracefully
In the task prompt, give the agent permission to recover:
If jettson_browser_navigate fails:
- Wait, then try once more
- If it fails again, try jettson_http_request to the same URL
- If that also fails, return { "error": "could not reach <url>", "fallback": <whatever you have so far> }
Don't error out the whole task — return partial progress.This converts "the task failed" into "the task returned a partial answer with an error field." Way easier to handle on the caller side.
Pattern 2 — Retry at the prompt level
For transient failures (network blips, 503s), one retry is usually enough.
If a tool call fails with a temporary error (timeout, 5xx, "temporarily unavailable"),
wait a moment and try again ONCE. Don't retry more than once.
If the second attempt also fails, fall back to the alternative source.The Mind is good at distinguishing "retry-worthy" errors from permanent ones — bad input vs. transient infra.
Pattern 3 — Fallback chains
When a task can be done multiple ways, list them in order:
To find the company's pricing:
1. jettson_http_request to https://<domain>/pricing.json (if their public API has it)
2. jettson_browser_navigate to https://<domain>/pricing (scrape)
3. jettson_browser_navigate to https://<domain> (look for a pricing link)
4. If still nothing, return { "pricing": "unknown", "tried": ["api", "scrape", "homepage"] }This shape is what makes the customer-research example robust — no one source is reliable, three of them stacked are.
Pattern 4 — Tell the agent when to give up
The default is for the Mind to keep trying. That's usually right but occasionally wasteful. Bound it:
If after 3 distinct attempts you can't get the data, return a partial result.
Do not loop indefinitely.The 20-iteration cap is the absolute backstop; this prompt-level guidance kicks in earlier and produces cleaner output.
Pattern 5 — Ask the user (when there's a user to ask)
For agents in interactive contexts (chat, copilot UI), the right answer to ambiguity is sometimes "ask back." Bake it in:
If the task lacks information you need to make a confident call, return:
{ "needs_input": "<what you need from the human>" }
Don't guess. The user is on the other end of this and they'll answer.The caller sees needs_input populated and renders a follow-up prompt UI instead of spawning a new agent blind.
Handling agent-level errors
When status transitions to error, the agent doc has:
{
"status": "error",
"errorMessage": "Agent exceeded the maximum duration for its plan."
}Common causes and reactions:
| errorMessage | Cause | What to do |
| --- | --- | --- |
| "Agent exceeded the maximum duration for its plan." | Hit maxAgentDurationMinutes | Trim the task; upgrade plan for longer runs; split into multiple agents |
| "Agent reached the iteration limit without completing." | 20-iter cap hit | Prompt is too open-ended; constrain the output shape |
| "Jettson Mind is temporarily unavailable. Please retry." | Reasoning proxy transient failure | Retry once with backoff |
| "Reasoning step exceeded the per-call token budget." | A tool returned too much data for one step | Add selector to extracts; use shell to filter before reading |
Handling spawn-time errors
These happen at POST /api/v1/agents time and never get a successful spawn:
429 rate_limited— honorRetry-After429 concurrent_limit_reached— wait or stop another agent402 monthly_quota_exceeded— upgrade or wait for the 1st of the month400 invalid_task— your task body is bad, fix it503 temporarily_unavailable— retry with backoff
See Errors for the full catalog and retry guidance.
A note on idempotency
Spawn calls are not idempotent — two POSTs make two agents. If your code might retry due to a network hiccup on the way to Jettson (e.g. a 504 from a load balancer in front of your own service), dedupe on your side before calling.
Good shape:
1. Compute a deterministic ID for this task (hash the task text + user)
2. Check your DB: has this ID been spawned?
3. If yes, return the existing agent_id
4. If no, POST to Jettson, store the (id, agent_id) tuple, returnRelated
- Errors — full catalog with retry advice
- Rate limits — back-off recipes
- Tool composition patterns — fallback chains in context