Memory

The persistent, searchable, semantic memory agents accumulate across runs. Auto-extracted by default. The moat.

Memory is the persistent knowledge layer the agents share. Every memory belongs to you, the user — not to a single agent — so anything one agent learns becomes available to every future agent you bring online.

This is the property that makes the agents compound. Run them once and they do work. Run them twenty times and they know your business.

What a memory is

| Field | Meaning | | --- | --- | | key | Short snake_case identifier, unique per namespace. brand_color, vercel_pricing. | | value | The content. Up to 5,000 characters. Plain text — JSON-stringify structured data if you need it. | | namespace | Logical bucket. Defaults to default. Use these to scope by use case (research, coding_standards, customer_facts). | | tags | Free-form labels. Up to 20, AND-filtered on search. | | importance | 1-10. Drives ranking. The reasoning model picks reasonable defaults during auto-extraction; you can override. | | source | manual, auto_extracted, or consolidated — how the memory was created. |

The three ways memories are created

  1. Auto-extraction. When an agent completes, the reasoning model reviews the full run and identifies up to five durably-useful facts — preferences, findings, patterns. These land in memory with source: "auto_extracted". Zero work on your part.
  2. Agent calls jettson_memory_put. From inside a task, the agent decides "this is worth saving" and explicitly stores it. Use this when you want fine control.
  3. You call the API directly. Seed memory with facts you already know (style guides, customer preferences, internal vocabulary) via POST /api/v1/memory.

How agents recall memories

Every agent run starts with an implicit recall. At boot time, Jettson runs a hybrid search across the user's memory pool using the task prompt as the query. Any memories scoring above the relevance threshold are prepended to the agent's first message — the agent sees them before it sees the task.

Agents can also call jettson_memory_search mid-run to look up specific things. Both paths use the same hybrid ranker.

Hybrid search ranking

The search engine combines four signals:

| Signal | Weight (hybrid mode) | | --- | --- | | Semantic similarity — match by meaning, not just words | 0.50 | | Keyword overlap — exact-token match against the value | 0.20 | | Importance — your 1-10 rating, normalized | 0.15 | | Time decay — half-life of 30 days; older memories rank lower | 0.15 |

You can run pure semantic or pure keyword mode if you need it (the SDK and the API both accept a mode parameter), but hybrid is the default and almost always the right choice.

Namespaces

Namespaces are logical buckets. Two memories with the same key in different namespaces are distinct.

  • default — the catch-all. Auto-extraction uses this unless the Mind picks a better one.
  • research — used by the customer-research example to cache dossiers.
  • coding_standards — used by the PR reviewer example for the team's accumulated review preferences.

You can have as many namespaces as you want. They cost nothing extra; they exist so search results stay scoped to the right context.

Quotas

| Plan | Live memories | | --- | --- | | Free | 100 | | Pro | 10,000 | | Scale | 100,000 |

A "live" memory is one that's the latest version, not expired, and not soft-deleted. The quota counts those; superseded versions and tombstones don't.

When you hit the quota, the next write returns 402 memory_quota_exceeded with the current count. Run POST /api/v1/memory/dedupe to clean near-duplicates, or upgrade your plan.

Versioning

Memory writes are versioned. Writing a new value under an existing (namespace, key) tuple supersedes the old one — the old doc keeps existing with isLatestVersion: false, and the new doc links back via versionOf. You can inspect history in the Memory Console.

Idempotent writes (same key, same value) return the existing doc without writing a new version.

Dedup and consolidate

Two operations help keep the pool sharp:

  • Dedupe (POST /api/v1/memory/dedupe) — finds pairs of memories with cosine similarity ≥ 0.92 and soft-deletes the less-accessed one. Run on demand; dryRun: true previews.
  • Consolidate (POST /api/v1/memory/consolidate) — clusters related memories above a 0.75 threshold, asks the reasoning model to write a single summary, replaces the cluster with the summary. Useful after a heavy run of accretion (think: the customer-research agent after 100 calls).

Common patterns

  • User preferencesnamespace=user_profile, key=brand_color. Set once, every agent that touches branding sees it.
  • Research cachenamespace=research, key=<domain>. Set with expiresAt 30 days out so the agent re-researches stale targets.
  • Team standardsnamespace=coding_standards. Auto-extraction accumulates patterns from PRs; you tweak importance manually.
  • Customer historynamespace=customers, key=<customer_id>. Notes on past tickets, preferences, edge cases the human team learned the hard way.