$ inseam · home · what it unlocks · status · github how it works
how it works

The shape of the index — and the three decisions everything else hangs off of.

The homepage promises a few things that sound suspicious until you see the architecture underneath. This page is that architecture, written plainly. If you're integrating against Inseam, writing a plugin, or evaluating whether to trust it with your stack, this is the page to read.

01The index

Three slim layers.
Megabytes, not gigabytes.

A standard RAG pipeline chunks every document, embeds every chunk, and stores the chunked text right next to the embedding. The vector store becomes a second copy of your corpus. That copy is what makes every cloud-RAG vendor a cloud-RAG vendor — it's too big to live anywhere else.

Inseam keeps three smaller things and fetches the rest live. That single decision is what makes a 100,000-document index fit on a phone, what makes retrieval always return the current document, what makes indexing paid newsletters license-clean, and what lets a cloud node hold an index of an email account it has never received a body from.

01Addresses
Where the thing lives + a little metadata

A stable id, the kind of thing it is (an IMAP message, a Drive file, a Notion page), a content hash, a few timestamps, a display name. Tens of bytes per item. Two providers holding the same RFC-5322 message dedupe to one row for free — same kind, same id, one source of truth.

location_kind path_id content_hash
02Summaries
A tree the agent can walk

Hierarchical, plugin-generated, queryable. Workspace down to folder down to document down to section. Agents navigate the tree, drill in where it's worth drilling, and only ask for the actual body when they need it. This lines up with where the research community is moving — agentic, hierarchical retrieval — and falls out of the shape for free.

structured · navigable · refreshes when the hash changes
03Embeddings
Over summaries, not the corpus

Vectors over the summary tree, plus selected detail where it earns its keep. Not every chunk of every document. Bodies get fetched live, through the plugin that owns the credential, when the agent has decided this specific thing is worth reading.

fetched fresh · always current · never re-ingested

The honest tradeoff: a query that needs a body pays a network round-trip the copy-everything systems don't pay. Summaries are usually enough; we cache by content hash; you can opt into a local body cache for hot documents. We trade query-time latency for index-time cost, storage cost, freshness, and sovereignty. For the people we built this for, that trade is overwhelmingly the right one.

02The split

Connection. Source. Access.
Three things, kept separate on purpose.

Every existing system bundles "where the document is", "the login that can fetch it", and "who's allowed to see it" into a single object — usually called a connector or a workspace. Inseam splits them. That separation is what makes the unusual things on the homepage actually possible.

Connection
The credential. The live reach.

A Connection holds the OAuth token, the API key, the SMB share password. It's the part that can actually go out and fetch bytes. Connections are owned by a specific node — your laptop, your phone, your Worker — and the node holding a Connection can be different from the node holding the index of the things that Connection reaches.

Source
The address. The thing in the index.

A Source row is the indexed thing — a stable address, metadata, summary, embeddings. Independent of any specific Connection. That's why disconnecting Gmail doesn't delete your Gmail-derived index, and why switching providers can keep the same Source rows pointed at the same logical things.

Access
Who's allowed to retrieve what.

Access is a first-class layer over Sources, not a workspace boundary. External identities — a customer who emailed you, a client on your portal, a counterparty on a contract — can hold scoped read paths into your index. Identity signals (From/To, account ids, party fields) come from data you already have. The same index becomes N scoped retrieval surfaces, one per identified party, with no per-customer plumbing.

03The pool

Same code on every node.
They pool, they don't centralize.

A node is a thing running Inseam — an app on a laptop, an iOS binary, a Cloudflare Worker. Same Store contract, same plugin surface, same retrieval API. Two adapters today: SQLite on a device, D1 in a Worker. No local-versus-cloud code fork.

Pair the nodes you own and they federate as peers. Your laptop holds the Gmail Connection; your phone holds the index; your Worker is always-on and runs heavier embedding models. A query to any of them resolves across all of them, transparently. Devices in your graph federate freely; outside identities only reach in through Access grants. The two surfaces are deliberately separate.

Hosts are config-driven; capability is dynamic. A node's available services flow from configuration; plugins discover what's there at runtime. That's what makes "the same retrieval surface, locally or remotely" structurally true rather than a slogan.

04The plugin

A contract small enough
for an AI to fill in.

The whole plugin surface is three things: a ConnectionDefinition (how this provider authenticates), a LocationKind registration (what kind of thing this plugin produces), and a typed Source-write API (the index rows it emits). That's it. Fixtures and tests are framework-enforced so quality doesn't degrade as the catalog grows.

Small on purpose. Most plugin contracts are too big for an agent to fill in correctly without tribal knowledge of the codebase. Ours is the opposite — opinionated, minimal, shape-checked. The result: you describe a tool you use, an agent writes the plugin, the test suite tells you when it's right. Twenty minutes to a working integration for a SaaS no vendor will ever build for.

That's the unlock for the long tail. Vendor-side connector catalogs are bounded by vendor headcount. Inseam's is bounded by how many users have an agent and ten minutes.

05Read on

The work lives in the open.

The design docs under /architecture are where the actual decisions happen. If a section here was too compressed, the corresponding design doc is the long form. Open an issue if something doesn't line up; argue with us before the code lands, not after.