The shape of the index — and the three decisions everything else hangs off of.
The homepage promises a few things that sound suspicious until you see the architecture underneath. This page is that architecture, written plainly. If you're integrating against Inseam, writing a plugin, or evaluating whether to trust it with your stack, this is the page to read.
Three slim layers.
Megabytes, not gigabytes.
A standard RAG pipeline chunks every document, embeds every chunk, and stores the chunked text right next to the embedding. The vector store becomes a second copy of your corpus. That copy is what makes every cloud-RAG vendor a cloud-RAG vendor — it's too big to live anywhere else.
Inseam keeps three smaller things and fetches the rest live. That single decision is what makes a 100,000-document index fit on a phone, what makes retrieval always return the current document, what makes indexing paid newsletters license-clean, and what lets a cloud node hold an index of an email account it has never received a body from.
A stable id, the kind of thing it is (an IMAP message, a Drive file, a Notion page), a content hash, a few timestamps, a display name. Tens of bytes per item. Two providers holding the same RFC-5322 message dedupe to one row for free — same kind, same id, one source of truth.
Hierarchical, plugin-generated, queryable. Workspace down to folder down to document down to section. Agents navigate the tree, drill in where it's worth drilling, and only ask for the actual body when they need it. This lines up with where the research community is moving — agentic, hierarchical retrieval — and falls out of the shape for free.
Vectors over the summary tree, plus selected detail where it earns its keep. Not every chunk of every document. Bodies get fetched live, through the plugin that owns the credential, when the agent has decided this specific thing is worth reading.
The honest tradeoff: a query that needs a body pays a network round-trip the copy-everything systems don't pay. Summaries are usually enough; we cache by content hash; you can opt into a local body cache for hot documents. We trade query-time latency for index-time cost, storage cost, freshness, and sovereignty. For the people we built this for, that trade is overwhelmingly the right one.
Connection. Source. Access.
Three things, kept separate on purpose.
Every existing system bundles "where the document is", "the login that can fetch it", and "who's allowed to see it" into a single object — usually called a connector or a workspace. Inseam splits them. That separation is what makes the unusual things on the homepage actually possible.
A Connection holds the OAuth token, the API key, the SMB share password. It's the part that can actually go out and fetch bytes. Connections are owned by a specific node — your laptop, your phone, your Worker — and the node holding a Connection can be different from the node holding the index of the things that Connection reaches.
A Source row is the indexed thing — a stable address, metadata, summary, embeddings. Independent of any specific Connection. That's why disconnecting Gmail doesn't delete your Gmail-derived index, and why switching providers can keep the same Source rows pointed at the same logical things.
Access is a first-class layer over Sources, not a workspace boundary. External identities — a customer who emailed you, a client on your portal, a counterparty on a contract — can hold scoped read paths into your index. Identity signals (From/To, account ids, party fields) come from data you already have. The same index becomes N scoped retrieval surfaces, one per identified party, with no per-customer plumbing.
Same code on every node.
They pool, they don't centralize.
A node is a thing running Inseam — an app on a laptop, an iOS binary, a Cloudflare Worker. Same Store contract, same plugin surface, same retrieval API. Two adapters today: SQLite on a device, D1 in a Worker. No local-versus-cloud code fork.
Pair the nodes you own and they federate as peers. Your laptop holds the Gmail Connection; your phone holds the index; your Worker is always-on and runs heavier embedding models. A query to any of them resolves across all of them, transparently. Devices in your graph federate freely; outside identities only reach in through Access grants. The two surfaces are deliberately separate.
Hosts are config-driven; capability is dynamic. A node's available services flow from configuration; plugins discover what's there at runtime. That's what makes "the same retrieval surface, locally or remotely" structurally true rather than a slogan.
A contract small enough
for an AI to fill in.
The whole plugin surface is three things: a ConnectionDefinition (how
this provider authenticates), a LocationKind registration (what kind of
thing this plugin produces), and a typed Source-write API (the index rows it emits).
That's it. Fixtures and tests are framework-enforced so quality doesn't degrade as
the catalog grows.
Small on purpose. Most plugin contracts are too big for an agent to fill in correctly without tribal knowledge of the codebase. Ours is the opposite — opinionated, minimal, shape-checked. The result: you describe a tool you use, an agent writes the plugin, the test suite tells you when it's right. Twenty minutes to a working integration for a SaaS no vendor will ever build for.
That's the unlock for the long tail. Vendor-side connector catalogs are bounded by vendor headcount. Inseam's is bounded by how many users have an agent and ten minutes.
The work lives in the open.
The design docs under /architecture are where the actual decisions
happen. If a section here was too compressed, the corresponding design doc is the
long form. Open an issue if something doesn't line up; argue with us before the code
lands, not after.