The Hard Problem of Federated Search
Every team has its knowledge spread across a dozen apps. The premise of RetrieveIT is that you shouldn’t have to migrate that knowledge into a new silo to search it. Connect your tools with OAuth, ask a question in plain English, get an answer with citations back to the source.
The painful version of that promise is integration sprawl. Each connector wants its own auth flow, its own pagination quirks, its own file-download URL semantics, its own retry behavior. Without discipline, the sixteenth integration takes longer to build than the first.
That isn’t what happened here. The sixteenth integration shipped in a day. This is how.
The Integration Timeline
Two distinct waves, separated by ~14 weeks of consolidation. The first wave invented the pattern; the second wave harvested it. Both ADRs sit at the inflection points.
First integration. The pattern is invented here.
Pattern from the first 6 connectors gets codified.
Three Microsoft connectors land on the same day.
Per-source dispatch lands. Each new connector now costs one day.
Sixteenth integration.
Wave 1
11 integrations
in 26 days — ~2.4 days each
Pattern discovered. ADR-001 codifies it on day 10.
Wave 2
5 integrations
in 7 days — ~1.4 days each
ADR-002 lands on day 2. Each connector after costs one day.
The ADR That Made Wave 2 Fast
By the end of Wave 1, every per-integration Lambda was doing the same four things: download bytes from the third party, upload them to the workspace’s S3 prefix, record source-specific metadata, hand off to embedding. Eleven copies of nearly identical code, each with its own retry logic.
ADR-002 collapsed all of that into a single shared Lambda with per-source dispatch. The per-integration Lambda’s job shrank to two responsibilities: enumerate files and enqueue one SQS message per file. The shared file-ingestion Lambda owns the download/upload/embed pipeline for every source.
From ADR-002
A single sharedlambda/file-ingestion/lambda owns the download → S3 upload step for every integration. Per-integration lambdas are responsible for enumerating files and enqueueing one SQS message per file onto the sharedfile_ingestion_queue. Each message carriessource: "<integration>"plus afilepayload understood by that source’s processor.
1. Enumerate
Per-integration Lambda walks the third-party API and finds files. Knows OAuth refresh, pagination, and source quirks — nothing else.
2. Enqueue
One SQS message per file, tagged with source: "dropbox" (or notion, box, slack…).
3. Dispatch
Shared Lambda’s handle_sqs_event routes to process_dropbox_file() and runs the rest of the pipeline.
Workspace-Scoped, Not Org-Scoped
ADR-001 sits at the other end of the integration story. The decision: OAuth tokens are scoped to workspaces, not to organizations. Two workspaces inside the same org can connect different Gmail accounts — one for the legal team, one for engineering — without colliding.
That sounds like a small design choice. It isn’t. The alternative — one Gmail account per org — is how most enterprise search products handle this, and it’s why they don’t scale past a single team buyer. Workspace-scoped tokens let RetrieveIT serve the way real organizations actually structure access.
Multi-Tenant Isolation Without Compromise
- •
org_idis extracted from the JWT at the API boundary — never from user input - • S3 prefixing:
orgs/{org_id}/workspaces/{workspace_id}/{source}/ - • Every S3 Vectors query MUST filter by
org_id(enforced in code, not just documented) - • DynamoDB partition keys (
ORG#{org_id}) make cross-org bypass impossible at the data layer
AWS S3 Vectors: A Differentiated Choice
Most teams building semantic search reach for Pinecone, Weaviate, or pgvector. RetrieveIT runs on AWS S3 Vectors — AWS’s first-party vector database. The vectors live in the customer’s AWS account, alongside the documents they describe.
Single Embedding Model
Amazon Titan Embed Text v2 for every document, every query, every integration. No model-mix risk; same vector space across all 16 sources.
Native Filtering
Vector queries are filtered by org_id and workspace_id at the index level. Isolation is a vector-store primitive, not a post-filter.
No Third-Party SaaS
The vector store is AWS-native. No additional vendor, no separate billing line, no extra DPA. For regulated buyers, this is the difference between a yes and a maybe.
Generation: Claude on Bedrock
Haiku 4.5 for fast retrieval-augmented answers; Sonnet 4.5 for longer synthesis when the query needs it. Both via Bedrock — same trust model as the rest of the stack.
Token Security That Holds Up to a Review
Sixteen integrations means sixteen sets of refresh tokens to protect. The pattern is the same for every one of them — codified in the per-integration token_manager.py:
- KMS-encrypted at rest. No plaintext token ever lands in DynamoDB. Decryption happens in Lambda memory at use-time.
- Rolling rotation on every use. Every refresh issues a new refresh token; the old one invalidates immediately. No scheduled rotation window for an attacker to ride.
- Magic-link auth, no passwords. SES delivers a 15-minute code; the code exchanges for a 24-hour JWT. Nothing to phish, nothing to leak.
- Query content is not stored. Only metadata (user, org, source count, status) is logged. The retrieval body never persists.
Why the Sixteenth Was Faster Than the First
The first integration (GitHub, January 1) had to figure everything out: how OAuth state should be stored, how files become S3 objects, how the embedding pipeline gets notified. By the time Box landed on May 10, none of that was an open question.
Each new integration in Wave 2 was: scaffold a directory, drop in the OAuth handshake for the new vendor, implement enumerate_files(), register a process_<source>_file() handler in the shared dispatch. Done.
The patterns aren’t the marketing — they’re the moat. The next ten integrations will land on the same one-day cadence because Claude Code retrieves the relevant ADRs from the MCP at write-time, instead of pretending it remembers them.
The MCP Retrieval Loop
Just-in-time ADR fetch — not context stuffing1 · Task in IDE
Scaffold lambda/box-integration/ (the 16th connector)
2 · MCP RAG Query
Claude Code → OutcomeOps MCP
Semantic search over outcomeops-adrs
3 · Only the ADRs This Task Needs
- ADR-001 — Terraform module versions for Lambda + SQS + DynamoDB
- ADR-004 — Lambda handler structure for API Gateway + worker types
- ADR-006 — KMS-encrypted OAuth token storage
- ADR-011 — AWS resource tagging
Why this beats context stuffing: the ADR library can grow without bound. The model’s working memory cannot. The MCP retrieves the two or three ADRs this task touches — the other thirty-plus stay on the shelf until they’re relevant. Plus the two RetrieveIT-specific ADRs from docs/adrs/ (workspace-scoped tokens + shared file-ingestion dispatch). All retrieved on demand — never loaded all at once.
The Point
16 integrations, 543 commits, AWS S3 Vectors, workspace-scoped tokens, multi-tenant isolation enforced at the index level — built by following ADRs that Claude Code queries through the OutcomeOps MCP.
Context Engineering is what makes the sixteenth connector cost less than the first.