EPIC-3: MCP Server (Streamable HTTP)
Drafted 2026-05-19. Beta-sprint epic 3 of 5. Sequencing: Week 1 Days 3-4 + 7 (local-first). Runs in parallel with EPIC-1; depends on EPIC-2 schema.
Goal
Adapt mcp-servers/doc-search from stdio transport to Streamable HTTP transport (MCP spec 2025-03-26+), with OAuth 2.1 scope enforcement via the Library/Connector model. End state: Dan can install a local MCP connector into Claude Desktop and query his novel KB with proper scope enforcement.
Why this epic exists
This is the "does the MCP-as-infrastructure wedge work?" gate. If Claude Desktop can install our local MCP connector and query a real KB with proper scope, the entire product thesis is validated end-to-end before we put a dollar into AWS deployment.
Scope (in)
Transport:
- Stdio → Streamable HTTP (MCP spec 2025-03-26+) — required by AgentCore Runtime per the research finding
- Local dev server listens on
localhost:8080/c/{connectorId}/mcp(POST for JSON-RPC;text/event-streamfor streaming responses + progress notifications) - Production endpoint will be
mcp.autri.ai/c/{connectorId}/mcp(AgentCore Runtime convention) Mcp-Session-Idsemantics: server generates on first request, returns in response header, expects client to echo on subsequent requests, persists across the session
HTTP framework + MCP SDK:
- HTTP framework: Hono (lightweight, modern, edge-runtime friendly if we ever go that way)
- MCP SDK:
@modelcontextprotocol/sdk(Anthropic official) — pin specific version supporting Streamable HTTP; document version in package.json + EPIC-3 notes - Local HTTPS: try plain HTTP first (Day 3 verify against Claude Desktop). If Claude Desktop rejects, add mkcert + local CA (~30 min). Don't pre-build complexity.
OAuth 2.1 + PKCE scaffold:
- Local dev: JWT shim for fast iteration — HS256 with shared secret in
.env.dev - Dev user mapping: shim reads
DEV_USER_ID=<uuid>from.env.dev(Dan sets this after EPIC-2 backfill assigns him a user_id). Shim signs JWTs with thatsubclaim. - Token issuance:
pnpm dev:make-tokenCLI script outputs a JWT to stdout. Dan copies, pastes into Claude Desktop config. - Production-ready interface: swap-able auth layer that becomes Cognito JWKS verification in EPIC-4. Interface:
verifyToken(token: string): Promise<TokenClaims>— dev impl uses HS256, prod impl uses Cognito JWKS. - Token validation middleware: JWKS cache (prod) / shared-secret verify (dev), expiry check, audience check
- Connector resolution from URL path segment
{connectorId}(defense-in-depth check:connector.user_id === token.subANDconnector.revoked_at IS NULL)
Scope enforcement at every tool call:
- Use
getKbScopeForConnector(connectorId)from EPIC-2 to constrain queries - Tool calls return ONLY results from KBs in the connector's library
- Defense-in-depth: scope helper handles revoked connectors AND deleted libraries (returns
[])
connectors.last_used_at write:
- Updated on every successful auth (before tool dispatch). Per EPIC-2 boundary: EPIC-3 owns this write.
Audit events for tool calls (to shared mcp_audit_log table from EPIC-2):
- Every tool call writes an event with
event_type='tool_call.{tool_name}',connector_id=<connectorId>,metadata={tool, kb_scope, query, result_count, status, latency_ms} - Failed tool calls write
event_type='tool_call.{tool_name}.error'withmetadata.error_kindpopulated - ~1ms overhead per call; critical for future audit dashboard, abuse detection, cost analysis
Tool surface (v1):
search_knowledge_base(query: string, kb_id?: string)— vector + FTS hybrid search across library's KBs. Optionalkb_idfilter.lookup_section(section_id: string)— direct lookup by section/rule ID.list_knowledge_bases()— list KBs available in this connector's library.get_document(doc_id: string)— fetch full doc metadata + chunks.
Response strategy: buffer + progress notifications
- Run DB query, collect chunks, emit single tool result (buffered)
- For slow tools (>1s estimated): emit
notifications/progressmid-execution ("Searching novel KB...", "Found N chunks, formatting...") - Status messages appear in Claude Desktop's tool-call UI; LLM's reply still streams natively after the tool result lands
- Implementation budget: 30 min per tool on top of the base buffer pattern
Error response format:
- HTTP status codes: 401 (missing/invalid/expired token), 403 (token valid but connector revoked / authz failure), 404 (connector not found), 500 (unexpected)
- JSON-RPC error envelope per MCP spec for tool-level failures
- Failed tool calls write audit events with
error_kindfor diagnostics
Local end-to-end validation (Day 7):
- Dev seed script:
pnpm dev:seed-connector— creates a known connector with known credentials (computes argon2 hash, inserts row, prints connector_id + secret). Avoids hand-computing hashes. - Install local MCP connector config in Claude Desktop using seeded credentials + dev JWT from
pnpm dev:make-token - Run each of the 4 tools, verify scope enforcement, verify progress notifications render, verify audit events written
Out of scope
- AgentCore Runtime production deploy (EPIC-4)
- Connector creation UI (EPIC-2 — this epic consumes EPIC-2's connectors)
- Per-tool rate limiting (deferred to v1.1)
- Audit logging dashboard UI (events written here AND in EPIC-2 — UI deferred to post-beta)
- Token introspection caching (just use JWKS verification + DB lookup)
- True mid-stream chunk streaming within tool results (Claude Desktop buffers tool results anyway — limited UX benefit; revisit if a real use case surfaces)
- CORS configuration (Claude Desktop is server-side, not browser — not needed for v1)
Dependencies
- EPIC-2 —
connectors,libraries,library_kbs,mcp_audit_logtables +getKbScopeForConnectorhelper must exist - Existing retrieval primitives in
@autri/retrieval(vector-search, fts-search, lookup-section — already working) @modelcontextprotocol/sdkversion that supports Streamable HTTP transport (MCP spec 2025-03-26+) — verify + pin SDK version on Day 3- Hono framework — add to dependencies
node-argon2(or@node-rs/argon2per EPIC-2 risks) — already added by EPIC-2; reused here for verify@autri/dbfor the connector lookup + audit-log write- Dan's user_id from the post-backfill DB — set as
DEV_USER_IDin.env.dev
Deliverables
- MCP server running locally on Streamable HTTP at
localhost:8080 - Working OAuth scaffold (dev shim) with swap-able interface for production Cognito
- All 4 tools implemented + tested
- Claude Desktop config snippet documented in the epic notes
- End-to-end demo: Dan queries his novel KB from Claude Desktop with proper scope enforcement
Implementation plan
Day 3 — Transport adaptation + dev tooling
- Verify
@modelcontextprotocol/sdkversion supports Streamable HTTP (upgrade + pin if needed). Document pinned version. - Scaffold Hono server:
POST /c/:connectorId/mcpfor MCP RPC- Response: JSON for short results,
text/event-streamfor streaming + progress notifications Mcp-Session-Idgeneration + return in response header
- Route handler skeleton: parse JSON-RPC request, dispatch to tool by method name
- Write
pnpm dev:make-tokenCLI script: readsDEV_USER_IDfrom.env.dev, signs HS256 JWT, prints to stdout - Write
pnpm dev:seed-connectorCLI script: creates a connector with known credentials, computes argon2 hash, inserts row, prints connector_id + plaintext secret - Stub tool implementations returning placeholder data
- Verify with raw curl against
localhost:8080:tools/listreturns the tool surface - Local HTTPS check: attempt Claude Desktop connection with plain HTTP. If rejected, add mkcert + local CA setup.
Day 4 — OAuth + scope enforcement + real tools + audit + last_used_at
- OAuth middleware: extract Bearer token, verify JWT (HS256 shared-secret for dev), check expiry + audience, write
connector.last_used_at = NOW()on success - Token → user_id extraction from
subclaim - Connector resolution:
SELECT user_id, library_id FROM connectors WHERE id = ? - Defense-in-depth: assert
connector.user_id === token.subANDconnector.revoked_at IS NULL→ 403 if mismatch - Wire
getKbScopeForConnectorinto each tool handler - Implement all 4 tools, each with buffer + progress notifications pattern:
search_knowledge_base→ calls@autri/retrieval's vector + FTS hybrid search, scoped to library's KBslookup_section→ direct section lookup, scoped checklist_knowledge_bases→ returns KBs fromlibrary_kbsjoinget_document→ fetch doc + chunks, scope check
- Audit event write for every tool call (success + failure)
- Unit tests for scope enforcement: cross-library leak attempts, revoked connector access attempts, expired token attempts, deleted library returns empty scope
- Unit test: audit event written correctly for success + failure paths
Day 7 — Local end-to-end validation (wedge gate)
- Run
pnpm dev:seed-connectorto create a test connector with known credentials - Run
pnpm dev:make-tokento mint a dev JWT for Dan's user - Create Claude Desktop MCP config:
{ "mcpServers": { "autri": { "url": "http://localhost:8080/c/{connectorId}/mcp", "headers": { "Authorization": "Bearer {jwt}" } } } } - Restart Claude Desktop, verify it sees the autri tools
- Run each tool with realistic queries:
- "What does Chapter 5 of mom's novel say about the locked door?" → search_knowledge_base
- "Show me FIA Technical Reg T-7.2" → lookup_section
- "What KBs do I have access to?" → list_knowledge_bases
- "Fetch document <doc_id>" → get_document
- Verify scope enforcement: try a tool call with the wrong connector ID → expect 403
- Verify progress notifications: trigger a slow tool, observe status messages in Claude Desktop UI
- Verify audit events:
SELECT * FROM mcp_audit_log ORDER BY created_at DESC LIMIT 20shows the recent tool calls with correct metadata - Realistic multi-turn workflow: ask 3-4 follow-up questions about mom's novel, confirm Claude Desktop uses tool results to build the conversation
- Mark wedge gate PASSED → proceed to EPIC-4
Risks
@modelcontextprotocol/sdkStreamable HTTP support maturity. Verify the SDK version we use supports it cleanly; if not, may need to implement transport manually. Mitigation: fallback to writing a thin transport layer over the SDK's lower-level primitives.- Claude Desktop MCP client config format for Streamable HTTP may differ from stdio. Check Anthropic's MCP docs for current Claude Desktop config schema. Mitigation: test with a known-working remote MCP server first if our format is unclear.
- OAuth shim must be replaceable with real Cognito without rewrites. Mitigation: design auth layer as a
verifyToken(token: string): Promise<TokenClaims>interface; dev impl uses HS256, prod impl uses Cognito JWKS. - Port conflict on localhost:8080. Mitigation: make port configurable via env var; default to 8181 if 8080 commonly conflicts.
- Performance of
getKbScopeForConnectoron every tool call. Sub-ms DB lookup with index should be fine, but worth measuring. Mitigation: in-memory cache with 60s TTL if it becomes a hotspot.
Definition of done
- MCP server running at
localhost:8080on Streamable HTTP (MCP spec 2025-03-26+) via Hono + pinned@modelcontextprotocol/sdk -
Mcp-Session-Idgenerated, returned, and honored on subsequent requests -
pnpm dev:make-tokenCLI script works (readsDEV_USER_IDfrom.env.dev, outputs JWT) -
pnpm dev:seed-connectorCLI script works (creates connector, computes argon2 hash, prints credentials) - OAuth scaffold validates bearer tokens (dev shim with swap-able interface for Cognito JWKS in EPIC-4)
- Connector resolution + defense-in-depth check works (user_id match, revoked check, deleted-library handling)
-
connectors.last_used_atupdated on every successful auth - All 4 tools implemented:
search_knowledge_base,lookup_section,list_knowledge_bases,get_document - Scope enforcement: tool calls return only KBs in connector's library
- Buffer + progress notifications pattern working for slow tools (status messages visible in Claude Desktop)
- Audit events written for every tool call (success + failure) to
mcp_audit_logwith correct metadata - Error response format: 401/403/404/500 with JSON-RPC envelope per MCP spec
- Unit tests pass for scope enforcement edge cases + audit event writing
- Claude Desktop installs the connector, lists tools, queries successfully
- Wedge gate (beta-ready definition):
- All 4 tools called successfully end-to-end
- Scope enforcement test passed (cross-library leak attempt blocked)
- One realistic multi-turn workflow completed (3-4 follow-up questions about mom's novel)
- Progress notifications visible during slow tools
- Audit events written + queryable
- → "Yes, MCP-into-Claude-Desktop works." Proceed to EPIC-4.
Notes / open questions
Locked this triage pass (2026-05-19):
- HTTP framework: Hono
- MCP SDK:
@modelcontextprotocol/sdk(Anthropic official), pinned version supporting Streamable HTTP - Dev shim auth: HS256 JWT, signs
subclaim withDEV_USER_IDfrom.env.dev - Dev token issuance:
pnpm dev:make-tokenCLI - Dev connector seeding:
pnpm dev:seed-connectorCLI - Streaming strategy: buffer + progress notifications via
notifications/progress connectors.last_used_atwrite on every successful auth (owned by EPIC-3)- Tool-call audit events written to
mcp_audit_log(shared schema from EPIC-2) - Library deletion handled implicitly by
getKbScopeForConnectorjoining throughlibrary_kbs(defense-in-depth) - Local HTTPS: try plain HTTP first; mkcert fallback if Claude Desktop rejects
- Error format: 401/403/404/500 + JSON-RPC error envelope per MCP spec
- Wedge gate: beta-ready definition (all 4 tools + scope + workflow + progress notifications + audit events)
- CORS: not needed (Claude Desktop is server-side)
- Mid-stream chunk streaming: out of scope (Claude Desktop buffers anyway)
Still open (low-risk, decide during implementation):
- Should the MCP server live in
mcp-servers/doc-search(current location) or be promoted to a top-levelmcp-servers/autripackage? Current path is fine; rename if it becomes confusing. - JWT expiry duration in dev: lean 24h (low friction for local dev). Production will be shorter (15 min) per Cognito defaults.
- Tool naming convention:
search_knowledge_base(snake_case) per MCP examples. Already in scope; just noting.