Foundry Foundry

Brehob Launch — Delivery Plan (DRAFT)

Status: ROADMAP LOCKED 2026-06-11. Async review applied (6/10) → red-team applied (6/10: critical path reordered, Gate-0 split with pre-defined pass bars, DB-7 added) → blue-team complete (6/11: scope locked at FULL, with the ordered fallback list in Risks; STEM Racing temporarily paused). Epic docs created 6/11 under projects/autri/epics/ — breakout list at the end of the Roadmap. Ladders to North Star (B1–B7).

How to read: the Decisions block and the Roadmap are the load-bearing parts. Everything after is supporting reference.


Decisions

Statuses: ✅ resolved (locked by an external outcome or explicit agreement) · 🔶 active (refined by review, still open to challenge). Red-teamed 2026-06-10; the reorder and DB-7 below came out of that pass.

  • DB-1 (🔶) — Deliver QuoteAI on the Autri substrate, bundled inside Autri's deployment. QuoteAI becomes a vertical surface consuming Autri's knowledge-base retrieval in-process, behind a clean internal API boundary, riding Autri's existing prod deployment + Cognito + Postgres. Why: a standalone QuoteAI build (own auth/hosting/multi-tenancy/ingestion) is not feasible in the window solo; Autri already ships ~75% of the proposal's "platform" layer, live. Alternative rejected: a separate QuoteAI deployment now. Realizes North Star B6; matches decisions.md D2 ("consume, not merge").

  • DB-2 (✅ RESOLVED 2026-06-10; amended by red-team same day) — Brehob gets a dedicated, segmented AWS account, in scope for Phase 1 — and the vend moves BEFORE the corpus load. Andy-meeting outcome: Brehob's IT is effectively one person, and Dan committed to best-practice isolation (all Brehob data in its own member account under our AWS Organization) as a trust point. Red-team amendment: the account vend (roadmap item 5a) precedes the curated corpus load, so Brehob data ingests once, into its final home, and never sits in the shared beta account — no double-ingestion cost, no migration project, and UAT runs in the environment Brehob actually gets. Bedrock model access in the new account is part of the vend's definition of done (see DB-7). Sized ~3–5 days for the first vend (CDK audit 6/10: de-hardcoding ~0.5–1 day — one account/region pin in bin/autri-infra.ts, two secret ARNs in cdk.json, two content-security-policy pins in lib/web/cdn.ts; resource names don't collide across accounts). Entra SSO folds into this workstream (Brehob is a Microsoft shop; current identity wiring is Google-only). The reusable enterprise-deploy capability (B4); customer #2 is cheap once #1 is parameterized.

  • DB-3 (🔶, reframed 6/10) — Ingestion = two paths: structured docs skip normalization; unstructured docs normalize to markdown. The three-regime reality (verified in code): genuinely-structured markdown → deterministic, no-LLM chunking (parse-markdown.ts routes every .md as STRUCTURED); prose → Haiku paragraph-grouping (the deployed "chunk-grouping-v2" extractor — kept after the real-path A/B; the embedding-valley grouper lost and stays offline); tables/line-items → unhandled today (the chunker has no table-specific handling; chunk_type: 'table' is a label only). Named sub-item: per-chunk FTS + lookup keyword metadata — greenfield (no keyword columns exist today); = a schema migration + an ingestion stage + widening the FTS/lookup queries; fixes the known prose FTS=0 weakness — the single biggest retrieval-quality lever for both the Brehob corpus and the dev-memory KB. Keyword-generation LLM calls ride the Batch API from day one. Gate-0 must report the regime mix, not an aggregate recall number. → B1.

  • DB-4 (🔶, promoted 6/10) — Build the consumable API now; dual-use; dogfood-first — and the dev-memory dogfood is load-bearing, not just API-hardening. Verified greenfield: no API-key surface exists today (web routes are Cognito-session-authed; the doc-search MCP server is token-authed). API keys are library-scoped from day one — the library is the RBAC unit (many KBs per library; the same KB can appear in multiple libraries), so keys grant per-use-case access. Sequence: API + key auth → dev-session-memory KB (/stop → transcript episode → KB; backfill past sessions; lives in a restricted library — transcripts carry Brehob pricing/contract detail, and collaborator access must be scopeable) → QuoteAI consumes the same hardened API. Dan's steer (6/10): the dev-memory loop is a compounding quality flywheel — every session retrieves prior sessions' decisions and rationale, which raises the quality of the Brehob build itself — so it earns real epic-level investment in parallel, not a thin de-risking pass. → B3/B6.

  • DB-5 (✅ CONFIRMED 2026-06-10) — Phase 1 only at go-live; Phase 2 trails under its own future MOI. Andy confirmed the signed MOI's timeline covers Phase 1 (quote generation + chat) only; Phase 2 (KPI dashboard + Quotebook) gets a separate MOI later. A +3-week extension was granted → go-live end-August / early-September 2026.

  • DB-6 (🔶, in motion) — $3,000/mo Phase 1 accepted (anchor Y2 renewal to the published $5,000); do not countersign until a real services agreement + attorney review are in place. IP requirements sent to Andy 6/10: Hannah Labs retains all platform/app IP including derivatives; Brehob gets a service license and owns its own data. Brehob drafts the contract → to Dan Thu 6/11 → sign Fri 6/12 only if reviewed and clean. Checklist: ☑ entity/EIN confirmed (6/11 — Hannah Labs LLC + EIN exist) · ☐ attorney engaged — TODAY's action (6/11, draft day) · ☐ attorney has reviewed the draft → only then sign. If review can't happen before Friday, the signature slips — not the review. Contract language to hand the attorney: "all Brehob data is processed within Brehob's dedicated AWS account, including AI inference via AWS Bedrock" (per DB-7).

  • DB-7 (🔶 NEW, red-team 6/10) — Phase-1 inference runs on Bedrock inside brehob-prod; the direct Anthropic API sits behind a fallback flag. Why: with Bedrock in-account, the inference data path never leaves Brehob's dedicated AWS account — the silo pitch becomes literally true and the contract sentence above is clean, with no third-party-processor caveats. Precision note: the Anthropic API does not train on business API data (no-training commercial terms, time-limited retention) — so this is a data-residency and contract decision, not leak prevention; Gate-0's small local slice may use the Anthropic API without ceremony. The Agent SDK switches to Bedrock via environment config; the fallback flag protects the timeline if Bedrock access approval drags. Consumes item 7's per-account Bedrock requests; honors carried decision D16. → B4.


The deal (signed MOI, 2026-06-05)

Brehob's COO (Ed Perry) signed a Memorandum of Implementation Understanding on 6/5; the Hannah Labs signature line is blank (ball in Dan's court). It's an ops/timeline alignment doc, not a full services contract. Key terms:

  • Go-live: originally second week of August 2026; +3-week extension granted 6/10 → end-Aug / early-Sep.
  • Pricing accepted: Phase 1 setup $12,950 (✓), Phase 2 setup $5,250 (✓, Option 2A — no ERP discovery), Phase 2 monthly $1,400 (✓). Changed: Phase 1 monthly $3,000 (proposed $3,500); setup spread over 6 monthly installments (proposed lump-sum at signing).
  • New commitments: hard milestone dates (below); 2 onsite training days pre-go-live + 1 follow-up within 30 days (admin / sales-leadership / end-user / approval-process / best-practice / Q&A tracks); weekly implementation meetings June→Sept.

The 6/9 open clarifications are all answered (Andy meeting, 6/10): (1) Phase 1 only at go-live — confirmed (DB-5); (2) the MOI sits alongside a real services agreement — contract in motion this week (DB-6); (3) data segregation — Dan proactively committed the dedicated account as a trust point (DB-2); no security review needed to force it.


Milestone map (MOI dates ↔ roadmap items)

Andy sent a revised timeline 6/11 (+3 weeks applied uniformly, including kickoff) — pending Dan's confirmation, which should ride with the contract response. Internal clocks do NOT slip: item 1's first measured checkpoint (the former Gate-0a read) stays on the original ~Jun 15 internal target, and the new pre-kickoff runway (now → Jul 6) goes to the substrate items (1, 2, 7, 5a-prep) — run on our clock, not the customer's.

MOI milestoneOriginalRevised (Andy, 6/11)Our work (roadmap item)
M1 Kickoff & alignmentJun 15Jul 6Item 1's early go/no-go read in hand WELL before (internal target unchanged); curation analysis done; substrate work underway
M2 Discovery & workflow mappingJun 26Jul 7–17Brehob intake/template config; approval-routing rules (feeds item 4's approval workflow); corpus triage — execute item 1's curation analysis before any bulk load
M3 Document ingestion & KB prepJul 10Jul 20–31Curated corpus load into brehob-prod (inference via Bedrock, DB-7) — ingestion foundation + 5a vend should be DONE before this window opens
M4 Platform config & buildJul 24Aug 3–14QuoteAI vertical (item 4)
M5 UATAug 1Aug 17–22End-to-end quote gen + chat against the Brehob KB — in brehob-prod, Entra SSO live, F2 isolation probe passed (5b done)
M6 Onsite training & deploy prepAug 4–8Aug 25–29Onsite training; training materials; final hardening polish
M7 Production go-liveAug 11–15Sep 1–5Phase 1 live (matches the verbally-granted extension)
M8 Post go-live stabilizationSep 19Sep 8–Oct 10Stabilization support; Phase 2 scoping under its own MOI (DB-5)

Item 7 (public face) runs in the pre-kickoff window — landing page live before 5a vends the Brehob account.


Roadmap (LOCKED 2026-06-11)

Two interleaved tracks: Substrate (the shared foundation, hardened by dogfooding) and Brehob delivery (the deadline-bound vertical).

Reconciled 2026-06-15: Gate-0 (formerly item 0) is folded into item 1 — the eval harness is the acceptance gate on the build, not a throwaway spike before it; and the structured-attribute filter is now core substrate built in item 1, which resolves the item-4 "build only if Gate-0b ⑥ demands it" contingency.

Critical path (reordered by red-team 6/10; Gate-0 folded into item 1 on 6/15): 1 (ingestion foundation — the Gate-0 eval-acceptance gate lives inside it) → 5a (vend brehob-prod + Bedrock access) → curated corpus load into brehob-prod → 4 → 5b (SSO + hardening) → UAT → go-live. Item 2 (API) lands between 1 and 4 (item 4 consumes it); item 3 parallels with real investment; item 6 threads through everything; item 7 starts immediately (landing page live before 5a vends). Why the reorder: the corpus loads once, into its final home — no Brehob data ever sits in the shared beta account, no double-ingestion cost, no cross-account migration project, and UAT runs in the environment Brehob actually gets.

1. Ingestion foundation — the shared substrate everything consumes; the former Gate-0 spike is folded in as this epic's eval-acceptance gate (merged 6/15).

What the 6/15 merge changed: Gate-0 is no longer a throwaway spike that runs before the build — the eval harness is the acceptance gate on this build (the capability is wanted in Autri regardless of Brehob, so nothing is thrown away). Pass bars are still authored before any code (epic story S0). The early go/no-go read Gate-0a was meant to give — server-side .doc conversion approach (container-image Lambda or Fargate task; LibreOffice has known Lambda traps), table/line-item fidelity at retrievable-row granularity, regime mix + real $/doc against the $3,000/mo margin — now surfaces as the first measured checkpoint after S1–S2 (production code, kept on a "go"), preserving the same Plan B on a no-go: conversion-approach rework + heavier curation + spend the +3-week extension; the signed deal is not at risk. Runs locally first (full cost instrumentation, migrations 013/014; the small slice's LLM calls use the Anthropic API under no-training commercial terms, DB-7).

Scope: structured-skip / unstructured-normalize split (DB-3); server-side legacy-format conversion; per-chunk FTS + lookup keyword metadata (the biggest retrieval lever — fixes prose FTS=0); typed-attribute extraction + the filter-then-rank operator — Autri's one missing retrieval primitive, now built here as core substrate (serves Brehob spec-match and dev-memory recency/supersession; architecture locked in sub-systems/ingestion-pipeline, OD1–OD12); batch ingestion (validated ~50% cheaper + ~2× faster; confirmed NOT built — BATCH_MULT in pricing.ts, no Batch call anywhere); failure surfacing (Brehob's legacy corpus guarantees conversion failures). Work breakdown (nine dependency-ordered stories, S0–S8) in epics/ingestion-foundation.

Eval slice + curation: seeded from the Slate Trucks pair (recent + complete: spreadsheet + proposal) plus ~8–12 docs spanning format × doc-type × complexity (the Powerex oilless golden scenario, a hospital/NFPA system, 1–2 deliberately-old docs to test whether age = junk). The corpus-curation analysis (former Gate-0b ⑤: recency per doc-type, final-vs-draft dedup, value tiers, coverage by product line) reuses that ingested slice + scorecard and feeds M2 triage. Pull back archive/sub-systems/{eval-corpus-and-doe, pipeline-eval-harness, unified-chunking-markdown} and archive/epics/batch-ingestion.

2. Consumable API + API-key auth (DB-4) — the seam both QuoteAI and dev-memory consume; verified greenfield. Keys are library-scoped from day one: the library is the RBAC unit (many KBs per library; the same KB can appear in multiple libraries), so a key grants per-use-case access — e.g. an Engineering library vs a commercially-sensitive one. QuoteAI consumption needs key scoping anyway; deferring it means retrofitting auth under deadline. Why here: nothing dogfoods or integrates until it exists.

3. Dev-memory dogfood (promoted)/stop → transcript episode → KB → backfill past Autri sessions. The API's first real consumer AND the compounding quality flywheel for the rest of this roadmap (Dan's steer, 6/10). Its corpus is unstructured prose, so it exercises exactly the keyword-metadata + prose-grouping path Brehob needs — on our own data first. The dev-memory KB lives in a restricted library — transcripts will carry Brehob pricing and contract detail, and collaborator access (e.g. Jack) must be scopeable from day one. Parallel to 2→4; invested, not thin.

4. QuoteAI verticalretrieval consumes Autri's generic surface, including the new structured-attribute filter (reconciled 6/15): all six of QuoteAI's drafter tools now map onto Autri operators — five onto generic vector/FTS/lookup (improved by item 1's keyword metadata), and the sixth, search_equipment's numeric-range filtering (horsepower/CFM/PSI), onto the filter-then-rank operator delivered as core substrate in item 1. The 6/10 "build the numeric primitive only if Gate-0b ⑥ demands it" contingency is resolved — it's built regardless (it's also the dev-memory primitive), so QuoteAI just consumes it; no vertical-specific retrieval rewire. Model path = DB-7: Phase-1 inference on Bedrock inside brehob-prod (Agent SDK environment switch), direct Anthropic API behind a fallback flag. Drafting is already an in-app Agent SDK → Sonnet call; build PDF export (disabled by design in the demo) and the review & approval workflow (genuinely absent) — real builds, not re-enables; admin + reviewer routing rides Autri auth. KPI dashboard + Quotebook = Phase 2, trails per DB-5. The actual Phase-1 deliverable, built to scale to consumers beyond QuoteAI.

5. Brehob dedicated account + enterprise deploy (DB-2) — split by red-team 6/10:

  • 5a — vend (moves BEFORE the corpus load): parameterize the CDK, vend brehob-prod, baseline guardrails, per-account externals (secrets, DNS + certs, Cognito pool + allowlist). Definition of done includes Bedrock model access granted — request it the day the account exists; approval lead time is why item 7's landing page wants to be live first.
  • 5b — enterprise hardening (before UAT): Entra SSO (Brehob is a Microsoft shop; current identity wiring is Google-only — needs Brehob's one-person IT on the Entra side, so start the conversation early), alarm routing past the single SNS email, the one backup-restore drill, the F2 live two-user isolation probe run against brehob-prod (acceptance criterion), and a fixed-infra COGS estimate (RDS/NAT/CloudFront/Cognito baseline — order of $200–400/mo — sanity-checked against $3,000/mo at the 70% margin target).

~3–5 days for 5a + SSO in 5b. Hard requirement (6/10) + the reusable B4 capability.

6. Reliability + cost (threaded, not last) — failure-surfacing acceptance criteria live in item 1; alarm routing + restore drill in item 5b; the cost columns are the live margin instrument from item 1's first ingest onward. (2026-06-10: the stale 10-message DLQ backlog — May 29–31 initial-pipeline leftovers — was verified and purged to a clean baseline; any future DLQ item is a real signal and gets taken seriously.)

7. Autri public face (parallel; external lead times — START IMMEDIATELY) — implement + deploy the landing page at the autri.ai root (confirmed: nothing serves the apex today) from the existing autri-landing design handoff (full long-form marketing landing + pricing page designs already exist; the CDK web stack already fronts CloudFront, so apex domain + S3 static origin is a contained addition) → Bedrock model-access requests for BOTH accounts (grants are per-account: autri-prod now, brehob-prod at 5a vend; consumed by DB-7) → AWS Activate credits application (needs the live site). Target: landing page live before 5a vends.

Epic docs (CREATED 2026-06-11; gate-0 re-scoped 6/15): gate-0-corpus-spike (re-scoped — no longer a standalone spike; its eval-acceptance role now lives inside ingestion-foundation — the S0 gold plus the extraction and operator stories) · ingestion-foundation · consumable-api · dev-memory · quoteai-vertical · enterprise-deploy. Reliability items thread into each epic's acceptance criteria rather than a standalone doc; item 7 is a short task list, not an epic. archive/epics/batch-ingestion and the eval-harness docs pull into ingestion-foundation.


Architecture

What Autri already provides (live in prod, ~75% of the proposal's "platform"): AWS hosting + monitoring + backups, Cognito auth, document ingestion (PDF/DOCX/MD), vector + full-text + section retrieval, chat against a KB, row-level multi-tenancy, cost tracking (shipped — per-doc stage-breakdown + per-query cost columns; doubles as the Gate-0 margin instrument), custom subdomain. QuoteAI sheds its own pgvector, MCP servers, ingestion CLI, and never-built auth/hosting — collapsing to just the vertical.

What QuoteAI is (the vertical on top): structured intake form (built) → AI quote drafting (built — already an in-app Agent SDK → Sonnet call; retrieval moves to Autri's generic surface by default — the item-4 spike decides whether any specialized capability, e.g. numeric-range equipment filtering, is needed, and if so it's built as a generic Autri feature) → template render + PDF export (build — disabled by design in the demo)review & approval workflow (build — genuinely absent) → KPI dashboard + Quotebook (Phase 2; currently seed-data).

Topology: QuoteAI ships as a product surface inside Autri's deployment (shared Lambda/Cognito/Postgres; QuoteAI tables via Autri migrations, org-scoped), calling retrieval in-process behind a clean module boundary so it can split into its own deployment later without a rewrite.


The corpus reality

17,704 files / 3.8 GB. The format split, updated with what our parsers actually cover (6/10 audit):

  • Covered by portable parsers today: .xlsx (2,992) and legacy .xls (4,783) — QuoteAI's SheetJS-based Excel parser (quoteai/ingestion/parsers/excel.ts) handles both, emitting markdown pipe-tables, and it's pure JavaScript — it ports cleanly into Autri's server-side workers. .docx (186) via mammoth (pure JS). .pdf (822) via Autri's existing path.
  • The true gap: .doc (8,618 files — including the actual sent quotes, … - Final.doc) + .rtf. QuoteAI's .doc path shells out to macOS textutil, which cannot run in Lambda. A server-side conversion approach (LibreOffice container or alternative) is the conversion question item 1 answers first (epic story S1).

Curation before load (6/10): we do not ingest all 17,704 files. Newer quotes and formats supersede old ones; junk is dropped; old documents enter only if they add retrieval value to current quote scenarios (the eval slice includes deliberately-old docs to measure exactly that — old ≠ junk automatically, since historical quotes carry phrasing/structure value). Criteria from item 1's curation analysis: recency window per doc-type, final-vs-draft dedup, doc-type value tiers, and a coverage check by product line so curation doesn't hollow out lines Brehob still sells. Side benefit: legacy .doc files skew old, so curation likely shrinks the conversion gap before we solve it. (Autri's retrieval already excludes superseded document versions natively; curation governs what enters at all.)

Implication unchanged: ingestion fit — conversion + table/line-item granularity — is the first gate, not scale. Item 1's first measured checkpoint answers it with a recall number on real quotes + a conversion-fidelity verdict + the regime mix + $/doc + the curation rules.


Risks

  • ⚠️ Legal is TODAY's action (red-team 6/10): the attorney is NOT yet engaged and the Hannah Labs LLC/EIN status is unconfirmed — with the contract draft arriving Thu 6/11 and a target signature Fri 6/12. The IP terms are a one-way door. If review can't happen before Friday, the signature slips — not the review (DB-6 checklist).
  • Bandwidth, not eng-weeks. ~7–9 focused eng-weeks in the (now ~12-week) window, solo, while keeping Autri's beta alive — plus weekly Brehob meetings, training prep, and the legal track. Still the binding constraint; the +3 weeks is breathing room, not slack. Scope carried in FULL (blue-team call, 6/11) — and the four pre-identified trims are the ordered fallback list if epic sizing busts the window: ① dev-memory backfill → post-go-live; ② approval workflow → single-stage v1; ③ batch ingestion → defer if item 1's early (curated-corpus-size × $/doc) read is modest; ④ item 7 → landing page only. Cut in that order, calmly, if and when sizing says so.
  • Corpus ingestion fidelity (the long pole). 76% legacy formats + tables/line-items must survive conversion at retrievable granularity. De-risk first — it's item 1's first measured checkpoint, with pass bars pre-defined in S0.
  • Item 1's early go/no-go read slipping past kickoff. The conversion-approach / table-fidelity / $-per-doc read (the former Gate-0a) is now the first measured checkpoint inside item 1 (after S1–S3); only those go/no-go measurements block scope-lock. If they slip, kickoff proceeds and the read lands days later, but scope-lock waits for it. The capability is production code now, so a "go" is kept, not rebuilt.
  • F2 multi-tenant isolation — now scheduled. White-box half done + green; the runtime probe is committed and ready; the live two-user run is an item-5b acceptance criterion against brehob-prod, before UAT.
  • Entra SSO depends on Brehob's one-person IT for the Microsoft-side configuration — start that conversation at kickoff, not at M5.
  • Reliability debt is real but now owned: failure-surfacing → item 1; alarm routing + restore drill → item 5b. DLQs purged to a clean baseline 6/10 — any future DLQ item is a real signal.
  • STEM Racing self-serve beta — temporarily PAUSED (6/11). Keep-the-lights-on only during the Brehob window; revisit the beta if time allows and the app is ready post-go-live.

Open questions (carried into epics)

Red/blue-team complete (6/10–6/11): scope locked at FULL, with the ordered fallback list in Risks. Remaining open questions carry into the epics that own them:

  • Does the in-process-bundle boundary stay clean enough to split QuoteAI out later, or does coupling creep? (→ epics/quoteai-vertical)
  • Server-side .doc conversion: LibreOffice container in the ingestion worker vs a dedicated conversion task? (→ ingestion-foundation S1; the first measured checkpoint recommends)
  • Does Gate-0b's drafter-sufficiency read demand the structured-metadata filter primitive? RESOLVED 2026-06-15 — the structured-attribute filter is built as core Autri substrate in item 1 regardless (it is also the dev-memory recency primitive); QuoteAI consumes it rather than triggering a contingent build. Architecture locked in sub-systems/ingestion-pipeline.

Relationship to other docs: ladders to North Star (B1–B7); reorders the archived roadmap (archive/roadmap); the architecture decision is decisions.md D2; the corpus findings, MOI deltas, and spike design originate here. The roadmap above supersedes this doc's earlier flat work breakdown (the P/Q epic list) — epic docs live under projects/autri/epics/.

Review

🔒

Enter your access token to view annotations