Executive Brief#

Board-level summary · one page

We will adopt an agent-native software development lifecycle for our regulated medical-device software — capturing the productivity of AI while meeting our FDA / IEC 62304 obligations, keeping all models and data inside our own infrastructure, and controlling cost. This is an engineering and quality discipline, not “vibe coding.”

The betGeneration is becoming free; correctness, validation, traceability, and cost control are the engineering. Our advantage comes from the system around the model, not the model alone.

Why now

AI now writes a large share of new code industry-wide, compressing implementation from weeks to hours. Our competitors are moving. The risk is not adopting too slowly — it is adopting without assurance, which in a regulated context means recalls, findings, and IP leakage. A disciplined framework lets us move fast and stay defensible.

What we are building

A self-hosted model fleet

Open-weight, fine-tuned models served on our existing Kubernetes/GPU platform. No Claude / OpenAI / Gemini APIs — for cost at scale, data sovereignty, and regulatory control.

A six-level maturity model

ASMM-Med moves us from ungoverned “shadow AI” to validated autonomous agents, in assurance-gated steps each signed off by Engineering, Quality/Regulatory, and Security.

Deterministic assurance

Every AI output passes deterministic verifiers (compilers, tests, static analysis, formal checks) plus human review before it can ship. The model proposes; the system disposes.

Cost as a first-class metric

We optimize cost-per-verified-change, not cost-per-token — via tiered model routing, caching, quantization, and in-loop budget limits.

The three questions the board will ask

Question	Our answer
How do we hit 99.9% accuracy if the AI is probabilistic?	99.9% is a property of the system, not the model. We wrap every generation in deterministic checks and human checkpoints (Generate → Verify → Repair → Gate → Sign). The model is the least-trusted component; trust is manufactured around it. The highest-risk software (IEC 62304 Class C) always requires dual human sign-off.
Can we afford the GPU cost?	Self-hosting converts per-token vendor billing into an owned, amortizable GPU fleet. A small “reflex” model handles the majority of calls cheaply; large reasoning models are used sparingly. At our scale this is materially cheaper than API pricing — and sovereignty/IP control make it non-optional regardless.
Will regulators accept it?	Yes, when the agent is treated as validated production software under FDA Computer Software Assurance, with documented intended use, risk-based evidence, full traceability, and immutable 21 CFR Part 11 records. The framework is built to produce that evidence automatically.

What it unlocks

Throughput — faster implementation, test generation, documentation, and safe modernization of legacy code that was previously “too risky to touch.”
Quality — more comprehensive automated test and evaluation coverage than humans can produce in the same time, lowering escape-rate into regulated products.
Leverage — smaller teams tackle larger problems; engineers shift from writing code to designing, validating, and directing the systems that produce it.

Investment & timeline (illustrative)

Horizon	Maturity target	Focus
Quarters 1–2	L1 Governed Assistance	Kill shadow AI; stand up self-hosted serving + full logging
Quarters 3–4	L2 Spec-Driven	Specs in-repo; deterministic evaluation harness in CI
Quarters 5–7	L3 Orchestrated	Sandboxed agents, model fleet + routing, policy server
Quarters 8–11	L4 Validated Autonomous	CSA-validate agents; 99.9% gates; full traceability
Quarter 12+	L5 Self-Optimizing	Closed-loop fine-tuning; cost-optimized at scale

Primary investment: GPU capacity, a platform/MLOps team, evaluation engineering, and quality/regulatory integration. Resourcing and figures are placeholders for the funded business case.

What we ask of the board

Endorse the self-hosted, assurance-gated strategy (no external LLM APIs).
Fund Phase 1 (L1) — serving platform, logging, and the end of shadow AI.
Affirm the governance principle that autonomy never outruns assurance: no maturity level is granted without joint Engineering + Quality/Regulatory + Security sign-off.

Bottom lineStructure scales; vibes don’t. AI amplifies our engineering and quality culture — both its strengths and its weaknesses. This framework ensures it amplifies the right ones. Full detail: see the Maturity Model and the document set.