01 — Requirements (Normative "Shall" Document)#

Project: Agentic-Native SDLC for Regulated Medical Device Engineering Status: Baseline v1.0 · Date context: May 2026 Classification: Internal Engineering / Quality Reference Related docs: 02-maturity-model.md · 03-reference-architecture.md · 04-model-strategy-and-finetuning.md · 05-evaluation-and-validation.md · 06-agentic-workflows.md · 07-security-and-compliance.md · 08-token-and-gpu-economics.md · 09-adoption-roadmap.md

1. Purpose, Scope, and How to Read Requirement IDs#

1.1 Purpose#

This document is the normative requirements baseline for an agentic-native software development lifecycle (SDLC) serving a 1000+ developer medical-device engineering organization. It defines, in testable "shall" form, what the platform, its model fleet, and its agents must do. It is the contract against which the architecture (03-reference-architecture.md), evaluation system (05-evaluation-and-validation.md), and compliance posture (07-security-and-compliance.md) are judged.

1.2 Scope#

In scope	Out of scope
AI-assisted and AI-autonomous activities across the IEC 62304 software lifecycle (requirements → design → code → test → docs → review → maintenance)	Hardware design, electrical/mechanical CAD outside software-controlled subsystems
Self-hosted open-weight model fleet, serving, fine-tuning, evaluation, and orchestration	Procurement of SaaS LLM services (explicitly prohibited — see §11)
Governance, traceability, audit, and validation of the agentic tooling itself (CSA / GAMP 5)	Clinical trial design, regulatory submission authoring beyond software evidence
Security, observability, and FinOps for the agent platform	General IT, HR, or non-engineering enterprise systems

1.3 Requirement ID scheme#

Each requirement has the form PREFIX-NNN, a MoSCoW priority, and a mapping to one ASMM-Med maturity level (L0–L5) and one or more of the eight dimensions (D1–D8).

Prefix	Domain	Primary owner
FR	Functional — what agents do across the SDLC	Eng + QA/RA
NFR	Non-functional — performance, scale, determinism	MLOps/Platform
REG	Regulatory & quality	QA/RA
DATA	Data & knowledge governance	MLOps + Security
MODEL	Model fleet requirements	MLOps
EVAL	Evaluation & assurance	QA/RA + Eng
SEC	Security & zero-trust	Security
COST	FinOps & cost guardrails	Finance + Platform
OPS	Observability & operations	Platform

MoSCoW priority: M = Must (release-blocking), S = Should, C = Could, W = Won't (this baseline). Conventions: "shall" = mandatory; "should" = recommended; numeric thresholds are org-set placeholders, not vendor claims, and are owned by the named accountable function. Every requirement is verifiable by inspection, demonstration, test, or analysis (method noted in §12).

Governing principles (referenced throughout): P1 99.9% is a system property (Generate→Verify→Repair→Gate); P2 determinism wraps probabilism; P3 risk-proportional autonomy by IEC 62304 class A/B/C; P4 everything an agent does is evidence; P5 the harness is the product; P6 cost per verified task; P7 self-hosted, sovereign, reproducible.

2. Stakeholders & Concerns#

Stakeholder	Primary concerns	Key requirement families	Veto authority
Engineering (Dev)	Productivity, low latency, correct suggestions, low friction, not babysitting bad output	FR, NFR, OPS	No
QA / Regulatory Affairs (RA)	IEC 62304 conformance, traceability, validated tools, audit-ready records, escape rate	REG, EVAL, FR	Yes (release gate)
Security	Zero-trust, no data egress, supply-chain integrity, prompt-injection defense, secrets	SEC, DATA, MODEL	Yes (deploy gate)
MLOps / Platform	Reproducibility, model lifecycle, serving SLOs, multi-LoRA, GPU efficiency	MODEL, NFR, OPS, DATA	No
Clinical / Product	Safety-class correctness, requirement intent fidelity, time-to-market	FR, REG, EVAL	Yes (intent)
Finance	Cost-per-green-PR, GPU capex/opex, budget predictability	COST, OPS	Yes (budget)

A requirement that any veto-holding stakeholder rejects cannot be marked "Accepted" in §12.

3. Functional Requirements (FR)#

All FRs are bounded by IEC 62304 safety class (A/B/C) and a defined review posture per P3. "Dual human control" = two qualified humans (author-reviewer separation) for Class C.

ID	Requirement (shall)	Safety-class bounding	Review posture	MoSCoW	ASMM-Med	Dim
FR-001	Agents shall assist requirements analysis: decompose, classify, detect ambiguity/conflict, and propose acceptance criteria from source specs (incl. PDF/diagram via Tier-V).	A/B/C: proposal only	Human approves all generated/edited requirements	M	L2	D3,D5
FR-002	Agents shall generate design support artifacts (interface specs, sequence/architecture sketches, design-decision rationale) traceable to requirements.	A/B: draft; C: draft + dual review	Human-of-record signs design	M	L3	D3,D5
FR-003	Agents shall perform code generation scoped to a spec/work item, emitting diffs, not silent edits.	A: auto-PR allowed; B: PR + 1 review; C: PR + dual review, no autonomous merge	Per class	M	L2→L4	D5
FR-004	Agents shall perform test generation (unit/integration/property/boundary) mapped to requirements and risk controls (ISO 14971).	All classes: tests are evidence, human-confirmed coverage intent	Reviewer confirms adequacy	M	L2	D4,D5
FR-005	Agents shall generate documentation (design history, SDS, API docs, traceability narratives) from code+spec, marked AI-authored.	A/B/C: draft	QA/RA approves controlled docs	M	L2	D3
FR-006	Agents shall perform code review producing findings with severity, location, and rationale; review output is advisory, not a gate by itself (P1).	All classes	Augments, never replaces, human reviewer	M	L3	D4,D5
FR-007	Agents shall perform refactoring with behavior-preservation evidence (test pass, diff semantics) attached.	A: auto; B: review; C: dual review	Per class	S	L3	D5
FR-008	Agents shall perform code/dependency/platform migration with before/after equivalence evidence and rollback plan.	B/C: human-gated cutover	Migration plan signed by lead	S	L3	D5
FR-009	Agents shall generate and maintain traceability links (requirement↔design↔code↔test↔risk) and flag gaps.	All classes	QA/RA owns final trace matrix	M	L3	D1,D3,D4
FR-010	Every agent action shall produce a Generate→Verify→Repair→Gate record; an action with no verifier shall not pass the gate (P1).	All classes	System-enforced	M	L2	D4
FR-011	Agents shall abstain ("I cannot safely complete this") and escalate when confidence/coverage thresholds are unmet, rather than emit low-assurance output.	All classes	Escalation routed to human	M	L1	D4,D5
FR-012	Agents shall be orchestrated via the MCP tool plane and A2A for multi-agent workflows with declared, least-privilege tool scopes.	All classes	Policy-bounded	M	L3	D5
FR-013	Agents shall produce risk-analysis support (hazard identification candidates, traceable to ISO 14971), human-confirmed.	A/B/C: proposal only	Risk owner confirms	S	L3	D1
FR-014	The system shall support human-in-the-loop interrupt/override at any step, with reason captured.	All classes	Always available	M	L1	D5,D8
FR-015	Agents shall route tasks across the tiered fleet (Reflex/Worker/Reasoner/Multimodal/Embedding) by task class and cost (P6).	All classes	System-enforced	S	L3	D2,D5

4. Non-Functional Requirements (NFR)#

ID	Requirement (shall)	Target (org-set placeholder)	MoSCoW	ASMM-Med	Dim
NFR-001	Inline/autocomplete (Tier-S) latency shall meet p95 budget.	p95 ≤ 300 ms	M	L1	D2,D7
NFR-002	Interactive agent step (Tier-M) first-token latency shall meet p95 budget.	p95 ≤ 2 s	M	L2	D2
NFR-003	Reasoning/planning task (Tier-L) end-to-end latency shall meet budget for batch-acceptable workloads.	p95 ≤ 60 s	S	L3	D2
NFR-004	Serving plane shall sustain org-wide concurrent throughput at peak.	≥ 1000 concurrent dev sessions	M	L2	D2,D7
NFR-005	Control/gate-path availability shall meet SLO.	≥ 99.9% monthly	M	L2	D7
NFR-006	Gate evaluation shall be deterministic and reproducible: identical inputs + pinned model/LoRA/seed/config → identical gate verdict (P2).	100% verdict reproducibility	M	L4	D4
NFR-007	Any generated artifact shall be reproducible from recorded {model digest, LoRA, prompt, context snapshot, params, seed} (P7).	100% replayable	M	L4	D2,D4
NFR-008	Platform shall scale to 1000+ developers via K8s horizontal scaling and KEDA autoscale; idle model pools shall scale to zero.	Linear cost-to-load to defined ceiling	M	L2	D2,D7
NFR-009	Multi-LoRA hot-swap shall serve N task-specialized adapters per base without per-adapter cold redeploy.	≥ defined adapters/base online	S	L3	D2
NFR-010	Probabilistic model calls shall be wrapped by deterministic harness logic (validators, parsers, policy) so non-determinism cannot reach a gate verdict (P2).	No stochastic path to verdict	M	L2	D4,D5
NFR-011	Recovery: on serving node/GPU failure, in-flight tasks shall be re-queued without evidence loss.	RTO ≤ defined; zero record loss	S	L2	D2,D7
NFR-012	The harness (not just the model) shall be versioned and treated as the product unit (P5); harness changes shall be release-controlled.	100% harness versioned	M	L3	D5

5. Regulatory & Quality Requirements (REG)#

ID	Requirement (shall)	Regulatory anchor	MoSCoW	ASMM-Med	Dim
REG-001	The platform shall enforce the IEC 62304 software safety classification (A/B/C) as a first-class attribute gating autonomy (P3).	IEC 62304	M	L2	D1
REG-002	All AI-assisted lifecycle activities shall be validated under a risk-based Computer Software Assurance (CSA) approach proportional to intended use and risk.	FDA CSA, GAMP 5 (2nd ed)	M	L2	D1,D4
REG-003	Every agent action shall be recorded as 21 CFR Part 11-grade evidence: attributable, immutable, time-stamped, and replayable (P4).	21 CFR Part 11	M	L2	D1,D6
REG-004	The platform shall maintain end-to-end traceability (user need → requirement → design → code → test → risk control) and surface coverage gaps.	IEC 62304, ISO 13485/QMSR	M	L3	D1,D3
REG-005	Each AI tool used in the lifecycle shall be subject to tool validation / qualification with documented intended use, acceptance, and re-validation triggers.	CSA, GAMP 5, 21 CFR 820 (QMSR, eff. Feb 2026)	M	L2	D1,D4
REG-006	Risk management activities shall integrate ISO 14971; AI-proposed hazards/controls shall be human-confirmed before becoming controlled records.	ISO 14971	M	L3	D1
REG-007	The AI management system governing the fleet shall conform to an AI management system standard.	ISO/IEC 42001	S	L3	D1
REG-008	AI-enabled-device change management shall support a Predetermined Change Control Plan (PCCP) where models influence device behavior.	FDA AI-enabled device guidance + PCCP	S	L4	D1
REG-009	The platform shall meet applicable EU MDR / EU AI Act obligations for high-risk AI used in device engineering.	EU MDR, EU AI Act	S	L3	D1
REG-010	All controlled records shall have defined retention, version, and signature controls under the QMS.	ISO 13485 / FDA QMSR	M	L2	D1
REG-011	Human accountability shall be preserved: a named qualified human of record shall sign every controlled output; AI is never the signer.	21 CFR Part 11, IEC 62304	M	L1	D1,D8

6. Data & Knowledge Requirements (DATA)#

ID	Requirement (shall)	MoSCoW	ASMM-Med	Dim
DATA-001	Training, fine-tuning, and RAG corpora shall be governed: cataloged, licensed, owned, and approved before use.	M	L2	D3
DATA-002	The platform shall enforce no external egress of source, specs, or model traffic; all inference, training, and storage are self-hosted (P7).	M	L1	D3,D6
DATA-003	PII/PHI shall be detected (Tier-S redactor) and excluded/masked from corpora, prompts, logs, and evidence stores unless explicitly authorized and controlled.	M	L1	D3,D6
DATA-004	Every datum used by an agent shall carry provenance (source, version, hash, retrieval timestamp) recorded in the action evidence.	M	L2	D3,D4
DATA-005	Knowledge bases shall be versioned and snapshot-able so a retrieval context is reproducible for replay (links NFR-007).	M	L3	D3
DATA-006	Corpus and evidence retention shall follow QMS record-retention policy; deletion shall be controlled and logged.	M	L2	D1,D3
DATA-007	Corpus quality shall be monitored for drift, staleness, and poisoning; suspect sources shall be quarantined.	S	L4	D3,D6
DATA-008	Embeddings/reranking (Tier-E) indices shall be access-controlled per project and safety class.	S	L3	D3,D6

7. Model Requirements (MODEL)#

ID	Requirement (shall)	MoSCoW	ASMM-Med	Dim
MODEL-001	Only self-hosted open-weight models shall be used; no SaaS LLM API (Claude/OpenAI/Gemini or equivalent) in any lifecycle path (P7, §11).	M	L1	D2
MODEL-002	All fleet models shall be fine-tunable in-house (full or PEFT/LoRA) on governed corpora.	M	L2	D2
MODEL-003	Every model and adapter shall be signed and version-pinned (cosign), registered in MLflow with an immutable digest.	M	L2	D2,D6
MODEL-004	Serving shall support multi-LoRA hot-swap across task-specialized adapters on shared base weights (vLLM/Triton+TensorRT-LLM/KServe).	M	L3	D2
MODEL-005	Models shall support calibrated abstention — emitting a refusal/low-confidence signal the harness can act on (links FR-011).	M	L2	D2,D4
MODEL-006	The fleet shall be tiered (Tier-S Reflex 1–8B, Tier-M Worker 14–34B, Tier-L Reasoner 70B+/MoE, Tier-V Multimodal, Tier-E Embedding/Rerank) with documented task→tier routing.	M	L3	D2,D5
MODEL-007	Multimodal capability (Tier-V) shall ingest diagrams, imaging, PDF specs, and UI for FR-001/FR-002.	S	L3	D2,D3
MODEL-008	Each model version shall pass acceptance evaluation before promotion to a serving channel (links EVAL-001).	M	L4	D2,D4
MODEL-009	Model lineage (base → fine-tune dataset → adapter → deployed digest) shall be fully reproducible and recorded (P7).	M	L4	D2
MODEL-010	Quantization/optimization (TensorRT-LLM) shall not degrade a model below its gated acceptance thresholds without re-validation.	S	L4	D2,D4

8. Evaluation & Assurance Requirements (EVAL)#

ID	Requirement (shall)	Target (org-set placeholder)	MoSCoW	ASMM-Med	Dim
EVAL-001	Release gates shall be deterministic and produce a binary, reproducible verdict (P1, P2).	100% reproducible	M	L4	D4
EVAL-002	The system release-gate acceptance correctness shall meet the org threshold as a system property via Generate→Verify→Repair→Gate, not from any single model (P1).	≥ 99.9%	M	L4	D4
EVAL-003	Escape rate (defects passing the gate into controlled artifacts) shall be measured and bounded.	≤ org-set ceiling	M	L4	D4
EVAL-004	Golden datasets per task/safety-class shall exist, be versioned, and gate model/harness promotion.	100% coverage of gated tasks	M	L4	D4
EVAL-005	An LLM shall never be the sole gate; gates shall combine deterministic verifiers (build/test/static analysis/policy) with optional model judgment as advisory only (§11).	Enforced	M	L2	D4
EVAL-006	Gate verifiers shall include compile/build, test execution, static analysis, and policy (OPA/Gatekeeper) checks.	All present	M	L3	D1,D4
EVAL-007	Evaluation results shall be evidence (P4): stored immutable, attributable to model/harness digests, replayable.	100%	M	L4	D4
EVAL-008	Continuous evaluation shall detect model/behavioral drift post-deployment and trigger re-validation.	Monitored	S	L5	D4
EVAL-009	Repair loops shall be bounded (max iterations/budget); on exhaustion the task shall escalate to human (links FR-011, COST).	Bounded	M	L2	D4,D7

9. Security Requirements (SEC)#

ID	Requirement (shall)	Mechanism	MoSCoW	ASMM-Med	Dim
SEC-001	The platform shall be zero-trust: every workload identity authenticated and authorized per call.	Istio mesh + SPIFFE/SPIRE	M	L2	D6
SEC-002	Agent code execution shall run in isolated sandboxes with no ambient credentials or network.	gVisor/Kata	M	L2	D6
SEC-003	Supply chain shall be signed and attested: artifacts, models, containers via Sigstore/cosign + SLSA provenance + SBOM.	cosign/SLSA/SBOM	M	L2	D6
SEC-004	The platform shall implement prompt-injection and tool-abuse defenses (input sanitization, tool allow-lists, output schema validation, least privilege).	MCP scopes + validators	M	L3	D5,D6
SEC-005	Secrets shall be managed centrally and never appear in prompts, logs, or evidence.	HashiCorp Vault	M	L1	D6
SEC-006	The audit/evidence store shall be immutable and tamper-evident (append-only, hash-chained) (P4).	Append-only + cosign	M	L2	D1,D6
SEC-007	Policy shall be enforced at admission and runtime via a policy server (OPA/Gatekeeper); no policy bypass path.	OPA/Gatekeeper	M	L2	D6
SEC-008	The platform shall meet medical-device cybersecurity obligations and network segmentation.	IEC 62443, FDA §524B	S	L3	D6
SEC-009	Tool plane (MCP) and multi-agent (A2A) calls shall enforce least-privilege, declared scopes, audited per invocation.	MCP/A2A policy	M	L3	D5,D6
SEC-010	Agents shall never autonomously merge or release Class C changes (links FR-003, §11).	Branch policy + gate	M	L3	D1,D6

10. Observability & Cost Requirements (OPS / COST)#

ID	Requirement (shall)	Target (org-set placeholder)	MoSCoW	ASMM-Med	Dim
OPS-001	Every agent action and model call shall emit OpenTelemetry traces correlatable end-to-end (request→tools→model→gate→evidence).	100% traced	M	L2	D7
OPS-002	The platform shall expose GPU utilization, queue depth, and tokens/sec per tier and per tenant.	Dashboards live	M	L2	D7
OPS-003	Drift, error-rate, abstention-rate, and escape-rate shall be observable in near-real-time.	SLO dashboards	S	L4	D4,D7
COST-001	The platform shall compute cost-per-green-PR (cost per verified task) as the primary efficiency KPI (P6).	Reported per team	M	L3	D7
COST-002	Budget guardrails shall enforce per-team/per-task token & GPU ceilings; overruns throttle or escalate, never silently spend.	Hard ceilings	M	L2	D7
COST-003	Routing shall prefer the cheapest tier that meets the quality gate (links FR-015, MODEL-006).	Enforced	S	L3	D2,D7
COST-004	Idle GPU pools shall scale to zero; cost attribution shall be tenant-accurate.	KEDA + chargeback	S	L2	D7
COST-005	Repair/retry loops shall be cost-bounded (links EVAL-009); runaway loops shall halt and escalate.	Bounded	M	L2	D4,D7

11. Constraints & Explicit Non-Goals#

11.1 Hard constraints (shall)#

ID	Constraint
CON-001	No SaaS/hosted LLM APIs (Claude, OpenAI, Gemini, or equivalent) in any lifecycle path. Open-weight, self-hosted only.
CON-002	No external network egress of source, specs, PHI/PII, or model traffic.
CON-003	No LLM as a sole gate: a deterministic verifier set must back every release decision (EVAL-005).
CON-004	No autonomous merge or release of Class C software by an agent; Class C requires dual human control (SEC-010, FR-003).
CON-005	No agent action without replayable Part 11-grade evidence (REG-003).
CON-006	No model/adapter deployment without signing, registry entry, and acceptance evaluation (MODEL-003, MODEL-008).
CON-007	No non-deterministic path may reach a gate verdict (NFR-010, P2).

11.2 Explicit non-goals (this baseline)#

Non-goal	Rationale
Fully unattended Class C autonomy	Prohibited by P3; revisit only with regulatory precedent.
"Vibe coding" / unbounded freeform generation	Counter to spec-driven, evidence-bound philosophy.
General-purpose chatbot assistant outside the SDLC	Out of scope; no validation basis.
Vendor-managed model hosting	Conflicts with P7 sovereignty.
Replacing human reviewers/signers	AI augments; humans remain accountable (REG-011).

12. Acceptance Criteria Summary#

Verification methods: I = Inspection, D = Demonstration, T = Test, A = Analysis.

Requirement set	Acceptance criterion (must pass for baseline sign-off)	Method	Priority
FR-001…015	Each SDLC activity demonstrated with safety-class bounding and correct review posture; abstention and human override exercised.	D, T	M
NFR-001…012	Latency/throughput/availability SLOs met under load test; gate verdict + artifact reproducibility shown bit-stable on replay.	T, A	M
REG-001…011	Traceability matrix complete with no orphan links; CSA/tool-validation dossiers present; Part 11 evidence replayed; human-of-record signatures verified.	I, A	M
DATA-001…008	No-egress proven by network policy test; PHI/PII redaction validated; provenance present on sampled actions; corpus versioning replayable.	T, I	M
MODEL-001…010	Fleet confirmed open-weight/self-hosted; signatures and registry digests verified; multi-LoRA hot-swap demonstrated; abstention signal observed; lineage reproducible.	I, D, T	M
EVAL-001…009	Gate determinism proven (identical inputs → identical verdict); system acceptance ≥ 99.9% on golden sets; escape rate within ceiling; no LLM-sole-gate path exists.	T, A	M
SEC-001…010	Zero-trust identity enforced; sandbox isolation tested; SLSA/SBOM/cosign present; prompt-injection suite passed; Class C autonomous-merge attempt blocked.	T, I	M
OPS/COST-001…005	End-to-end traces present; cost-per-green-PR reported; budget guardrail throttle demonstrated; scale-to-zero and chargeback verified.	D, T	M
CON-001…007	Each hard constraint shown enforced (negative tests: egress blocked, SaaS call blocked, LLM-sole-gate rejected, Class C auto-merge rejected).	T	M

Baseline sign-off requires all Must requirements Accepted with no open veto from any §2 veto-holder (QA/RA, Security, Finance, Clinical/Product), and traceability of every requirement to at least one verification record per P4.

End of 01-requirements.md — proceed to 02-maturity-model.md for the ASMM-Med level definitions that scope phased rollout in 09-adoption-roadmap.md.