← Unovie.AI Agentic-Native SDLC · Regulated MedTech

01 — Requirements (Normative "Shall" Document)#

Project: Agentic-Native SDLC for Regulated Medical Device Engineering Status: Baseline v1.0 · Date context: May 2026 Classification: Internal Engineering / Quality Reference Related docs: 02-maturity-model.md · 03-reference-architecture.md · 04-model-strategy-and-finetuning.md · 05-evaluation-and-validation.md · 06-agentic-workflows.md · 07-security-and-compliance.md · 08-token-and-gpu-economics.md · 09-adoption-roadmap.md


1. Purpose, Scope, and How to Read Requirement IDs#

1.1 Purpose#

This document is the normative requirements baseline for an agentic-native software development lifecycle (SDLC) serving a 1000+ developer medical-device engineering organization. It defines, in testable "shall" form, what the platform, its model fleet, and its agents must do. It is the contract against which the architecture (03-reference-architecture.md), evaluation system (05-evaluation-and-validation.md), and compliance posture (07-security-and-compliance.md) are judged.

1.2 Scope#

In scopeOut of scope
AI-assisted and AI-autonomous activities across the IEC 62304 software lifecycle (requirements → design → code → test → docs → review → maintenance)Hardware design, electrical/mechanical CAD outside software-controlled subsystems
Self-hosted open-weight model fleet, serving, fine-tuning, evaluation, and orchestrationProcurement of SaaS LLM services (explicitly prohibited — see §11)
Governance, traceability, audit, and validation of the agentic tooling itself (CSA / GAMP 5)Clinical trial design, regulatory submission authoring beyond software evidence
Security, observability, and FinOps for the agent platformGeneral IT, HR, or non-engineering enterprise systems

1.3 Requirement ID scheme#

Each requirement has the form PREFIX-NNN, a MoSCoW priority, and a mapping to one ASMM-Med maturity level (L0–L5) and one or more of the eight dimensions (D1–D8).

PrefixDomainPrimary owner
FRFunctional — what agents do across the SDLCEng + QA/RA
NFRNon-functional — performance, scale, determinismMLOps/Platform
REGRegulatory & qualityQA/RA
DATAData & knowledge governanceMLOps + Security
MODELModel fleet requirementsMLOps
EVALEvaluation & assuranceQA/RA + Eng
SECSecurity & zero-trustSecurity
COSTFinOps & cost guardrailsFinance + Platform
OPSObservability & operationsPlatform

MoSCoW priority: M = Must (release-blocking), S = Should, C = Could, W = Won't (this baseline). Conventions: "shall" = mandatory; "should" = recommended; numeric thresholds are org-set placeholders, not vendor claims, and are owned by the named accountable function. Every requirement is verifiable by inspection, demonstration, test, or analysis (method noted in §12).

Governing principles (referenced throughout): P1 99.9% is a system property (Generate→Verify→Repair→Gate); P2 determinism wraps probabilism; P3 risk-proportional autonomy by IEC 62304 class A/B/C; P4 everything an agent does is evidence; P5 the harness is the product; P6 cost per verified task; P7 self-hosted, sovereign, reproducible.


2. Stakeholders & Concerns#

StakeholderPrimary concernsKey requirement familiesVeto authority
Engineering (Dev)Productivity, low latency, correct suggestions, low friction, not babysitting bad outputFR, NFR, OPSNo
QA / Regulatory Affairs (RA)IEC 62304 conformance, traceability, validated tools, audit-ready records, escape rateREG, EVAL, FRYes (release gate)
SecurityZero-trust, no data egress, supply-chain integrity, prompt-injection defense, secretsSEC, DATA, MODELYes (deploy gate)
MLOps / PlatformReproducibility, model lifecycle, serving SLOs, multi-LoRA, GPU efficiencyMODEL, NFR, OPS, DATANo
Clinical / ProductSafety-class correctness, requirement intent fidelity, time-to-marketFR, REG, EVALYes (intent)
FinanceCost-per-green-PR, GPU capex/opex, budget predictabilityCOST, OPSYes (budget)

A requirement that any veto-holding stakeholder rejects cannot be marked "Accepted" in §12.


3. Functional Requirements (FR)#

All FRs are bounded by IEC 62304 safety class (A/B/C) and a defined review posture per P3. "Dual human control" = two qualified humans (author-reviewer separation) for Class C.

IDRequirement (shall)Safety-class boundingReview postureMoSCoWASMM-MedDim
FR-001Agents shall assist requirements analysis: decompose, classify, detect ambiguity/conflict, and propose acceptance criteria from source specs (incl. PDF/diagram via Tier-V).A/B/C: proposal onlyHuman approves all generated/edited requirementsML2D3,D5
FR-002Agents shall generate design support artifacts (interface specs, sequence/architecture sketches, design-decision rationale) traceable to requirements.A/B: draft; C: draft + dual reviewHuman-of-record signs designML3D3,D5
FR-003Agents shall perform code generation scoped to a spec/work item, emitting diffs, not silent edits.A: auto-PR allowed; B: PR + 1 review; C: PR + dual review, no autonomous mergePer classML2→L4D5
FR-004Agents shall perform test generation (unit/integration/property/boundary) mapped to requirements and risk controls (ISO 14971).All classes: tests are evidence, human-confirmed coverage intentReviewer confirms adequacyML2D4,D5
FR-005Agents shall generate documentation (design history, SDS, API docs, traceability narratives) from code+spec, marked AI-authored.A/B/C: draftQA/RA approves controlled docsML2D3
FR-006Agents shall perform code review producing findings with severity, location, and rationale; review output is advisory, not a gate by itself (P1).All classesAugments, never replaces, human reviewerML3D4,D5
FR-007Agents shall perform refactoring with behavior-preservation evidence (test pass, diff semantics) attached.A: auto; B: review; C: dual reviewPer classSL3D5
FR-008Agents shall perform code/dependency/platform migration with before/after equivalence evidence and rollback plan.B/C: human-gated cutoverMigration plan signed by leadSL3D5
FR-009Agents shall generate and maintain traceability links (requirement↔design↔code↔test↔risk) and flag gaps.All classesQA/RA owns final trace matrixML3D1,D3,D4
FR-010Every agent action shall produce a Generate→Verify→Repair→Gate record; an action with no verifier shall not pass the gate (P1).All classesSystem-enforcedML2D4
FR-011Agents shall abstain ("I cannot safely complete this") and escalate when confidence/coverage thresholds are unmet, rather than emit low-assurance output.All classesEscalation routed to humanML1D4,D5
FR-012Agents shall be orchestrated via the MCP tool plane and A2A for multi-agent workflows with declared, least-privilege tool scopes.All classesPolicy-boundedML3D5
FR-013Agents shall produce risk-analysis support (hazard identification candidates, traceable to ISO 14971), human-confirmed.A/B/C: proposal onlyRisk owner confirmsSL3D1
FR-014The system shall support human-in-the-loop interrupt/override at any step, with reason captured.All classesAlways availableML1D5,D8
FR-015Agents shall route tasks across the tiered fleet (Reflex/Worker/Reasoner/Multimodal/Embedding) by task class and cost (P6).All classesSystem-enforcedSL3D2,D5

4. Non-Functional Requirements (NFR)#

IDRequirement (shall)Target (org-set placeholder)MoSCoWASMM-MedDim
NFR-001Inline/autocomplete (Tier-S) latency shall meet p95 budget.p95 ≤ 300 msML1D2,D7
NFR-002Interactive agent step (Tier-M) first-token latency shall meet p95 budget.p95 ≤ 2 sML2D2
NFR-003Reasoning/planning task (Tier-L) end-to-end latency shall meet budget for batch-acceptable workloads.p95 ≤ 60 sSL3D2
NFR-004Serving plane shall sustain org-wide concurrent throughput at peak.≥ 1000 concurrent dev sessionsML2D2,D7
NFR-005Control/gate-path availability shall meet SLO.≥ 99.9% monthlyML2D7
NFR-006Gate evaluation shall be deterministic and reproducible: identical inputs + pinned model/LoRA/seed/config → identical gate verdict (P2).100% verdict reproducibilityML4D4
NFR-007Any generated artifact shall be reproducible from recorded {model digest, LoRA, prompt, context snapshot, params, seed} (P7).100% replayableML4D2,D4
NFR-008Platform shall scale to 1000+ developers via K8s horizontal scaling and KEDA autoscale; idle model pools shall scale to zero.Linear cost-to-load to defined ceilingML2D2,D7
NFR-009Multi-LoRA hot-swap shall serve N task-specialized adapters per base without per-adapter cold redeploy.≥ defined adapters/base onlineSL3D2
NFR-010Probabilistic model calls shall be wrapped by deterministic harness logic (validators, parsers, policy) so non-determinism cannot reach a gate verdict (P2).No stochastic path to verdictML2D4,D5
NFR-011Recovery: on serving node/GPU failure, in-flight tasks shall be re-queued without evidence loss.RTO ≤ defined; zero record lossSL2D2,D7
NFR-012The harness (not just the model) shall be versioned and treated as the product unit (P5); harness changes shall be release-controlled.100% harness versionedML3D5

5. Regulatory & Quality Requirements (REG)#

IDRequirement (shall)Regulatory anchorMoSCoWASMM-MedDim
REG-001The platform shall enforce the IEC 62304 software safety classification (A/B/C) as a first-class attribute gating autonomy (P3).IEC 62304ML2D1
REG-002All AI-assisted lifecycle activities shall be validated under a risk-based Computer Software Assurance (CSA) approach proportional to intended use and risk.FDA CSA, GAMP 5 (2nd ed)ML2D1,D4
REG-003Every agent action shall be recorded as 21 CFR Part 11-grade evidence: attributable, immutable, time-stamped, and replayable (P4).21 CFR Part 11ML2D1,D6
REG-004The platform shall maintain end-to-end traceability (user need → requirement → design → code → test → risk control) and surface coverage gaps.IEC 62304, ISO 13485/QMSRML3D1,D3
REG-005Each AI tool used in the lifecycle shall be subject to tool validation / qualification with documented intended use, acceptance, and re-validation triggers.CSA, GAMP 5, 21 CFR 820 (QMSR, eff. Feb 2026)ML2D1,D4
REG-006Risk management activities shall integrate ISO 14971; AI-proposed hazards/controls shall be human-confirmed before becoming controlled records.ISO 14971ML3D1
REG-007The AI management system governing the fleet shall conform to an AI management system standard.ISO/IEC 42001SL3D1
REG-008AI-enabled-device change management shall support a Predetermined Change Control Plan (PCCP) where models influence device behavior.FDA AI-enabled device guidance + PCCPSL4D1
REG-009The platform shall meet applicable EU MDR / EU AI Act obligations for high-risk AI used in device engineering.EU MDR, EU AI ActSL3D1
REG-010All controlled records shall have defined retention, version, and signature controls under the QMS.ISO 13485 / FDA QMSRML2D1
REG-011Human accountability shall be preserved: a named qualified human of record shall sign every controlled output; AI is never the signer.21 CFR Part 11, IEC 62304ML1D1,D8

6. Data & Knowledge Requirements (DATA)#

IDRequirement (shall)MoSCoWASMM-MedDim
DATA-001Training, fine-tuning, and RAG corpora shall be governed: cataloged, licensed, owned, and approved before use.ML2D3
DATA-002The platform shall enforce no external egress of source, specs, or model traffic; all inference, training, and storage are self-hosted (P7).ML1D3,D6
DATA-003PII/PHI shall be detected (Tier-S redactor) and excluded/masked from corpora, prompts, logs, and evidence stores unless explicitly authorized and controlled.ML1D3,D6
DATA-004Every datum used by an agent shall carry provenance (source, version, hash, retrieval timestamp) recorded in the action evidence.ML2D3,D4
DATA-005Knowledge bases shall be versioned and snapshot-able so a retrieval context is reproducible for replay (links NFR-007).ML3D3
DATA-006Corpus and evidence retention shall follow QMS record-retention policy; deletion shall be controlled and logged.ML2D1,D3
DATA-007Corpus quality shall be monitored for drift, staleness, and poisoning; suspect sources shall be quarantined.SL4D3,D6
DATA-008Embeddings/reranking (Tier-E) indices shall be access-controlled per project and safety class.SL3D3,D6

7. Model Requirements (MODEL)#

IDRequirement (shall)MoSCoWASMM-MedDim
MODEL-001Only self-hosted open-weight models shall be used; no SaaS LLM API (Claude/OpenAI/Gemini or equivalent) in any lifecycle path (P7, §11).ML1D2
MODEL-002All fleet models shall be fine-tunable in-house (full or PEFT/LoRA) on governed corpora.ML2D2
MODEL-003Every model and adapter shall be signed and version-pinned (cosign), registered in MLflow with an immutable digest.ML2D2,D6
MODEL-004Serving shall support multi-LoRA hot-swap across task-specialized adapters on shared base weights (vLLM/Triton+TensorRT-LLM/KServe).ML3D2
MODEL-005Models shall support calibrated abstention — emitting a refusal/low-confidence signal the harness can act on (links FR-011).ML2D2,D4
MODEL-006The fleet shall be tiered (Tier-S Reflex 1–8B, Tier-M Worker 14–34B, Tier-L Reasoner 70B+/MoE, Tier-V Multimodal, Tier-E Embedding/Rerank) with documented task→tier routing.ML3D2,D5
MODEL-007Multimodal capability (Tier-V) shall ingest diagrams, imaging, PDF specs, and UI for FR-001/FR-002.SL3D2,D3
MODEL-008Each model version shall pass acceptance evaluation before promotion to a serving channel (links EVAL-001).ML4D2,D4
MODEL-009Model lineage (base → fine-tune dataset → adapter → deployed digest) shall be fully reproducible and recorded (P7).ML4D2
MODEL-010Quantization/optimization (TensorRT-LLM) shall not degrade a model below its gated acceptance thresholds without re-validation.SL4D2,D4

8. Evaluation & Assurance Requirements (EVAL)#

IDRequirement (shall)Target (org-set placeholder)MoSCoWASMM-MedDim
EVAL-001Release gates shall be deterministic and produce a binary, reproducible verdict (P1, P2).100% reproducibleML4D4
EVAL-002The system release-gate acceptance correctness shall meet the org threshold as a system property via Generate→Verify→Repair→Gate, not from any single model (P1).≥ 99.9%ML4D4
EVAL-003Escape rate (defects passing the gate into controlled artifacts) shall be measured and bounded.≤ org-set ceilingML4D4
EVAL-004Golden datasets per task/safety-class shall exist, be versioned, and gate model/harness promotion.100% coverage of gated tasksML4D4
EVAL-005An LLM shall never be the sole gate; gates shall combine deterministic verifiers (build/test/static analysis/policy) with optional model judgment as advisory only (§11).EnforcedML2D4
EVAL-006Gate verifiers shall include compile/build, test execution, static analysis, and policy (OPA/Gatekeeper) checks.All presentML3D1,D4
EVAL-007Evaluation results shall be evidence (P4): stored immutable, attributable to model/harness digests, replayable.100%ML4D4
EVAL-008Continuous evaluation shall detect model/behavioral drift post-deployment and trigger re-validation.MonitoredSL5D4
EVAL-009Repair loops shall be bounded (max iterations/budget); on exhaustion the task shall escalate to human (links FR-011, COST).BoundedML2D4,D7

9. Security Requirements (SEC)#

IDRequirement (shall)MechanismMoSCoWASMM-MedDim
SEC-001The platform shall be zero-trust: every workload identity authenticated and authorized per call.Istio mesh + SPIFFE/SPIREML2D6
SEC-002Agent code execution shall run in isolated sandboxes with no ambient credentials or network.gVisor/KataML2D6
SEC-003Supply chain shall be signed and attested: artifacts, models, containers via Sigstore/cosign + SLSA provenance + SBOM.cosign/SLSA/SBOMML2D6
SEC-004The platform shall implement prompt-injection and tool-abuse defenses (input sanitization, tool allow-lists, output schema validation, least privilege).MCP scopes + validatorsML3D5,D6
SEC-005Secrets shall be managed centrally and never appear in prompts, logs, or evidence.HashiCorp VaultML1D6
SEC-006The audit/evidence store shall be immutable and tamper-evident (append-only, hash-chained) (P4).Append-only + cosignML2D1,D6
SEC-007Policy shall be enforced at admission and runtime via a policy server (OPA/Gatekeeper); no policy bypass path.OPA/GatekeeperML2D6
SEC-008The platform shall meet medical-device cybersecurity obligations and network segmentation.IEC 62443, FDA §524BSL3D6
SEC-009Tool plane (MCP) and multi-agent (A2A) calls shall enforce least-privilege, declared scopes, audited per invocation.MCP/A2A policyML3D5,D6
SEC-010Agents shall never autonomously merge or release Class C changes (links FR-003, §11).Branch policy + gateML3D1,D6

10. Observability & Cost Requirements (OPS / COST)#

IDRequirement (shall)Target (org-set placeholder)MoSCoWASMM-MedDim
OPS-001Every agent action and model call shall emit OpenTelemetry traces correlatable end-to-end (request→tools→model→gate→evidence).100% tracedML2D7
OPS-002The platform shall expose GPU utilization, queue depth, and tokens/sec per tier and per tenant.Dashboards liveML2D7
OPS-003Drift, error-rate, abstention-rate, and escape-rate shall be observable in near-real-time.SLO dashboardsSL4D4,D7
COST-001The platform shall compute cost-per-green-PR (cost per verified task) as the primary efficiency KPI (P6).Reported per teamML3D7
COST-002Budget guardrails shall enforce per-team/per-task token & GPU ceilings; overruns throttle or escalate, never silently spend.Hard ceilingsML2D7
COST-003Routing shall prefer the cheapest tier that meets the quality gate (links FR-015, MODEL-006).EnforcedSL3D2,D7
COST-004Idle GPU pools shall scale to zero; cost attribution shall be tenant-accurate.KEDA + chargebackSL2D7
COST-005Repair/retry loops shall be cost-bounded (links EVAL-009); runaway loops shall halt and escalate.BoundedML2D4,D7

11. Constraints & Explicit Non-Goals#

11.1 Hard constraints (shall)#

IDConstraint
CON-001No SaaS/hosted LLM APIs (Claude, OpenAI, Gemini, or equivalent) in any lifecycle path. Open-weight, self-hosted only.
CON-002No external network egress of source, specs, PHI/PII, or model traffic.
CON-003No LLM as a sole gate: a deterministic verifier set must back every release decision (EVAL-005).
CON-004No autonomous merge or release of Class C software by an agent; Class C requires dual human control (SEC-010, FR-003).
CON-005No agent action without replayable Part 11-grade evidence (REG-003).
CON-006No model/adapter deployment without signing, registry entry, and acceptance evaluation (MODEL-003, MODEL-008).
CON-007No non-deterministic path may reach a gate verdict (NFR-010, P2).

11.2 Explicit non-goals (this baseline)#

Non-goalRationale
Fully unattended Class C autonomyProhibited by P3; revisit only with regulatory precedent.
"Vibe coding" / unbounded freeform generationCounter to spec-driven, evidence-bound philosophy.
General-purpose chatbot assistant outside the SDLCOut of scope; no validation basis.
Vendor-managed model hostingConflicts with P7 sovereignty.
Replacing human reviewers/signersAI augments; humans remain accountable (REG-011).

12. Acceptance Criteria Summary#

Verification methods: I = Inspection, D = Demonstration, T = Test, A = Analysis.

Requirement setAcceptance criterion (must pass for baseline sign-off)MethodPriority
FR-001…015Each SDLC activity demonstrated with safety-class bounding and correct review posture; abstention and human override exercised.D, TM
NFR-001…012Latency/throughput/availability SLOs met under load test; gate verdict + artifact reproducibility shown bit-stable on replay.T, AM
REG-001…011Traceability matrix complete with no orphan links; CSA/tool-validation dossiers present; Part 11 evidence replayed; human-of-record signatures verified.I, AM
DATA-001…008No-egress proven by network policy test; PHI/PII redaction validated; provenance present on sampled actions; corpus versioning replayable.T, IM
MODEL-001…010Fleet confirmed open-weight/self-hosted; signatures and registry digests verified; multi-LoRA hot-swap demonstrated; abstention signal observed; lineage reproducible.I, D, TM
EVAL-001…009Gate determinism proven (identical inputs → identical verdict); system acceptance ≥ 99.9% on golden sets; escape rate within ceiling; no LLM-sole-gate path exists.T, AM
SEC-001…010Zero-trust identity enforced; sandbox isolation tested; SLSA/SBOM/cosign present; prompt-injection suite passed; Class C autonomous-merge attempt blocked.T, IM
OPS/COST-001…005End-to-end traces present; cost-per-green-PR reported; budget guardrail throttle demonstrated; scale-to-zero and chargeback verified.D, TM
CON-001…007Each hard constraint shown enforced (negative tests: egress blocked, SaaS call blocked, LLM-sole-gate rejected, Class C auto-merge rejected).TM

Baseline sign-off requires all Must requirements Accepted with no open veto from any §2 veto-holder (QA/RA, Security, Finance, Clinical/Product), and traceability of every requirement to at least one verification record per P4.


End of 01-requirements.md — proceed to 02-maturity-model.md for the ASMM-Med level definitions that scope phased rollout in 09-adoption-roadmap.md.