Security & Compliance — Agentic SDLC for Regulated Medical Device Engineering#
Audience: CISO and security architecture, Quality/Regulatory (QA/RA), MLOps/Platform, and an FDA / Notified Body auditor. Scope: The security and compliance control set for AI agents that build, test, document, and maintain regulated medical-device software, running Kubernetes-native with self-hosted, fine-tuned open-weight models only (no Claude/OpenAI/Gemini SaaS APIs — hard-blocked at the network layer). Date: May 2026. All numeric thresholds in this document are placeholders pending organizational calibration. Companion docs: 01-requirements · 02-maturity-model · 03-reference-architecture · 04-model-strategy-and-finetuning · 05-evaluation-and-validation · 06-agentic-workflows · 08-token-and-gpu-economics · 09-adoption-roadmap
0. Purpose and framing#
This document is the security and compliance register for the agentic SDLC. It assumes the seven principles defined in 02-maturity-model — in particular P3 (risk-proportional autonomy by IEC 62304 class), P4 (everything an agent does is evidence, 21 CFR Part 11-grade), P5 (the harness is the product), and P7 (self-hosted, sovereign, reproducible).
The governing security thesis: a probabilistic agent is an untrusted actor inside a zero-trust system. We do not trust the model to behave; we constrain what any model can do via deterministic structural controls, sandboxing, and signed supply-chain assurance, and we make every action attributable and replayable so it survives an audit or a recall investigation.
Two regulatory and security framings run in parallel and must never be conflated (see §9):
- Track A — AI that BUILDS the device: production/Quality-System tooling. Validated under FDA Computer Software Assurance (CSA), ISO 13485/QMSR, 21 CFR Part 11. This framework's primary focus.
- Track B — AI shipped INSIDE the device: SaMD / AI-enabled device function, submission-bearing, governed by EU AI Act high-risk obligations, FDA premarket cybersecurity, and a Predetermined Change Control Plan (PCCP).
Threat frameworks of record: OWASP Top 10 for LLM Applications (2025) and MITRE ATLAS. Standards anchors: 21 CFR Part 11, IEC 62304, ISO 13485/QMSR, ISO 14971, ISO/IEC 42001 (AI management system), IEC 62443 (cyber for medical/industrial), FDA premarket cybersecurity guidance, FD&C Act §524B (SBOM + vulnerability management), EU MDR + EU AI Act.
1. Threat model for agentic dev in a regulated org#
1.1 Assets under protection#
| Asset | Why it matters | Loss class |
|---|---|---|
| Source IP (regulated product source, DHF, algorithms) | Core competitive + regulated artifact | Confidentiality, Integrity |
| PHI/PII | May appear in test fixtures, bug repros, logs, support data | Confidentiality (HIPAA/GDPR), regulatory breach |
| Model weights + adapters (fleet S/M/L/V/E) | Fine-tuned on proprietary corpora; theft = IP loss + cloneable behavior | Confidentiality, Integrity |
| Signing keys (Sigstore/cosign, Vault transit, SPIRE CA) | Root of all provenance trust; compromise forges everything downstream | Integrity (catastrophic) |
| The regulated product itself | A malicious or erroneous agent commit can injure patients | Safety, Integrity |
| Audit/evidence store (WORM logs) | The Part 11 record of truth; if tamperable, nothing is defensible | Integrity, Non-repudiation |
| Eval gold sets (see 05) | Poisoning the eval = silently lowering the release gate | Integrity |
1.2 Adversaries#
| Adversary | Capability | Primary objective |
|---|---|---|
| External attacker | Network probing, supply-chain injection, poisoned public repos/docs | Exfiltration, foothold, weight theft |
| Malicious insider | Authenticated dev/operator access | IP theft, sabotage, gate bypass |
| Compromised dependency / model | Trojaned open-weight base model, poisoned dataset, malicious package | Backdoor in shipped product |
| The agent itself (untrusted-by-design) | Whatever tools it is granted; subject to injection/poisoning | Unintended/rogue action, data exfil |
| Negligent user | Over-broad prompts, pasting PHI, approving without review | Accidental leak, gate erosion |
1.3 Agent-specific attack surface → framework mapping#
| Threat | Description in our context | OWASP LLM Top 10 | MITRE ATLAS |
|---|---|---|---|
| Prompt injection | Malicious instructions in an issue, code comment, PR body, or doc the agent reads | LLM01 | AML.T0051 (LLM Prompt Injection) |
| Context poisoning | Tainted retrieval source / repo seeds the agent's working context | LLM01, LLM08 | AML.T0070 (RAG poisoning) |
| Tool misuse | Agent invokes a granted tool with harmful args (e.g., mass email, prod write) | LLM06 (Excessive Agency) | AML.T0053 (LLM Plugin Compromise) |
| Data exfiltration | Source/PHI/weights leak via tool output, egress, or model channel | LLM02 (Sensitive Info Disclosure) | AML.T0024, AML.T0057 |
| Model supply-chain | Trojaned base weights / poisoned adapter / malicious dataset | LLM03 (Supply Chain) | AML.T0010 (ML Supply Chain Compromise) |
| Rogue autonomous action | High-autonomy agent takes an unbounded irreversible action | LLM06 | AML.T0048 (External Harms) |
| Training-data poisoning | Corrupted fine-tune corpus embeds a backdoor or skews behavior | LLM04 (Data/Model Poisoning) | AML.T0020 (Poison Training Data) |
| Insecure output handling | Unsanitized agent output executed downstream (e.g., shell, SQL) | LLM05 | AML.T0050 |
| System prompt / config leak | Harness policy/secrets exposed via model | LLM07 | AML.T0056 |
1.4 Trust-boundary diagram#
flowchart TB
subgraph EXT["UNTRUSTED — external"]
PUB[Public repos / docs / packages]
SAAS[(External LLM APIs — HARD BLOCKED)]
ADV([External attacker])
end
subgraph EDGE["TB1: Org perimeter — egress deny-by-default"]
EG{{Egress allow-list proxy}}
SAAS -. "BLOCKED at L3/L7" .-x EG
end
subgraph MESH["TB2: K8s + Istio mTLS mesh — SPIFFE/SPIRE identity"]
direction TB
subgraph CTRL["Control plane (trusted)"]
POL[Policy Server\nstructural + semantic gating]
OPA[OPA / Gatekeeper admission]
VAULT[(HashiCorp Vault)]
AUD[(WORM audit / evidence store)]
REG[(Signed model + artifact registry)]
end
subgraph SBX["TB3: Agent sandbox — ephemeral, low-priv"]
AGENT([Dev agent\ngVisor/Kata])
TOOLS[Tools: git, build, test, browser, term]
end
INF[[Self-hosted inference\nfleet S/M/L/V/E]]
end
subgraph REG_ASSETS["TB4: Regulated assets (highest trust)"]
SRC[(Source IP / DHF)]
PROD[(Regulated product / release branch)]
KEYS[(Signing keys / SPIRE CA)]
end
PUB --> EG --> AGENT
AGENT -- "every tool call" --> POL
POL -- allow/deny/sanitize --> TOOLS
POL --> AUD
AGENT <-->|mTLS| INF
TOOLS -->|Vault-brokered, short-TTL| VAULT
AGENT -. "no default write" .-x PROD
POL -- "Class C: dual human control" --> PROD
REG --> OPA --> SBX
ADV -.-x EDGE
Trust boundaries: TB1 perimeter (egress control), TB2 mesh (identity + mTLS), TB3 sandbox (blast-radius containment), TB4 regulated assets (signed, dual-controlled). An agent never crosses from TB3 to TB4 except through the Policy Server and (for Class B/C) human authorization.
2. Zero-trust architecture#
Zero trust here means: no implicit trust by network location; every workload authenticates; every call is authorized; deny by default.
| Control | Implementation | What it enforces |
|---|---|---|
| Workload identity | SPIFFE/SPIRE — every agent, tool, model server gets a SPIFFE ID (SVID), attested at startup, short-TTL, auto-rotated | No shared service accounts; every action attributable to a cryptographic identity (feeds P4 attribution) |
| mTLS everywhere | Istio service mesh; PeerAuthentication: STRICT mesh-wide | No cleartext intra-mesh traffic; no spoofed peers |
| Least-privilege RBAC | K8s RBAC + AuthorizationPolicy keyed on SPIFFE ID; agents get only the namespaces/tools their role requires | An agent role cannot reach services outside its task scope |
| Network deny-by-default | Default-deny NetworkPolicy (Cilium); explicit allow per workload pair | Lateral movement blocked; sandbox cannot reach the audit store directly |
| Egress allow-list | L3/L7 egress proxy; allow-list of approved internal endpoints only | Exfiltration channel closed |
| Hard block on external LLM endpoints | Egress proxy + DNS sinkhole deny api.openai.com, *.anthropic.com, generativelanguage.googleapis.com, etc.; alert + auto-quarantine on attempt | Enforces P7 sovereignty; an injected agent cannot phone an external model out |
The external-LLM block is both a control and a detector: any attempt is treated as a potential prompt-injection/exfil indicator and raises a security incident (§11).
3. The Policy Server in depth#
The Policy Server is the deterministic chokepoint through which every tool call passes before execution. It realizes principle P2 (determinism wraps probabilism) at the action boundary. It has two stages.
3.1 Structural gating (deterministic, policy-as-code)#
Rules are pure functions of (role, environment, tool, args, safety_class) — no model in the loop, fully testable, version-controlled, signed. This is the authoritative, non-bypassable layer.
# policies.yaml (illustrative — versioned, cosign-signed, loaded read-only)
version: 2026.05
defaults:
effect: deny # deny-by-default
roles:
dev-agent-classA:
allow:
- tool: git.read
- tool: build.run
- tool: test.run
- tool: pr.propose # propose only, never merge
- tool: browser.fetch
constraints: { url_allowlist: ["internal-docs", "approved-mirror"] }
deny:
- tool: email.send # structurally forbidden for any dev agent
- tool: prod.write
dev-agent-classC:
allow:
- tool: git.read
- tool: test.run
- tool: pr.propose
require:
- control: dual_human_authorization # see §10 matrix
deny:
- tool: "*.write"
- tool: email.send
- tool: browser.fetch # no external context for Class C work
environments:
prod:
deny:
- tool: "*.write"
unless_role: ["release-bot-signed"]
and_control: dual_human_authorization
guards:
- id: secret-egress
match: { args_contains_secret: true }
effect: deny
- id: pii-in-args
match: { semantic.pii_detected: true } # result from semantic stage
effect: sanitize # mask then allow, or escalateRepresentative invariants (canonical examples reused across docs): role email.send is never granted to a dev agent; prod.write requires a signed release role and dual human authorization; Class C work forbids external browser context.
3.2 Semantic gating (intent / PII check before tool execution)#
A secondary self-hosted LLM (a small fleet-tier model, see 04) inspects the intent and content of a tool call that passed structural rules but warrants judgment — e.g., does this pr.propose body or browser.fetch argument carry PHI/PII, exfiltration intent, or signs of injection? Semantic gating is advisory and escalating, never a sole gate (P2): it can downgrade (sanitize) or escalate (require human), but it can never grant something structural rules denied.
3.3 Interception pseudo-flow#
on tool_call(agent_id, role, env, tool, args):
record = open_evidence_span(agent_id, role, tool, args_hash) # P4
# STAGE 1 — structural (deterministic, authoritative)
s = structural_eval(role, env, tool, args, safety_class)
if s == DENY:
emit_evidence(record, decision=DENY, stage=structural); return BLOCKED
# STAGE 2 — semantic (judgment; PII/intent/injection)
sem = semantic_model.assess(tool, args, retrieval_provenance)
if sem.pii or sem.exfil_intent or sem.injection_signal:
if policy.allows_sanitize(tool):
args = mask_placeholders(args) # [[VAR]] injection, §5
emit_evidence(record, decision=SANITIZE, findings=sem)
else:
emit_evidence(record, decision=ESCALATE, findings=sem)
return REQUIRE_HUMAN(record)
# STAGE 3 — control requirements (autonomy matrix, §10)
if requires_dual_control(role, env, safety_class):
emit_evidence(record, decision=PENDING_DUAL_CONTROL)
return REQUIRE_HUMAN(record, control=DUAL)
emit_evidence(record, decision=ALLOW)
return EXECUTE(tool, args)Every branch emits evidence: input hash, structural verdict, semantic findings, sanitization diff, human-decision pointer, model+policy versions. This record is the Part 11 artifact (§7).
4. Agent sandboxing & blast-radius control#
The agent runs untrusted-by-design; the sandbox guarantees that even a fully-compromised agent has a small, recoverable blast radius.
| Control | Implementation |
|---|---|
| Ephemeral runtime | One agent run = one fresh ephemeral namespace + pod, torn down on completion; no persistence across runs |
| Kernel-isolated sandbox | gVisor (default) or Kata Containers (stronger isolation for V/E tiers or external-content tasks) — syscall surface contained |
| No prod write by default | Sandbox SVID has zero write capability to release branches / prod; writes only via Policy Server + signed release role + human control |
| Egress control | Per-sandbox egress allow-list (§2); browser/term tools route through the inspecting proxy |
| Terminal & browser isolation | term and browser.fetch tools run in a separate isolation domain; fetched content is untrusted input subject to context hygiene (§5) and cannot self-execute |
| Secrets via Vault | No long-lived secrets in env/image; HashiCorp Vault brokers short-TTL, narrowly-scoped, dynamic credentials; Vault audit log feeds the WORM store |
| Kill-switches | Per-agent and fleet-wide kill-switch: revoke SVID (SPIRE) → mesh denies all calls instantly; circuit-breakers on anomalous tool-call rate; "freeze on novel egress" tripwire |
| Resource bounds | CPU/GPU/wall-clock/tool-call quotas — caps runaway loops (cost + blast radius; see 08) |
5. Prompt-injection, context hygiene & PII/PHI protection#
Treat all model-facing content not authored by the harness as hostile input.
Input sanitization & provenance. Retrieved context is wrapped with provenance and trust labels; instructions embedded in data (issues, comments, fetched docs) are demarcated and not treated as commands. Retrieval is restricted to a source allow-list (approved internal repos/doc stores); arbitrary web/repos are off the path for regulated work, eliminating most context-poisoning vectors.
Placeholder injection — the [[VAR]] pattern (context hygiene middleware). Before any PHI/PII/secret-bearing content enters a prompt, a deterministic middleware masks sensitive spans into typed placeholders and keeps the mapping in a secure side-table the model never sees:
RAW: Patient John Doe (MRN 55512) reports error E13 at 10.0.4.7
MASKED: Patient [[NAME_1]] (MRN [[ID_1]]) reports error E13 at [[IP_1]]
side-table (Vault-sealed): NAME_1→"John Doe", ID_1→"55512", IP_1→"10.0.4.7"The model reasons over placeholders; on output, only authorized placeholders are re-hydrated, and only into allow-listed sinks. PHI never reaches the model, never lands in logs/eval sets in cleartext, and cannot leak through the model channel.
Output sanitization & insecure-output-handling defense. Agent output destined for a downstream interpreter (shell, SQL, code) is schema-validated and never auto-executed without passing the Policy Server; outputs are scanned for residual PII and for re-injection patterns.
Defense against poisoned repos/docs. Source allow-listing + signed dependencies (§6) + semantic injection detection (§3.2) + the rule that data is never instruction. A poisoned README cannot redirect the agent's authority because authority lives in structural policy, not in text.
The "rogue agent emails 50 colleagues" failure class. Worked example of defense-in-depth:
- Structural deny:
email.sendis not in any dev-agent role (§3.1) — the tool literally cannot be invoked. - Even if a privileged role had it: semantic gate flags bulk-recipient/exfil intent → escalate.
- Egress allow-list: the SMTP endpoint is not reachable from the sandbox.
- Kill-switch: anomalous tool-call burst trips the circuit-breaker and revokes the SVID.
- Evidence: the attempt is recorded as a security incident (§11).
No single control is trusted; the action requires all of them to fail simultaneously.
6. Supply-chain assurance#
Provenance is required for code, models, AND datasets — models and data are first-class regulated supply-chain artifacts. Aligns to FD&C Act §524B and FDA premarket cybersecurity.
| Artifact | Signing | Provenance | SBOM | Admission check |
|---|---|---|---|---|
| Code / container images | Sigstore/cosign | SLSA build provenance (L3 target) | CycloneDX SBOM | Gatekeeper verifies signature + provenance |
| Model weights + adapters | cosign-signed digest | Build/train provenance (base model lineage, fine-tune run ID) | Model SBOM (base model, datasets, hyperparams, eval hash) | Unsigned/unknown model rejected at admission |
| Datasets | cosign-signed manifest + hash | Source lineage, consent/PHI-handling attestation | Dataset card / data SBOM | Untrusted dataset cannot enter a training run |
| Eval gold sets | signed, version-pinned | provenance to authoring QA | included | tamper = gate integrity incident (05) |
Admission enforcement. OPA/Gatekeeper admission policy: no pod runs a container or loads a model whose cosign signature and SLSA provenance do not verify against the trusted key set (Vault/SPIRE-rooted). Reproducible builds (P7) mean any shipped artifact — code or model — can be regenerated bit-for-bit and defended in an audit or recall. Vulnerability management (continuous SBOM scanning, KEV/CVE feeds) satisfies the §524B postmarket obligation for Track B artifacts and the QS obligation for Track A tooling.
7. Records, audit & 21 CFR Part 11#
Per P4, every agent action is evidence. The evidence record is the regulatory product of the agent, not a byproduct.
What is recorded for every step (immutable, attributable, replayable):
| Field | Source | Part 11 role |
|---|---|---|
| Prompt + full context bundle (hashed; PHI masked) | harness | reconstructs what the agent saw |
| Model tier + weights digest + adapter version | registry (§6) | "which software produced this" |
| Tool call + args (sanitized) + Policy Server verdict | Policy Server (§3) | authorization record |
| Verifier/eval results | gates (05) | objective evidence of correctness |
| Human decision + e-signature (who, when, meaning) | review system | 21 CFR Part 11 §11.50/11.70 |
| SPIFFE identity of every actor | SPIRE | attribution / non-repudiation |
| Policy version + semantic-model version | Policy Server | change-control linkage |
Storage: WORM / immutable store, hash-chained (append-only, tamper-evident), time-synced. Replayability: because weights, adapters, prompts, and policy are all versioned and signed, any decision can be deterministically re-derived for an investigator. e-signatures bind a human's identity, timestamp, and the meaning of their action (reviewed / approved / authorized) to the record.
Audit & recall use: in a recall or FDA inspection, the WORM store answers "show me everything the agent did to this Class C module, who authorized it, what it saw, and prove it wasn't tampered" — with cryptographic non-repudiation. This is the evidentiary backbone the CSA validation (§8) certifies.
8. Validating the agent as regulated software (CSA)#
Under FDA Computer Software Assurance, the agentic harness is production/Quality-System software and is validated risk-proportionately — not exhaustively, but where it matters.
| CSA element | Application here |
|---|---|
| Intended use | Defined per agent role (e.g., "propose unit tests for Class A modules"); autonomy bounded by §10 matrix |
| Risk-based assurance | Test effort scales with the impact of the agent's failure; Class C-touching agents get the deepest scrutiny (P3) |
| Security testing | Threat-led: each §1.3 threat has corresponding adversarial tests and red-team coverage (§11) |
| Threat-led validation | Validation cases derived from the threat model + OWASP LLM / ATLAS mappings, not just happy-path |
| Objective evidence | The §7 WORM record + the 05 eval evidence constitute validation evidence |
| Change control | Harness, policies, and model versions are controlled items; ISO/IEC 42001 governs the AI management system |
The assurance argument ties directly to 05-evaluation-and-validation: deterministic eval + ≥99.9% release-gate correctness is the functional assurance; this document supplies the security assurance. Together they form the CSA validation package.
9. Two regulated tracks#
This is the distinction most frequently muddled — and the one an auditor will test.
| Dimension | Track A — AI that BUILDS the device (this framework) | Track B — AI shipped INSIDE the device (SaMD / AI function) |
|---|---|---|
| What it is | Dev/test/doc agents = production & QS tooling | The model is part of the medical device / its output is a device function |
| Submission-bearing? | No — not in the 510(k)/PMA submission as a function | Yes — part of premarket submission |
| Primary regime | CSA, ISO 13485/QMSR, 21 CFR Part 11 | IEC 62304, ISO 14971, FDA premarket cyber, EU AI Act high-risk, PCCP |
| Change control | QS change control; ISO/IEC 42001 | Predetermined Change Control Plan (PCCP) — pre-authorized model-update envelope |
| Clinical evidence | Not required | Required (clinical validation of the AI function) |
| Failure consequence | Bad tooling → defective product (caught by gates) | Bad model → direct patient harm in the field |
Shared assurance muscles (build once, apply to both): self-hosted signed model supply chain (§6), immutable evidence + Part 11 records (§7), threat-led validation (§8), drift/anomaly monitoring (§11), ISO/IEC 42001 AI governance. Where obligations diverge: Track B additionally owns clinical validation, a PCCP, premarket cybersecurity documentation, and EU AI Act high-risk conformity. This document governs Track A; it deliberately reuses controls that a Track B program will also need, but Track B's submission obligations are out of scope here.
10. Autonomy Authorization Matrix (canonical)#
This is the canonical autonomy matrix. It is referenced by 02-maturity-model and 06-agentic-workflows. It maps (ASMM-Med governing level × IEC 62304 safety class) → permitted agent action and required human control. Per P3, Class C is ALWAYS dual human control regardless of maturity level.
Action legend: Suggest (advisory only) · Propose-PR (opens a PR, no merge authority) · Auto-bounded (autonomous within signed, pre-authorized bounds) · Forbidden. Human control legend: None · Single review · Dual control (two qualified humans; author ≠ approver).
| ASMM-Med level ↓ / IEC 62304 class → | Class A (no injury) | Class B (non-serious injury) | Class C (death / serious injury) |
|---|---|---|---|
| L0 Ad-hoc | Suggest / None | Suggest / Single review | Suggest / Dual control |
| L1 Governed Assistance | Suggest / None | Propose-PR / Single review | Propose-PR / Dual control |
| L2 Spec-Driven Bounded | Propose-PR / Single review | Propose-PR / Single review | Propose-PR / Dual control |
| L3 Orchestrated Agentic | Auto-bounded / Single review (post-hoc) | Propose-PR / Single review | Propose-PR / Dual control |
| L4 Validated Autonomous | Auto-bounded / None within bounds | Auto-bounded / Single review | Propose-PR / Dual control |
| L5 Self-Optimizing | Auto-bounded / None within bounds; sampled audit | Auto-bounded / Single review | Propose-PR / Dual control |
Reading the matrix:
- The leash lengthens with maturity (rows) but is capped by safety class (columns).
- Class C never reaches Auto-bounded or "None." The highest Class C autonomy is Propose-PR under dual control — the agent proposes and evidences; two qualified humans author the merge decision (P3).
- "Auto-bounded" requires the bounds to be signed, version-controlled policy enforced by the Policy Server (§3); outside the bounds, the agent escalates.
- Every cell's enforcement is mechanical: the Policy Server reads
(role→level, target→safety_class)and applies the correspondingrequire: control(§3.1).
11. Continuous security#
Security is a steady-state operation, not a one-time gate.
| Capability | Implementation |
|---|---|
| Red-team agents | Standing adversarial agents continuously attempt prompt injection, context poisoning, tool misuse, and exfil against the live harness; findings feed §8 validation and §3 policy |
| Adversarial eval | OWASP-LLM / ATLAS-derived adversarial suites run in the deterministic eval pipeline (05); regressions block model/harness promotion |
| Drift & anomaly response | Monitor tool-call distributions, egress patterns, semantic-gate hit rates, and model-output drift; anomalies trip circuit-breakers (§4) and open incidents |
| Incident handling | Defined runbooks: SVID revocation, namespace freeze, fleet kill-switch, WORM-log forensic replay; incidents link to QMS CAPA |
| Secure model-update path | New weights/adapters: signed → SBOM'd → SLSA-provenanced (§6) → adversarial + functional eval gates (05) → Gatekeeper admission → staged rollout with rollback. For Track B models, this path executes within the PCCP envelope; for Track A, under QS change control + ISO/IEC 42001 |
| Vulnerability management | Continuous SBOM/CVE scanning of code + model dependencies; §524B-aligned triage and disclosure |
Appendix A — Control-to-standard traceability#
| Control (this doc) | Standard / framework anchor |
|---|---|
| Policy Server, autonomy matrix (§3, §10) | IEC 62304 §5–§9, P3; CSA |
| Evidence / WORM records, e-signature (§7) | 21 CFR Part 11, ISO 13485/QMSR |
| Zero-trust, mTLS, egress, sandboxing (§2, §4) | IEC 62443, FDA premarket cybersecurity |
| Supply chain: signing/SBOM/SLSA (§6) | FD&C Act §524B, SLSA, Sigstore |
| Threat model, red-team, adversarial eval (§1, §11) | OWASP LLM Top 10, MITRE ATLAS |
| AI management system, change control (§8, §11) | ISO/IEC 42001 |
| PII/PHI masking, context hygiene (§5) | HIPAA, GDPR, ISO 14971 (risk) |
| Track A vs B, PCCP, high-risk (§9) | EU AI Act, EU MDR, FDA PCCP guidance |
Cross-references: autonomy bounds and maturity levels — 02-maturity-model; harness/sandbox architecture — 03-reference-architecture; model/adapter signing and fleet — 04-model-strategy-and-finetuning; deterministic eval and validation evidence — 05-evaluation-and-validation; workflow-level human controls — 06-agentic-workflows; cost of controls — 08-token-and-gpu-economics.