Search ARuntime.com

Find runtime definitions and implementation guidance

Search page titles, summaries, headings, glossary terms, use cases, and runtime-directory entries.

Enter at least two characters.

Operations

Security and Governance

AI runtime security and governance guide covering prompt injection, tool authorization, data exfiltration, model supply chain, sandboxing, memory poisoning, multi-tenancy, audit, approval, and incident response.

Audience: Technical readers Reading time: 7 minutes Status: Production guidance Last reviewed:

Key takeaways

  • Prompt text cannot be the final security boundary; deterministic controls must govern tool and data access.
  • Treat retrieved content, model output, tool output, and third-party protocol servers as untrusted inputs.
  • Identity must distinguish human actor, tenant, runtime service, delegated authority, and tool credential.
  • Model, adapter, tokenizer, runtime, container, tool, and policy artifacts form one supply chain.
  • Memory and telemetry require classification, provenance, retention, access, deletion, and incident procedures.
  • Human approval is effective only when bound to an exact action, target, context, and expiry.

Runtime boundary

A useful architecture identifies what this layer receives, owns, emits, measures, and refuses to own. That boundary prevents overlapping products from being treated as interchangeable.

Receives

Identity, delegated authority, classified data, model/tool artifacts, runtime request, policy, approval requirements, and threat context.

Owns

Trust boundaries, least privilege, validation, isolation, artifact integrity, egress, retention, policy enforcement, and incident controls.

Emits

Access decisions, constrained execution, redacted results, audit events, provenance, incidents, and remediation evidence.

Does not own

Legal advice, authority inferred from model confidence, or security delegated to prompt wording.

Failure modes

Prompt injection, confused deputy, unauthorized tool use, exfiltration, model tampering, memory poisoning, tenant leakage, secret exposure, and untraceable actions.

Evidence and metrics

Authorization deny, approval, data-classification blocks, egress, integrity failure, sandbox violation, memory writes, redaction, incident detection, and recovery.

Trust boundaries and identities

A runtime crosses user, product, model provider, data source, tool, memory, telemetry, and administrative boundaries.

Implementation

Model actor, tenant, service, delegated authority, target resource, and credential separately; authenticate at every boundary.

Operational implications

Do not trust tenant or permission claims copied from prompt text or client JSON.

Measure

Authentication failure, tenant mismatch, service identity, delegated-scope use, and credential rotation.

Prompt injection and untrusted content

Retrieved documents, web pages, emails, tool outputs, and messages can contain instructions intended to override policy.

Implementation

Label untrusted data, minimize context, separate instructions from content, restrict tools, validate actions, and use independent policy/approval.

Operational implications

Detection is defense in depth, not a replacement for least privilege.

Measure

Injection detections, context blocks, denied tools, escalations, and incident outcome.

Tool authorization and confused deputy

The model can propose an action but cannot establish the actor’s authority.

Implementation

Resolve normalized action/target, required permission, tenant scope, side effects, idempotency, approval, rate, and budget before execution.

Operational implications

Use credentials scoped to the permitted operation; never expose raw secrets to the model.

Measure

Tool validation/deny, approval, privileged action, credential scope, and side-effect verification.

Data classification and egress

Context, prompts, outputs, caches, logs, and external calls can expose sensitive data.

Implementation

Classify sources and fields, filter context, control destinations, redact, tokenize or reference protected values, and audit egress.

Operational implications

Model providers and telemetry backends are separate destinations with separate policies.

Measure

Data-class blocks, redaction, outbound bytes/domain, policy violations, and retention.

Model and dependency supply chain

Models can include custom code, unsafe formats, malicious weights, vulnerable libraries, and unexpected license obligations.

Implementation

Use approved registries, hashes/signatures, provenance, scanning, isolated conversion, SBOMs, runtime compatibility, and reproducible build evidence.

Operational implications

Avoid loading arbitrary remote-code models in privileged serving processes.

Measure

Integrity failures, provenance completeness, vulnerabilities, license review, and artifact promotion.

Sandboxing and isolation

Tool code, generated code, parsers, converters, and model plugins may require containment.

Implementation

Use process/container/VM isolation, non-root execution, read-only filesystems, minimal mounts, seccomp/capability restrictions, network egress policy, and resource quotas.

Operational implications

Isolation strength must match side effects and attacker control; containers are not the only boundary.

Measure

Sandbox violations, syscalls/network denies, quota events, escape indicators, and cleanup.

Multi-tenant inference

Tenants can share accelerators, memory pools, caches, queues, models, admin APIs, and telemetry.

Implementation

Enforce identity-aware quotas, cache keys, data separation, namespace/RBAC, admin access, encrypted transport, and protected traces.

Operational implications

Performance isolation and data isolation are separate requirements.

Measure

Cross-tenant alerts, quota denies, cache-sharing attempts, queue fairness, and access audit.

Memory governance

Durable memory can amplify false or malicious content across future sessions.

Implementation

Require schemas, provenance, confidence, owner, write permission, review, expiry, deletion, and conflict handling.

Operational implications

Treat durable memory writes as side effects; do not store hidden reasoning or unrestricted raw content.

Measure

Writes by source, approval, rejection, conflicts, expiry/deletion, and poisoning detections.

Output validation and structured constraints

Runtime outputs may feed APIs, databases, UI, or automation.

Implementation

Use JSON Schema/typed contracts, allowlists, bounded values, citations/evidence, encoding, and downstream validation.

Operational implications

Structured output reduces syntax errors but does not prove truth or permission.

Measure

Contract-valid output, semantic rejection, citation verification, and downstream errors.

Human approval and irreversible actions

High-impact changes should pause with a reviewable proposal.

Implementation

Bind approval to exact arguments, target, version, evidence, side-effect class, expiry, and one-time token; verify reviewer authority.

Operational implications

Revalidate if the action changes after approval.

Measure

Approval time/rate, expired/replayed tokens, modified proposals, and post-action verification.

Incident response and replay

Response requires immutable versions, trace context, protected evidence, state changes, and side-effect records.

Implementation

Define containment, credential revocation, cache/memory invalidation, replay, correction, notification, and lessons-learned workflows.

Operational implications

Do not depend on unavailable raw prompts when policy forbids storing them; use safe references and hashes.

Measure

Detection/containment/recovery time, affected runs, replay completeness, and corrective actions.

Reference tables

Runtime threat and control map
Threat Boundary Primary controls Evidence
Prompt injection Context → model/tool Untrusted-data separation, least privilege, action validation Context provenance and denied actions
Confused deputy Model → tool Actor/tenant authorization and scoped credentials Policy decision and tool audit
Data exfiltration Runtime → provider/tool/telemetry Classification, egress allowlist, redaction Destination and redaction events
Memory poisoning Model/tool → durable memory Schema, provenance, review, expiry Memory-change record
Supply-chain compromise Artifact → runtime Registry, hash/signature, SBOM, sandbox Provenance and integrity check
Cross-tenant leakage Shared serving/cache/telemetry Tenant keys, quotas, isolation, RBAC Access/cache audit
Duplicate side effect Retry → external system Idempotency and authoritative outcome check Invocation/result/compensation
Denial of wallet Request loop → capacity/providers Budgets, step/tool/token limits, backpressure Budget decisions and termination

Decision checklist

  1. Where are the runtime trust boundaries and identities authenticated?
  2. Which data classes can reach each model, tool, cache, and telemetry destination?
  3. How is tool authority determined independently of the prompt?
  4. Which artifacts can execute code and how are they verified?
  5. What isolation and egress controls match each side effect?
  6. How are tenant caches, queues, and traces separated?
  7. Which memory writes require review or expiry?
  8. What actions require human approval?
  9. Can an incident reconstruct versions, decisions, state changes, and side effects?

Common mistakes

  • Relying on a system prompt as the security boundary.
  • Passing production credentials into model context.
  • Using one service identity for every tenant and tool.
  • Sharing prefix cache across tenants without explicit policy.
  • Loading unverified model artifacts or custom code.
  • Logging full sensitive prompts and tool results.
  • Writing model output directly into long-term memory.
  • Approving an agent broadly rather than an exact action.
  • Retrying ambiguous side effects without checking state.

Sources and further reading


  1. OWASP Top 10 for LLM Applications
    (opens in a new tab)

    OWASP GenAI Security Project · Official documentation · accessed 2026-06-21 UTC

  2. NIST AI Risk Management Framework
    (opens in a new tab)

    NIST · Government framework · accessed 2026-06-21 UTC

  3. MITRE ATLAS
    (opens in a new tab)

    MITRE · Threat knowledge base · accessed 2026-06-21 UTC

  4. Model Context Protocol security
    (opens in a new tab)

    MCP · Official documentation · accessed 2026-06-21 UTC

  5. Supply-chain Levels for Software Artifacts
    (opens in a new tab)

    OpenSSF · Supply-chain specification · accessed 2026-06-21 UTC

  6. NIST Privacy Framework
    (opens in a new tab)

    NIST · Government framework · accessed 2026-06-21 UTC

Last reviewed: 2026-06-21 UTC

Maintenance record

Found an error, outdated capability, or unclear category boundary? Submit a correction with a supporting source.