Search ARuntime.com

Find runtime definitions and implementation guidance

Search page titles, summaries, headings, glossary terms, use cases, and runtime-directory entries.

Enter at least two characters.

ARuntime Reference

Security and Governance

Security and governance span identity, data, prompts, tools, networks, runtime isolation, approvals, evidence, and incident response.

Audience: Technical readers Reading time: 3 minutes Status: Production guidance Last reviewed:

AI runtime security assumes that model output, retrieved content, tool results, and other agents can be incorrect or adversarial. Controls therefore sit around execution rather than relying on the model to protect itself.

Key takeaways

  • Treat all model-generated actions as proposals requiring typed validation and authorization.
  • Use identity, least privilege, isolation, egress control, idempotency, approval, and evidence as independent controls.
  • Separate established production controls from experimental defenses and model-based monitors.

[ar_threat_matrix]

Runtime threat model

Threats include direct and indirect prompt injection, overprivileged tools, credential leakage, data exfiltration, runaway loops, resource exhaustion, poisoned memory, silent side effects, compromised connectors, cross-agent trust failure, supply-chain compromise, and evidence tampering. Threat modeling identifies assets, trust boundaries, actor capabilities, data flow, and safe failure.

Prompt injection path

An indirect injection can enter through a webpage, email, document, database field, or tool response. It becomes dangerous when untrusted content shares a context with privileged instructions and the same agent can invoke consequential tools. The runtime reduces blast radius by labeling provenance, separating data from authority, limiting tools, validating calls, and requiring approval. Content filters alone cannot establish authorization. [ar_cite id=”owasp-prompt” label=”OWASP prompt injection guidance”]

Identity and dynamic least privilege

Resolve human, workload, service, and delegated identity. Bind every tool call to actor, tenant, task, resource, action, and expiry. Use vault references or token exchange so credentials are created just in time and never placed in model context. Denied actions do not trigger automatic privilege expansion.

Sandbox and egress

Code execution and untrusted transformations run in isolated processes, containers, microVMs, or other bounded environments appropriate to risk. Use read-only base images, ephemeral filesystems, resource quotas, syscall/process restrictions, and default-deny network egress. Isolation does not replace tool authorization or data minimization.

Tool and connector controls

  • Typed input and output schema
  • Permission and side-effect classification
  • Resource allowlist and data-classification rules
  • Timeout, rate, concurrency, and budget limits
  • Idempotency and compensation
  • Connector provenance, version, and health
  • Result validation and unexpected-side-effect detection

Memory security

Memory writes are untrusted until validated. Record source, scope, confidence, consent, retention, and correction path. Prevent cross-tenant retrieval, privilege-bearing memories, secret retention, and automatic promotion of tool output. Memory deletion should remove derived indexes and references according to policy.

Human oversight

Use approval for irreversible, high-impact, financial, external communication, administrative, or ambiguous actions. Present concrete proposed changes and evidence rather than a vague “approve agent” prompt. Out-of-band approval can reduce the risk of a compromised conversational channel.

Evidence and incident response

Evidence supports detection and review but can itself contain sensitive data. Store minimized records, access decisions, artifact hashes, and protected references. Incident response needs kill switches, credential revocation, task cancellation, connector disablement, evidence preservation, notification, and correction of poisoned memory.

Governance ownership

Example ownership matrix
Concern Accountable owner
Legitimate purpose and user experience Product/business owner
Runtime architecture and service objectives Platform engineering
Identity, secrets, isolation, incident response Security
Data classification, retention, and rights Data/privacy governance
Tool side effects and compensation Tool/domain owner
Model and evaluation suitability ML/application owner
Approval authority Named operational or business role

Production versus research

Least privilege, schema validation, sandboxing, egress control, human approval, rate limits, checkpoints, and audit logging are established production practices. Reliable prompt-injection detection, model-based runtime monitors, formal guarantees for stochastic agents, autonomous self-repair, and broad cross-agent trust remain active research. Deploy research controls only as defense in depth, not as a sole safety boundary.

Maintenance record

Found an error, outdated capability, or unclear category boundary? Submit a correction with a supporting source.