Evidence Schema

Evidence records show what request ran, which versions and sources were used, what actions occurred, what changed, how failures were handled, and what uncertainty remains.

Audience: Technical readers Reading time: 2 minutes Status: Developer reference Last reviewed: 2026-06-23 UTC

Evidence records show what a runtime was asked to do, which configuration and sources it used, what actions occurred, what changed, how failure was handled, and what uncertainty remains. Evidence is designed for review, not indiscriminate payload retention.

Key takeaways

Evidence correlates decisions and outcomes across runtime layers.
Protected references and hashes are often safer than raw prompts and tool payloads.
Evidence can support accountability without claiming perfect replay or model explainability.

[ar_diagram id=”evidence-trace-lifecycle”]

Purpose

Operational logs are optimized for debugging and may be sampled or deleted. Evidence is selected for durable review according to purpose, access, retention, and integrity policy. It includes negative events such as denials, failed attempts, compensation, and missing dependencies.

Evidence record

The schema includes evidenceVersion, eventId, requestId, correlationId, traceId, sequence, occurredAtUtc, eventType, actorRef, tenantRef, contractVersions, contextRefs, modelRoute, toolInvocations, policyDecisions, approvals, sideEffects, changedResources, failures, retries, recoverySteps, evaluationResults, finalOutputRef, redaction, and retentionPolicy.

Event sequence

request.admitted
context.selected or context.failed
route.selected
model.completed or model.failed
policy.decided
approval.requested / approved / denied / expired
tool.started / completed / partial / failed
recovery.started / completed / failed
artifact.produced
evaluation.completed
request.completed / failed / terminated

Events are append-oriented and ordered within a request using sequence values. Corrections link to the prior record rather than erasing history.

Redaction and minimization

Store context references, source identifiers, classification, and hashes rather than full text when possible.
Never persist credentials or secret values.
Separate restricted payload stores from broadly accessible event metadata.
Record why a field was redacted and which authorized system can resolve it.
Apply deletion and legal-hold rules to derived evidence as well as source data.

Integrity and retention

Use append-only storage controls, authenticated writers, sequence checks, artifact hashes, signatures or tamper-evident mechanisms where required, and monitored export. Retention is field and event aware: a policy decision may need a different lifetime from a temporary model trace.

Limits of evidence

Evidence cannot prove that a model’s internal reasoning was correct, reconstruct omitted sensitive payloads, or guarantee that an external system reported truthfully. It should clearly identify observed facts, derived assessments, unavailable data, and editorial or model-generated interpretation.

Example and schema

[ar_downloads file=”evidence-record-example.json”][ar_downloads file=”aruntime-evidence-record.schema.json”][ar_downloads file=”aruntime-evidence-event.schema.json”]

Find runtime definitions and implementation guidance