The trace schema correlates infrastructure, model, tool, policy, business, and evaluation activity without assuming every organization can or should store raw prompts.
Key takeaways
- Use one trace context with explicit span kinds rather than flattening all events into “LLM trace.”
- Identifiers and protected references support correlation without copying payloads.
- Sampling must preserve required security and evidence events.
Purpose
Tracing explains latency, dependency, failure, and causal relationships during an execution. Evidence selects a durable, review-oriented subset. The same request and correlation identifiers connect both.
Trace kinds
Infrastructure
Queues, workers, devices, networks, storage, and process lifecycle.
Model
Route, deployment version, prefill, decode, tokens, cache, and stop reason.
Tool
Tool ID/version, authorization, operation key, dependency latency, and result status.
Policy
Policy version, decision, reason codes, and approval relation.
Business
Domain step and outcome reference without unnecessary payload.
Evaluation
Evaluator version, dataset/case, score, pass/fail, and uncertainty.
Correlation
Use traceId and parent/child spanId relationships, plus requestId, correlationId, tool operation ID, policy decision ID, approval ID, and artifact references. Async waits and message delivery use trace links when strict parentage is misleading.
Payload policy
Default to metadata: model deployment, token counts, schema name, tool ID, status, classification, and hashes. Raw prompts, completions, tool inputs, and outputs require an explicit purpose, restricted store, access controls, redaction, and retention. Trace attributes must never contain credentials.
Sampling
Head sampling can reduce cost but may miss rare failures. Tail or rule-based sampling can retain errors, denials, approvals, high latency, and high-risk actions. Required evidence events are not dropped merely because an observability trace is sampled.
Errors and status
Record stable error type, source layer, retry eligibility, and sanitized message. A span can be successful while the business outcome fails, or failed while a recovery succeeds. Keep execution status and outcome evaluation separate.
Example and schema
[ar_downloads file=”trace-correlation-example.json”][ar_downloads file=”aruntime-trace-record.schema.json”]
