AI Runtime Glossary - aRuntime.com

Agentic runtime

Definition: A production execution layer for agentic work that coordinates context, tools, memory, policy, state, and traceability.

A production execution layer for agentic work that coordinates context, tools, memory, policy, state, and traceability. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Agentic runtime

Related: Tool broker, Tool contract, Side-effect classification

Tool broker

Definition: A agentic runtime concept used when designing, implementing, or operating AI runtimes.

A agentic runtime concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Agentic runtime

Related: Agentic runtime, Tool contract, Side-effect classification

Tool contract

Definition: A agentic runtime concept used when designing, implementing, or operating AI runtimes.

A agentic runtime concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Agentic runtime

Related: Agentic runtime, Tool broker, Side-effect classification

Side-effect classification

Definition: A agentic runtime concept used when designing, implementing, or operating AI runtimes.

A agentic runtime concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Agentic runtime

Related: Agentic runtime, Tool broker, Tool contract

Idempotency

Definition: A agentic runtime concept used when designing, implementing, or operating AI runtimes.

A agentic runtime concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Agentic runtime

Related: Agentic runtime, Tool broker, Tool contract

Memory manager

Definition: A agentic runtime concept used when designing, implementing, or operating AI runtimes.

A agentic runtime concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Agentic runtime

Related: Agentic runtime, Tool broker, Tool contract

Working memory

Definition: A agentic runtime concept used when designing, implementing, or operating AI runtimes.

A agentic runtime concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Agentic runtime

Related: Agentic runtime, Tool broker, Tool contract

Long-term memory

Definition: A agentic runtime concept used when designing, implementing, or operating AI runtimes.

A agentic runtime concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Agentic runtime

Related: Agentic runtime, Tool broker, Tool contract

Context assembly

Definition: A agentic runtime concept used when designing, implementing, or operating AI runtimes.

A agentic runtime concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Agentic runtime

Related: Agentic runtime, Tool broker, Tool contract

Human review

Definition: A agentic runtime concept used when designing, implementing, or operating AI runtimes.

A agentic runtime concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Agentic runtime

Related: Agentic runtime, Tool broker, Tool contract

Rollback

Definition: A agentic runtime concept used when designing, implementing, or operating AI runtimes.

A agentic runtime concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Agentic runtime

Related: Agentic runtime, Tool broker, Tool contract

Compensation

Definition: A agentic runtime concept used when designing, implementing, or operating AI runtimes.

A agentic runtime concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Agentic runtime

Related: Agentic runtime, Tool broker, Tool contract

Evaluation envelope

Definition: A agentic runtime concept used when designing, implementing, or operating AI runtimes.

A agentic runtime concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Agentic runtime

Related: Agentic runtime, Tool broker, Tool contract

Runtime contract

Definition: A agentic runtime concept used when designing, implementing, or operating AI runtimes.

A agentic runtime concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Agentic runtime

Related: Agentic runtime, Tool broker, Tool contract

Intermediate representation

Definition: A compiler and ir terms concept used when designing, implementing, or operating AI runtimes.

A compiler and ir terms concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Compiler and IR terms

Related: ONNX, StableHLO, MLIR

ONNX

Definition: An open model graph format used to exchange models between frameworks and inference runtimes.

An open model graph format used to exchange models between frameworks and inference runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Compiler and IR terms

Related: Intermediate representation, StableHLO, MLIR

StableHLO

Definition: A portable high-level operation set used in compiler workflows around OpenXLA-compatible systems.

A portable high-level operation set used in compiler workflows around OpenXLA-compatible systems. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Compiler and IR terms

Related: Intermediate representation, ONNX, MLIR

MLIR

Definition: A compiler and ir terms concept used when designing, implementing, or operating AI runtimes.

A compiler and ir terms concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Compiler and IR terms

Related: Intermediate representation, ONNX, StableHLO

TOSA

Definition: A compiler and ir terms concept used when designing, implementing, or operating AI runtimes.

A compiler and ir terms concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Compiler and IR terms

Related: Intermediate representation, ONNX, StableHLO

HLO

Definition: A compiler and ir terms concept used when designing, implementing, or operating AI runtimes.

A compiler and ir terms concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Compiler and IR terms

Related: Intermediate representation, ONNX, StableHLO

PTX

Definition: A compiler and ir terms concept used when designing, implementing, or operating AI runtimes.

A compiler and ir terms concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Compiler and IR terms

Related: Intermediate representation, ONNX, StableHLO

SPIR-V

Definition: A compiler and ir terms concept used when designing, implementing, or operating AI runtimes.

A compiler and ir terms concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Compiler and IR terms

Related: Intermediate representation, ONNX, StableHLO

Tracing

Definition: A compiler and ir terms concept used when designing, implementing, or operating AI runtimes.

A compiler and ir terms concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Compiler and IR terms

Related: Intermediate representation, ONNX, StableHLO

Scripting

Definition: A compiler and ir terms concept used when designing, implementing, or operating AI runtimes.

A compiler and ir terms concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Compiler and IR terms

Related: Intermediate representation, ONNX, StableHLO

JIT compilation

Definition: A compiler and ir terms concept used when designing, implementing, or operating AI runtimes.

A compiler and ir terms concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Compiler and IR terms

Related: Intermediate representation, ONNX, StableHLO

AOT compilation

Definition: A compiler and ir terms concept used when designing, implementing, or operating AI runtimes.

A compiler and ir terms concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Compiler and IR terms

Related: Intermediate representation, ONNX, StableHLO

Lowering

Definition: A compiler and ir terms concept used when designing, implementing, or operating AI runtimes.

A compiler and ir terms concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Compiler and IR terms

Related: Intermediate representation, ONNX, StableHLO

Code generation

Definition: A compiler and ir terms concept used when designing, implementing, or operating AI runtimes.

A compiler and ir terms concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Compiler and IR terms

Related: Intermediate representation, ONNX, StableHLO

Shape guard

Definition: A compiler and ir terms concept used when designing, implementing, or operating AI runtimes.

A compiler and ir terms concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Compiler and IR terms

Related: Intermediate representation, ONNX, StableHLO

Dynamic shape

Definition: A compiler and ir terms concept used when designing, implementing, or operating AI runtimes.

A compiler and ir terms concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Compiler and IR terms

Related: Intermediate representation, ONNX, StableHLO

Graph partitioning

Definition: A compiler and ir terms concept used when designing, implementing, or operating AI runtimes.

A compiler and ir terms concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Compiler and IR terms

Related: Intermediate representation, ONNX, StableHLO

Execution provider

Definition: A backend abstraction used to run supported graph partitions on specific hardware or libraries.

A backend abstraction used to run supported graph partitions on specific hardware or libraries. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Compiler and IR terms

Related: Intermediate representation, ONNX, StableHLO

Delegate

Definition: A mobile or edge backend that accelerates supported operations on a target processor.

A mobile or edge backend that accelerates supported operations on a target processor. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Compiler and IR terms

Related: Intermediate representation, ONNX, StableHLO

BYOC

Definition: A compiler and ir terms concept used when designing, implementing, or operating AI runtimes.

A compiler and ir terms concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Compiler and IR terms

Related: Intermediate representation, ONNX, StableHLO

Model repository

Definition: A deployment concept used when designing, implementing, or operating AI runtimes.

A deployment concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Deployment

Related: Model versioning, Canary rollout, Blue-green deployment

Model versioning

Definition: A deployment concept used when designing, implementing, or operating AI runtimes.

A deployment concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Deployment

Related: Model repository, Canary rollout, Blue-green deployment

Canary rollout

Definition: A deployment concept used when designing, implementing, or operating AI runtimes.

A deployment concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Deployment

Related: Model repository, Model versioning, Blue-green deployment

Blue-green deployment

Definition: A deployment concept used when designing, implementing, or operating AI runtimes.

A deployment concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Deployment

Related: Model repository, Model versioning, Canary rollout

Autoscaling

Definition: A deployment concept used when designing, implementing, or operating AI runtimes.

A deployment concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Deployment

Related: Model repository, Model versioning, Canary rollout

Cold start

Definition: A deployment concept used when designing, implementing, or operating AI runtimes.

A deployment concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Deployment

Related: Model repository, Model versioning, Canary rollout

Warmup

Definition: A deployment concept used when designing, implementing, or operating AI runtimes.

A deployment concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Deployment

Related: Model repository, Model versioning, Canary rollout

Scale to zero

Definition: A deployment concept used when designing, implementing, or operating AI runtimes.

A deployment concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Deployment

Related: Model repository, Model versioning, Canary rollout

Kubernetes

Definition: A deployment concept used when designing, implementing, or operating AI runtimes.

A deployment concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Deployment

Related: Model repository, Model versioning, Canary rollout

Serverless runtime

Definition: A deployment concept used when designing, implementing, or operating AI runtimes.

A deployment concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Deployment

Related: Model repository, Model versioning, Canary rollout

MicroVM

Definition: A deployment concept used when designing, implementing, or operating AI runtimes.

A deployment concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Deployment

Related: Model repository, Model versioning, Canary rollout

Private cloud runtime

Definition: A deployment concept used when designing, implementing, or operating AI runtimes.

A deployment concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Deployment

Related: Model repository, Model versioning, Canary rollout

Managed cloud runtime

Definition: A deployment concept used when designing, implementing, or operating AI runtimes.

A deployment concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Deployment

Related: Model repository, Model versioning, Canary rollout

Air-gapped runtime

Definition: A deployment concept used when designing, implementing, or operating AI runtimes.

A deployment concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Deployment

Related: Model repository, Model versioning, Canary rollout

Hybrid runtime

Definition: A deployment concept used when designing, implementing, or operating AI runtimes.

A deployment concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Deployment

Related: Model repository, Model versioning, Canary rollout

Local runtime

Definition: A deployment concept used when designing, implementing, or operating AI runtimes.

A deployment concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Deployment

Related: Model repository, Model versioning, Canary rollout

Tensor parallelism

Definition: A distributed inference concept used when designing, implementing, or operating AI runtimes.

A distributed inference concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Distributed inference

Related: Pipeline parallelism, Data parallelism, Expert parallelism

Pipeline parallelism

Definition: A distributed inference concept used when designing, implementing, or operating AI runtimes.

A distributed inference concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Distributed inference

Related: Tensor parallelism, Data parallelism, Expert parallelism

Data parallelism

Definition: A distributed inference concept used when designing, implementing, or operating AI runtimes.

A distributed inference concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Distributed inference

Related: Tensor parallelism, Pipeline parallelism, Expert parallelism

Expert parallelism

Definition: A distributed inference concept used when designing, implementing, or operating AI runtimes.

A distributed inference concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Distributed inference

Related: Tensor parallelism, Pipeline parallelism, Data parallelism

Sequence parallelism

Definition: A distributed inference concept used when designing, implementing, or operating AI runtimes.

A distributed inference concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Distributed inference

Related: Tensor parallelism, Pipeline parallelism, Data parallelism

Disaggregated serving

Definition: A distributed inference concept used when designing, implementing, or operating AI runtimes.

A distributed inference concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Distributed inference

Related: Tensor parallelism, Pipeline parallelism, Data parallelism

Prefill worker

Definition: A distributed inference concept used when designing, implementing, or operating AI runtimes.

A distributed inference concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Distributed inference

Related: Tensor parallelism, Pipeline parallelism, Data parallelism

Decode worker

Definition: A distributed inference concept used when designing, implementing, or operating AI runtimes.

A distributed inference concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Distributed inference

Related: Tensor parallelism, Pipeline parallelism, Data parallelism

Collective communication

Definition: A distributed inference concept used when designing, implementing, or operating AI runtimes.

A distributed inference concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Distributed inference

Related: Tensor parallelism, Pipeline parallelism, Data parallelism

Interconnect

Definition: A distributed inference concept used when designing, implementing, or operating AI runtimes.

A distributed inference concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Distributed inference

Related: Tensor parallelism, Pipeline parallelism, Data parallelism

Elasticity

Definition: A distributed inference concept used when designing, implementing, or operating AI runtimes.

A distributed inference concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Distributed inference

Related: Tensor parallelism, Pipeline parallelism, Data parallelism

Rebalancing

Definition: A distributed inference concept used when designing, implementing, or operating AI runtimes.

A distributed inference concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Distributed inference

Related: Tensor parallelism, Pipeline parallelism, Data parallelism

Edge runtime

Definition: A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes.

A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Edge/mobile/browser

Related: On-device runtime, TinyML, Mobile delegate

On-device runtime

Definition: A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes.

A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Edge/mobile/browser

Related: Edge runtime, TinyML, Mobile delegate

TinyML

Definition: A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes.

A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Edge/mobile/browser

Related: Edge runtime, On-device runtime, Mobile delegate

Mobile delegate

Definition: A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes.

A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Edge/mobile/browser

Related: Edge runtime, On-device runtime, TinyML

Browser runtime

Definition: A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes.

A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Edge/mobile/browser

Related: Edge runtime, On-device runtime, TinyML

WebAssembly

Definition: A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes.

A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Edge/mobile/browser

Related: Edge runtime, On-device runtime, TinyML

WebGPU

Definition: A web API exposing GPU compute capabilities to browser applications.

A web API exposing GPU compute capabilities to browser applications. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Edge/mobile/browser

Related: Edge runtime, On-device runtime, TinyML

WebNN

Definition: A web API for constructing and executing neural network graphs using operating system and hardware capabilities.

A web API for constructing and executing neural network graphs using operating system and hardware capabilities. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Edge/mobile/browser

Related: Edge runtime, On-device runtime, TinyML

Worker

Definition: A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes.

A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Edge/mobile/browser

Related: Edge runtime, On-device runtime, TinyML

IndexedDB model cache

Definition: A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes.

A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Edge/mobile/browser

Related: Edge runtime, On-device runtime, TinyML

Progressive enhancement

Definition: A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes.

A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Edge/mobile/browser

Related: Edge runtime, On-device runtime, TinyML

NPU delegate

Definition: A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes.

A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Edge/mobile/browser

Related: Edge runtime, On-device runtime, TinyML

Thermal throttling

Definition: A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes.

A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Edge/mobile/browser

Related: Edge runtime, On-device runtime, TinyML

Offline inference

Definition: A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes.

A edge/mobile/browser concept used when designing, implementing, or operating AI runtimes. In the aRuntime.com taxonomy, this term should be interpreted by layer, workload, deployment boundary, and source context.

Where it fits: Edge/mobile/browser

Related: Edge runtime, On-device runtime, TinyML

Constant folding

Definition: A graph optimization concept used when designing, implementing, or operating AI runtimes.