Observability
Comprehensive observability is crucial for production agentic workloads. ARK provides integrated monitoring, tracing, and logging capabilities to help you understand and optimize your AI agent performance.
OpenTelemetry Integration
ARK provides observability through OpenTelemetry integration, allowing you to monitor and trace all operations across the controller, execution engines and any other services. You can connect to any OpenTelemetry-compatible provider using standard environment variables.
Telemetry is enabled by setting the OpenTelemetry environment variables:
| Variable | Description | Example |
|---|---|---|
OTEL_EXPORTER_OTLP_ENDPOINT | OTLP endpoint URL | http://localhost:4318/v1/traces |
OTEL_EXPORTER_OTLP_HEADERS | Authentication headers | Authorization=Basic <token> |
OTEL_SERVICE_NAME | Service name for telemetry | ark-controller |
OTEL_RESOURCE_ATTRIBUTES | Additional resource attributes | environment=production |
Architecture
Some queries go directly from the controller to the OTEL endpoint, while others flow through execution engines when multi-framework agent orchestration is used.
OTEL Env Vars
┌─────────────────┐
│ ARK Controller │
└─────────┬───────┘
│
├─── → Query Executor → OTEL Endpoint
│
└─── → Query Executor → Execution Engine → OTEL Endpoint
OTEL Env Vars
┌─────────────────────────┐
│ Services, Engines, etc ├─── → OTEL Endpoint
└─────────────────────────┘Per-Tenant OTEL Routing
For multi-tenant deployments, Ark supports routing traces to tenant-specific OTEL endpoints. This enables:
- Tenant isolation - Each tenant’s traces go to their own observability backend
- Backend flexibility - Different tenants can use different OTEL backends (Langfuse, Phoenix, Jaeger, Honeycomb, etc.)
- Cost attribution - Observability costs can be attributed per tenant
- Compliance - Meet data residency or access control requirements
Enabling Per-Tenant Routing
Enable per-tenant OTEL discovery in your Helm values:
telemetry:
tenantRouting:
otelDiscovery: trueTenant Configuration
Each tenant configures their OTEL endpoint by creating a Secret named otel-environment-variables in their namespace:
apiVersion: v1
kind: Secret
metadata:
name: otel-environment-variables
namespace: <tenant-namespace>
type: Opaque
stringData:
OTEL_EXPORTER_OTLP_ENDPOINT: "https://otel-backend.example.com/v1/traces"
OTEL_EXPORTER_OTLP_HEADERS: "Authorization=Bearer <token>"The controller discovers these Secrets at startup and routes traces based on the query.namespace attribute.
Example Backend Configurations
Langfuse:
stringData:
OTEL_EXPORTER_OTLP_ENDPOINT: "http://langfuse.svc:3000/api/public/otel"
OTEL_EXPORTER_OTLP_HEADERS: "Authorization=Basic <base64(pk:sk)>"Phoenix (Arize):
stringData:
OTEL_EXPORTER_OTLP_ENDPOINT: "https://app.phoenix.arize.com/v1/traces"
OTEL_EXPORTER_OTLP_HEADERS: "api_key=<phoenix_api_key>"Honeycomb:
stringData:
OTEL_EXPORTER_OTLP_ENDPOINT: "https://api.honeycomb.io/v1/traces"
OTEL_EXPORTER_OTLP_HEADERS: "x-honeycomb-team=<api_key>"Jaeger:
stringData:
OTEL_EXPORTER_OTLP_ENDPOINT: "http://jaeger-collector.svc:4318/v1/traces"Architecture with Per-Tenant Routing
When per-tenant OTEL routing is enabled, traces are routed based on the query’s namespace:
ARK Controller
│
┌─────────────┼─────────────┐
│ │ │
▼ ▼ ▼
Primary OTEL Tenant-A OTEL Tenant-B OTEL
(platform) (Langfuse) (Phoenix)Applying Changes
After creating or updating tenant OTEL Secrets, restart the controller to pick up new configurations:
kubectl rollout restart deployment/ark-controller -n ark-system
kubectl rollout status deployment/ark-controller -n ark-system --timeout=120sAutomatic Injection of OTEL Configuration
One way to set up automatic OpenTelemetry configuration is through standardized ConfigMap and Secret references. This pattern allows any Kubernetes resource to automatically pick up OTEL environment variables when available:
apiVersion: apps/v1
kind: Deployment
spec:
template:
spec:
containers:
- name: your-app
envFrom:
# Standard OTEL configuration - will be injected if available
- configMapRef:
name: otel-environment-variables
optional: true
- secretRef:
name: otel-environment-variables
optional: trueWhen you create or update the standardized otel-environment-variables ConfigMap and Secret, all deployments and pods that reference them must be restarted to pick up the new environment variables:
# Restart components to pick up changes
kubectl rollout restart deployment/ark-controller -n ark-systemService Name Configuration
You can optionally set the service name used for telemetry in your containers, using the OTEL_SERVICE_NAME variable:
spec:
template:
spec:
containers:
- name: your-app
env:
- name: OTEL_SERVICE_NAME
value: "my-custom-service"Additional OTEL Variables
These OpenTelemetry environment variables are also supported:
| Variable | Description | Example |
|---|---|---|
OTEL_RESOURCE_ATTRIBUTES | Additional resource attributes | environment=production,version=1.0 |
OTEL_EXPORTER_OTLP_TIMEOUT | Request timeout in milliseconds | 30000 |
OTEL_PROPAGATORS | Trace context propagation format | tracecontext,baggage |
OTEL_TRACES_SAMPLER | Sampling strategy | always_on, always_off, traceidratio |
OTEL_TRACES_SAMPLER_ARG | Sampler configuration | 0.1 (for 10% sampling) |
Next: Learn about observability options:
- Phoenix Service - AI/ML model observability and evaluation
- Langfuse Service - Open Source LLM Application/Agent observability, evaluation, and prompt management