Ark Completions Engine
Execution engine that runs the LLM turn loop, tool execution, team orchestration, memory, and streaming for Ark queries.
- Deployed as a standalone service in
ark-system - Communicates with the controller via A2A protocol over K8s service
- Handles built-in agent, team, model, and tool execution (agents with an
executionEngineare dispatched directly by the controller) - Streams chunks directly to ark-broker
Architecture
The controller delegates query execution to the completions engine via A2A SendMessage. The engine reads CRDs using its own K8s client, executes the turn loop, and returns results.
Kubernetes Cluster (ark-system namespace)
┌─────────────────────┐ K8s Service ┌─────────────────────────┐
│ Controller Pod │ (A2A) │ Completions Pod │
│ (reconciler) │──────────────────►│ (ark-completions:80) │
│ │ │◄──────────────────│ │ │
│ - watch CRs │ │ - agent loop │
│ - resolve target │ │ - team orchestration │
│ - write status │ │ - tool execution │
│ │ │ - LLM providers │
│ │ │ - memory load/save │
│ │ │ - stream -> ark-broker │
└─────────────────────┘ └─────────────────────────┘Configuration
The completions engine is deployed as a standalone Helm chart.
Helm Values
image:
repository: ghcr.io/mckinsey/agents-at-scale-ark/ark-completions
pullPolicy: IfNotPresent
tag: ""
port: 80
targetPort: 9090
resources:
limits:
cpu: 500m
memory: 256Mi
requests:
cpu: 10m
memory: 64Mi
serviceAccountName: ark-completionsHelm chart location: ark/executors/completions/chart/
Controller Flag
The controller connects to the engine via the --completions-addr flag (default: http://ark-completions.ark-system).
Source Code
The completions engine source is in ark/executors/completions/.
Local Development
In dev mode, the completions engine runs as a standalone pod with independent code sync and restart.
cd ark
devspace devDevSpace deploys the engine using the chart at ark/executors/completions/chart/. The controller’s --completions-addr is set to http://ark-completions.ark-system.
Changes to engine code restart only the engine pod. Changes to controller code restart only the controller pod.
A2A Message Contract
The controller sends the same fat metadata contract used by external execution engines (Python SDK’s BaseExecutor), extended with a query reference:
{
"role": "user",
"parts": [{ "text": "<user input>" }],
"metadata": {
"ark.mckinsey.com/execution-engine": {
"agent": { "name": "...", "namespace": "..." },
"tools": [],
"history": [],
"query": { "name": "q-123", "namespace": "default" }
}
}
}The engine reads the Query CR and target CRDs using its K8s client for tool execution, MCP discovery, memory, and streaming config.