Query Execution Flow
This page explains how a Query is executed in the ARK platform, from creation to completion.
Overview
When you create a Query resource, it triggers a hierarchical execution flow:
- Query - Created by user with input and targets
- Query Controller - Orchestrates the execution
- Agents or Teams - Process the query
- Teams use strategies to coordinate multiple agents
- Agents execute prompts against models
- Models - Generate responses and can invoke:
- Tools - Custom functions
- MCP Servers - External service integrations
- Optional Components:
- Memory - Provides conversation context
- Execution Engines - Custom implementations for queries, agents, or teams
Execution Steps
1. Query Creation
A user creates a Query resource specifying:
- input: The user’s question or request
- targets: One or more Agents or Teams
- parameters (optional): Values to use in the query or pass to agent parameters via queryParameterRef
- memory (optional): Reference to a Memory resource
- sessionId (optional): Conversation session identifier
- serviceAccount (optional): For RBAC isolation
apiVersion: ark.mckinsey.com/v1alpha1
kind: Query
metadata:
name: my-query
spec:
input: "What is the weather today?"
target:
type: agent
name: weather-agent
parameters:
- name: agent_name
value: "WeatherBot"
memory:
name: conversation-memory2. Controller Processing
The Query controller detects the new Query and:
- Resolves the target (agent, team, model, or tool) from the spec or label selector
- If the target agent has an
executionEngine(A2A or named engine), the controller executes it directly since these agents proxy to external services - Otherwise, sends an A2A
SendMessageto the completions engine with the target and query reference - Waits for the response and writes the result to the Query status
3. Completions Engine Execution
The completions engine receives the A2A message and:
- Reads the Query CR and target CRD from Kubernetes
- Creates a memory client and loads conversation history
- Creates an EventStream for streaming chunks to ark-broker
- Executes the target
Agent Execution
- Model Resolution: Resolves the agent’s model reference
- Parameter Resolution: Resolves agent parameters from static values, ConfigMaps, Secrets, and query parameters
- Prompt Construction: Builds the prompt from agent system prompt, memory context, and user input
- Tool Preparation: Prepares available tools and MCP servers
- Turn Loop: Calls the model, executes tool calls, repeats until no more tool calls
- Streaming: Chunks stream directly from the engine to ark-broker
Team Execution
- Strategy Application: Uses the team’s strategy (sequential, round-robin, selector, graph)
- Member Coordination: Orchestrates execution across team members
- Recursive Routing: Members with explicit execution engines route via A2A; others execute locally
4. Response Handling
After execution:
- The engine saves new messages to memory, sends a final stream chunk, and closes the stream
- The engine returns the A2A response with the assistant message, token usage, and conversation ID
- The controller writes the response, token usage, and conversation ID to the Query CR status
- The controller marks the Query as completed
Execution Engines
Custom execution engines override default agent execution using the A2A protocol:
- Agent Engines: Custom agent execution via A2A (e.g., LangChain, AutoGen)
- Agents referencing an ExecutionEngine send requests via A2A with agent config, tools, and history in message metadata
- The engine processes the request and returns results through the A2A protocol
Error Handling
The system handles errors gracefully:
- Model Errors: API failures, rate limits, invalid responses
- Tool Errors: Tool execution failures, timeouts
- Resource Errors: Missing agents, models, or tools
- Permission Errors: RBAC violations, service account issues
Failed queries are marked with error status and detailed error messages.
Observability
Query execution is fully observable through:
- Kubernetes Events: Resource creation and status changes
- OpenTelemetry Traces: Detailed execution spans
- Logs: Structured logging throughout the pipeline
- Metrics: Performance and error metrics
Example Flow
Here’s a complete example of a weather query execution:
- User creates Query targeting
weather-agent - Controller resolves
weather-agentand itsgpt-4model - Agent constructs prompt with weather tools available
- Model calls
get-weathertool with location parameter - Tool executes and returns weather data
- Model generates natural language response
- Controller updates Query with final response
- Memory stores conversation for future context
This flow demonstrates ARK’s orchestration of multiple components to deliver intelligent, tool-enabled responses.
Last updated on