Query Execution Flow

This page explains how a Query is executed in the ARK platform, from creation to completion.

Overview

When you create a Query resource, it triggers a hierarchical execution flow:

Query - Created by user with input and targets
Query Controller - Orchestrates the execution
Agents or Teams - Process the query
- Teams use strategies to coordinate multiple agents
- Agents execute prompts against models
Models - Generate responses and can invoke:
- Tools - Custom functions
- MCP Servers - External service integrations
Optional Components:
- Memory - Provides conversation context
- Execution Engines - Custom implementations for queries, agents, or teams

Execution Steps

1. Query Creation

A user creates a Query resource specifying:

input: The user’s question or request
targets: One or more Agents or Teams
parameters (optional): Values to use in the query or pass to agent parameters via queryParameterRef
memory (optional): Reference to a Memory resource
sessionId (optional): Conversation session identifier
serviceAccount (optional): For RBAC isolation


apiVersion: ark.mckinsey.com/v1alpha1
kind: Query
metadata:
  name: my-query
spec:
  input: "What is the weather today?"
  target:
    type: agent
    name: weather-agent
  parameters:
    - name: agent_name
      value: "WeatherBot"
  memory:
    name: conversation-memory

2. Controller Processing

The Query controller detects the new Query and:

Resolves the target (agent, team, model, or tool) from the spec or label selector
If the target agent has an executionEngine (A2A or named engine), the controller executes it directly since these agents proxy to external services
Otherwise, sends an A2A SendMessage to the completions engine with the target and query reference
Waits for the response and writes the result to the Query status

3. Completions Engine Execution

The completions engine receives the A2A message and:

Reads the Query CR and target CRD from Kubernetes
Creates a memory client and loads conversation history
Creates an EventStream for streaming chunks to ark-broker
Executes the target

Agent Execution

Model Resolution: Resolves the agent’s model reference
Parameter Resolution: Resolves agent parameters from static values, ConfigMaps, Secrets, and query parameters
Prompt Construction: Builds the prompt from agent system prompt, memory context, and user input
Tool Preparation: Prepares available tools and MCP servers
Turn Loop: Calls the model, executes tool calls, repeats until no more tool calls
Streaming: Chunks stream directly from the engine to ark-broker

Team Execution

Strategy Application: Uses the team’s strategy (sequential, round-robin, selector, graph)
Member Coordination: Orchestrates execution across team members
Recursive Routing: Members with explicit execution engines route via A2A; others execute locally

4. Response Handling

After execution:

The engine saves new messages to memory, sends a final stream chunk, and closes the stream
The engine returns the A2A response with the assistant message, token usage, and conversation ID
The controller writes the response, token usage, and conversation ID to the Query CR status
The controller marks the Query as completed

Execution Engines

Custom execution engines override default agent execution using the A2A protocol:

Agent Engines: Custom agent execution via A2A (e.g., LangChain, AutoGen)
Agents referencing an ExecutionEngine send requests via A2A with agent config, tools, and history in message metadata
The engine processes the request and returns results through the A2A protocol

Error Handling

The system handles errors gracefully:

Model Errors: API failures, rate limits, invalid responses
Tool Errors: Tool execution failures, timeouts
Resource Errors: Missing agents, models, or tools
Permission Errors: RBAC violations, service account issues

Failed queries are marked with error status and detailed error messages.

Observability

Query execution is fully observable through:

Kubernetes Events: Resource creation and status changes
OpenTelemetry Traces: Detailed execution spans
Logs: Structured logging throughout the pipeline
Metrics: Performance and error metrics

Example Flow

Here’s a complete example of a weather query execution:

User creates Query targeting weather-agent
Controller resolves weather-agent and its gpt-4 model
Agent constructs prompt with weather tools available
Model calls get-weather tool with location parameter
Tool executes and returns weather data
Model generates natural language response
Controller updates Query with final response
Memory stores conversation for future context

This flow demonstrates ARK’s orchestration of multiple components to deliver intelligent, tool-enabled responses.