Skip to Content
ReferenceQuery Execution Flow

Query Execution Flow

This page explains how a Query is executed in the ARK platform, from creation to completion.

Overview

When you create a Query resource, it triggers a hierarchical execution flow:

  1. Query - Created by user with input and targets
  2. Query Controller - Orchestrates the execution
  3. Agents or Teams - Process the query
    • Teams use strategies to coordinate multiple agents
    • Agents execute prompts against models
  4. Models - Generate responses and can invoke:
    • Tools - Custom functions
    • MCP Servers - External service integrations
  5. Optional Components:
    • Memory - Provides conversation context
    • Execution Engines - Custom implementations for queries, agents, or teams

Execution Steps

1. Query Creation

A user creates a Query resource specifying:

  • input: The user’s question or request
  • targets: One or more Agents or Teams
  • parameters (optional): Values to use in the query or pass to agent parameters via queryParameterRef
  • memory (optional): Reference to a Memory resource
  • sessionId (optional): Conversation session identifier
  • serviceAccount (optional): For RBAC isolation
apiVersion: ark.mckinsey.com/v1alpha1 kind: Query metadata: name: my-query spec: input: "What is the weather today?" target: type: agent name: weather-agent parameters: - name: agent_name value: "WeatherBot" memory: name: conversation-memory

2. Controller Processing

The Query controller detects the new Query and:

  1. Resolves the target (agent, team, model, or tool) from the spec or label selector
  2. If the target agent has an executionEngine (A2A or named engine), the controller executes it directly since these agents proxy to external services
  3. Otherwise, sends an A2A SendMessage to the completions engine with the target and query reference
  4. Waits for the response and writes the result to the Query status

3. Completions Engine Execution

The completions engine receives the A2A message and:

  1. Reads the Query CR and target CRD from Kubernetes
  2. Creates a memory client and loads conversation history
  3. Creates an EventStream for streaming chunks to ark-broker
  4. Executes the target

Agent Execution

  1. Model Resolution: Resolves the agent’s model reference
  2. Parameter Resolution: Resolves agent parameters from static values, ConfigMaps, Secrets, and query parameters
  3. Prompt Construction: Builds the prompt from agent system prompt, memory context, and user input
  4. Tool Preparation: Prepares available tools and MCP servers
  5. Turn Loop: Calls the model, executes tool calls, repeats until no more tool calls
  6. Streaming: Chunks stream directly from the engine to ark-broker

Team Execution

  1. Strategy Application: Uses the team’s strategy (sequential, round-robin, selector, graph)
  2. Member Coordination: Orchestrates execution across team members
  3. Recursive Routing: Members with explicit execution engines route via A2A; others execute locally

4. Response Handling

After execution:

  1. The engine saves new messages to memory, sends a final stream chunk, and closes the stream
  2. The engine returns the A2A response with the assistant message, token usage, and conversation ID
  3. The controller writes the response, token usage, and conversation ID to the Query CR status
  4. The controller marks the Query as completed

Execution Engines

Custom execution engines override default agent execution using the A2A protocol:

  • Agent Engines: Custom agent execution via A2A (e.g., LangChain, AutoGen)
  • Agents referencing an ExecutionEngine send requests via A2A with agent config, tools, and history in message metadata
  • The engine processes the request and returns results through the A2A protocol

Error Handling

The system handles errors gracefully:

  • Model Errors: API failures, rate limits, invalid responses
  • Tool Errors: Tool execution failures, timeouts
  • Resource Errors: Missing agents, models, or tools
  • Permission Errors: RBAC violations, service account issues

Failed queries are marked with error status and detailed error messages.

Observability

Query execution is fully observable through:

  • Kubernetes Events: Resource creation and status changes
  • OpenTelemetry Traces: Detailed execution spans
  • Logs: Structured logging throughout the pipeline
  • Metrics: Performance and error metrics

Example Flow

Here’s a complete example of a weather query execution:

  1. User creates Query targeting weather-agent
  2. Controller resolves weather-agent and its gpt-4 model
  3. Agent constructs prompt with weather tools available
  4. Model calls get-weather tool with location parameter
  5. Tool executes and returns weather data
  6. Model generates natural language response
  7. Controller updates Query with final response
  8. Memory stores conversation for future context

This flow demonstrates ARK’s orchestration of multiple components to deliver intelligent, tool-enabled responses.

Last updated on