OpenAI Responses Executor
Executor for Ark agents backed by the OpenAI Responses API . Supports built-in tools (web search, code interpreter, file search), CFG/Lark grammar-constrained output, structured JSON output, MCP function tools, and stateless multi-turn threading via previous_response_id.
Overview
- Built-in Tools —
web_search_preview,file_search,code_interpreter,computer_useconfigured via annotations - CFG/Grammar Output — Lark grammar constraints enforced at token level (not by prompt) via
customtool type - Structured Output — JSON schema enforcement via
text.formatannotation; response is a valid JSON object - Multi-turn Threading — Conversations thread via
previous_response_id— no full history resent each turn - MCP Tools — Custom function tools from
spec.toolswired through Ark’s tool infrastructure - GPT-5 Support — Reasoning parameter (
effort) forgpt-5models; temperature disabled automatically - OTEL Tracing — Optional observability via
openinference-instrumentation-openai - A2A Protocol — Compliant with the Agent-to-Agent protocol for seamless Ark integration
Conversation Threading
Each request carries an A2A context_id → mapped to conversationId in the executor → used as a key to look up the last response_id on disk (/data/sessions/<conversationId>/response_id). Subsequent turns pass previous_response_id to the API instead of resending history — keeping payloads small and preserving server-side context.
Query CR A2A layer Executor OpenAI API
───────────────────── ────────────────── ──────────────────── ──────────────────
conversationId: "abc" → context_id: "abc" → lookup session file → previous_response_id: "resp_xyz"
save response.id ← response.id: "resp_xyz2"Install
ark install marketplace/executors/executor-openai-responsesOr with DevSpace:
cd executors/openai-responses
devspace deployOr with Helm:
helm install executor-openai-responses ./chart -n default --create-namespacePrerequisites
Model CRD (Required)
apiVersion: ark.mckinsey.com/v1alpha1
kind: Model
metadata:
name: openai-gpt-4o
spec:
provider: openai
type: completions
model:
value: gpt-4o
config:
openai:
apiKey:
valueFrom:
secretKeyRef:
name: openai-credentials
key: api-keyFor GPT-5 models, include baseUrl:
baseUrl:
valueFrom:
secretKeyRef:
name: openai-credentials
key: base-urlOpenAI Credentials Secret
kubectl create secret generic openai-credentials \
--from-literal=api-key=sk-... \
--from-literal=base-url=https://your-endpoint # optionalAnnotations
All configuration uses annotations with cascade: ExecutionEngine → Agent → Query (highest priority wins, merged by type key).
Built-in Tools
annotations:
executor-openai-responses.ark.mckinsey.com/tools: |
[
{
"type": "web_search_preview",
"user_location": {"type": "approximate", "country": "GB", "city": "London", "region": "London"}
}
]Available types: web_search_preview, file_search, code_interpreter, computer_use.
Reasoning (GPT-5 only)
annotations:
executor-openai-responses.ark.mckinsey.com/reasoning: '{"effort": "low"}'Effort values: "low", "medium", "high". Omitting the annotation defaults to "medium".
Use "low" for focused single-task agents (e.g. find one URL). Use "medium" or higher for agents that must gather multiple pieces of information (e.g. structured lookup with web search across several fields) — lower effort may not perform enough searches to find all required data.
Structured Output
Constrains the response to a JSON object matching the schema — enforced at token level:
annotations:
executor-openai-responses.ark.mckinsey.com/output-schema: |
{
"type": "object",
"properties": {
"company_name": {"type": "string"},
"website_url": {"type": "string"}
},
"required": ["company_name", "website_url"],
"additionalProperties": false
}Examples
See examples/ for ready-to-use YAML manifests and demo scripts.
Running the demo
Against a live cluster:
# Apply all CRDs and run each example, printing prompt, input and response
examples/demo.shLocally without Kubernetes:
export OPENAI_API_KEY=sk-...
export OPENAI_BASE_URL=https://your-endpoint # optional
export MODEL_NAME=gpt-5.2-2025-12-11 # optional
python3 examples/demo_local.pyExample manifests
| Example | What it shows |
|---|---|
website-search-agent.yaml | web_search_preview with UK location context |
company-lookup-agent.yaml | Web search + structured JSON output (company data) |
sql-generator-agent.yaml | CFG/Lark grammar-constrained SQL generation |
dsl-generator-agent.yaml | CFG/Lark grammar for a functional pipeline DSL |
companies-house-agent.yaml | MCP function tools via spec.tools |
Configuration
| Env Var | Default | Description |
|---|---|---|
SESSIONS_DIR | /data/sessions | Directory for persisting response_id per conversation |
MAX_TOOL_ITERATIONS | 10 | Max function-call loop iterations before returning |
OTEL_INSTRUMENTATION_ENABLED | false | Enable OpenAI OTEL instrumentation |
PORT | 8000 | HTTP server port |