Skip to Content
ExecutorsOpenAI Responses Executor

OpenAI Responses Executor

Executor for Ark agents backed by the OpenAI Responses API . Supports built-in tools (web search, code interpreter, file search), CFG/Lark grammar-constrained output, structured JSON output, MCP function tools, and stateless multi-turn threading via previous_response_id.

Overview

  • Built-in Toolsweb_search_preview, file_search, code_interpreter, computer_use configured via annotations
  • CFG/Grammar Output — Lark grammar constraints enforced at token level (not by prompt) via custom tool type
  • Structured Output — JSON schema enforcement via text.format annotation; response is a valid JSON object
  • Multi-turn Threading — Conversations thread via previous_response_id — no full history resent each turn
  • MCP Tools — Custom function tools from spec.tools wired through Ark’s tool infrastructure
  • GPT-5 Support — Reasoning parameter (effort) for gpt-5 models; temperature disabled automatically
  • OTEL Tracing — Optional observability via openinference-instrumentation-openai
  • A2A Protocol — Compliant with the Agent-to-Agent protocol for seamless Ark integration

Conversation Threading

Each request carries an A2A context_id → mapped to conversationId in the executor → used as a key to look up the last response_id on disk (/data/sessions/<conversationId>/response_id). Subsequent turns pass previous_response_id to the API instead of resending history — keeping payloads small and preserving server-side context.

Query CR A2A layer Executor OpenAI API ───────────────────── ────────────────── ──────────────────── ────────────────── conversationId: "abc" → context_id: "abc" → lookup session file → previous_response_id: "resp_xyz" save response.id ← response.id: "resp_xyz2"

Install

ark install marketplace/executors/executor-openai-responses

Or with DevSpace:

cd executors/openai-responses devspace deploy

Or with Helm:

helm install executor-openai-responses ./chart -n default --create-namespace

Prerequisites

Model CRD (Required)

apiVersion: ark.mckinsey.com/v1alpha1 kind: Model metadata: name: openai-gpt-4o spec: provider: openai type: completions model: value: gpt-4o config: openai: apiKey: valueFrom: secretKeyRef: name: openai-credentials key: api-key

For GPT-5 models, include baseUrl:

baseUrl: valueFrom: secretKeyRef: name: openai-credentials key: base-url

OpenAI Credentials Secret

kubectl create secret generic openai-credentials \ --from-literal=api-key=sk-... \ --from-literal=base-url=https://your-endpoint # optional

Annotations

All configuration uses annotations with cascade: ExecutionEngine → Agent → Query (highest priority wins, merged by type key).

Built-in Tools

annotations: executor-openai-responses.ark.mckinsey.com/tools: | [ { "type": "web_search_preview", "user_location": {"type": "approximate", "country": "GB", "city": "London", "region": "London"} } ]

Available types: web_search_preview, file_search, code_interpreter, computer_use.

Reasoning (GPT-5 only)

annotations: executor-openai-responses.ark.mckinsey.com/reasoning: '{"effort": "low"}'

Effort values: "low", "medium", "high". Omitting the annotation defaults to "medium".

Use "low" for focused single-task agents (e.g. find one URL). Use "medium" or higher for agents that must gather multiple pieces of information (e.g. structured lookup with web search across several fields) — lower effort may not perform enough searches to find all required data.

Structured Output

Constrains the response to a JSON object matching the schema — enforced at token level:

annotations: executor-openai-responses.ark.mckinsey.com/output-schema: | { "type": "object", "properties": { "company_name": {"type": "string"}, "website_url": {"type": "string"} }, "required": ["company_name", "website_url"], "additionalProperties": false }

Examples

See examples/ for ready-to-use YAML manifests and demo scripts.

Running the demo

Against a live cluster:

# Apply all CRDs and run each example, printing prompt, input and response examples/demo.sh

Locally without Kubernetes:

export OPENAI_API_KEY=sk-... export OPENAI_BASE_URL=https://your-endpoint # optional export MODEL_NAME=gpt-5.2-2025-12-11 # optional python3 examples/demo_local.py

Example manifests

ExampleWhat it shows
website-search-agent.yamlweb_search_preview with UK location context
company-lookup-agent.yamlWeb search + structured JSON output (company data)
sql-generator-agent.yamlCFG/Lark grammar-constrained SQL generation
dsl-generator-agent.yamlCFG/Lark grammar for a functional pipeline DSL
companies-house-agent.yamlMCP function tools via spec.tools

Configuration

Env VarDefaultDescription
SESSIONS_DIR/data/sessionsDirectory for persisting response_id per conversation
MAX_TOOL_ITERATIONS10Max function-call loop iterations before returning
OTEL_INSTRUMENTATION_ENABLEDfalseEnable OpenAI OTEL instrumentation
PORT8000HTTP server port
Last updated on