Models

Models define AI language model configurations for agents to use. Agents use the model named default if no specific model is configured.

OpenAI


# Example OpenAI model.
apiVersion: ark.mckinsey.com/v1alpha1
kind: Model
metadata:
  name: default
spec:
  # The 'openai' type is any openai specification compatible model. This
  # includes OpenAI, Google Gemini (in OpenAI compability mode), Anthropic
  # Claude and so on.
  type: openai
  model:
    # The specific model type.
    value: gpt-4o
  config:
    openai:
      # API endpoint URL
      baseUrl:
        value: "https://api.openai.com/v1"
      # API authentication key - this should be set to a Kubernetes Secret
      # for security purposes.
      apiKey:
        valueFrom:
          secretKeyRef:
            name: default-model-token
            key: token
      # Optional model generation parameters
      properties:
        temperature:
          value: "0.7"
        max_tokens:
          value: "4096"
---
# Example of a secret that can be used to configure the API key for a model.
apiVersion: v1
kind: Secret
metadata:
  name: default-model-token
type: Opaque
stringData:
  token: "your-api-key-here"

An API key secret can also be created like so:


kubectl create secret generic default-model-token --from-literal=token="your-api-key-here"

Azure OpenAI


apiVersion: ark.mckinsey.com/v1alpha1
kind: Model
metadata:
  name: gpt-4o-mini
spec:
  # The Azure OpenAI model type.
  type: azure
  model:
    value: gpt-4o-mini
  config:
    azure:
      baseUrl:
        value: "https://your-resource.openai.azure.com"
      apiKey:
        valueFrom:
          secretKeyRef:
            name: azure-openai-key
            key: token
      # Azure-specific API version
      apiVersion:
        value: "2024-12-01-preview"
      # Optional properties.
      properties:
        temperature:
          value: "0.7"
        max_tokens:
          value: "4096"

AWS Bedrock


apiVersion: ark.mckinsey.com/v1alpha1
kind: Model
metadata:
  name: claude-haiku
spec:
  # The AWS Bedrock model type.
  type: bedrock
  model:
    value: "us.anthropic.claude-3-5-haiku-20241022-v1:0"
  config:
    bedrock:
      # AWS region (optional, uses default)
      region:
        value: "us-west-2"
      # Base URL - optional only needed if the non-default is required.
      baseUrl:
        value: "https://aws-bedrock.prod.ai-gateway.quantumblack.com/your-project-id"
      # Explicit credentials (optional, defaults to IAM role)
      accessKeyId:
        valueFrom:
          secretKeyRef:
            name: aws-credentials
            key: access-key-id
      secretAccessKey:
        valueFrom:
          secretKeyRef:
            name: aws-credentials
            key: secret-access-key
      # Session token for temporary credentials or JWT tokens
      sessionToken:
        valueFrom:
          secretKeyRef:
            name: aws-credentials
            key: session-token
      # Custom model ARN (optional)
      modelArn:
        value: "arn:aws:bedrock:..."
      properties:
        temperature:
          value: "0.7"
        max_tokens:
          value: "4096"

Google Gemini and Anthropic Models

Both Google Gemini and Anthropic provide OpenAI-compatible endpoints, allowing you to use their models with the openai type. The base URls are:

https://generativelanguage.googleapis.com/v1beta/openai for Google Gemini
https://api.anthropic.com/v1 for Anthropic Claude

Most other providers also support OpenAI compatible base URLs - check their docs for details.

Model Properties

All model providers support a flexible properties system that allows you to customize model behavior by setting parameters like temperature, max tokens, and other OpenAI ChatCompletion parameters.

Basic Properties Example


apiVersion: ark.mckinsey.com/v1alpha1
kind: Model
metadata:
  name: gpt-4-custom
spec:
  type: openai
  model:
    value: gpt-4o
  config:
    openai:
      properties:
        temperature:
          value: "0.1"
        max_tokens:
          value: "1000"
      baseUrl:
        value: "https://api.openai.com/v1"
      apiKey:
        valueFrom:
          secretKeyRef:
            name: openai-secret
            key: token

Any OpenAI ChatCompletion parameters can be provided through the properties system, including temperature, max_tokens, top_p, frequency_penalty, presence_penalty, stop, seed, and more.

Custom HTTP Headers

OpenAI and Azure models support custom HTTP headers for advanced authentication and routing scenarios. Headers can be specified with direct values or loaded from Kubernetes Secrets and ConfigMaps.

Supported Providers:

OpenAI
Azure OpenAI

Basic Headers Example


apiVersion: ark.mckinsey.com/v1alpha1
kind: Model
metadata:
  name: default
spec:
  type: azure
  model:
    value: gpt-4o
  config:
    azure:
      baseUrl:
        value: "https://your-resource.openai.azure.com"
      apiKey:
        valueFrom:
          secretKeyRef:
            name: azure-openai-key
            key: token
      apiVersion:
        value: "2024-12-01-preview"
      # Custom HTTP headers sent with every request
      headers:
        - name: X-Custom-Header
          value:
            value: "direct-header-value"
        - name: X-Request-ID
          value:
            value: "my-app-v1"

Headers from Secrets and ConfigMaps

Load sensitive header values from Kubernetes Secrets or configuration from ConfigMaps:


apiVersion: v1
kind: Secret
metadata:
  name: gateway-credentials
type: Opaque
stringData:
  api-key: "your-gateway-api-key"
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  user-agent: "MyApp/1.0"
---
apiVersion: ark.mckinsey.com/v1alpha1
kind: Model
metadata:
  name: default
spec:
  type: azure
  model:
    value: gpt-4o
  config:
    azure:
      baseUrl:
        value: "https://your-resource.openai.azure.com"
      apiKey:
        valueFrom:
          secretKeyRef:
            name: azure-openai-key
            key: token
      apiVersion:
        value: "2024-12-01-preview"
      headers:
        # Load from Secret
        - name: X-API-Gateway-Key
          value:
            valueFrom:
              secretKeyRef:
                name: gateway-credentials
                key: api-key
        # Load from ConfigMap
        - name: User-Agent
          value:
            valueFrom:
              configMapKeyRef:
                name: app-config
                key: user-agent
        # Direct value
        - name: X-Client-ID
          value:
            value: "production-client"

OpenAI Provider Headers

Headers work the same way with OpenAI provider:


apiVersion: ark.mckinsey.com/v1alpha1
kind: Model
metadata:
  name: openai-with-headers
spec:
  type: openai
  model:
    value: gpt-4o
  config:
    openai:
      baseUrl:
        value: "https://api.openai.com/v1"
      apiKey:
        valueFrom:
          secretKeyRef:
            name: openai-secret
            key: token
      headers:
        - name: X-Custom-Header
          value:
            value: "my-value"

Status and Health Checking

ARK continuously monitors model availability through periodic health checks. The model controller probes each model at regular intervals to ensure it remains accessible and functional.

Health Check Configuration

The pollInterval field controls how often the model is probed:


spec:
  pollInterval: 1m  # Default: 1 minute

Status Conditions

Model status is tracked using Kubernetes conditions pattern. The primary condition is ModelAvailable:


status:
  conditions:
  - type: ModelAvailable
    status: "True"        # True/False/Unknown
    reason: "Available"   # Short reason for the condition
    message: "Model is available and probed successfully"
    lastTransitionTime: "2024-01-15T10:30:00Z"

Condition States:

ModelAvailable: True - Model successfully responds to test prompts
ModelAvailable: False - Model probe failed (network error, authentication issue, etc.)
ModelAvailable: Unknown - Initial state before first probe completes

Viewing Model Status

Check model availability using kubectl:


# List models with availability status
kubectl get models
 
NAME                TYPE      MODEL                    AVAILABLE   AGE
gpt-4-model         azure     gpt-4.1-mini             True        5m
claude-model        bedrock   claude-3-sonnet-v1       False       3m
 
# Get detailed status
kubectl describe model gpt-4-model

The AVAILABLE column shows the current state of the ModelAvailable condition, making it easy to identify models that may have connectivity or configuration issues.

Agent Model Configuration

Agents can specify which model to use. If no model is specified, the default model is used. If an agent references a model that doesn’t exist, the agent will remain in pending state. The modelRef parameter is used to specify the model name:


apiVersion: ark.mckinsey.com/v1alpha1
kind: Agent
metadata:
  name: weather-agent
spec:
  prompt: "You are a helpful weather assistant"
  # Explicitly set the model to use
  modelRef: 
    # Specify the model name. If no modelRef is provided then 'default' is used.
    name: gpt-4o-mini