KubeChain is a cloud-native orchestrator for AI Agents built on Kubernetes. It supports long-lived outer-loop agents that can process asynchronous execution of both LLM inference and long-running tool calls. It's designed for simplicity and gives strong durability and reliability guarantees for agents that make asynchronous tool calls like contacting humans or delegating work to other agents.
- LLM: Provider + API Keys + Parameters
- Agent: LLM + System Prompt + Tools
- Tool: Function, API, Docker container, or another Agent
- Task: Agent + User Message
- TaskRun: Task + Current context window
To run KubeChain, you'll need:
- kubectl - Command-line tool for Kubernetes
- kind - For running local Kubernetes clusters
- OpenAI API Key - For LLM functionality
- Docker - For building and running container images
- Create a Kind cluster
kind create cluster --config kubechain-example/kind/kind-config.yaml- Add your OpenAI API key as a Kubernetes secret
kubectl create secret generic openai \
--from-literal=OPENAI_API_KEY=$OPENAI_API_KEY \
--namespace=defaultDeploy the KubeChain operator to your cluster:
kubectl apply -f https://raw.githubusercontent.com/humanlayer/smallchain/refs/heads/main/kubechain/config/release/latest.yamlJust the CRDs
kubectl apply -f https://raw.githubusercontent.com/humanlayer/smallchain/refs/heads/main/kubechain/config/release/latest-crds.yamlInstall a specific version
kubectl apply -f https://raw.githubusercontent.com/humanlayer/smallchain/refs/heads/main/kubechain/config/release/v0.1.0.yamlThis command will build the operator, create necessary CRDs, and deploy the KubeChain components to your cluster.
- Define an LLM resource
cat <<EOF | kubectl apply -f -
apiVersion: kubechain.humanlayer.dev/v1alpha1
kind: LLM
metadata:
name: gpt-4o
spec:
provider: openai
apiKeyFrom:
secretKeyRef:
name: openai
key: OPENAI_API_KEY
EOFCheck the created LLM:
kubectl get llmOutput:
NAME PROVIDER READY STATUS
gpt-4o openai true Ready
Using `-o wide` and `describe`
kubectl get llm -o wideOutput:
NAME PROVIDER READY STATUS DETAIL
gpt-4o openai true Ready OpenAI API key validated successfully
kubectl describe llmOutput:
Name: gpt-4o
Namespace: default
Labels: <none>
Annotations: <none>
API Version: kubechain.humanlayer.dev/v1alpha1
Kind: LLM
Metadata:
Creation Timestamp: 2025-03-21T20:18:17Z
Generation: 2
Resource Version: 1682222
UID: 973098fb-2b8d-46b3-be49-81592e0b8f4e
Spec:
API Key From:
Secret Key Ref:
Key: OPENAI_API_KEY
Name: openai
Provider: openai
Status:
Ready: true
Status: Ready
Status Detail: OpenAI API key validated successfully
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ValidationSucceeded 32m (x3 over 136m) llm-controller OpenAI API key validated successfully
- Create an Agent resource
cat <<EOF | kubectl apply -f -
apiVersion: kubechain.humanlayer.dev/v1alpha1
kind: Agent
metadata:
name: my-assistant
spec:
llmRef:
name: gpt-4o
system: |
You are a helpful assistant. Your job is to help the user with their tasks.
EOFCheck the created Agent:
kubectl get agentOutput:
NAME READY STATUS
my-assistant true Ready
Using `-o wide` and `describe`
kubectl get agent -o wideOutput:
NAME READY STATUS DETAIL
my-assistant true Ready All dependencies validated successfully
kubectl describe agentOutput:
Name: my-assistant
Namespace: default
Labels: <none>
Annotations: <none>
API Version: kubechain.humanlayer.dev/v1alpha1
Kind: Agent
Metadata:
Creation Timestamp: 2025-03-21T22:06:27Z
Generation: 1
Resource Version: 1682754
UID: e389b3e5-c718-4abd-aa72-d4fc82c9b992
Spec:
Llm Ref:
Name: gpt-4o
System: You are a helpful assistant. Your job is to help the user with their tasks.
Status:
Ready: true
Status: Ready
Status Detail: All dependencies validated successfully
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Initializing 64m agent-controller Starting validation
Normal ValidationSucceeded 64m (x2 over 64m) agent-controller All dependencies validated successfully
- Create a Task resource
cat <<EOF | kubectl apply -f -
apiVersion: kubechain.humanlayer.dev/v1alpha1
kind: Task
metadata:
name: hello-world-task
spec:
agentRef:
name: my-assistant
message: "What is the capital of the moon?"
EOFCheck the created Task:
kubectl get taskOutput:
NAME READY STATUS AGENT MESSAGE
hello-world-task true Ready my-assistant What is the capital of the moon?
Using `-o wide` and `describe`
kubectl get task -o wideOutput:
NAME READY STATUS DETAIL AGENT MESSAGE OUTPUT
hello-world-task true Ready Task Run Created my-assistant What is the capital of the moon?
kubectl describe taskOutput:
ame: hello-world-task
Namespace: default
Labels: <none>
Annotations: <none>
API Version: kubechain.humanlayer.dev/v1alpha1
Kind: Task
Metadata:
Creation Timestamp: 2025-03-21T22:14:09Z
Generation: 1
Resource Version: 1683590
UID: 8d0c7d4a-88db-4005-b212-a2c3a6956af3
Spec:
Agent Ref:
Name: my-assistant
Message: What is the capital of the moon?
Status:
Ready: true
Status: Ready
Status Detail: Task Run Created
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Initializing 56m task-controller Starting validation
Normal TaskRunCreated 56m task-controller Created TaskRun hello-world-task-1
By default, creating a task will create an initial TaskRun to execute the task.
For now, our task run should complete quickly and return a FinalAnswer.
kubectl get taskrun Output:
NAME READY STATUS PHASE TASK PREVIEW OUTPUT
hello-world-task-1 true Ready FinalAnswer hello-world-task The Moon does not have a capital. It is a natural satellite of Earth and lacks any governmental structure or human habitation that would necessitate a capital city.
We saw above how you can get the status of a taskrun with kubectl get taskrun.
For more detailed information, like to see the full context window, you can use:
kubectl describe taskrun Name: hello-world-task-1
Namespace: default
Labels: kubechain.humanlayer.dev/task=hello-world-task
Annotations: <none>
API Version: kubechain.humanlayer.dev/v1alpha1
Kind: TaskRun
Metadata:
Creation Timestamp: 2025-03-21T22:14:09Z
Generation: 1
Owner References:
API Version: kubechain.humanlayer.dev/v1alpha1
Controller: true
Kind: Task
Name: hello-world-task
UID: 8d0c7d4a-88db-4005-b212-a2c3a6956af3
Resource Version: 1683602
UID: 53b1b69a-fb49-431b-857a-1cafe017a544
Spec:
Task Ref:
Name: hello-world-task
Status:
Context Window:
Content: You are a helpful assistant. Your job is to help the user with their tasks.
Role: system
Content: What is the capital of the moon?
Role: user
Content: The Moon does not have a capital. It is a natural satellite of Earth and lacks any governmental structure or human habitation that would necessitate a capital city.
Role: assistant
Output: The Moon does not have a capital. It is a natural satellite of Earth and lacks any governmental structure or human habitation that would necessitate a capital city.
Phase: FinalAnswer
Ready: true
Status: Ready
Status Detail: LLM final response received
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Waiting 17m taskrun-controller Waiting for task "hello-world-task" to become ready
Normal ValidationSucceeded 17m taskrun-controller Task validated successfully
Normal LLMFinalAnswer 17m taskrun-controller LLM response received successfully
or
kubectl get taskrun -o yamlOutput (truncated for brevity)
``` apiVersion: v1 items: - apiVersion: kubechain.humanlayer.dev/v1alpha1 kind: TaskRun metadata: labels: kubechain.humanlayer.dev/task: hello-world-task name: hello-world-task-1 namespace: default # ...snip... spec: taskRef: name: hello-world-task status: contextWindow: - content: | You are a helpful assistant. Your job is to help the user with their tasks. role: system - content: What is the capital of the moon? role: user - content: The Moon does not have a capital. It is a natural satellite of Earth and lacks any governmental structure or human habitation that would necessitate a capital city. role: assistant output: The Moon does not have a capital. It is a natural satellite of Earth and lacks any governmental structure or human habitation that would necessitate a capital city. phase: FinalAnswer ready: true status: Ready statusDetail: LLM final response received # ...snip... ```Agent's aren't that interesting without tools. Let's add a basic MCP server tool to our agent:
cat <<EOF | kubectl apply -f -
apiVersion: kubechain.humanlayer.dev/v1alpha1
kind: MCPServer
metadata:
name: fetch
spec:
type: "stdio"
command: "uvx"
args: ["mcp-server-fetch"]
EOFkubectl get mcpserverNAME READY STATUS
fetch true Ready
kubectl describe mcpserverOutput:
Name: fetch
Namespace: default
Labels: <none>
Annotations: <none>
API Version: kubechain.humanlayer.dev/v1alpha1
Kind: MCPServer
Metadata:
Creation Timestamp: 2025-03-21T22:18:45Z
Generation: 1
Resource Version: 1684392
UID: b2c43f91-c8e2-4d3a-9c82-f39d12e48a92
Spec:
Command: uvx
Args:
- mcp-server-fetch
Status:
Connected: true
Tools:
- Name: fetch_url
Description: Fetches content from a URL
Input Schema:
Type: object
Properties:
url:
Type: string
Required:
- url
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ToolsDiscovered 2m mcp-controller Discovered 1 tool(s)
Then we can update our agent in-place to give it access to the fetch tool:
cat <<EOF | kubectl apply -f -
apiVersion: kubechain.humanlayer.dev/v1alpha1
kind: Agent
metadata:
name: my-assistant
spec:
llmRef:
name: gpt-4o
system: |
You are a helpful assistant. Your job is to help the user with their tasks.
mcpServers:
- name: fetch
EOFLet's make a new task that uses the fetch tool:
cat <<EOF | kubectl apply -f -
apiVersion: kubechain.humanlayer.dev/v1alpha1
kind: Task
metadata:
name: fetch-task
spec:
agentRef:
name: my-assistant
message: "What is on the front page of news.google.com?"
EOFRemove our agent, task and related resources:
kubectl delete taskruntoolcall --all
kubectl delete taskrun --all
kubectl delete task --all
kubectl delete agent --all
kubectl delete mcpserver --all
kubectl delete llm --all
Remove the OpenAI secret:
kubectl delete secret openai
Remove the operator, resources and custom resource definitions:
kustomize build kubechain/config/default | kubectl delete --ignore-not-found=true -f -
If you made a kind cluster, you can delete it with:
kind delete cluster --name kubechain-local
-
Kubernetes-Native Architecture: KubeChain is built as a Kubernetes operator, using Custom Resource Definitions (CRDs) to define and manage LLMs, Agents, Tools, Tasks, and TaskRuns.
-
Durable Agent Execution: KubeChain implements something like async/await at the infrastructure layer, checkpointing a conversation chain whenever a tool call or agent delegation occurs, with the ability to resume from that checkpoint when the operation completes.
-
Dynamic Workflow Planning: Allows agents to reprioritize and replan their workflows mid-execution.
-
Observable Control Loop Architecture: KubeChain uses a simple, observable control loop architecture that allows for easy debugging and observability into agent execution.
-
Scalable: Leverages Kubernetes for scalability and resilience. If you have k8s / etcd, you can run reliable distributed async agents.
-
Human Approvals and Input: Support for durable task execution across long-running function calls means a simple tool-based interface to allow an agent to ask a human for input or wait for an approval.
-
Simplicity: Leverages the unique property of AI applications where the entire "call stack" can be expressed as the rolling context window accumulated through interactions and tool calls. No separate execution state.
-
Clarity: Easy to understand what's happening and what the framework is doing with your prompts.
-
Control: Ability to customize every aspect of agent behavior without framework limitations.
-
Modularity: Composed of small control loops with limited scope that each progress the state of the world.
-
Durability: Resilient to failures as a distributed system.
-
Extensibility: Because agents are YAML, it's easy to build and share agents, tools, and tasks.
KubeChain is open-source and we welcome contributions in the form of issues, documentation, pull requests, and more. See CONTRIBUTING.md for more details.
KubeChain is licensed under the Apache 2 License.