Tracing
This page includes details about the Tracing feature, which provides comprehensive tracing capabilities for AI agents.
Feature overview
The Tracing feature is a powerful monitoring and debugging tool that captures detailed information about agent runs, including:
- Strategy execution
- LLM calls
- Tool invocations
- Node execution within the agent graph
This feature operates by intercepting key events in the agent pipeline and forwarding them to configurable message processors. These processors can output the trace information to various destinations such as log files or other types of files in the filesystem, enabling developers to gain insights into agent behavior and troubleshoot issues effectively.
Event flow
- The Tracing feature intercepts events in the agent pipeline.
- Events are filtered based on the configured message filter.
- Filtered events are passed to registered message processors.
- Message processors format and output the events to their respective destinations.
Configuration and initialization
Basic setup
To use the Tracing feature, you need to:
- Have one or more message processors (you can use the existing ones or create your own).
- Install
Tracing
in your agent. - Configure the message filter (optional).
- Add the message processors to the feature.
// Defining a logger/file that will be used as a destination of trace messages
val logger = LoggerFactory.create("my.trace.logger")
val fs = JVMFileSystemProvider.ReadWrite
val path = Paths.get("/path/to/trace.log")
// Creating an agent
val agent = AIAgent(...) {
install(Tracing) {
// Configure message processors to handle trace events
addMessageProcessor(TraceFeatureMessageLogWriter(logger))
addMessageProcessor(TraceFeatureMessageFileWriter(outputPath, fileSystem::sink))
// Optionally filter messages
messageFilter = { message ->
// Only trace LLM calls and tool calls
message is LLMCallStartEvent || message is ToolCallEvent
}
}
}
Message filtering
You can process all existing events or select some of them based on specific criteria. The message filter lets you control which events are processed. This is useful for focusing on specific aspects of agent runs:
// Filter for LLM-related events only
messageFilter = { message ->
message is LLMCallStartEvent ||
message is LLMCallEndEvent ||
message is LLMCallWithToolsStartEvent ||
message is LLMCallWithToolsEndEvent
}
// Filter for tool-related events only
messageFilter = { message ->
message is ToolCallsEvent ||
message is ToolCallResultEvent ||
message is ToolValidationErrorEvent ||
message is ToolCallFailureEvent
}
// Filter for node execution events only
messageFilter = { message ->
message is AIAgentNodeExecutionStartEvent || message is AIAgentNodeExecutionEndEvent
}
Large trace volumes
For agents with complex strategies or long-running executions, the volume of trace events can be substantial. Consider using the following methods to manage the volume of events:
- Use specific message filters to reduce the number of events.
- Implement custom message processors with buffering or sampling.
- Use file rotation for log files to prevent them from growing too large.
Dependency graph
The Tracing feature has the following dependencies:
Tracing
├── AIAgentPipeline (for intercepting events)
├── TraceFeatureConfig
│ └── FeatureConfig
├── Message Processors
│ ├── TraceFeatureMessageLogWriter
│ │ └── FeatureMessageLogWriter
│ ├── TraceFeatureMessageFileWriter
│ │ └── FeatureMessageFileWriter
│ └── TraceFeatureMessageRemoteWriter
│ └── FeatureMessageRemoteWriter
└── Event Types (from ai.koog.agents.core.feature.model)
├── AIAgentStartedEvent
├── AIAgentFinishedEvent
├── AIAgentRunErrorEvent
├── AIAgentStrategyStartEvent
├── AIAgentStrategyFinishedEvent
├── AIAgentNodeExecutionStartEvent
├── AIAgentNodeExecutionEndEvent
├── LLMCallStartEvent
├── LLMCallWithToolsStartEvent
├── LLMCallEndEvent
├── LLMCallWithToolsEndEvent
├── ToolCallEvent
├── ToolValidationErrorEvent
├── ToolCallFailureEvent
└── ToolCallResultEvent
Examples and quickstarts
Basic tracing to logger
// Create a logger
val logger = LoggerFactory.create("my.agent.trace")
// Create an agent with tracing
val agent = AIAgent(...) {
install(Tracing) {
addMessageProcessor(TraceFeatureMessageLogWriter(logger))
}
}
// Run the agent
agent.run("Hello, agent!")
Error handling and edge cases
No message processors
If no message processors are added to the Tracing feature, a warning will be logged:
The feature will still intercept events, but they will not be processed or output anywhere.
Resource management
Message processors may hold resources (like file handles) that need to be properly released. Use the use
extension
function to ensure proper cleanup:
TraceFeatureMessageFileWriter(fs, path).use { writer ->
// Use the writer
install(Tracing) {
addMessageProcessor(writer)
}
// Run the agent
agent.run(input)
// Writer will be automatically closed when the block exits
}
Tracing specific events to file
// Create a file writer
val fs = JVMFileSystemProvider.ReadWrite
val path = Paths.get("/path/to/llm-calls.log")
val writer = TraceFeatureMessageFileWriter(fs, path)
// Create an agent with filtered tracing
val agent = AIAgent(...) {
install(Tracing) {
// Only trace LLM calls
messageFilter = { message ->
message is LLMCallWithToolsStartEvent || message is LLMCallWithToolsEndEvent
}
addMessageProcessor(writer)
}
}
// Run the agent
agent.run("Generate a story about a robot.")
Tracing specific events to remote endpoint
You use tracing to remote endpoints when you need to send event data via the network. Once initiated, tracing to a remote endpoint launches a light server at the specified port number and sends events via Kotlin Server-Sent Events (SSE).
// Create a file writer
val port = 4991
val serverConfig = ServerConnectionConfig(port = port)
val writer = TraceFeatureMessageRemoteWriter(connectionConfig = serverConfig)
// Create an agent with filtered tracing
val agent = AIAgent(...) {
install(Tracing) {
// Only trace LLM calls
messageFilter = { message ->
message is LLMCallWithToolsStartEvent || message is LLMCallWithToolsEndEvent
}
addMessageProcessor(writer)
}
}
// Run the agent
agent.run("Generate a story about a robot.")
FeatureMessageRemoteClient
to receive events and deserialize them.
// Create the client configuration
// Use the same port number as for the server emitting agent events
val clientConfig = AIAgentFeatureClientConnectionConfig(
host = "127.0.0.1",
port = 4991
)
// Create a client instance
val client = FeatureMessageRemoteClient(
connectionConfig = clientConfig,
scope = this
)
// Connect the client to the remote feature messaging service
client.connect()
// Collect events from the remote feature messaging service
val collectEvents = launch {
client.receivedMessages.consumeAsFlow().collect { message: FeatureMessage ->
// Process the received agent event
}
}
API documentation
The Tracing feature follows a modular architecture with these key components:
- Tracing: the main feature class that intercepts events in the agent pipeline.
- TraceFeatureConfig: configuration class for customizing feature behavior.
- Message Processors: components that process and output trace events:
- TraceFeatureMessageLogWriter: writes trace events to a logger.
- TraceFeatureMessageFileWriter: writes trace events to a file.
- TraceFeatureMessageRemoteWriter: sends trace events to a remote server.
FAQ and troubleshooting
The following section includes commonly asked questions and answers related to the Tracing feature.
How do I trace only specific parts of my agent's execution?
Use the messageFilter
property to filter events. For example, to trace only node execution:
install(Tracing) {
messageFilter = { message ->
message is AIAgentNodeExecutionStartEvent || message is AIAgentNodeExecutionEndEvent
}
addMessageProcessor(writer)
}
Can I use multiple message processors?
Yes, you can add multiple message processors to trace to different destinations simultaneously:
install(Tracing) {
addMessageProcessor(TraceFeatureMessageLogWriter(logger))
addMessageProcessor(TraceFeatureMessageFileWriter(fs, path))
addMessageProcessor(TraceFeatureMessageRemoteWriter(connectionConfig))
}
How can I create a custom message processor?
Implement the FeatureMessageProcessor
interface:
class CustomTraceProcessor : FeatureMessageProcessor {
override suspend fun onMessage(message: FeatureMessage) {
// Custom processing logic
when (message) {
is AIAgentNodeExecutionStartEvent -> {
// Process node start event
}
is LLMCallWithToolsEndEvent -> {
// Process LLM call end event
}
// Handle other event types
}
}
}
// Use your custom processor
install(Tracing) {
addMessageProcessor(CustomTraceProcessor())
}
For more information about existing event types that can be handled by message processors, see Predefined event types.
Predefined event types
Koog provides predefined event types that can be used in custom message processors. The predefined events can be classified into several categories, depending on the entity they relate to:
Agent events
AIAgentStartedEvent
Represents the start of an agent run. Includes the following fields:
Name | Data type | Required | Default | Description |
---|---|---|---|---|
strategyName |
String | Yes | The name of the strategy that the agent should follow. | |
eventId |
String | No | AIAgentStartedEvent |
The identifier of the event. Usually the simpleName of the event class. |
AIAgentFinishedEvent
Represents the end of an agent run. Includes the following fields:
Name | Data type | Required | Default | Description |
---|---|---|---|---|
strategyName |
String | Yes | The name of the strategy that the agent followed. | |
result |
String | Yes | The result of the agent run. Can be null if there is no result. |
|
eventId |
String | No | AIAgentFinishedEvent |
The identifier of the event. Usually the simpleName of the event class. |
AIAgentRunErrorEvent
Represents the occurrence of an error during an agent run. Includes the following fields:
Name | Data type | Required | Default | Description |
---|---|---|---|---|
strategyName |
String | Yes | The name of the strategy that the agent followed. | |
error |
AIAgentError | Yes | The specific error that occurred during the agent run. For more information, see AIAgentError. | |
eventId |
String | No | AIAgentRunErrorEvent |
The identifier of the event. Usually the simpleName of the event class. |
The AIAgentError
class provides more details about an error that occurred during an agent run. Includes the following fields:
Name | Data type | Required | Default | Description |
---|---|---|---|---|
message |
String | Yes | The message that provides more details about the specific error. | |
stackTrace |
String | Yes | The collection of stack records until the last executed code. | |
cause |
String | No | null | The cause of the error, if available. |
Strategy events
AIAgentStrategyStartEvent
Represents the start of a strategy run. Includes the following fields:
Name | Data type | Required | Default | Description |
---|---|---|---|---|
strategyName |
String | Yes | The name of the strategy. | |
eventId |
String | No | AIAgentStrategyStartEvent |
The identifier of the event. Usually the simpleName of the event class. |
AIAgentStrategyFinishedEvent
Represents the end of a strategy run. Includes the following fields:
Name | Data type | Required | Default | Description |
---|---|---|---|---|
strategyName |
String | Yes | The name of the strategy. | |
result |
String | Yes | The result of the run. | |
eventId |
String | No | AIAgentStrategyFinishedEvent |
The identifier of the event. Usually the simpleName of the event class. |
Node events
AIAgentNodeExecutionStartEvent
Represents the start of a node run. Includes the following fields:
Name | Data type | Required | Default | Description |
---|---|---|---|---|
nodeName |
String | Yes | The name of the node whose run started. | |
input |
String | Yes | The input value for the node. | |
eventId |
String | No | AIAgentNodeExecutionStartEvent |
The identifier of the event. Usually the simpleName of the event class. |
AIAgentNodeExecutionEndEvent
Represents the end of a node run. Includes the following fields:
Name | Data type | Required | Default | Description |
---|---|---|---|---|
nodeName |
String | Yes | The name of the node whose run ended. | |
input |
String | Yes | The input value for the node. | |
output |
String | Yes | The output value produced by the node. | |
eventId |
String | No | AIAgentNodeExecutionEndEvent |
The identifier of the event. Usually the simpleName of the event class. |
LLM call events
LLMCallStartEvent
Represents the start of an LLM call. Includes the following fields:
Name | Data type | Required | Default | Description |
---|---|---|---|---|
prompt |
Prompt | Yes | The prompt that is sent to the model. For more information, see Prompt. | |
tools |
List<String> | Yes | The list of tools that the model can call. | |
eventId |
String | No | LLMCallStartEvent |
The identifier of the event. Usually the simpleName of the event class. |
The Prompt
class represents a data structure for a prompt, consisting of a list of messages, a unique identifier, and
optional parameters for language model settings. Includes the following fields:
Name | Data type | Required | Default | Description |
---|---|---|---|---|
messages |
List<Message> | Yes | The list of messages that the prompt consists of. | |
id |
String | Yes | The unique identifier for the prompt. | |
params |
LLMParams | No | LLMParams() | The settings that control the way the LLM generates content. |
LLMCallEndEvent
Represents the end of an LLM call. Includes the following fields:
Name | Data type | Required | Default | Description |
---|---|---|---|---|
responses |
List<Message.Response> | Yes | One or more responses returned by the model. | |
eventId |
String | No | LLMCallEndEvent |
The identifier of the event. Usually the simpleName of the event class. |
Tool call events
ToolCallEvent
Represents the event of a model calling a tool. Includes the following fields:
Name | Data type | Required | Default | Description |
---|---|---|---|---|
toolName |
String | Yes | The name of the tool. | |
toolArgs |
Tool.Args | Yes | The arguments that are provided to the tool. | |
eventId |
String | No | ToolCallEvent |
The identifier of the event. Usually the simpleName of the event class. |
ToolValidationErrorEvent
Represents the occurrence of a validation error during a tool call. Includes the following fields:
Name | Data type | Required | Default | Description |
---|---|---|---|---|
toolName |
String | Yes | The name of the tool for which validation failed. | |
toolArgs |
Tool.Args | Yes | The arguments that are provided to the tool. | |
errorMessage |
String | Yes | The validation error message. | |
eventId |
String | No | ToolValidationErrorEvent |
The identifier of the event. Usually the simpleName of the event class. |
ToolCallFailureEvent
Represents a failure to call a tool. Includes the following fields:
Name | Data type | Required | Default | Description |
---|---|---|---|---|
toolName |
String | Yes | The name of the tool. | |
toolArgs |
Tool.Args | Yes | The arguments that are provided to the tool. | |
error |
AIAgentError | Yes | The specific error that occurred when trying to call a tool. For more information, see AIAgentError. | |
eventId |
String | No | ToolCallFailureEvent |
The identifier of the event. Usually the simpleName of the event class. |
ToolCallResultEvent
Represents a successful tool call with the return of a result. Includes the following fields:
Name | Data type | Required | Default | Description |
---|---|---|---|---|
toolName |
String | Yes | The name of the tool. | |
toolArgs |
Tool.Args | Yes | The arguments that are provided to the tool. | |
result |
ToolResult | Yes | The result of the tool call. | |
eventId |
String | No | ToolCallResultEvent |
The identifier of the event. Usually the simpleName of the event class. |