-
Notifications
You must be signed in to change notification settings - Fork 31
Description
The OpenAI Agents SDK has introduced more sophisticated built-in support for human-in-the-loop (HITL) flows, primarily for seeking approval for sensitive tool executions. This includes mechanisms to pause agent runs, request approval, and resume execution based on human/external intervention. This epic aims to evaluate and integrate these new HITL features into EAT's OpenAI provider and potentially enhance or streamline EAT's existing
IntentReviewAgent` and related tools when interacting with OpenAI agents.
The key features in the OpenAI Agents SDK to consider are:
needsApproval
flag/function on tools.- Agent interruption when approval is required.
interruptions
array in theRunResult
containingToolApprovalItem
.result.state.approve(interruption)
andresult.state.reject(interruption)
methods.- Resuming execution with
runner.run(agent, state)
. - State serialization (
JSON.stringify(result.state)
) and deserialization (RunState.fromString(agent, serializedState)
) for longer approval times.
2. Motivation & Benefits
Integrating these features aims to:
- Standardize HITL for OpenAI Agents: Leverage the SDK's native HITL mechanism for OpenAI agents managed by EAT, potentially reducing custom EAT logic for this specific case.
- Improve Granularity of Approval: Allow approval requests at the individual tool-call level for OpenAI agents, based on tool sensitivity.
- Enhance EAT's Intent Review System:
- The SDK's HITL can complement EAT's
IntentReviewAgent
andApprovePlanTool
. - EAT's system can be the "human" (or AI reviewer) responding to the SDK's approval requests, using EAT's existing review infrastructure and MongoDB persistence.
- The SDK's HITL can complement EAT's
- Support for Asynchronous/Longer Approvals: The SDK's state serialization allows EAT to persist the state of an OpenAI agent awaiting approval in MongoDB (similar to how
IntentPlan
objects are stored) and resume it later. - Maintain Alignment with OpenAI SDK: Keep EAT's OpenAI integration up-to-date with the latest SDK features.
3. Proposed Plan & Tasks
Phase 1: Investigation & Design (OpenAI Agents SDK Version: [Specify target version, e.g., 0.0.4+ or latest])
- Research & Prototyping:
- Thoroughly review the OpenAI Agents SDK documentation and examples for the new HITL features.
- Build a small standalone prototype using the latest OpenAI Agents SDK to understand the HITL flow, state serialization, and resumption mechanics.
- EAT Integration Design:
- Define how EAT's
OpenAIAgentsProvider
will manage the SDK's HITL lifecycle. - Design how EAT's
IntentReviewAgent
and/orApprovePlanTool
(or new specialized tools) will interact with the SDK'sinterruptions
andapprove
/reject
mechanisms. - Determine how the serialized
RunState
from the OpenAI SDK will be stored and retrieved using EAT's MongoDB backend (e.g., in a new collection or alongsideeat_intent_plans
). - Consider how EAT's
Firmware
or component metadata can inform theneedsApproval
setting for tools adapted for OpenAI agents. - Evaluate how this new SDK feature aligns with EAT's existing multi-level intent review (design, components, intents). Will this primarily apply to the "intents" level execution for OpenAI agents?
- Define how EAT's
Phase 2: Implementation in EAT
- Update
OpenAIAgentsProvider
(evolving_agents/providers/openai_agents_provider.py
):- Modify agent execution logic to handle
RunResult.interruptions
of typetool_approval_item
. - When an interruption occurs:
- Serialize
result.state
from the OpenAI SDK. - Store this serialized state in MongoDB (e.g., associated with an EAT
IntentPlan
or a new "PendingApproval" record type). - Signal to EAT's orchestration layer (e.g.,
SystemAgent
) that human/AI approval is required.
- Serialize
- Implement logic to resume execution:
- Retrieve the stored serialized state from MongoDB.
- Deserialize it using
RunState.fromString(agent, serializedState)
. - Apply approvals/rejections (obtained via EAT's review system) to the
RunState
object. - Call
runner.run(agent, state)
to resume.
- Modify agent execution logic to handle
- Update/Create EAT Review Tools:
- Adapt
ApprovePlanTool
or create a new tool (e.g.,OpenAIToolApprovalTool
) that:- Can fetch pending OpenAI tool approvals from MongoDB.
- Presents the
interruption.rawItem
(tool name, arguments) to the reviewer (human or AI). - Records the approval/rejection decision.
- Triggers the
OpenAIAgentsProvider
to resume the agent run with the updated state.
- Adapt
- Update
OpenAIToolAdapter
(evolving_agents/adapters/openai_tool_adapter.py
):- Add a mechanism to pass the
needsApproval
property (boolean or async function) from an EAT tool's definition/metadata to the converted OpenAI tool. - This might involve extending EAT's tool metadata schema in
SmartLibrary
.
- Add a mechanism to pass the
- MongoDB Schema (if new collection needed):
- Define a schema for storing serialized OpenAI
RunState
and associated EAT review metadata (e.g.,eat_openai_pending_approvals
). - Implement CRUD operations for this collection.
- Define a schema for storing serialized OpenAI
- SystemAgent Integration:
- Modify
SystemAgent
's logic (potentially withinProcessWorkflowTool
or when directly executing OpenAI agents viaRequestAgentTool
) to recognize and handle the "awaiting OpenAI tool approval" state. - Ensure
SystemAgent
can invoke the appropriate EAT review tool for OpenAI approvals.
- Modify
Phase 3: Testing
- Unit Tests:
- Test
OpenAIAgentsProvider
's ability to pause, serialize state, deserialize state, and resume OpenAI agent runs. - Test the EAT review tool(s) for handling OpenAI tool approval requests.
- Test
OpenAIToolAdapter
for correctly setting theneedsApproval
property.
- Test
- Integration Tests:
- Test the end-to-end flow: OpenAI agent attempts sensitive tool -> EAT captures interruption -> EAT review tool approves/rejects -> OpenAI agent resumes/alters course.
- Example Scripts:
- Create new example scripts in
examples/openai_agents/
demonstrating the new HITL flow integrated with EAT's review system. - One example with
needsApproval: true
. - One example with
needsApproval
as an async function. - Test long-running approvals where the state is persisted and resumed later.
- Create new example scripts in
Phase 4: Documentation & Finalization
- Update EAT Documentation:
README.md
: Mention the enhanced HITL for OpenAI agents.docs/ARCHITECTURE.md
: Update diagrams and descriptions to reflect the new flow.docs/TUTORIAL.md
(or a new tutorial): Provide guidance on using this feature.
- Update Code Comments: Add/update comments.
- Code Review & Merge.
4. Acceptance Criteria
- EAT can manage OpenAI agents that use tools requiring approval via the SDK's
needsApproval
mechanism. - The
OpenAIAgentsProvider
correctly handles interruptions, state serialization/deserialization, and resumption for these agents. - EAT's review system (e.g.,
ApprovePlanTool
or a new dedicated tool) can be used to approve/reject OpenAI tool calls. - Serialized OpenAI agent state can be persisted in EAT's MongoDB backend for asynchronous approvals.
- New example scripts successfully demonstrate the HITL functionality.
- Documentation is updated to cover this new feature.
- The integration is robust and does not negatively impact EAT's existing
IntentReview
system for non-OpenAI agents or other review levels.
5. Potential Risks & Challenges
- Complexity of State Management: Ensuring reliable serialization, storage, and deserialization of OpenAI's
RunState
within EAT's MongoDB infrastructure. - Alignment with EAT's Existing Review System: Making the OpenAI SDK's HITL feel like a natural extension of EAT's
IntentReviewAgent
rather than a separate, disjointed process. - OpenAI SDK Versioning: The
RunState
serialization format might change between SDK versions, requiring careful management if EAT needs to support resuming states created with older SDK versions (as noted in the OpenAI SDK docs). EAT will likely target one SDK version at a time. - UI/UX for Review: Ensuring the information presented to the EAT reviewer (from
interruption.rawItem
) is clear and sufficient for making an informed decision.
6. Relevant Context & Links
- OpenAI Agents SDK HITL Guide: [Provided in the prompt]
- EAT
OpenAIAgentsProvider
:evolving_agents/providers/openai_agents_provider.py
- EAT
OpenAIToolAdapter
:evolving_agents/adapters/openai_tool_adapter.py
- EAT
IntentReviewAgent
:evolving_agents/agents/intent_review_agent.py
- EAT
ApprovePlanTool
:evolving_agents/tools/intent_review/approve_plan_tool.py
This epic focuses on integrating the OpenAI SDK's specific HITL mechanism. It complements, and should be coordinated with, any broader updates to the OpenAI Agents SDK version used by EAT.