这是indexloc提供的服务,不要输入任何密码
Skip to content

Integrate Latest OpenAI Agents SDK Human-in-the-Loop (HITL) Features #138

@matiasmolinas

Description

@matiasmolinas

The OpenAI Agents SDK has introduced more sophisticated built-in support for human-in-the-loop (HITL) flows, primarily for seeking approval for sensitive tool executions. This includes mechanisms to pause agent runs, request approval, and resume execution based on human/external intervention. This epic aims to evaluate and integrate these new HITL features into EAT's OpenAI provider and potentially enhance or streamline EAT's existing IntentReviewAgent` and related tools when interacting with OpenAI agents.

The key features in the OpenAI Agents SDK to consider are:

  • needsApproval flag/function on tools.
  • Agent interruption when approval is required.
  • interruptions array in the RunResult containing ToolApprovalItem.
  • result.state.approve(interruption) and result.state.reject(interruption) methods.
  • Resuming execution with runner.run(agent, state).
  • State serialization (JSON.stringify(result.state)) and deserialization (RunState.fromString(agent, serializedState)) for longer approval times.

2. Motivation & Benefits

Integrating these features aims to:

  • Standardize HITL for OpenAI Agents: Leverage the SDK's native HITL mechanism for OpenAI agents managed by EAT, potentially reducing custom EAT logic for this specific case.
  • Improve Granularity of Approval: Allow approval requests at the individual tool-call level for OpenAI agents, based on tool sensitivity.
  • Enhance EAT's Intent Review System:
    • The SDK's HITL can complement EAT's IntentReviewAgent and ApprovePlanTool.
    • EAT's system can be the "human" (or AI reviewer) responding to the SDK's approval requests, using EAT's existing review infrastructure and MongoDB persistence.
  • Support for Asynchronous/Longer Approvals: The SDK's state serialization allows EAT to persist the state of an OpenAI agent awaiting approval in MongoDB (similar to how IntentPlan objects are stored) and resume it later.
  • Maintain Alignment with OpenAI SDK: Keep EAT's OpenAI integration up-to-date with the latest SDK features.

3. Proposed Plan & Tasks

Phase 1: Investigation & Design (OpenAI Agents SDK Version: [Specify target version, e.g., 0.0.4+ or latest])

  • Research & Prototyping:
    • Thoroughly review the OpenAI Agents SDK documentation and examples for the new HITL features.
    • Build a small standalone prototype using the latest OpenAI Agents SDK to understand the HITL flow, state serialization, and resumption mechanics.
  • EAT Integration Design:
    • Define how EAT's OpenAIAgentsProvider will manage the SDK's HITL lifecycle.
    • Design how EAT's IntentReviewAgent and/or ApprovePlanTool (or new specialized tools) will interact with the SDK's interruptions and approve/reject mechanisms.
    • Determine how the serialized RunState from the OpenAI SDK will be stored and retrieved using EAT's MongoDB backend (e.g., in a new collection or alongside eat_intent_plans).
    • Consider how EAT's Firmware or component metadata can inform the needsApproval setting for tools adapted for OpenAI agents.
    • Evaluate how this new SDK feature aligns with EAT's existing multi-level intent review (design, components, intents). Will this primarily apply to the "intents" level execution for OpenAI agents?

Phase 2: Implementation in EAT

  • Update OpenAIAgentsProvider (evolving_agents/providers/openai_agents_provider.py):
    • Modify agent execution logic to handle RunResult.interruptions of type tool_approval_item.
    • When an interruption occurs:
      • Serialize result.state from the OpenAI SDK.
      • Store this serialized state in MongoDB (e.g., associated with an EAT IntentPlan or a new "PendingApproval" record type).
      • Signal to EAT's orchestration layer (e.g., SystemAgent) that human/AI approval is required.
    • Implement logic to resume execution:
      • Retrieve the stored serialized state from MongoDB.
      • Deserialize it using RunState.fromString(agent, serializedState).
      • Apply approvals/rejections (obtained via EAT's review system) to the RunState object.
      • Call runner.run(agent, state) to resume.
  • Update/Create EAT Review Tools:
    • Adapt ApprovePlanTool or create a new tool (e.g., OpenAIToolApprovalTool) that:
      • Can fetch pending OpenAI tool approvals from MongoDB.
      • Presents the interruption.rawItem (tool name, arguments) to the reviewer (human or AI).
      • Records the approval/rejection decision.
      • Triggers the OpenAIAgentsProvider to resume the agent run with the updated state.
  • Update OpenAIToolAdapter (evolving_agents/adapters/openai_tool_adapter.py):
    • Add a mechanism to pass the needsApproval property (boolean or async function) from an EAT tool's definition/metadata to the converted OpenAI tool.
    • This might involve extending EAT's tool metadata schema in SmartLibrary.
  • MongoDB Schema (if new collection needed):
    • Define a schema for storing serialized OpenAI RunState and associated EAT review metadata (e.g., eat_openai_pending_approvals).
    • Implement CRUD operations for this collection.
  • SystemAgent Integration:
    • Modify SystemAgent's logic (potentially within ProcessWorkflowTool or when directly executing OpenAI agents via RequestAgentTool) to recognize and handle the "awaiting OpenAI tool approval" state.
    • Ensure SystemAgent can invoke the appropriate EAT review tool for OpenAI approvals.

Phase 3: Testing

  • Unit Tests:
    • Test OpenAIAgentsProvider's ability to pause, serialize state, deserialize state, and resume OpenAI agent runs.
    • Test the EAT review tool(s) for handling OpenAI tool approval requests.
    • Test OpenAIToolAdapter for correctly setting the needsApproval property.
  • Integration Tests:
    • Test the end-to-end flow: OpenAI agent attempts sensitive tool -> EAT captures interruption -> EAT review tool approves/rejects -> OpenAI agent resumes/alters course.
  • Example Scripts:
    • Create new example scripts in examples/openai_agents/ demonstrating the new HITL flow integrated with EAT's review system.
    • One example with needsApproval: true.
    • One example with needsApproval as an async function.
    • Test long-running approvals where the state is persisted and resumed later.

Phase 4: Documentation & Finalization

  • Update EAT Documentation:
    • README.md: Mention the enhanced HITL for OpenAI agents.
    • docs/ARCHITECTURE.md: Update diagrams and descriptions to reflect the new flow.
    • docs/TUTORIAL.md (or a new tutorial): Provide guidance on using this feature.
  • Update Code Comments: Add/update comments.
  • Code Review & Merge.

4. Acceptance Criteria

  • EAT can manage OpenAI agents that use tools requiring approval via the SDK's needsApproval mechanism.
  • The OpenAIAgentsProvider correctly handles interruptions, state serialization/deserialization, and resumption for these agents.
  • EAT's review system (e.g., ApprovePlanTool or a new dedicated tool) can be used to approve/reject OpenAI tool calls.
  • Serialized OpenAI agent state can be persisted in EAT's MongoDB backend for asynchronous approvals.
  • New example scripts successfully demonstrate the HITL functionality.
  • Documentation is updated to cover this new feature.
  • The integration is robust and does not negatively impact EAT's existing IntentReview system for non-OpenAI agents or other review levels.

5. Potential Risks & Challenges

  • Complexity of State Management: Ensuring reliable serialization, storage, and deserialization of OpenAI's RunState within EAT's MongoDB infrastructure.
  • Alignment with EAT's Existing Review System: Making the OpenAI SDK's HITL feel like a natural extension of EAT's IntentReviewAgent rather than a separate, disjointed process.
  • OpenAI SDK Versioning: The RunState serialization format might change between SDK versions, requiring careful management if EAT needs to support resuming states created with older SDK versions (as noted in the OpenAI SDK docs). EAT will likely target one SDK version at a time.
  • UI/UX for Review: Ensuring the information presented to the EAT reviewer (from interruption.rawItem) is clear and sufficient for making an informed decision.

6. Relevant Context & Links

  • OpenAI Agents SDK HITL Guide: [Provided in the prompt]
  • EAT OpenAIAgentsProvider: evolving_agents/providers/openai_agents_provider.py
  • EAT OpenAIToolAdapter: evolving_agents/adapters/openai_tool_adapter.py
  • EAT IntentReviewAgent: evolving_agents/agents/intent_review_agent.py
  • EAT ApprovePlanTool: evolving_agents/tools/intent_review/approve_plan_tool.py

This epic focuses on integrating the OpenAI SDK's specific HITL mechanism. It complements, and should be coordinated with, any broader updates to the OpenAI Agents SDK version used by EAT.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions