Integrate Latest OpenAI Agents SDK Human-in-the-Loop (HITL) Features

`The OpenAI Agents SDK has introduced more sophisticated built-in support for human-in-the-loop (HITL) flows, primarily for seeking approval for sensitive tool executions. This includes mechanisms to pause agent runs, request approval, and resume execution based on human/external intervention. This epic aims to evaluate and integrate these new HITL features into EAT's OpenAI provider and potentially enhance or streamline EAT's existing `IntentReviewAgent` and related tools when interacting with OpenAI agents.

The key features in the OpenAI Agents SDK to consider are:
*   `needsApproval` flag/function on tools.
*   Agent interruption when approval is required.
*   `interruptions` array in the `RunResult` containing `ToolApprovalItem`.
*   `result.state.approve(interruption)` and `result.state.reject(interruption)` methods.
*   Resuming execution with `runner.run(agent, state)`.
*   State serialization (`JSON.stringify(result.state)`) and deserialization (`RunState.fromString(agent, serializedState)`) for longer approval times.

### 2. Motivation & Benefits

Integrating these features aims to:

*   **Standardize HITL for OpenAI Agents:** Leverage the SDK's native HITL mechanism for OpenAI agents managed by EAT, potentially reducing custom EAT logic for this specific case.
*   **Improve Granularity of Approval:** Allow approval requests at the individual tool-call level for OpenAI agents, based on tool sensitivity.
*   **Enhance EAT's Intent Review System:**
    *   The SDK's HITL can complement EAT's `IntentReviewAgent` and `ApprovePlanTool`.
    *   EAT's system can be the "human" (or AI reviewer) responding to the SDK's approval requests, using EAT's existing review infrastructure and MongoDB persistence.
*   **Support for Asynchronous/Longer Approvals:** The SDK's state serialization allows EAT to persist the state of an OpenAI agent awaiting approval in MongoDB (similar to how `IntentPlan` objects are stored) and resume it later.
*   **Maintain Alignment with OpenAI SDK:** Keep EAT's OpenAI integration up-to-date with the latest SDK features.

### 3. Proposed Plan & Tasks

**Phase 1: Investigation & Design (OpenAI Agents SDK Version: [Specify target version, e.g., 0.0.4+ or latest])**

*   [ ] **Research & Prototyping:**
    *   Thoroughly review the OpenAI Agents SDK documentation and examples for the new HITL features.
    *   Build a small standalone prototype using the latest OpenAI Agents SDK to understand the HITL flow, state serialization, and resumption mechanics.
*   [ ] **EAT Integration Design:**
    *   Define how EAT's `OpenAIAgentsProvider` will manage the SDK's HITL lifecycle.
    *   Design how EAT's `IntentReviewAgent` and/or `ApprovePlanTool` (or new specialized tools) will interact with the SDK's `interruptions` and `approve`/`reject` mechanisms.
    *   Determine how the serialized `RunState` from the OpenAI SDK will be stored and retrieved using EAT's MongoDB backend (e.g., in a new collection or alongside `eat_intent_plans`).
    *   Consider how EAT's `Firmware` or component metadata can inform the `needsApproval` setting for tools adapted for OpenAI agents.
    *   Evaluate how this new SDK feature aligns with EAT's existing multi-level intent review (design, components, intents). Will this primarily apply to the "intents" level execution for OpenAI agents?

**Phase 2: Implementation in EAT**

*   [ ] **Update `OpenAIAgentsProvider` (`evolving_agents/providers/openai_agents_provider.py`):**
    *   [ ] Modify agent execution logic to handle `RunResult.interruptions` of type `tool_approval_item`.
    *   [ ] When an interruption occurs:
        *   [ ] Serialize `result.state` from the OpenAI SDK.
        *   [ ] Store this serialized state in MongoDB (e.g., associated with an EAT `IntentPlan` or a new "PendingApproval" record type).
        *   [ ] Signal to EAT's orchestration layer (e.g., `SystemAgent`) that human/AI approval is required.
    *   [ ] Implement logic to resume execution:
        *   [ ] Retrieve the stored serialized state from MongoDB.
        *   [ ] Deserialize it using `RunState.fromString(agent, serializedState)`.
        *   [ ] Apply approvals/rejections (obtained via EAT's review system) to the `RunState` object.
        *   [ ] Call `runner.run(agent, state)` to resume.
*   [ ] **Update/Create EAT Review Tools:**
    *   [ ] Adapt `ApprovePlanTool` or create a new tool (e.g., `OpenAIToolApprovalTool`) that:
        *   [ ] Can fetch pending OpenAI tool approvals from MongoDB.
        *   [ ] Presents the `interruption.rawItem` (tool name, arguments) to the reviewer (human or AI).
        *   [ ] Records the approval/rejection decision.
        *   [ ] Triggers the `OpenAIAgentsProvider` to resume the agent run with the updated state.
*   [ ] **Update `OpenAIToolAdapter` (`evolving_agents/adapters/openai_tool_adapter.py`):**
    *   [ ] Add a mechanism to pass the `needsApproval` property (boolean or async function) from an EAT tool's definition/metadata to the converted OpenAI tool.
    *   [ ] This might involve extending EAT's tool metadata schema in `SmartLibrary`.
*   [ ] **MongoDB Schema (if new collection needed):**
    *   [ ] Define a schema for storing serialized OpenAI `RunState` and associated EAT review metadata (e.g., `eat_openai_pending_approvals`).
    *   [ ] Implement CRUD operations for this collection.
*   [ ] **SystemAgent Integration:**
    *   [ ] Modify `SystemAgent`'s logic (potentially within `ProcessWorkflowTool` or when directly executing OpenAI agents via `RequestAgentTool`) to recognize and handle the "awaiting OpenAI tool approval" state.
    *   [ ] Ensure `SystemAgent` can invoke the appropriate EAT review tool for OpenAI approvals.

**Phase 3: Testing**

*   [ ] **Unit Tests:**
    *   Test `OpenAIAgentsProvider`'s ability to pause, serialize state, deserialize state, and resume OpenAI agent runs.
    *   Test the EAT review tool(s) for handling OpenAI tool approval requests.
    *   Test `OpenAIToolAdapter` for correctly setting the `needsApproval` property.
*   [ ] **Integration Tests:**
    *   Test the end-to-end flow: OpenAI agent attempts sensitive tool -> EAT captures interruption -> EAT review tool approves/rejects -> OpenAI agent resumes/alters course.
*   [ ] **Example Scripts:**
    *   [ ] Create new example scripts in `examples/openai_agents/` demonstrating the new HITL flow integrated with EAT's review system.
    *   [ ] One example with `needsApproval: true`.
    *   [ ] One example with `needsApproval` as an async function.
    *   [ ] Test long-running approvals where the state is persisted and resumed later.

**Phase 4: Documentation & Finalization**

*   [ ] **Update EAT Documentation:**
    *   `README.md`: Mention the enhanced HITL for OpenAI agents.
    *   `docs/ARCHITECTURE.md`: Update diagrams and descriptions to reflect the new flow.
    *   `docs/TUTORIAL.md` (or a new tutorial): Provide guidance on using this feature.
*   [ ] **Update Code Comments:** Add/update comments.
*   [ ] **Code Review & Merge.**

### 4. Acceptance Criteria

*   EAT can manage OpenAI agents that use tools requiring approval via the SDK's `needsApproval` mechanism.
*   The `OpenAIAgentsProvider` correctly handles interruptions, state serialization/deserialization, and resumption for these agents.
*   EAT's review system (e.g., `ApprovePlanTool` or a new dedicated tool) can be used to approve/reject OpenAI tool calls.
*   Serialized OpenAI agent state can be persisted in EAT's MongoDB backend for asynchronous approvals.
*   New example scripts successfully demonstrate the HITL functionality.
*   Documentation is updated to cover this new feature.
*   The integration is robust and does not negatively impact EAT's existing `IntentReview` system for non-OpenAI agents or other review levels.

### 5. Potential Risks & Challenges

*   **Complexity of State Management:** Ensuring reliable serialization, storage, and deserialization of OpenAI's `RunState` within EAT's MongoDB infrastructure.
*   **Alignment with EAT's Existing Review System:** Making the OpenAI SDK's HITL feel like a natural extension of EAT's `IntentReviewAgent` rather than a separate, disjointed process.
*   **OpenAI SDK Versioning:** The `RunState` serialization format might change between SDK versions, requiring careful management if EAT needs to support resuming states created with older SDK versions (as noted in the OpenAI SDK docs). EAT will likely target one SDK version at a time.
*   **UI/UX for Review:** Ensuring the information presented to the EAT reviewer (from `interruption.rawItem`) is clear and sufficient for making an informed decision.

### 6. Relevant Context & Links

*   **OpenAI Agents SDK HITL Guide:** [Provided in the prompt]
*   **EAT `OpenAIAgentsProvider`:** `evolving_agents/providers/openai_agents_provider.py`
*   **EAT `OpenAIToolAdapter`:** `evolving_agents/adapters/openai_tool_adapter.py`
*   **EAT `IntentReviewAgent`:** `evolving_agents/agents/intent_review_agent.py`
*   **EAT `ApprovePlanTool`:** `evolving_agents/tools/intent_review/approve_plan_tool.py`

---

This epic focuses on integrating the OpenAI SDK's *specific* HITL mechanism. It complements, and should be coordinated with, any broader updates to the OpenAI Agents SDK version used by EAT.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Integrate Latest OpenAI Agents SDK Human-in-the-Loop (HITL) Features #138

2. Motivation & Benefits

3. Proposed Plan & Tasks

4. Acceptance Criteria

5. Potential Risks & Challenges

6. Relevant Context & Links

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Integrate Latest OpenAI Agents SDK Human-in-the-Loop (HITL) Features #138

Description

2. Motivation & Benefits

3. Proposed Plan & Tasks

4. Acceptance Criteria

5. Potential Risks & Challenges

6. Relevant Context & Links

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions