这是indexloc提供的服务,不要输入任何密码
Skip to content

[Question] Managing multi-step user interactions and task state in Remote Agent execution #902

@Yoojkim

Description

@Yoojkim

I'm currently developing a remote agent that handles email writing tasks based on natural language commands (e.g., "Send a report request email to amy"). While implementing this, I came across a few questions about how task flow and agent responsibilities are expected to be handled in the A2A framework.

use case
The remote agent needs to handle a multi-step interaction like this:

  1. Recipient Identification
  • The agent queries an internal API to find recipient candidates.
  • If multiple candidates are found, it asks the user to choose one (input_required).
  • This can involve multiple user interactions.
  1. Email Drafting
  • The agent drafts an email and presents it to the user.
  • The user may request modifications and iterate.
  1. Confirmation & Sending
  • The agent asks for final confirmation before sending.
  • If confirmed, it triggers the email API.

❓ Questions

  1. Is it possible for the Host Agent to manage the state of a Remote Agent's task?
    For example, if the remote agent returns input_required, can the host agent remember the task context and resume the same task after receiving additional input from the user?

  2. The AgentExecutor.execute() method seems to act as the only endpoint on the remote agent side.Does this mean that the entire multi-step flow must be implemented sequentially inside execute() using conditional logic and internal state tracking?
    If so, this feels rather rigid and hard to manage for complex, interaction-heavy workflows.
    Is this the intended usage pattern, or is there a better way to decompose such flows?

  3. Can each AgentSkill be mapped to a specific method within the remote agent (1:1)?
    One of my teammates suggested that this is possible. However, based on my understanding, since AgentExecutor.execute() is the single exposed entry point, it doesn't seem feasible to dispatch skills to distinct internal methods directly. Is there a built-in way to route or differentiate skill-level logic within execute()?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions