这是indexloc提供的服务,不要输入任何密码
Skip to content

PettingZoo AECEnv wrapper causing issues with Batch class #1244

@encratite

Description

@encratite

I'm trying to do MARL with a PettingZoo environment and Tianshou's PPO implementation but it looks like the PettingZoo AECEnv wrapper is inserting agent_id values that are causing issues with the batch processing. I'm not sure if this is a bug. I might just be misusing some components. The OnpolicyTrainer is throwing the following exception in test_episode:

Image

The offending agent_id properties were likely generated by the PettingZooEnv wrapper:

    def reset(self, *args: Any, **kwargs: Any) -> tuple[dict, dict]:
        self.env.reset(*args, **kwargs)

        observation, reward, terminated, truncated, info = self.env.last(self)

        if isinstance(observation, dict) and "action_mask" in observation:
            observation_dict = {
                "agent_id": self.env.agent_selection,
                "obs": observation["observation"],
                "mask": [obm == 1 for obm in observation["action_mask"]],
            }
        else:
            if isinstance(self.action_space, spaces.Discrete):
                observation_dict = {
                    "agent_id": self.env.agent_selection,
                    "obs": observation,
                    "mask": [True] * self.env.action_space(self.env.agent_selection).n,
                }
            else:
                observation_dict = {
                    "agent_id": self.env.agent_selection,
                    "obs": observation,
                }

        return observation_dict, info

The test_ppo.py sample, which works fine but doesn't support MARL, is similar to my code but uses the CartPole Gymnasium env instead, without the PettingZoo wrapper.

Here's my PettingZoo env:

https://github.com/encratite/thumper/blob/master/environment.py

Here's the Tianshou PPO training code:

https://github.com/encratite/thumper/blob/master/train.py

Any idea what I'm doing wrong? Or is this an actual incompatibility between different Tianshou components?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions