这是indexloc提供的服务,不要输入任何密码
Skip to content

action_space in SubprocVectorEnv make it harder to reproduce result. #299

@ChenDRAG

Description

@ChenDRAG
train_envs = SubprocVectorEnv(
            [lambda: gym.make(args.task) for _ in range(args.training_num)])
train_envs.action_space[0]
>Box(8,)
a = train_envs.action_space[0]
b = train_envs.action_space[0]
a is b
>False  # For SubprocVectorEnv, but for DummyVectorEnv the result is true.

This is because that SubprocVectorEnv returns a deep copy but DummyVectorEnv return a reference.
As a result,
If you use SubprocVectorEnv and tianshou/collector, when you use 'random' option in collect() method, you will get different random action everytime you run the scripts even when using same random seeds.

So you have to be very careful about when you call 'train_envs.action_space[0].seed(1)'

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions