You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
train_envs = SubprocVectorEnv(
[lambda: gym.make(args.task) for _ in range(args.training_num)])
train_envs.action_space[0]
>Box(8,)
a = train_envs.action_space[0]
b = train_envs.action_space[0]
a is b
>False # For SubprocVectorEnv, but for DummyVectorEnv the result is true.
This is because that SubprocVectorEnv returns a deep copy but DummyVectorEnv return a reference.
As a result,
If you use SubprocVectorEnv and tianshou/collector, when you use 'random' option in collect() method, you will get different random action everytime you run the scripts even when using same random seeds.
So you have to be very careful about when you call 'train_envs.action_space[0].seed(1)'