-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Hello,
You advertise tianshou as being fast and provide in the readme a comparison table.
However, no reference code is linked to reproduce the results.
So, I decided to create a colab notebook to have fair comparison (same hyperparameters) between tianshou, Stable-Baselines and Stable-Baselines3.
On the two environments tested (Pendulum and Cartpole), the results are quite far from those reported for Stable-Baselines...
Even worse, when compared to SB2 on the Pendulum environment, Tianshou seems slower (and seems to give worse final performance).
Time to reach mean reward of -200 for TD3 on Pendulum-v0 (1 env):
SB2: ~30s (vs 99s in the readme)
SB3: ~70s
Tianshou: >110s (vs 44s in the readme)
You can find the notebook here
(Please check tianshou code, I'm not 100% sure that I re-used the same hyperparameters)