-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Hello,
I used Tainshou 0.5 on a custom environment running on a Windows PC. I was impressed by the training speed of the PPO agent, which exceeded 2000 iterations per second.
import tianshou, gymnasium as gym, torch, numpy, sys
print(tianshou.__version__, gym.__version__, torch.__version__, numpy.__version__, sys.version, sys.platform)0.5.0 0.26.3 2.5.1 1.26.4 3.11.10 | packaged by conda-forge | (main, Oct 16 2024, 01:17:14) [MSC v.1941 64 bit (AMD64)] win32
Training using the Tianshou library, version 0.5:
Epoch #1: 10112it [00:03, 2748.12it/s, env_step=10112, len=0, loss=0.303, loss/clip=0.000, loss/enEpoch #1: 10112it [00:03, 2610.28it/s, env_step=10112, len=0, loss=0.303, loss/clip=0.000, loss/ent=0.916, loss/vf=0.623, n/ep=0, n/st=128, rew=0.00]
Epoch #1: test_reward: 27.403868 ± 0.000000, best_reward: 27.403868 ± 0.000000 in #1
Epoch #2: 10112it [00:03, 2644.64it/s, env_step=20224, len=0, loss=0.500, loss/clip=0.000, loss/enEpoch #2: 10112it [00:03, 2634.27it/st/s, env_step=20224, len=0, loss=0.500, loss/clip=0.000, loss/ent=0.915, loss/vf=1.018, n/ep=0, n/st=128, rew=0.00]
Epoch #2: test_reward: 33.482427 ± 0.000000, best_reward: 33.482427 ± 0.000000 in #2
Epoch #3: 10112it [00:04, 2376.15it/s, env_step=30208, len=0, loss=0.715, loss/clip=-0.000, loss/eEpoch #3: 10112it [00:04, 2376.15it/st/s, env_step=30336, len=0, loss=0.713, loss/clip=-0.000, loss/eEpoch #3: 10112it [00:04, 2442.15it/s, env_step=30336, len=0, loss=03, .713, loss/clip=-0.000, loss/ent=0.913, loss/vf=1.445, n/ep=0, n/st=128, rew=0.00]
Epoch #3: test_reward: 35.236934 ± 0.000000, best_reward: 35.236934 ± 0.000000 in #3
Epoch #4: 59%|5| 5888/10000 [00:02<00:01, 2481.35it/s, env_step=36224, len=0, loss=0.614, loss/clip=0.000, loss/ent=0.911, loss/vf==1Epoch #4: 60%|6| 6016/10000 [00:02<00:01, 2559.46it/s, env_step=36224, len=0, loss=0.614, loss/clip=0.000, loss/ent=0.911, loss/vf
Epoch #4: 10112it [00:04, 2473.71it/s, env_step=40448, len=0, loss=0.547, loss/clip=0.000, loss/ent=0.910, loss/vf=1.112, n/ep=0, Epoch #4: 10112it [00:04, 2508.40it/s, env_step=40448, len=0, loss=0.547, loss/clip=0.000, loss/ent=0.910, loss/vf=1.112, ep=0, st=128, rew=0.00]
Epoch #4: test_reward: 22.770667 ± 0.000000, best_reward: 35.236934 ± 0.000000 in #3
Epoch #5: 10112it [00:04, 2383.42it/s, env_step=50432, len=334, loss=0.479, loss/clip=0.000, loss/ent=0.911, loss/vf=0.976, ep=0,
Epoch #5: 10112it [00:04, 2383.42it/s, env_step=50560, len=334, loss=0.476, loss/clip=0.000, loss/ent=0.911, loss/vf=0.970, ep=0,
Epoch #5: 10112it [00:04, 2479.57it/s, env_step=50560, len=334, loss=0.476, loss/clip=0.000, loss/ent=0.911, loss/vf=0.970, ep=0, st=128, rew=54.33]
Epoch #5: test_reward: 29.205846 ± 0.000000, best_reward: 35.236934 ± 0.000000 in #3Recently, I upgraded to Tianshou 1.2, keeping the agent configuration the same. However, I observed a significant performance drop, with the new version running approximately 12 times slower, as shown below. I also tested that on Linux and observed the same results:
import tianshou, gymnasium as gym, torch, numpy, sys
print(tianshou.__version__, gym.__version__, torch.__version__, numpy.__version__, sys.version, sys.platform)1.2.0-dev 0.28.1 2.1.1+cu121 1.24.4 3.11.10 (main, Sep 7 2024, 18:35:41) [GCC 11.4.0] linu
Training using the Tianshou library, version 1.2:
Epoch #1: 10112it [00:59, 169.41it/s, env_episode=0, env_step=10112, gradient_step=158, len=0, n/ep=0, n/st=128, rew=0.00]
Epoch #2: 10112it [00:59, 171.18it/s, env_episode=0, env_step=20224, gradient_step=316, len=0, n/ep=0, n/st=128, rew=0.00]
Epoch #3: 10112it [00:59, 170.92it/s, env_episode=0, env_step=30336, gradient_step=474, len=0, n/ep=0, n/st=128, rew=0.00]
Epoch #4: 10112it [00:59, 171.19it/s, env_episode=0, env_step=40448, gradient_step=632, len=0, n/ep=0, n/st=128, rew=0.00]
Epoch #5: 10112it [00:59, 170.45it/s, env_episode=128, env_step=50560, gradient_step=790, len=1, n/ep=0, n/st=128, rew=41.83] Have there been changes to the library that impact execution performance, and can I restore previous performance levels through configuration adjustments?
- I have marked all applicable categories:
- exception-raising bug
- RL algorithm bug
- documentation request (i.e. "X is missing from the documentation.")
- new feature request
- design request (i.e. "X should be changed to Y.")
- I have visited the source website
- I have searched through the issue tracker for duplicates
- I have mentioned version numbers, operating system and environment.