-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Hi team,
I've encountered a performance issue while running the example/atari/atari_dqn.py script. The training speed is slow, and slows down drastically as it progresses.
Environment:
GPU: NVIDIA A6000
PyTorch Version: 2.1.1+cu113, 2.6.0+cu126 (tried both)
Script: example/atari/atari_dqn.py
Env: Alien & Pong
Seed: 1
Steps to Reproduce:
Run the example/atari/atari_dqn.py script.
Monitor the training speed (iterations per second, it/s).
Observed Behavior:
Initially, the training speed is around 40-50 it/s.
The it/s continuously decreases. Around 2,400 training steps, the speed drops to below 10 it/s, and later even below 7 it/s.
I attempted to mitigate this by setting training_num=100.
While increasing training_num might slightly delay the severe drop, the it/s still falls below 10 around 20,000 training steps.
At this reduced speed, a single epoch is estimated to take over 2 hours to complete.
Could you please look into this? Let me know if you need any more information from my end.
Thanks!