-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add QR-DQN algorithm #276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add QR-DQN algorithm #276
Conversation
Codecov Report
@@ Coverage Diff @@
## master #276 +/- ##
==========================================
+ Coverage 94.31% 94.39% +0.08%
==========================================
Files 44 45 +1
Lines 2866 2892 +26
==========================================
+ Hits 2703 2730 +27
+ Misses 163 162 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Same as #263, suggest adding experiment before pushing to tianshou/policy to ensure correctness. |
For Atari 2600 games, it's hard for my PC to reproduce the same parameters, i.e. Memory size is 1M and training frame is 200M. I have tested QR-DQN on some Atari environments to ensure the convergence of the algorithm. You can see some atari results in examples/atari/results/qrdqn/. |
How about another 3 atari games' results? I can help you if you are lack of computational resources. |
Thank you for your help. I've been on holiday recently. It's not convenient for me to do more tests. |
Okay, that's fine. I'll do that. |
This is the PR for QR-DQN algorithm: https://arxiv.org/abs/1710.10044 1. add QR-DQN policy in tianshou/policy/modelfree/qrdqn.py. 2. add QR-DQN net in examples/atari/atari_network.py. 3. add QR-DQN atari example in examples/atari/atari_qrdqn.py. 4. add QR-DQN statement in tianshou/policy/init.py. 5. add QR-DQN unit test in test/discrete/test_qrdqn.py. 6. add QR-DQN atari results in examples/atari/results/qrdqn/. 7. add compute_q_value in DQNPolicy and C51Policy for simplify forward function. 8. move `with torch.no_grad():` from `_target_q` to BasePolicy By running "python3 atari_qrdqn.py --task "PongNoFrameskip-v4" --batch-size 64", get best_result': '19.8 ± 0.40', in epoch 8.
This is my PR for QR-DQN algorithm: https://arxiv.org/abs/1710.10044
By running "python3 atari_qrdqn.py --task "PongNoFrameskip-v4" --batch-size 64", get best_result': '19.8 ± 0.40', in epoch 8.