这是indexloc提供的服务,不要输入任何密码
Skip to content

Add QR-DQN algorithm #276

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Jan 28, 2021
Merged

Add QR-DQN algorithm #276

merged 23 commits into from
Jan 28, 2021

Conversation

shengxiang19
Copy link
Contributor

This is my PR for QR-DQN algorithm: https://arxiv.org/abs/1710.10044

  1. add QR-DQN policy in tianshou/policy/modelfree/qrdqn.py.
  2. add QR-DQN net in tianshou/utils/net/discrete.py.
  3. add QR-DQN atari example in examples/atari/atari_qrdqn.py.
  4. add QR-DQN statement in tianshou/policy/init.py.
  5. add QR-DQN test in test/discrete/test_qrdqn.py.
  6. add QR-DQN atari results in examples/atari/results/qrdqn/.
  7. add compute_q in DQNPolicy and C51Policy for simplify forward function.
  8. add huber function in BasePolicy.

By running "python3 atari_qrdqn.py --task "PongNoFrameskip-v4" --batch-size 64", get best_result': '19.8 ± 0.40', in epoch 8.

@codecov-io
Copy link

codecov-io commented Jan 19, 2021

Codecov Report

Merging #276 (7ab8160) into master (a511cb4) will increase coverage by 0.08%.
The diff coverage is 95.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #276      +/-   ##
==========================================
+ Coverage   94.31%   94.39%   +0.08%     
==========================================
  Files          44       45       +1     
  Lines        2866     2892      +26     
==========================================
+ Hits         2703     2730      +27     
+ Misses        163      162       -1     
Flag Coverage Δ
unittests 94.39% <95.00%> (+0.08%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
tianshou/policy/modelfree/dqn.py 98.71% <90.00%> (+0.03%) ⬆️
tianshou/policy/modelfree/qrdqn.py 93.18% <93.18%> (ø)
tianshou/policy/__init__.py 100.00% <100.00%> (ø)
tianshou/policy/base.py 73.52% <100.00%> (+0.26%) ⬆️
tianshou/policy/imitation/discrete_bcq.py 98.38% <100.00%> (-0.03%) ⬇️
tianshou/policy/modelfree/c51.py 93.61% <100.00%> (+4.55%) ⬆️
tianshou/policy/modelfree/ddpg.py 98.73% <100.00%> (-0.02%) ⬇️
tianshou/policy/modelfree/discrete_sac.py 87.30% <100.00%> (-0.20%) ⬇️
tianshou/policy/modelfree/sac.py 86.86% <100.00%> (-0.14%) ⬇️
tianshou/policy/modelfree/td3.py 100.00% <100.00%> (ø)
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a511cb4...7ab8160. Read the comment docs.

@ChenDRAG
Copy link
Collaborator

Same as #263, suggest adding experiment before pushing to tianshou/policy to ensure correctness.

@shengxiang19
Copy link
Contributor Author

Same as #263, suggest adding experiment before pushing to tianshou/policy to ensure correctness.

For Atari 2600 games, it's hard for my PC to reproduce the same parameters, i.e. Memory size is 1M and training frame is 200M. I have tested QR-DQN on some Atari environments to ensure the convergence of the algorithm. You can see some atari results in examples/atari/results/qrdqn/.

@Trinkle23897
Copy link
Collaborator

Trinkle23897 commented Jan 22, 2021

How about another 3 atari games' results? I can help you if you are lack of computational resources.

@shengxiang19
Copy link
Contributor Author

How about another 3 atari games' results? I can help you if you are lack of computational resources.

Thank you for your help. I've been on holiday recently. It's not convenient for me to do more tests.

@Trinkle23897
Copy link
Collaborator

Okay, that's fine. I'll do that.

@Trinkle23897 Trinkle23897 merged commit 1eb6137 into thu-ml:master Jan 28, 2021
@shengxiang19 shengxiang19 deleted the qrdqn branch January 28, 2021 07:10
BFAnas pushed a commit to BFAnas/tianshou that referenced this pull request May 5, 2024
This is the PR for QR-DQN algorithm: https://arxiv.org/abs/1710.10044

1. add QR-DQN policy in tianshou/policy/modelfree/qrdqn.py.
2. add QR-DQN net in examples/atari/atari_network.py.
3. add QR-DQN atari example in examples/atari/atari_qrdqn.py.
4. add QR-DQN statement in tianshou/policy/init.py.
5. add QR-DQN unit test in test/discrete/test_qrdqn.py.
6. add QR-DQN atari results in examples/atari/results/qrdqn/.
7. add compute_q_value in DQNPolicy and C51Policy for simplify forward function.
8. move `with torch.no_grad():` from `_target_q` to BasePolicy

By running "python3 atari_qrdqn.py --task "PongNoFrameskip-v4" --batch-size 64", get best_result': '19.8 ± 0.40', in epoch 8.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants