Add QR-DQN algorithm #276

shengxiang19 · 2021-01-19T07:45:26Z

This is my PR for QR-DQN algorithm: https://arxiv.org/abs/1710.10044

add QR-DQN policy in tianshou/policy/modelfree/qrdqn.py.
add QR-DQN net in tianshou/utils/net/discrete.py.
add QR-DQN atari example in examples/atari/atari_qrdqn.py.
add QR-DQN statement in tianshou/policy/init.py.
add QR-DQN test in test/discrete/test_qrdqn.py.
add QR-DQN atari results in examples/atari/results/qrdqn/.
add compute_q in DQNPolicy and C51Policy for simplify forward function.
add huber function in BasePolicy.

By running "python3 atari_qrdqn.py --task "PongNoFrameskip-v4" --batch-size 64", get best_result': '19.8 ± 0.40', in epoch 8.

fix readme (#273)

codecov-io · 2021-01-19T09:35:43Z

Codecov Report

Merging #276 (7ab8160) into master (a511cb4) will increase coverage by 0.08%.
The diff coverage is 95.00%.

@@            Coverage Diff             @@
##           master     #276      +/-   ##
==========================================
+ Coverage   94.31%   94.39%   +0.08%     
==========================================
  Files          44       45       +1     
  Lines        2866     2892      +26     
==========================================
+ Hits         2703     2730      +27     
+ Misses        163      162       -1

Flag	Coverage Δ
unittests	`94.39% <95.00%> (+0.08%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
tianshou/policy/modelfree/dqn.py	`98.71% <90.00%> (+0.03%)`	⬆️
tianshou/policy/modelfree/qrdqn.py	`93.18% <93.18%> (ø)`
tianshou/policy/__init__.py	`100.00% <100.00%> (ø)`
tianshou/policy/base.py	`73.52% <100.00%> (+0.26%)`	⬆️
tianshou/policy/imitation/discrete_bcq.py	`98.38% <100.00%> (-0.03%)`	⬇️
tianshou/policy/modelfree/c51.py	`93.61% <100.00%> (+4.55%)`	⬆️
tianshou/policy/modelfree/ddpg.py	`98.73% <100.00%> (-0.02%)`	⬇️
tianshou/policy/modelfree/discrete_sac.py	`87.30% <100.00%> (-0.20%)`	⬇️
tianshou/policy/modelfree/sac.py	`86.86% <100.00%> (-0.14%)`	⬇️
tianshou/policy/modelfree/td3.py	`100.00% <100.00%> (ø)`
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a511cb4...7ab8160. Read the comment docs.

tianshou/utils/net/discrete.py

tianshou/policy/base.py

tianshou/utils/net/discrete.py

ChenDRAG · 2021-01-20T02:14:27Z

Same as #263, suggest adding experiment before pushing to tianshou/policy to ensure correctness.

shengxiang19 · 2021-01-20T02:36:02Z

Same as #263, suggest adding experiment before pushing to tianshou/policy to ensure correctness.

For Atari 2600 games, it's hard for my PC to reproduce the same parameters, i.e. Memory size is 1M and training frame is 200M. I have tested QR-DQN on some Atari environments to ensure the convergence of the algorithm. You can see some atari results in examples/atari/results/qrdqn/.

Trinkle23897 · 2021-01-22T08:36:34Z

How about another 3 atari games' results? I can help you if you are lack of computational resources.

shengxiang19 · 2021-01-22T09:05:29Z

How about another 3 atari games' results? I can help you if you are lack of computational resources.

Thank you for your help. I've been on holiday recently. It's not convenient for me to do more tests.

Trinkle23897 · 2021-01-22T09:09:05Z

Okay, that's fine. I'll do that.

tianshou/policy/modelfree/qrdqn.py

This is the PR for QR-DQN algorithm: https://arxiv.org/abs/1710.10044 1. add QR-DQN policy in tianshou/policy/modelfree/qrdqn.py. 2. add QR-DQN net in examples/atari/atari_network.py. 3. add QR-DQN atari example in examples/atari/atari_qrdqn.py. 4. add QR-DQN statement in tianshou/policy/init.py. 5. add QR-DQN unit test in test/discrete/test_qrdqn.py. 6. add QR-DQN atari results in examples/atari/results/qrdqn/. 7. add compute_q_value in DQNPolicy and C51Policy for simplify forward function. 8. move `with torch.no_grad():` from `_target_q` to BasePolicy By running "python3 atari_qrdqn.py --task "PongNoFrameskip-v4" --batch-size 64", get best_result': '19.8 ± 0.40', in epoch 8.

shengxiang19 and others added 8 commits January 19, 2021 14:17

Add QR-DQN algorithm

aaa6a79

Merge pull request #1 from thu-ml/master

5ae80be

fix readme (#273)

fix PEP8

c03c5a1

fix PEP8

137fa55

fix PEP8

00315be

fix PEP8

ab9fe0e

fix PEP8

6afd8c7

fix PEP8

be4de10

shengxiang19 added 3 commits January 19, 2021 17:38

fix

207afac

fix PEP8

d57aae4

fix PEP8

5d44d33

Trinkle23897 reviewed Jan 19, 2021

View reviewed changes

tianshou/utils/net/discrete.py Outdated Show resolved Hide resolved

tianshou/policy/base.py Outdated Show resolved Hide resolved

remove one redundant view operation

fb5d889

duburcqa reviewed Jan 19, 2021

View reviewed changes

tianshou/utils/net/discrete.py Outdated Show resolved Hide resolved

tianshou/utils/net/discrete.py Outdated Show resolved Hide resolved

Fix typing errors

216350e

Trinkle23897 added 4 commits January 20, 2021 17:02

merge master

109f199

fix test

3d4ccdc

small tweak

748757a

merge master

3c4dc69

shengxiang19 requested a review from Trinkle23897 January 22, 2021 07:03

Trinkle23897 added 4 commits January 24, 2021 09:16

update Enduro and SpaceInvader result

254dc6f

update seaquest result

22028fd

move 'with torch.no_grad():' from _target_q to BasePolicy

86f4d39

move huber loss to utils/loss

8c3f7cb

revert

504cd9e

Trinkle23897 reviewed Jan 27, 2021

View reviewed changes

tianshou/policy/modelfree/qrdqn.py Outdated Show resolved Hide resolved

try smooth_l1_loss

7ab8160

Trinkle23897 approved these changes Jan 27, 2021

View reviewed changes

Trinkle23897 merged commit 1eb6137 into thu-ml:master Jan 28, 2021

shengxiang19 deleted the qrdqn branch January 28, 2021 07:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add QR-DQN algorithm #276

Add QR-DQN algorithm #276

Uh oh!

shengxiang19 commented Jan 19, 2021

Uh oh!

codecov-io commented Jan 19, 2021 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ChenDRAG commented Jan 20, 2021

Uh oh!

shengxiang19 commented Jan 20, 2021

Uh oh!

Trinkle23897 commented Jan 22, 2021 •

edited

Loading

Uh oh!

shengxiang19 commented Jan 22, 2021

Uh oh!

Trinkle23897 commented Jan 22, 2021

Uh oh!

Uh oh!

Uh oh!

Add QR-DQN algorithm #276

Add QR-DQN algorithm #276

Uh oh!

Conversation

shengxiang19 commented Jan 19, 2021

Uh oh!

codecov-io commented Jan 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ChenDRAG commented Jan 20, 2021

Uh oh!

shengxiang19 commented Jan 20, 2021

Uh oh!

Trinkle23897 commented Jan 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shengxiang19 commented Jan 22, 2021

Uh oh!

Trinkle23897 commented Jan 22, 2021

Uh oh!

Uh oh!

Uh oh!

codecov-io commented Jan 19, 2021 •

edited

Loading

Trinkle23897 commented Jan 22, 2021 •

edited

Loading