Step collector implementation #280

ChenDRAG · 2021-01-28T09:32:34Z

This is the third commit of 6 commits mentioned in #274, which features refactor of Collector to fix #245. You can check #274 for more detail.

… for replaybuffer

…r(size=0, )

…ffer

codecov-io · 2021-02-18T12:05:39Z

Codecov Report

Merging #280 (9b46678) into dev (d918022) will decrease coverage by 0.16%.
The diff coverage is 93.67%.

@@            Coverage Diff             @@
##              dev     #280      +/-   ##
==========================================
- Coverage   94.64%   94.47%   -0.17%     
==========================================
  Files          45       45              
  Lines        3027     3152     +125     
==========================================
+ Hits         2865     2978     +113     
- Misses        162      174      +12

Flag	Coverage Δ
unittests	`94.47% <93.67%> (-0.17%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
tianshou/policy/modelfree/a2c.py	`86.20% <ø> (ø)`
tianshou/policy/modelfree/pg.py	`97.36% <ø> (ø)`
tianshou/policy/modelfree/ppo.py	`96.51% <ø> (ø)`
tianshou/policy/base.py	`76.80% <67.39%> (+3.27%)`	⬆️
tianshou/trainer/onpolicy.py	`96.87% <90.90%> (-1.54%)`	⬇️
tianshou/data/collector.py	`94.46% <94.11%> (-1.54%)`	⬇️
tianshou/data/buffer.py	`98.57% <97.12%> (-1.43%)`	⬇️
tianshou/data/__init__.py	`100.00% <100.00%> (ø)`
tianshou/data/batch.py	`99.74% <100.00%> (+0.51%)`	⬆️
tianshou/policy/__init__.py	`100.00% <100.00%> (ø)`
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d918022...9b46678. Read the comment docs.

…array with np.isscalar)

ChenDRAG · 2021-02-19T01:57:08Z

I think this pr is ready to merge now.

Trinkle23897 · 2021-02-19T02:19:51Z

Other suggestings will appear in the next PR because this is too large (over 2000+ lines, though lots of changes are from the test).

This is the third PR of 6 commits mentioned in thu-ml#274, which features refactor of Collector to fix thu-ml#245. You can check thu-ml#274 for more detail. Things changed in this PR: 1. refactor collector to be more cleaner, split AsyncCollector to support asyncvenv; 2. change buffer.add api to add(batch, bffer_ids); add several types of buffer (VectorReplayBuffer, PrioritizedVectorReplayBuffer, etc.) 3. add policy.exploration_noise(act, batch) -> act 4. small change in BasePolicy.compute_*_returns 5. move reward_metric from collector to trainer 6. fix np.asanyarray issue (different version's numpy will result in different output) 7. flake8 maxlength=88 8. polish docs and fix test Co-authored-by: n+e <trinkle23897@gmail.com>

ChenDRAG and others added 29 commits January 19, 2021 17:00

add first version of cached replay buffer(baseline), add standard api…

8229b47

… for replaybuffer

add cached buffer, vec buffer

483404f

simple pep8 fix

942c2a3

init

ec4b246

Merge branch 'master(net/utils change thu-ml#275)' into cached

e7f631e

Merge branch 'master' into cached

da564be

some change

36e799e

update ReplayBuffer

50b20a0

Merge branch 'cached' of github.com:ChenDRAG/tianshou into cached

3e487dc

refactor ReplayBuffer

0ac97af

refactor vec/cached buffer

bd90b97

pep8 fix

2b7c227

update VectorReplayBuffer and add test

3afb2cb

update cached

443969d

order change, small fix

ee51e64

try unittest

a5bc4ad

add more test and fix bugs

17e3612

fix a bug and add some corner-case tests

7eba23d

re-implement sample_avail function and add test for CachedReplayBuffe…

b9f4f2a

…r(size=0, )

improve documents

8fe85f8

ReplayBuffers._offset

f59a530

fix atari-style update; support CachedBuffer with main_buffer==PrioBu…

425c2bd

…ffer

assert _meta.is_empty() in ReplayBuffers init

2361755

Merge branch 'master' into cached

0160a7f

small fix

75d581b

small fix

b5d93f3

improve coverage

c8f27c9

small buffer change

9c879f2

draft of step_collector, not finished yet

679fe27

ChenDRAG mentioned this pull request Jan 28, 2021

Add CachedReplayBuffer and ReplayBufferManager #278

Merged

Trinkle23897 added 3 commits February 18, 2021 17:09

it works!

64fe3ce

fix test

64e04df

fix dqn family eps-test

2275efb

Trinkle23897 added 7 commits February 18, 2021 20:08

fix dead loop in creating new Batch (drqn _is_scalar replace np.asany…

1145359

…array with np.isscalar)

split exploration_noise in bcq

c7a624f

add test for priovecbuf

d5fa008

improve coverage

c3b35d4

fix several bugs of documentation

0cae28c

fix test

b7efc68

add a test of batch

b2098ee

Trinkle23897 requested a review from duburcqa February 18, 2021 15:13

Trinkle23897 added 2 commits February 18, 2021 23:32

fix test

4f181cc

add a note

8be5b2c

Trinkle23897 mentioned this pull request Feb 19, 2021

Plans of releasing mujoco benchmark with ddpg/sac/td3 on Tianshou #274

Closed

This was linked to issues Feb 19, 2021

Collector.collect run an entire episode when only set n_step #255

Closed

Increasing training time of a2c #140

Closed

remove redundant code

cb4dbda

polish docs organization

9b46678

Trinkle23897 approved these changes Feb 19, 2021

View reviewed changes

Trinkle23897 merged commit 150d0ec into thu-ml:dev Feb 19, 2021

This was referenced Feb 19, 2021

Trainer refactor : some definition change #293

Merged

Trainer refactor : flexible logger #295

Merged

ChenDRAG mentioned this pull request Mar 6, 2021

Action noise in testing state #304

Closed

8 tasks

Trinkle23897 mentioned this pull request Mar 7, 2021

Some atari envs use 30GB of RAM #225

Closed

Trinkle23897 linked an issue Apr 21, 2021 that may be closed by this pull request

Plans of releasing mujoco benchmark with ddpg/sac/td3 on Tianshou #274

Closed

Trinkle23897 mentioned this pull request Jun 17, 2021

How to do self-play correctly for tic-tac-toe? #381

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Step collector implementation #280

Step collector implementation #280

Uh oh!

ChenDRAG commented Jan 28, 2021 •

edited

Loading

Uh oh!

codecov-io commented Feb 18, 2021 •

edited

Loading

Uh oh!

ChenDRAG commented Feb 19, 2021

Uh oh!

Trinkle23897 commented Feb 19, 2021

Uh oh!

Uh oh!

Step collector implementation #280

Step collector implementation #280

Uh oh!

Conversation

ChenDRAG commented Jan 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-io commented Feb 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ChenDRAG commented Feb 19, 2021

Uh oh!

Trinkle23897 commented Feb 19, 2021

Uh oh!

Uh oh!

ChenDRAG commented Jan 28, 2021 •

edited

Loading

codecov-io commented Feb 18, 2021 •

edited

Loading