unify single-env and multi-env in collector #157

youkaichao · 2020-07-23T02:20:21Z

Unify the implementation with multi-environments (wrap a single environment in a multi-environment with one envs) to greatly simplify the code.

This changed the behavior of single-environment.
Prior to this pr, for single environment, collector.collect(n_step=n) will step n steps.
After this pr, for single environment, collector.collect(n_step=n) will step m episodes until the steps are greater than n.

That is to say, collectors now always collect full episodes.

youkaichao · 2020-07-23T02:20:55Z

This is a prerequisite for #134 .

youkaichao · 2020-07-23T04:45:53Z

Collector.collect should only reflect the stat of current call. If users want to track the stat (such as moving average) of Collector.collect, they can achieve it with minimal efforts, but this should not be done by Tianshou anyway. So I commited d3e23ac .

duburcqa · 2020-07-23T07:15:09Z

To me it is a great improvement. Reducing the number of branches is not easy but often it makes everything simpler and clearer. And especially this part of the code was unnecessarily complicated and hard to understand for no reason (probably an artefact of the past), since a single env is nothing conceptually different from a vector venv with a single instance. Good you took the time to clean that part of the code !

youkaichao · 2020-07-23T07:29:50Z

And especially this part of the code was unnecessarily complicated and hard to understand for no reason (probably an artefact of the past), since a single env is nothing conceptually different from a vector venv with a single instance.

I realized this when I want to support sync simulation, and get support from @Trinkle23897 . It is indeed an artefact of the past because he wants to support partial episodes, but later he realized that it is hardly necessary, so I removed this part of code.

youkaichao · 2020-07-23T07:38:08Z

ready for review now @Trinkle23897 @duburcqa

test/base/test_collector.py

tianshou/data/collector.py

The base branch was changed.

jiang-yuan · 2020-07-24T10:27:19Z

According to my experiments:
Even though I change the step_number within an episode (e.g. There is a 30 steps episode, within which I update the net every 5, 10, or 15 steps), the result may be different.
I do not understand. What is the reason for 'he realized that it is hardly necessary'?

Trinkle23897 · 2020-07-24T10:29:28Z

We do some experiments on Bipedal-walker and Lunar-lander, the final result seems slightly different with each other. @imoneoi

youkaichao · 2020-07-24T11:00:40Z

Even though I change the step_number within an episode (e.g. There is a 30 steps episode, within which I update the net every 5, 10, or 15 steps), the result may be different.

The point of this pr is to enforce that there will only be full episodes in the replay buffer. If fully online learning really matters for you, I think you can use a single environment and try to write a collector with simple for-loops.

for xxx:
    data = []
    while len(data) < n:
        data.append(env.step())
    policy.learn(data)

In my opinion, if you need a replay buffer, fully online learning (learning after every several steps) will not matter so much. Current implementation (Tianshou 0.2.5) allows users to perform online learning at the episode level (learning after every several episodes) and I think it is enough.

However, feel free to provide feedback and to discuss :)

Unify the implementation with multi-environments (wrap a single environment in a multi-environment with one envs) to greatly simplify the code. This changed the behavior of single-environment. Prior to this pr, for single environment, collector.collect(n_step=n) will step n steps. After this pr, for single environment, collector.collect(n_step=n) will step m episodes until the steps are greater than n. That is to say, collectors now always collect full episodes.

youkaichao added 4 commits July 23, 2020 10:17

remove _multi_env flag; fix info bug

ade94e4

update testcase

5140563

pep8 fix

1e439e0

remove dummy code

993ec29

youkaichao changed the title ~~Improve Collector~~ WIP: Improve Collector Jul 23, 2020

youkaichao and others added 2 commits July 23, 2020 10:47

remove dummy condition: _cached_buf never empty now

90dd719

minor fix

6846e2e

Trinkle23897 changed the title ~~WIP: Improve Collector~~ WIP: unify single-env and multi-env in collector Jul 23, 2020

youkaichao and others added 3 commits July 23, 2020 11:19

update doc

9939ba3

minor fix

44ac9bf

minor fix

a6dbff7

Trinkle23897 previously approved these changes Jul 23, 2020

View reviewed changes

remove movavg in collector

d3e23ac

youkaichao dismissed Trinkle23897’s stale review via d3e23ac July 23, 2020 04:42

youkaichao added 3 commits July 23, 2020 13:00

add back collect_time, collect_step and collect_episode

00a3e1e

simplify collector

fae0dd6

simplify code

1990a23

simplify code

ff98a6f

youkaichao changed the title ~~WIP: unify single-env and multi-env in collector~~ unify single-env and multi-env in collector Jul 23, 2020

youkaichao requested a review from duburcqa July 23, 2020 07:37

remove log_fn since it is overlap with preprocess_fn

1b51b33

minor fix

605b0ac

Trinkle23897 reviewed Jul 23, 2020

View reviewed changes

test/base/test_collector.py Outdated Show resolved Hide resolved

tianshou/data/collector.py Outdated Show resolved Hide resolved

tianshou/data/collector.py Outdated Show resolved Hide resolved

duburcqa reviewed Jul 23, 2020

View reviewed changes

tianshou/data/collector.py Outdated Show resolved Hide resolved

minor fix

c4026bf

duburcqa previously approved these changes Jul 23, 2020

View reviewed changes

Trinkle23897 previously approved these changes Jul 23, 2020

View reviewed changes

Trinkle23897 changed the base branch from master to dev July 23, 2020 08:15

Merge branch 'dev' into collector

46c2d5a

Trinkle23897 approved these changes Jul 23, 2020

View reviewed changes

Trinkle23897 requested a review from duburcqa July 23, 2020 08:16

Trinkle23897 merged commit bfeffe1 into thu-ml:dev Jul 23, 2020

youkaichao deleted the collector branch July 23, 2020 08:41

youkaichao mentioned this pull request Jul 23, 2020

Asynchronous sampling vector environment #134

Merged

Trinkle23897 mentioned this pull request Dec 3, 2020

Collector.collect run an entire episode when only set n_step #255

Closed

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

unify single-env and multi-env in collector #157

unify single-env and multi-env in collector #157

Uh oh!

youkaichao commented Jul 23, 2020 •

edited

Loading

Uh oh!

youkaichao commented Jul 23, 2020

Uh oh!

youkaichao commented Jul 23, 2020

Uh oh!

duburcqa commented Jul 23, 2020

Uh oh!

youkaichao commented Jul 23, 2020

Uh oh!

youkaichao commented Jul 23, 2020 •

edited by Trinkle23897

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jiang-yuan commented Jul 24, 2020

Uh oh!

Trinkle23897 commented Jul 24, 2020 •

edited

Loading

Uh oh!

youkaichao commented Jul 24, 2020

Uh oh!

Uh oh!

unify single-env and multi-env in collector #157

unify single-env and multi-env in collector #157

Uh oh!

Conversation

youkaichao commented Jul 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

youkaichao commented Jul 23, 2020

Uh oh!

youkaichao commented Jul 23, 2020

Uh oh!

duburcqa commented Jul 23, 2020

Uh oh!

youkaichao commented Jul 23, 2020

Uh oh!

youkaichao commented Jul 23, 2020 • edited by Trinkle23897 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jiang-yuan commented Jul 24, 2020

Uh oh!

Trinkle23897 commented Jul 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

youkaichao commented Jul 24, 2020

Uh oh!

Uh oh!

youkaichao commented Jul 23, 2020 •

edited

Loading

youkaichao commented Jul 23, 2020 •

edited by Trinkle23897

Loading

Trinkle23897 commented Jul 24, 2020 •

edited

Loading