Standardized behavior of Batch.cat and misc code refactor #137

youkaichao · 2020-07-14T06:34:26Z

Misc code refactor and add Batch.condense

These changes are picked from #122 to be an independent PR.

tianshou/data/batch.py

tianshou/policy/base.py

…ty Batch by test cases

youkaichao · 2020-07-14T12:42:24Z

Currently, cat only supports partially empty Batch, that is, a non-empty Batch but has some fields with empty Batch.

To enable cat with empty Batch like Batch(a=Batch(), b=Batch(c=Batch())), we have to know its length, which should be resolved after #138 .

…f keys

duburcqa · 2020-07-14T13:53:35Z

To enable cat with empty Batch like Batch(a=Batch(), b=Batch(c=Batch())), we have to know its length, which should be resolved after #138 .

is_empty is provided in such case. No need to implement change or implement anything else.

youkaichao · 2020-07-14T14:22:27Z

Batch.cat([Batch(a=np.random.rand(3, 4)), Batch(b=Batch(), a=Batch(c=Batch()))])

@duburcqa What about this? Maybe this should cause an error? Then we should distinguish between Batch() and Batch(c=Batch()), where the former is used to reserve keys but the latter should not be treated as empty.

This raises a natural question: is it ok to use Batch() to reserve keys? Or maybe we should explicitly reserve the keys? (e.g. have a field named self.reserved_keys).

Or, must we reserve any keys? Maybe we can reserve no keys, return a special token to indicate the absence of a key, and let the caller to deal with it.

One more issue: None is also a valid object (Batch(a=None) works just fine), but we are using b.get(k, None) is None to judge the existence of a key.

duburcqa · 2020-07-14T14:37:35Z

@duburcqa What about this?

Batch(c=Batch()) is not an empty Batch -> a=Batch(c=Batch()) is not a key reservation -> the shape of Batch(b=Batch(), a=Batch(c=Batch())) is not compatible with the shape of Batch(a=np.random.rand(3, 4)) -> it should raise an exception. I see no ambiguity here.

This raises a natural question: is it ok to use Batch() to reserve keys? Or maybe we should explicitly reserve the keys? (e.g. have a field named self.reserved_keys).
Or, must we reserve any keys? Maybe we can reserve no keys, return a special token to indicate the absence of a key, and let the caller to deal with it.
One more issue: None is also a valid object (Batch(a=None) works just fine), but we are using b.get(k, None) is None to judge the existence of a key.

I dont know. But those design issues should be discussed in another dedicated issue + PR.

youkaichao · 2020-07-14T14:47:16Z

I see no ambiguity here.

Currently, Batch(c=Batch()) is empty. And maybe someone would want to use Batch(b=Batch(), a=Batch(c=Batch())) for hierarchical key reservation?

duburcqa · 2020-07-14T14:51:56Z

Currently, Batch(c=Batch()) is empty.

Yes, but I don't think it should (at least for this PR).

And maybe someone would want to use Batch(b=Batch(), a=Batch(c=Batch())) for hierarchical key reservation?

Maybe, but I think it is another issue since it requires changing of paradigm. But it my opinion, it is irrelevant to support hierarchical key reservation. I may be wrong.

Trinkle23897 · 2020-07-14T15:05:42Z

Hierarchical key reservation is a must for the current collector implementation, since it will create something like self.data is a Batch(obs=..., ..., policy=Batch(_state=Batch()), ...)

youkaichao · 2020-07-16T04:36:25Z

Currently, Batch(c=Batch()) is empty.

Now we support is_empty(recursive=bool) to identify either Batch() or Batch(c=Batch()).

Hierarchical key reservation is a must for the current collector implementation

Hierarchical key reservation is stays.

With 6d2cda6 , now Batch.cat_ behaves exactly as described here.

tianshou/data/batch.py

youkaichao · 2020-07-16T07:41:25Z

Torch is using recurse not recursive.

Fixed in 02f9d1a .

duburcqa · 2020-07-16T07:52:33Z

Nice job ! Very high quality PR again 👍 @Trinkle23897 Ready to be merged.

NB: @Trinkle23897 you should not allow every collaborators to merge PR, or clearly state if you agree if I merge some PRs.

Trinkle23897 · 2020-07-16T08:37:21Z

NB: @Trinkle23897 you should not allow every collaborator to merge PR, or clearly state if you agree if I merge some PRs.

I don't know how to set it since the minimum role for approval is "write". I open the " At least 1 approving review by reviewers with write access." choice. Maybe set it to 2 is better for our three?

duburcqa · 2020-07-16T08:40:56Z

I don't know how to set it since the minimum role for approval is "write".

Settings -> branches -> branch protected rules -> master -> proected matching branches -> Require review from Code Owners + Restrict who can dismiss pull request reviews

Maybe set it to 2 is better for our three?

Why not

test/base/test_batch.py

tianshou/data/batch.py

Trinkle23897 · 2020-07-16T10:38:55Z

@duburcqa I've changed the number to 2.

* code refactor; remove unused kwargs; add reward_normalization for dqn * bugfix for __setitem__ with torch.Tensor; add Batch.condense * minor fix * support cat with empty Batch * remove the dependency of is_empty on len; specify the semantic of empty Batch by test cases * support stack with empty Batch * remove condense * refactor code to reflect the shared / partial / reserved categories of keys * add is_empty(recursive=False) * doc fix * docfix and bugfix for _is_batch_set * add doc for key reservation * bugfix for algebra operators * fix cat with lens hint * code refactor * bugfix for storing None * use ValueError instead of exception * hide lens away from users * add comment for __cat * move the computation of the initial value of lens in cat_ itself. * change the place of doc string * doc fix for Batch doc string * change recursive to recurse * doc string fix * minor fix for batch doc

* make fileds with empty Batch rather than None after reset * dummy code * remove dummy * add reward_length argument for collector * Improve Batch (#126) * make sure the key type of Batch is string, and add unit tests * add is_empty() function and unit tests * enable cat of mixing dict and Batch, just like stack * bugfix for reward_length * add get_final_reward_fn argument to collector to deal with marl * minor polish * remove multibuf * minor polish * improve and implement Batch.cat_ * bugfix for buffer.sample with field impt_weight * restore the usage of a.cat_(b) * fix 2 bugs in batch and add corresponding unittest * code fix for update * update is_empty to recognize empty over empty; bugfix for len * bugfix for update and add testcase * add testcase of update * make fileds with empty Batch rather than None after reset * dummy code * remove dummy * add reward_length argument for collector * bugfix for reward_length * add get_final_reward_fn argument to collector to deal with marl * make sure the key type of Batch is string, and add unit tests * add is_empty() function and unit tests * enable cat of mixing dict and Batch, just like stack * dummy code * remove dummy * add multi-agent example: tic-tac-toe * move TicTacToeEnv to a separate file * remove dummy MANet * code refactor * move tic-tac-toe example to test * update doc with marl-example * fix docs * reduce the threshold * revert * update player id to start from 1 and change player to agent; keep coding * add reward_length argument for collector * Improve Batch (#128) * minor polish * improve and implement Batch.cat_ * bugfix for buffer.sample with field impt_weight * restore the usage of a.cat_(b) * fix 2 bugs in batch and add corresponding unittest * code fix for update * update is_empty to recognize empty over empty; bugfix for len * bugfix for update and add testcase * add testcase of update * fix docs * fix docs * fix docs [ci skip] * fix docs [ci skip] Co-authored-by: Trinkle23897 <463003665@qq.com> * refact * re-implement Batch.stack and add testcases * add doc for Batch.stack * reward_metric * modify flag * minor fix * reuse _create_values and refactor stack_ & cat_ * fix pep8 * fix reward stat in collector * fix stat of collector, simplify test/base/env.py * fix docs * minor fix * raise exception for stacking with partial keys and axis!=0 * minor fix * minor fix * minor fix * marl-examples * add condense; bugfix for torch.Tensor; code refactor * marl example can run now * enable tic tac toe with larger board size and win-size * add test dependency * Fix padding of inconsistent keys with Batch.stack and Batch.cat (#130) * re-implement Batch.stack and add testcases * add doc for Batch.stack * reuse _create_values and refactor stack_ & cat_ * fix pep8 * fix docs * raise exception for stacking with partial keys and axis!=0 * minor fix * minor fix Co-authored-by: Trinkle23897 <463003665@qq.com> * stash * let agent learn to play as agent 2 which is harder * code refactor * Improve collector (#125) * remove multibuf * reward_metric * make fileds with empty Batch rather than None after reset * many fixes and refactor Co-authored-by: Trinkle23897 <463003665@qq.com> * marl for tic-tac-toe and general gomoku * update default gamma to 0.1 for tic tac toe to win earlier * fix name typo; change default game config; add rew_norm option * fix pep8 * test commit * mv test dir name * add rew flag * fix torch.optim import error and madqn rew_norm * remove useless kwargs * Vector env enable select worker (#132) * Enable selecting worker for vector env step method. * Update collector to match new vecenv selective worker behavior. * Bug fix. * Fix rebase Co-authored-by: Alexis Duburcq <alexis.duburcq@wandercraft.eu> * show the last move of tictactoe by capital letters * add multi-agent tutorial * fix link * Standardized behavior of Batch.cat and misc code refactor (#137) * code refactor; remove unused kwargs; add reward_normalization for dqn * bugfix for __setitem__ with torch.Tensor; add Batch.condense * minor fix * support cat with empty Batch * remove the dependency of is_empty on len; specify the semantic of empty Batch by test cases * support stack with empty Batch * remove condense * refactor code to reflect the shared / partial / reserved categories of keys * add is_empty(recursive=False) * doc fix * docfix and bugfix for _is_batch_set * add doc for key reservation * bugfix for algebra operators * fix cat with lens hint * code refactor * bugfix for storing None * use ValueError instead of exception * hide lens away from users * add comment for __cat * move the computation of the initial value of lens in cat_ itself. * change the place of doc string * doc fix for Batch doc string * change recursive to recurse * doc string fix * minor fix for batch doc * write tutorials to specify the standard of Batch (#142) * add doc for len exceptions * doc move; unify is_scalar_value function * remove some issubclass check * bugfix for shape of Batch(a=1) * keep moving doc * keep writing batch tutorial * draft version of Batch tutorial done * improving doc * keep improving doc * batch tutorial done * rename _is_number * rename _is_scalar * shape property do not raise exception * restore some doc string * grammarly [ci skip] * grammarly + fix warning of building docs * polish docs * trim and re-arrange batch tutorial * go straight to the point * minor fix for batch doc * add shape / len in basic usage * keep improving tutorial * unify _to_array_with_correct_type to remove duplicate code * delegate type convertion to Batch.__init__ * further delegate type convertion to Batch.__init__ * bugfix for setattr * add a _parse_value function * remove dummy function call * polish docs Co-authored-by: Trinkle23897 <463003665@qq.com> * bugfix for mapolicy * pretty code * remove debug code; remove condense * doc fix * check before get_agents in tutorials/tictactoe * tutorial * fix * minor fix for batch doc * minor polish * faster test_ttt * improve tic-tac-toe environment * change default epoch and step-per-epoch for tic-tac-toe * fix mapolicy * minor polish for mapolicy * 90% to 80% (need to change the tutorial) * win rate * show step number at board * simplify mapolicy * minor polish for mapolicy * remove MADQN * fix pep8 * change legal_actions to mask (need to update docs) * simplify maenv * fix typo * move basevecenv to single file * separate RandomAgent * update docs * grammarly * fix pep8 * win rate typo * format in cheatsheet * use bool mask directly * update doc for boolean mask Co-authored-by: Trinkle23897 <463003665@qq.com> Co-authored-by: Alexis DUBURCQ <alexis.duburcq@gmail.com> Co-authored-by: Alexis Duburcq <alexis.duburcq@wandercraft.eu>

* code refactor; remove unused kwargs; add reward_normalization for dqn * bugfix for __setitem__ with torch.Tensor; add Batch.condense * minor fix * support cat with empty Batch * remove the dependency of is_empty on len; specify the semantic of empty Batch by test cases * support stack with empty Batch * remove condense * refactor code to reflect the shared / partial / reserved categories of keys * add is_empty(recursive=False) * doc fix * docfix and bugfix for _is_batch_set * add doc for key reservation * bugfix for algebra operators * fix cat with lens hint * code refactor * bugfix for storing None * use ValueError instead of exception * hide lens away from users * add comment for __cat * move the computation of the initial value of lens in cat_ itself. * change the place of doc string * doc fix for Batch doc string * change recursive to recurse * doc string fix * minor fix for batch doc

* make fileds with empty Batch rather than None after reset * dummy code * remove dummy * add reward_length argument for collector * Improve Batch (thu-ml#126) * make sure the key type of Batch is string, and add unit tests * add is_empty() function and unit tests * enable cat of mixing dict and Batch, just like stack * bugfix for reward_length * add get_final_reward_fn argument to collector to deal with marl * minor polish * remove multibuf * minor polish * improve and implement Batch.cat_ * bugfix for buffer.sample with field impt_weight * restore the usage of a.cat_(b) * fix 2 bugs in batch and add corresponding unittest * code fix for update * update is_empty to recognize empty over empty; bugfix for len * bugfix for update and add testcase * add testcase of update * make fileds with empty Batch rather than None after reset * dummy code * remove dummy * add reward_length argument for collector * bugfix for reward_length * add get_final_reward_fn argument to collector to deal with marl * make sure the key type of Batch is string, and add unit tests * add is_empty() function and unit tests * enable cat of mixing dict and Batch, just like stack * dummy code * remove dummy * add multi-agent example: tic-tac-toe * move TicTacToeEnv to a separate file * remove dummy MANet * code refactor * move tic-tac-toe example to test * update doc with marl-example * fix docs * reduce the threshold * revert * update player id to start from 1 and change player to agent; keep coding * add reward_length argument for collector * Improve Batch (thu-ml#128) * minor polish * improve and implement Batch.cat_ * bugfix for buffer.sample with field impt_weight * restore the usage of a.cat_(b) * fix 2 bugs in batch and add corresponding unittest * code fix for update * update is_empty to recognize empty over empty; bugfix for len * bugfix for update and add testcase * add testcase of update * fix docs * fix docs * fix docs [ci skip] * fix docs [ci skip] Co-authored-by: Trinkle23897 <463003665@qq.com> * refact * re-implement Batch.stack and add testcases * add doc for Batch.stack * reward_metric * modify flag * minor fix * reuse _create_values and refactor stack_ & cat_ * fix pep8 * fix reward stat in collector * fix stat of collector, simplify test/base/env.py * fix docs * minor fix * raise exception for stacking with partial keys and axis!=0 * minor fix * minor fix * minor fix * marl-examples * add condense; bugfix for torch.Tensor; code refactor * marl example can run now * enable tic tac toe with larger board size and win-size * add test dependency * Fix padding of inconsistent keys with Batch.stack and Batch.cat (thu-ml#130) * re-implement Batch.stack and add testcases * add doc for Batch.stack * reuse _create_values and refactor stack_ & cat_ * fix pep8 * fix docs * raise exception for stacking with partial keys and axis!=0 * minor fix * minor fix Co-authored-by: Trinkle23897 <463003665@qq.com> * stash * let agent learn to play as agent 2 which is harder * code refactor * Improve collector (thu-ml#125) * remove multibuf * reward_metric * make fileds with empty Batch rather than None after reset * many fixes and refactor Co-authored-by: Trinkle23897 <463003665@qq.com> * marl for tic-tac-toe and general gomoku * update default gamma to 0.1 for tic tac toe to win earlier * fix name typo; change default game config; add rew_norm option * fix pep8 * test commit * mv test dir name * add rew flag * fix torch.optim import error and madqn rew_norm * remove useless kwargs * Vector env enable select worker (thu-ml#132) * Enable selecting worker for vector env step method. * Update collector to match new vecenv selective worker behavior. * Bug fix. * Fix rebase Co-authored-by: Alexis Duburcq <alexis.duburcq@wandercraft.eu> * show the last move of tictactoe by capital letters * add multi-agent tutorial * fix link * Standardized behavior of Batch.cat and misc code refactor (thu-ml#137) * code refactor; remove unused kwargs; add reward_normalization for dqn * bugfix for __setitem__ with torch.Tensor; add Batch.condense * minor fix * support cat with empty Batch * remove the dependency of is_empty on len; specify the semantic of empty Batch by test cases * support stack with empty Batch * remove condense * refactor code to reflect the shared / partial / reserved categories of keys * add is_empty(recursive=False) * doc fix * docfix and bugfix for _is_batch_set * add doc for key reservation * bugfix for algebra operators * fix cat with lens hint * code refactor * bugfix for storing None * use ValueError instead of exception * hide lens away from users * add comment for __cat * move the computation of the initial value of lens in cat_ itself. * change the place of doc string * doc fix for Batch doc string * change recursive to recurse * doc string fix * minor fix for batch doc * write tutorials to specify the standard of Batch (thu-ml#142) * add doc for len exceptions * doc move; unify is_scalar_value function * remove some issubclass check * bugfix for shape of Batch(a=1) * keep moving doc * keep writing batch tutorial * draft version of Batch tutorial done * improving doc * keep improving doc * batch tutorial done * rename _is_number * rename _is_scalar * shape property do not raise exception * restore some doc string * grammarly [ci skip] * grammarly + fix warning of building docs * polish docs * trim and re-arrange batch tutorial * go straight to the point * minor fix for batch doc * add shape / len in basic usage * keep improving tutorial * unify _to_array_with_correct_type to remove duplicate code * delegate type convertion to Batch.__init__ * further delegate type convertion to Batch.__init__ * bugfix for setattr * add a _parse_value function * remove dummy function call * polish docs Co-authored-by: Trinkle23897 <463003665@qq.com> * bugfix for mapolicy * pretty code * remove debug code; remove condense * doc fix * check before get_agents in tutorials/tictactoe * tutorial * fix * minor fix for batch doc * minor polish * faster test_ttt * improve tic-tac-toe environment * change default epoch and step-per-epoch for tic-tac-toe * fix mapolicy * minor polish for mapolicy * 90% to 80% (need to change the tutorial) * win rate * show step number at board * simplify mapolicy * minor polish for mapolicy * remove MADQN * fix pep8 * change legal_actions to mask (need to update docs) * simplify maenv * fix typo * move basevecenv to single file * separate RandomAgent * update docs * grammarly * fix pep8 * win rate typo * format in cheatsheet * use bool mask directly * update doc for boolean mask Co-authored-by: Trinkle23897 <463003665@qq.com> Co-authored-by: Alexis DUBURCQ <alexis.duburcq@gmail.com> Co-authored-by: Alexis Duburcq <alexis.duburcq@wandercraft.eu>

youkaichao added 2 commits July 14, 2020 14:29

code refactor; remove unused kwargs; add reward_normalization for dqn

1f84806

bugfix for __setitem__ with torch.Tensor; add Batch.condense

80a4f98

youkaichao requested a review from Trinkle23897 July 14, 2020 06:39

duburcqa reviewed Jul 14, 2020

View reviewed changes

tianshou/data/batch.py Outdated Show resolved Hide resolved

duburcqa reviewed Jul 14, 2020

View reviewed changes

tianshou/data/batch.py Show resolved Hide resolved

duburcqa reviewed Jul 14, 2020

View reviewed changes

tianshou/policy/base.py Show resolved Hide resolved

duburcqa reviewed Jul 14, 2020

View reviewed changes

tianshou/policy/base.py Outdated Show resolved Hide resolved

Trinkle23897 linked an issue Jul 14, 2020 that may be closed by this pull request

Remove some useless kwargs or not #131

Closed

Trinkle23897 and others added 4 commits July 14, 2020 16:11

minor fix

3d7cc24

support cat with empty Batch

fc84433

remove the dependency of is_empty on len; specify the semantic of emp…

9fa118c

…ty Batch by test cases

support stack with empty Batch

ceca419

remove condense

4557baa

youkaichao mentioned this pull request Jul 14, 2020

The length of empty Batch #138

Closed

refactor code to reflect the shared / partial / reserved categories o…

3de0218

…f keys

youkaichao added 6 commits July 16, 2020 09:57

add is_empty(recursive=False)

f840c73

doc fix

ce08bac

docfix and bugfix for _is_batch_set

35b1533

add doc for key reservation

8c2847f

bugfix for algebra operators

e1e36e0

fix cat with lens hint

6d2cda6

code refactor

ebf19ea

duburcqa reviewed Jul 16, 2020

View reviewed changes

tianshou/data/batch.py Show resolved Hide resolved

youkaichao added 2 commits July 16, 2020 15:08

add comment for __cat

fba94a6

move the computation of the initial value of lens in cat_ itself.

6287326

duburcqa reviewed Jul 16, 2020

View reviewed changes

tianshou/data/batch.py Outdated Show resolved Hide resolved

duburcqa reviewed Jul 16, 2020

View reviewed changes

tianshou/data/batch.py Outdated Show resolved Hide resolved

youkaichao added 3 commits July 16, 2020 15:23

change the place of doc string

f64faf5

doc fix for Batch doc string

b795493

change recursive to recurse

02f9d1a

duburcqa previously approved these changes Jul 16, 2020

View reviewed changes

This was linked to issues Jul 16, 2020

The length of empty Batch #138

Closed

Standardize the behavior of Batch aggregation (stack/cat) when dealing with reserved keys #139

Closed

Trinkle23897 reviewed Jul 16, 2020

View reviewed changes

doc string fix

2a063f0

youkaichao dismissed duburcqa’s stale review via 2a063f0 July 16, 2020 09:55

minor fix for batch doc

7912eae

Trinkle23897 approved these changes Jul 16, 2020

View reviewed changes

Trinkle23897 assigned youkaichao Jul 16, 2020

duburcqa approved these changes Jul 16, 2020

View reviewed changes

Trinkle23897 merged commit f8ad6df into thu-ml:dev Jul 16, 2020

youkaichao deleted the misc branch July 16, 2020 11:49

Standardized behavior of Batch.cat and misc code refactor #137

Standardized behavior of Batch.cat and misc code refactor #137

Uh oh!

Conversation

youkaichao commented Jul 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

youkaichao commented Jul 14, 2020

Uh oh!

duburcqa commented Jul 14, 2020

Uh oh!

youkaichao commented Jul 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

duburcqa commented Jul 14, 2020

Uh oh!

youkaichao commented Jul 14, 2020

Uh oh!

duburcqa commented Jul 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Trinkle23897 commented Jul 14, 2020

Uh oh!

youkaichao commented Jul 16, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

youkaichao commented Jul 16, 2020

Uh oh!

duburcqa commented Jul 16, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Trinkle23897 commented Jul 16, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

duburcqa commented Jul 16, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Trinkle23897 commented Jul 16, 2020

Uh oh!

Uh oh!

youkaichao commented Jul 14, 2020 •

edited

Loading

youkaichao commented Jul 14, 2020 •

edited

Loading

duburcqa commented Jul 14, 2020 •

edited

Loading

duburcqa commented Jul 16, 2020 •

edited

Loading

Trinkle23897 commented Jul 16, 2020 •

edited

Loading

duburcqa commented Jul 16, 2020 •

edited

Loading