这是indexloc提供的服务,不要输入任何密码
Skip to content

Tags: JoeZJH/tianshou

Tags

v0.4.5

Toggle v0.4.5's commit message

Unverified

We had a problem verifying this signature. Please try again later.
Fix critic network for Discrete CRR (thu-ml#485)

- Fixes an inconsistency in the implementation of Discrete CRR. Now it uses `Critic` class for its critic, following conventions in other actor-critic policies;
- Updates several offline policies to use `ActorCritic` class for its optimizer to eliminate randomness caused by parameter sharing between actor and critic;
- Add `writer.flush()` in TensorboardLogger to ensure real-time result;
- Enable `test_collector=None` in 3 trainers to turn off testing during training;
- Updates the Atari offline results in README.md;
- Moves Atari offline RL examples to `examples/offline`; tests to `test/offline` per review comments.

v0.4.4

Toggle v0.4.4's commit message
bump to 0.4.4

v0.4.3

Toggle v0.4.3's commit message

Unverified

We had a problem verifying this signature. Please try again later.
bump to v0.4.3 (thu-ml#432)

* add makefile
* bump version
* add isort and yapf
* update contributing.md
* update PR template
* spelling check

v0.4.2

Toggle v0.4.2's commit message

Unverified

We had a problem verifying this signature. Please try again later.
add vizdoom example, bump version to 0.4.2 (thu-ml#384)

v0.4.1

Toggle v0.4.1's commit message

Unverified

We had a problem verifying this signature. Please try again later.
Fix SAC loss explode (thu-ml#333)

* change SAC action_bound_method to "clip" (tanh is hardcoded in forward)

* docstring update

* modelbase -> modelbased

v0.4.0

Toggle v0.4.0's commit message

Unverified

We had a problem verifying this signature. Please try again later.
Merge pull request thu-ml#302 from thu-ml/dev

v0.4.0

v0.3.2

Toggle v0.3.2's commit message

Unverified

We had a problem verifying this signature. Please try again later.
v0.3.2 (thu-ml#292)

Throw a warning in ListReplayBuffer.

This version update is needed because of thu-ml#289, the previous v0.3.1 cannot work well under torch<=1.6.0 with cuda environment.

v0.3.1

Toggle v0.3.1's commit message

Unverified

We had a problem verifying this signature. Please try again later.
Add offline trainer and discrete BCQ algorithm (thu-ml#263)

The result needs to be tuned after `done` issue fixed.

Co-authored-by: n+e <trinkle23897@gmail.com>

v0.3.0.post1

Toggle v0.3.0.post1's commit message
specify the meaning of logits in documentation (thu-ml#238)

v0.3.0

Toggle v0.3.0's commit message

Unverified

We had a problem verifying this signature. Please try again later.
change API of train_fn and test_fn (thu-ml#229)

train_fn(epoch) -> train_fn(epoch, num_env_step)
test_fn(epoch) -> test_fn(epoch, num_env_step)