v1.2.0

This is the last major release before version 2.0.0

It solves the regression in data collection performance, introduces several fixes, and importantly, adds support for determinism testing, which is used to ensure that the refactoring in the upcoming 2.0.0 release does not affect any aspect of training or inference

Changes/Improvements

trainer:
- Custom scoring now supported for selecting the best model. #1202
highlevel:
- DiscreteSACExperimentBuilder: Expose method with_actor_factory_default #1248 #1250
- ActorFactoryDefault: Fix parameters for hidden sizes and activation not being
  passed on in the discrete case (affects with_actor_factory_default method of experiment builders)
- ExperimentConfig: Do not inherit from other classes, as this breaks automatic handling by
  jsonargparse when the class is used to define interfaces (as in high-level API examples)
- AutoAlphaFactoryDefault: Differentiate discrete and continuous action spaces
  and allow coefficient to be modified, adding an informative docstring
  (previous implementation was reasonable only for continuous action spaces)
  - Adjust usage in atari_sac_hl example accordingly.
- NPGAgentFactory, TRPOAgentFactory: Fix optimizer instantiation including the actor parameters
  (which was misleadingly suggested in the docstring in the respective policy classes; docstrings were fixed),
  as the actor parameters are intended to be handled via natural gradients internally
data:
- ReplayBuffer: Fix collection of empty episodes being disallowed
- Collection was slow due to isinstance checks on Protocols and due to Buffer integrity validation. This was solved
  by no longer performing isinstance on Protocols and by making the integrity validation disabled by default.
Tests:
- We have introduced extensive determinism tests which allow to validate whether
  training processes deterministically compute the same results across different development branches.
  This is an important step towards ensuring reproducibility and consistency, which will be
  instrumental in supporting Tianshou developers in their work, especially in the context of
  algorithm development and evaluation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v1.2.0

Changes/Improvements

Uh oh!