-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Tianshou is moving towards a new release that will improve and stabilize interfaces, and generally make the library friendly for industry applications, while still keeping (and even improving) the experience for researchers and beginners. The issues are collected in the corresponding milestone.
This issue serves as an overview of the most important aspects and a very quick overview of the current status
Aspects of Industry Grade RL
Generally, in industry applications one cares about
- ease of use (addressed with high level interfaces)
- reproducibility
- ease of restoration of agents, and inference with them (will need to be improved)
- versatility of algorithms, logging, and integrations
- stability of interfaces
- readability of code, typing, documentation
- performance guarantees (through reliable benchmarking on each release)
Aspects of RL for research
- Various details of algorithms and trainers should be easily customizable
- Low code complexity of algo/trainer implementations
- Good evaluation protocols (HPO, best-practices in statistics for reporting)
- Easily customizable logging (via callbacks/method overrides)
Current Status
@opcode81 created a first version of high-level interfaces, as well as example scripts. This will be further refined over the next months. See #970
@bordeauxred and @MischaPanch are working on #978 to create a proper training/evaluation protocol
@carlocagnetta is working on including tutorial notebooks into the repo, as well as generally updating and extending documentation #916
@maxhuettenrauch is working on #933 and general improvements of logging/callbacks
@spacegoing is working on fixing RNN-related issues #937
@ivanappliedai is fixing some bugs in the offline RL part, and migrating from d4rl to minari #932
@MischaPanch is coordinating overall efforts, and is involved in several tangential improvements