这是indexloc提供的服务，不要输入任何密码

26 Apr 07:25

Trinkle23897

0.2.2

Algorithm Implementation

Generalized Advantage Estimation (GAE);
Update PPO algorithm with arXiv:1811.02553 and arXiv:1912.09729;
Vanilla Imitation Learning (BC & DA, with continuous/discrete action space);
Prioritized DQN;
RNN-style policy network;
Fix SAC with torch==1.5.0

API change

change __call__ to forward in policy;
Add save_fn in trainer;
Add __repr__ in tianshou.data, e.g. print(buffer)

Assets 3

07 Apr 03:52

Trinkle23897

0.2.1

First version with full documentation.
Support algorithms: DQN/VPG/A2C/DDPG/PPO/TD3/SAC

Assets 3