这是indexloc提供的服务,不要输入任何密码
Skip to content

SAC implementation update #212

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Sep 12, 2020
Merged

SAC implementation update #212

merged 4 commits into from
Sep 12, 2020

Conversation

danagi
Copy link
Collaborator

@danagi danagi commented Sep 11, 2020

@Trinkle23897
Copy link
Collaborator

Could you please have a try at examples/box2d/bipedal_hardcore_sac.py with auto-alpha tuning? Thanks very much~

@danagi danagi closed this Sep 11, 2020
@danagi danagi reopened this Sep 11, 2020
- replace DiagGuassian with Independent(Normal) (pytorch has already supported this)
- detach alpha from autograd
- add value/alpha to result (more informational)
duburcqa
duburcqa previously approved these changes Sep 11, 2020
@Trinkle23897 Trinkle23897 changed the title clean sac SAC implementation update and accelerate auto-alpha training speed Sep 11, 2020
@Trinkle23897 Trinkle23897 changed the title SAC implementation update and accelerate auto-alpha training speed SAC implementation update Sep 11, 2020
@Trinkle23897 Trinkle23897 linked an issue Sep 11, 2020 that may be closed by this pull request
8 tasks
@Trinkle23897 Trinkle23897 merged commit 16d8e9b into thu-ml:master Sep 12, 2020
danagi added a commit to danagi/tianshou that referenced this pull request Sep 12, 2020
BFAnas pushed a commit to BFAnas/tianshou that referenced this pull request May 5, 2024
- replace DiagGuassian with Independent(Normal) (pytorch has already supported this)
- detach alpha from autograd
- add value/alpha to result (more informational)
- revert thu-ml#204 to fix thu-ml#211

Co-authored-by: Trinkle23897 <463003665@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Potential bug caused by calling policy.eval() before collecting experience
3 participants