Implement CQLPolicy and offline_cql example #506

thkkk · 2022-01-15T15:26:27Z

I have marked all applicable categories:
- exception-raising fix
- algorithm implementation fix
- documentation modification
- new feature
I have reformatted the code using make format (required)
I have checked the code using make commit-checks (required)
If applicable, I have mentioned the relevant/related issue(s)
If applicable, I have listed every items in this Pull Request below

This PR implements CQLPolicy, which could be used to train an offline agent in the environment of continuous action space. An experimental result 'halfcheetah-medium-v1' is provided, which is a d4rl environment (for Offline Reinforcement Learning).

Example usage is in the examples/offline/offline_bcq.py. Document modification and unit test for CQLPolicy are also completed.

codecov-commenter · 2022-01-15T15:50:37Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.42%. Comparing base (a59d96d) to head (e855038).
Report is 548 commits behind head on master.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #506      +/-   ##
==========================================
+ Coverage   94.18%   94.42%   +0.24%     
==========================================
  Files          62       63       +1     
  Lines        4127     4252     +125     
==========================================
+ Hits         3887     4015     +128     
+ Misses        240      237       -3

Flag	Coverage Δ
unittests	`94.42% <100.00%> (+0.24%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

thkkk and others added 20 commits January 12, 2022 18:33

Add example offline_cql.py

d875716

finish CQLPolicy

23941ef

polish

fa7d2c6

Merge branch 'thu-ml:master' into master

578d7b6

fix bug of loading dataset

99957b7

Use forward function in SAC

d81b5b0

fix bug of tensor size

7fba2d8

update examples

ad86fe1

test for correctness of SAC

9fe2143

update alpha_loss

f3f5b22

switch to medium env

ef27374

update 2 todos

41efc09

Merge branch 'thu-ml:master' into master

9ffd8a2

update initialization of ActorProb

b502d09

Add unittest and docs

75e56cb

update unittest

fdea7a4

update reward_threshold for test_cql

7f9e566

spell check

c7c1b6a

remove a useless comment in SACPolicy

8375767

type check

8c6bfd0

type check

e855038

Trinkle23897 approved these changes Jan 15, 2022

View reviewed changes

Trinkle23897 merged commit bc53ead into thu-ml:master Jan 15, 2022

BFAnas pushed a commit to BFAnas/tianshou that referenced this pull request May 5, 2024

Implement CQLPolicy and offline_cql example (thu-ml#506)

b1665d8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement CQLPolicy and offline_cql example #506

Implement CQLPolicy and offline_cql example #506

Uh oh!

thkkk commented Jan 15, 2022 •

edited

Loading

Uh oh!

codecov-commenter commented Jan 15, 2022 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Implement CQLPolicy and offline_cql example #506

Implement CQLPolicy and offline_cql example #506

Uh oh!

Conversation

thkkk commented Jan 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Jan 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

thkkk commented Jan 15, 2022 •

edited

Loading

codecov-commenter commented Jan 15, 2022 •

edited

Loading