-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Implement ContinuousBCQPolicy and offline_bcq example #480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Could you please add |
Codecov Report
@@ Coverage Diff @@
## master #480 +/- ##
==========================================
+ Coverage 94.06% 94.24% +0.17%
==========================================
Files 60 61 +1
Lines 3910 4031 +121
==========================================
+ Hits 3678 3799 +121
Misses 232 232
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
finish test_bcq
This PR implements BCQPolicy, which could be used to train an offline agent in the environment of continuous action space. An experimental result 'halfcheetah-expert-v1' is provided, which is a d4rl environment (for Offline Reinforcement Learning). Example usage is in the examples/offline/offline_bcq.py.
make format
(required)make commit-checks
(required)This PR implements ContinuousBCQPolicy, which could be used to train offline agent in the environment of continuous action space. Here is an experimental result in 'halfcheetah-expert-v1', a d4rl environment (for Offline Reinforcement Learning).
Example usage is in the
examples/offline/offline_bcq.py
.