-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Labels
discussionDiscussion of a typical issueDiscussion of a typical issuerefactoringNo change to functionalityNo change to functionality
Milestone
Description
Not really a coding issue.
You assert in your paper that the code base is highly modular but the algorithms are very strongly tied to your implementation of a replay buffer. All of the static methods in the base policy require the replay buffer.
It would be non-trivial to decouple the algorithmic implementations from the replay.
I would like to recommend that you rename learn to _learn. learn without the underscore implies that it is a stand alone method that can be publicly called but, at least for DDPG and child classes, learn requires everything surrounding it in update.
if buffer is None:
return {}
batch, indices = buffer.sample(sample_size)
self.updating = True
batch = self.process_fn(batch, buffer, indices)
result = self.learn(batch, **kwargs)
self.post_process_fn(batch, buffer, indices)
if self.lr_scheduler is not None:
self.lr_scheduler.step()
self.updating = False
return result
The docstring of learn is also not indicative of this limitation.
Metadata
Metadata
Assignees
Labels
discussionDiscussion of a typical issueDiscussion of a typical issuerefactoringNo change to functionalityNo change to functionality
Type
Projects
Status
Done