-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Labels
enhancementFeature that is not a new algorithm or an algorithm enhancementFeature that is not a new algorithm or an algorithm enhancementrefactoringNo change to functionalityNo change to functionality
Description
Currently, the distribution function (e.g. for PPO) and the actor are specified independently, yet there is obviously a very strong connection: The nature of the actor's output determines the distribution function to apply.
This can cause errors that are hard to debug, especially when the input to the distribution function can be interpreted but its semantics are skewed.
Therefore, I propose to create a stronger link, which we can already implement at least in the high-level API:
The ActorFactory, which knows the nature of the actor's output, shall also create the matching distribution function, eliminating the issue. In turn, the dist_fn parameter can be entirely removed from the high-level API (breaking change).
Metadata
Metadata
Assignees
Labels
enhancementFeature that is not a new algorithm or an algorithm enhancementFeature that is not a new algorithm or an algorithm enhancementrefactoringNo change to functionalityNo change to functionality