这是indexloc提供的服务,不要输入任何密码
Skip to content

Actor and distribution function should not be specified independently #1194

@opcode81

Description

@opcode81

Currently, the distribution function (e.g. for PPO) and the actor are specified independently, yet there is obviously a very strong connection: The nature of the actor's output determines the distribution function to apply.

This can cause errors that are hard to debug, especially when the input to the distribution function can be interpreted but its semantics are skewed.

Therefore, I propose to create a stronger link, which we can already implement at least in the high-level API:
The ActorFactory, which knows the nature of the actor's output, shall also create the matching distribution function, eliminating the issue. In turn, the dist_fn parameter can be entirely removed from the high-level API (breaking change).

Metadata

Metadata

Assignees

Labels

enhancementFeature that is not a new algorithm or an algorithm enhancementrefactoringNo change to functionality

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions