-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Labels
questionFurther information is requestedFurther information is requested
Description
- I have marked all applicable categories:
- exception-raising bug
- RL algorithm bug
- documentation request (i.e. "X is missing from the documentation.")
- new feature request
- I have visited the source website
- I have searched through the issue tracker for duplicates
- I have mentioned version numbers, operating system and environment, where applicable:
I find that action noise is added both in training and testing phase. For example, the following code is from DDPGPolicy.forward
. Is there any reason to add action noise when you are evaluating the policy? And what is the best practice if I wish a deterministic policy evaluation?
def forward(
self,
batch: Batch,
state: Optional[Union[dict, Batch, np.ndarray]] = None,
model: str = "actor",
input: str = "obs",
**kwargs: Any,
) -> Batch:
model = getattr(self, model)
obs = batch[input]
actions, h = model(obs, state=state, info=batch.info)
actions += self._action_bias
if self._noise and not self.updating:
actions += to_torch_as(self._noise(actions.shape), actions)
actions = actions.clamp(self._range[0], self._range[1])
return Batch(act=actions, state=h)
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested