You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, everyone; it seems that in Collector , if random is true, we are sampling agents from the action space using self._action_space[i].sample().
We then apply action_remap = self.policy.map_action(self.data.act) to them just as we do to action generated by the policy.
Is this correct? It seems to me that the actions sampled from the action space should already be scaled correctly and squashing them probably changes their distribution.