-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Labels
enhancementFeature that is not a new algorithm or an algorithm enhancementFeature that is not a new algorithm or an algorithm enhancement
Description
In recurrent-style training, it stores data in the format of [batch_size, stack_num, ...]
. And if we want to get the first hidden state, batch[:, 0]
is the best way instead of manually and recursively get the value in the batch. (#19)
Also with the multi-agent training scenario (someone is working on it), the batch-data will store in the format of [env_num, agent_num, ...]
. If we want to get data from a specific agent's perspective, the best way is batch[:, agent_id]
.
Metadata
Metadata
Assignees
Labels
enhancementFeature that is not a new algorithm or an algorithm enhancementFeature that is not a new algorithm or an algorithm enhancement