-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Labels
enhancementFeature that is not a new algorithm or an algorithm enhancementFeature that is not a new algorithm or an algorithm enhancement
Description
- I have marked all applicable categories:
- exception-raising bug
- RL algorithm bug
- documentation request (i.e. "X is missing from the documentation.")
- new feature request
- I have visited the source website
- I have searched through the issue tracker for duplicates
- I have mentioned version numbers, operating system and environment, where applicable:
import tianshou, gymnasium as gym, torch, numpy, sys print(tianshou.__version__, gym.__version__, torch.__version__, numpy.__version__, sys.version, sys.platform) 0.5.1 0.29.0 2.0.1+cu117 1.24.4 3.11.4 | packaged by conda-forge | (main, Jun 10 2023, 18:08:17) [GCC 12.2.0] linux
Hello, I noticed that custom keys are properly supported by the ReplayBuffer class, in my case I wanted to add a custom key returns to the buffer, however the __getitem__
and the add
methods were not returning my custom key. Here is the code for reproducing this behaviour:
from tianshou.data import Batch, ReplayBuffer
import numpy as np
batch = Batch(
**{'obs_next': np.array([[ 1.174 , -0.1151, -0.609 , -0.5205, -0.9316, 3.236 , -2.418 ,
0.386 , 0.2227, -0.5117, 2.293 ]]),
'rew': np.array([4.28125]),
'act': np.array([[-0.3088, -0.4636, 0.4956]]),
'truncated': np.array([False]),
'obs': np.array([[ 1.193 , -0.1203, -0.6123, -0.519 , -0.9434, 3.32 , -2.266 ,
0.9116, 0.623 , 0.1259, 0.363 ]]),
'terminated': np.array([False]),
'done': np.array([False]),
'returns': np.array([74.70343082])
}
)
print("Original batch: \n", batch)
buffer_size = len(batch.rew)
buffer = ReplayBuffer(buffer_size)
buffer.add(batch)
print("Buffer: \n", buffer)
sampled_batch, _ = buffer.sample(1)
print("Sampled batch: \n", sampled_batch)
Metadata
Metadata
Assignees
Labels
enhancementFeature that is not a new algorithm or an algorithm enhancementFeature that is not a new algorithm or an algorithm enhancement