-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add CachedReplayBuffer and ReplayBufferManager #278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@Trinkle23897 Could you please help me check if it is possible to separate of stack option and other abilities of ReplayBuffer? I have not done that part. And it is a little diffcult for me because I'm not familar with RNN stack usage. Thanks a lot. |
Update: resolved @duburcqa I have some trouble with the new buffer type's pickle. (I think the current code is clean but lack of unit test, will be fixed soon...) Because the CachedReplayBuffer needs to perform Here is an example: we now have a buffer that contains three sub-buffers, with each of size=10 (total=30). At this time, we add a new key-value pair called
The current implementation use I haven't figure out the solution to this issue, could you please give us some suggestions? |
Codecov Report
@@ Coverage Diff @@
## master #278 +/- ##
==========================================
+ Coverage 94.39% 94.64% +0.24%
==========================================
Files 45 45
Lines 2892 3006 +114
==========================================
+ Hits 2730 2845 +115
+ Misses 162 161 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
In my current thought it is a special case. I haven't come up with other demands or requirements similar to |
Why not then ! But may be the name is not very appropriate if it is a instance method (supposedly private) instead of a hook ? |
What are you using for profiling ? Personally I'm now using
|
Hum... that's look kind of hacky but not meaningless actually. Is it still pickle correctly after that change at instance-level ? |
I think so, but I haven't come up with other approaches better than this. (compatible with pickle)
Sure. I add the related test under |
What's your suggestion? (Also I'm not sure if the name |
What about |
I don't know, maybe just |
…self._episode_rew/len defaults to 0.0/0 instead of np.array
Your profiling graph is polluted by non-significant method calls (such as import.load modules), you should use a script running for a longer duration. I was using this tool before, but I find it less readable than |
py-spy result profiling.zip I can't find the part related to replaybuffer. Is there anything wrong? |
And yet it is here! Just use the search functionality integrated in the svg itself, and look for |
By looking at the time spend in this module, I don't think it is relevant to optimize it further x) NB: Just disable |
I think cachedbuffer is ready now. |
@duburcqa There is a question that makes me confused. import numpy as np
from tianshou.data import ReplayBuffer, CachedReplayBuffer
buf = CachedReplayBuffer(ReplayBuffer(10000), 1, 10000)
buf.add(obs=[np.random.rand(4, 84, 84)], act=[1], rew=[1], done=[1])
buf = CachedReplayBuffer(ReplayBuffer(10000), 1, 10000)
buf.add(obs=[np.random.rand(4, 84, 84)], act=[1], rew=[1], done=[1]) Repeating executing the last two lines, you'll see that the actual memory is still increasing. That's weird because theoretically the reference count of previous |
What about this ?
If there is an actual memory leak, you should be able to find the line that is responsible of it using memory_profiler. I'm using it frequently and it is quite good, even though time consuming to use. |
|
This is the second commit of 6 commits mentioned in thu-ml#274, which features minor refactor of ReplayBuffer and adding two new ReplayBuffer classes called CachedReplayBuffer and ReplayBufferManager. You can check thu-ml#274 for more detail. 1. Add ReplayBufferManager (handle a list of buffers) and CachedReplayBuffer; 2. Make sure the reserved keys cannot be edited by methods like `buffer.done = xxx`; 3. Add `set_batch` method for manually choosing the batch the ReplayBuffer wants to handle; 4. Add `sample_index` method, same as `sample` but only return index instead of both index and batch data; 5. Add `prev` (one-step previous transition index), `next` (one-step next transition index) and `unfinished_index` (the last modified index whose done==False); 6. Separate `alloc_fn` method for allocating new memory for `self._meta` when a new `(key, value)` pair comes in; 7. Move buffer's documentation to `docs/tutorials/concepts.rst`. Co-authored-by: n+e <trinkle23897@gmail.com>
This is the second commit of 6 commits mentioned in #274, which features minor refactor of ReplayBuffer and adding two new ReplayBuffer classes called CachedReplayBuffer and ReplayBufferManager. You can check #274 for more detail.
buffer.done = xxx
;set_batch
method for manually choosing the batch the ReplayBuffer wants to handle;sample_index
method, same assample
but only return index instead of both index and batch data;prev
(one-step previous transition index),next
(one-step next transition index) andunfinished_index
(the last modified index whose done==False);alloc_fn
method for allocating new memory forself._meta
when a new(key, value)
pair comes in;docs/tutorials/concepts.rst
.