-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Yet another 3 fix #160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Yet another 3 fix #160
Conversation
It's no difference with the previous version because |
Reshape is good, and would be better if you can avoid shape [bsz] by all means (reward has this shape) because of the confusing behavior. |
I don't agree, usually I prefer to use |
But 1-dim tensor cannot apply flatten(1): In [6]: b=torch.rand(3)
In [7]: b.flatten(1,-1)
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-7-0fd10ca82bb9> in <module>
----> 1 b.flatten(1,-1)
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1) |
OK so your example before edit " |
reward is the case. |
The reward is used as input of its own neural network ? In which algorithm ? |
I'm not talking about using reward as input, though. My suggestion is to reshape reward to be [bsz, 1] in |
I don't agree to reshape this as [bsz, 1] since it had already made a bug previously. |
We are talking about this initially. |
Okay. This is because I write a small test with MyTestEnv under the |
1. DQN learn should keep eps=0 2. Add a warning of env.seed in VecEnv 3. fix thu-ml#162 of multi-dim action
Uh oh!
There was an error while loading. Please reload this page.