fix bug where the non-linear activation is also applied on the last decoding layer #4

amoussawi · 2017-10-08T21:59:12Z

Hello. Thanks for the great work. I know from the paper that the last layer is supposed to be computed as an affine transformation W*x + b, but it's not being done that way in the code (apparently unintentionally). In the decode() function you are checking if the weights are at the last layer kind=self._nl_type if ind!=self._last else 'none', however we know that number of weights matrices = number of layers - 1, and since ind corresponds to the index of the current weight matrix, and since indexing starts at 0, then self._last should be set as self._last = len(layer_sizes) - 2 and not as self._last = len(layer_sizes) - 1.

A good test case is to run the test units AutoEncoder with nl_type set to sigmoid, it will perform badly because the last layer is computing values in the range of [0,1] which does not correspond to the range of ratings. After applying the fix it will perform way better.

That actually explains why sigmoid and tanh activation functions did not perform well as described on the paper.

…ecoding layer

okuchaiev · 2017-10-09T21:11:44Z

Thanks @amoussawi ! This is indeed a bug. Which invalidates sigmoid and tanh results from section 3.2 of the paper. I'll need to re-do those experiments.
I'll merge your fix and will also add an option if user wants (or doesn't want to skip last layer's non-linearity).

There are two hard things in computer science: cache invalidation, naming things, and off-by-one errors.

fix bug where the non-linear activation is also applied on the last d…

9e7ed1a

…ecoding layer

okuchaiev self-assigned this Oct 9, 2017

okuchaiev merged commit fcf3161 into NVIDIA:master Oct 9, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix bug where the non-linear activation is also applied on the last decoding layer #4

fix bug where the non-linear activation is also applied on the last decoding layer #4

Uh oh!

amoussawi commented Oct 8, 2017

Uh oh!

okuchaiev commented Oct 9, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix bug where the non-linear activation is also applied on the last decoding layer #4

fix bug where the non-linear activation is also applied on the last decoding layer #4

Uh oh!

Conversation

amoussawi commented Oct 8, 2017

Uh oh!

okuchaiev commented Oct 9, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants