accurate-gelu #813

jcrist1 · 2023-07-13T13:17:25Z

This PR adds an accurate gelu function (used at least in in GPT2 model from huggingface) (see Issue #804. Importantly in order to make the operators generic over Dtype, I introduce an Erf trait that allows us to call d_type.erf() to get the error function of the value d_type. I currently am getting a compile error with feature cuda, and I have a hunch that this trait might be the issue. I'm having trouble debugging the build further as I don't have the cuda headers anywhere (working on a mac), so I would appreciate some help, so I was wondering if someone more familiar with the code might be able to point out the error.

For Github: Resolves #804

jcrist1 · 2023-07-13T14:13:12Z

Also I'm not at sure that the cuda actually supports the erf as I can't build cuda locally. Here's a link to the cuda 32bit error function

jcrist1 · 2023-07-13T14:20:15Z

Finally the naming is pretty terrible. Would it make sens to make a Gelu Activation an enum

enum Gelu {
   Fast,
   Accurate
}

But then probably the activation_impls! macro couldn't be used

nkoppel · 2023-07-13T18:21:15Z

Thank you for contributing! This looks good so far, I just have a few comments.

To check the CUDA build without needing to call nvcc, use the command below. This won't be able to check your kernels, but should help you resolve your current CI errors.

cargo +nightly clippy -F cuda,ci-check

I think it's best that we name the op/module 'AccurateGeLU' rather than 'GeLUCorrect' because it's easier to read and doesn't imply the the approximate GeLU op is incorrect.
Be sure to document the difference between the accurate gelu op and the normal gelu.

src/tensor_ops/accurate_gelu/cpu_kernel.rs

jcrist1 · 2023-07-14T12:27:44Z

Okay @nkoppel I updated the name. Accurate GeLU is much better. I think distinguishing that the other GeLU is faster will help. I couldn't find an explicit citation for the fact that it's faster, but it seems to only require a single exponential, while the error function requires a much higher degree polynomial and still an exponential. I also beefed up the docs.

I feel like the changes to the docs for the activations don't fit in the code, but I'm guessing most people using the code will be using them as activations, rather than postfix operations so I added some info there even though it breaks the nice code block.

coreylowman · 2023-07-14T13:58:33Z

src/tensor_ops/gelu/gelu.cu

Are the changes in this file just indentation? Can you revert them if so? Just for easier reviewing 😀

should be, and will do. LSP automatically updated them.

src/tensor_ops/accurate_gelu/mod.rs

nkoppel · 2023-07-14T15:21:49Z

@jcrist1 , can you write "Resolves #804" somewhere in your first comment so that Github will mark this pr as resolving that issue?

…, indents

…ated)

…fast_gelu

nkoppel

Just a few little fixes to documentation and this should be good to go!

Note that I don't have write access to dfdx, I've just contributed a lot, so @coreylowman gets the final say on everything.

nkoppel · 2023-07-16T18:27:33Z

src/tensor_ops/fast_gelu/mod.rs

+    /// See [gelu]
+    pub fn fast_gelu(self) -> Self {
+        self.try_fast_gelu().unwrap()
+    }
+    /// See [gelu]
+    pub fn try_fast_gelu(self) -> Result<Self, D::Err> {
+        try_unary_op(FastGeLUKernelOp, self)
+    }
+
+    #[deprecated(since = "0.12.0", note = "Use `fast_gelu` instead")]
+    pub fn gelu(self) -> Self {
+        self.fast_gelu()
+    }
+
+    #[deprecated(since = "0.12.0", note = "Use `try_fast_gelu` instead")]
+    pub fn try_gelu(self) -> Result<Self, D::Err> {
+        self.try_fast_gelu()
+    }


Top two methods should link to fast_gelu, and deprecated items should have a link to their non-deprecated counterparts.

nkoppel · 2023-07-16T18:27:55Z

src/tensor_ops/fast_gelu/mod.rs

+#[derive(Debug, Default, Copy, Clone)]
+pub struct FastGeLUKernelOp;
+
+/// [Fast Gaussian Linear Unit (GeLU)](https://paperswithcode.com/method/gelu). `0.5 * x * (1 + tanh(sqrt(2 / pi) * (x + 0.044715 * x^3)))`


These docs should include a link to AccurateGeLU

nkoppel · 2023-07-16T18:28:21Z

src/tensor_ops/accurate_gelu/mod.rs

+/// GeLU(x) ~ 0.5 ∗ x ∗ (1.0 + tanh((sqrt(2.0/π) ∗ (x + 0.044715 ∗ x^3)))
+/// ```
+///
+/// See [gelu](crate::tensor_ops::gelu::gelu) to use this approximation


This link needs to be fixed with the new naming

nkoppel · 2023-07-16T18:30:47Z

src/nn/activations.rs

+#[deprecated(since = "0.12.0", note = "please use `FastGeLU` instead")]
+#[derive(Default, Debug, Clone, Copy)]
+pub struct GeLU;


Needs to link to it's non-deprecated counterpart.

link is 3 lines up

jcrist1 · 2023-07-17T20:03:14Z

@nkoppel alright should be addressed

coreylowman · 2023-07-19T12:50:41Z

This looks good from me - any other updates planned here?

jcrist1 · 2023-07-19T18:24:09Z

I'm not planning on anything new. Just waiting on confirmation from @nkoppel that everything is addressed

coreylowman · 2023-07-20T12:39:19Z

We can open another PR if there's something else to add/change!

jcrist1 added 3 commits July 6, 2023 22:22

accurate-gelu

9e140be

accurate-gelu - make it actually available

04eb169

accurate-gelu - trying to tweak cuda kernel

cb8a99c

jcrist1 marked this pull request as ready for review July 13, 2023 14:10

accurate-gelu - rename

b42b5e3

jcrist1 commented Jul 14, 2023

View reviewed changes

src/tensor_ops/accurate_gelu/cpu_kernel.rs Show resolved Hide resolved

jcrist1 added 2 commits July 14, 2023 14:11

accurate-gelu - more documentation, fix tests

8fbf249

accurate-gelu - describe corresponding pytorch algos for gelus

d201f4a

coreylowman reviewed Jul 14, 2023

View reviewed changes

src/tensor_ops/accurate_gelu/mod.rs Show resolved Hide resolved

jcrist1 added 6 commits July 15, 2023 10:58

accurate-gelu - rename gelu -> fast_gelu + bwd compat and deprecation…

4e11335

…, indents

accurate-gelu - correct cuda struct name

f30a37f

accurate-gelu - GuLU -> GeLU X-D

858d176

accurate-gelu - create backwards compatibility gelu(t) function

360c381

accurate-gelu - mark backwards compatible gelu import as allow(deprec…

9fd8685

…ated)

accurate-gelu - fix backwards-compatibility for GeLU, rename test to …

9f3c7bb

…fast_gelu

jcrist1 requested review from coreylowman and nkoppel July 15, 2023 09:32

nkoppel reviewed Jul 16, 2023

View reviewed changes

jcrist1 added 3 commits July 17, 2023 22:42

accurate-gelu - added doc links

03b5397

accurat-gelu - last link

2c2b3d9

accurate-gelu - correct method links

d00d2df

jcrist1 requested a review from nkoppel July 17, 2023 20:02

Merge branch 'main' into accurate-gelu

ec5f25d

coreylowman merged commit 47deac7 into coreylowman:main Jul 20, 2023

jcrist1 mentioned this pull request Jul 25, 2023

Accurate gelu cuda fix #830

Closed

Uh oh!

accurate-gelu #813

accurate-gelu #813

Uh oh!

Conversation

jcrist1 commented Jul 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jcrist1 commented Jul 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jcrist1 commented Jul 13, 2023

Uh oh!

nkoppel commented Jul 13, 2023

Uh oh!

Uh oh!

jcrist1 commented Jul 14, 2023

Uh oh!

coreylowman Jul 14, 2023

Choose a reason for hiding this comment

Uh oh!

jcrist1 Jul 15, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nkoppel commented Jul 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nkoppel left a comment

Choose a reason for hiding this comment

Uh oh!

nkoppel Jul 16, 2023

Choose a reason for hiding this comment

Uh oh!

nkoppel Jul 16, 2023

Choose a reason for hiding this comment

Uh oh!

nkoppel Jul 16, 2023

Choose a reason for hiding this comment

Uh oh!

nkoppel Jul 16, 2023

Choose a reason for hiding this comment

Uh oh!

jcrist1 Jul 17, 2023

Choose a reason for hiding this comment

Uh oh!

jcrist1 commented Jul 17, 2023

Uh oh!

coreylowman commented Jul 19, 2023

Uh oh!

jcrist1 commented Jul 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coreylowman commented Jul 20, 2023

Uh oh!

Uh oh!

jcrist1 commented Jul 13, 2023 •

edited

Loading

jcrist1 commented Jul 13, 2023 •

edited

Loading

nkoppel commented Jul 14, 2023 •

edited

Loading

jcrist1 commented Jul 19, 2023 •

edited

Loading