+
Skip to content

Conversation

ViliamVadocz
Copy link
Contributor

@ViliamVadocz ViliamVadocz commented Feb 4, 2023

I implemented my suggestion of converting to long long int in the atomic max/min functions. I could not test it since the Rust portion of the code does not compile.

I also added a missing header that is needed to use __half.

@coreylowman
Copy link
Owner

Thanks!

@coreylowman coreylowman merged commit d6887c5 into coreylowman:num-traits-float Feb 6, 2023
coreylowman added a commit that referenced this pull request Feb 13, 2023
* impl Cpu kernels for num_traits::Float

* Update examples

* Fixing cuda issues

* Add dtype skeleton to cuda kernels

* Working commit of TestDtype & cuda kernels

* Adding test-f64 feature

* Temp commit

* Moving more things to generic dtype

* Possibly fix max_to and min_to for doubles (#431)

* Temp commit

* Fixing optim tests

* Compiling

* Tests passing for Cpu

* Cuda compiling

* f64 cuda tests passing

* Fixing warnings

* Requiring minimum compute_60

* f64 nightly tests passing for cpu

* pool2d f64 kernels

* Conv & pool passing on cuda

* Making nn/dropout.rs generic over dtype

* Styling & using DeviceBuildExt

* Cleanup optimizers

* Minifying cuda kernels

* Adding f64 cmp kernels

* Removing macros in optim

* Reduce macro usage in cuda kernels

* minify cuda kernel impls

* Fixing docs

---------

Co-authored-by: Viliam Vadocz <viliam.vadocz@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载