-
-
Notifications
You must be signed in to change notification settings - Fork 308
Add ARMv6 version of slide_hash #1538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## develop #1538 +/- ##
===========================================
+ Coverage 83.88% 83.91% +0.03%
===========================================
Files 132 133 +1
Lines 10843 10882 +39
Branches 2801 2805 +4
===========================================
+ Hits 9096 9132 +36
- Misses 1048 1049 +1
- Partials 699 701 +2 Flags with carried forward coverage won't be shown. Click here to find out more.
☔ View full report in Codecov by Sentry. |
|
This looks pretty good to me, but I have not tested it myself. @ccawley2011 What about your other pending PRs? Should we try to find someone else to come in and finish them up for inclusion? |
|
As I've said earlier, we support all vector instruction sets, no matter if 32-bit wide or wider... As such I have nothing against adding support for optimizations using After adding |
|
CMake tests seem to fail even though configure tests pass: |
|
I also tried compiling for 32-bit ARMv8-A and short-circuiting the feature test: |
7c91e3e to
f25c582
Compare
da62d60 to
1ccb42a
Compare
|
This should be ready to merge now. |
|
This PR seems to fail: |
1ccb42a to
1f44293
Compare
This was because configure checked if |
nmoinvaz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have reviewed with one change request, after that I think it is ready for approval.
1f44293 to
2e7aa08
Compare
nmoinvaz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
|
||
| static inline uint16x2_t __uqsub16(uint16x2_t __a, uint16x2_t __b) { | ||
| uint16x2_t __c; | ||
| __asm__ __volatile__("uqsub16\t%0, %1, %2" : "=r" (__c) : "r"(__a), "r"(__b)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ccawley2011 is it intentional that \t follows uqsub16? I noticed that the CMake checks use a space.
This uses the
__uqsub16instruction from newer versions ofarm_acle.h, with fallbacks for MSVC and older GCC.Results from
benchmark_zlibon a Raspberry Pi 1B: