这是indexloc提供的服务,不要输入任何密码
Skip to content

Faster signatures w/o timing attack issue #268

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

linegel
Copy link

@linegel linegel commented Mar 15, 2025

On M4 Max (MBP Nov 2024), tested in Chrome Version 134.0.6998.89 - arm64:

THIS PR

Operation Operations Time per operation Operations per second Delta
sign 787 ops 0.64 ms/op 1573.69 ops/sec [+91.1%]
sign.open 8641 ops 0.06 ms/op 17261.29 ops/sec [+4067.3%]

PR #134

Operation Operations Time per operation Operations per second Delta
sign 889 ops 0.56 ms/op 1775.51 ops/sec [+115.6%]
sign.open 454 ops 1.10 ms/op 906.55 ops/sec [+118.9%]

CURRENT IMPLEMENTATION

Operation Operations Time per operation Operations per second Delta
sign 412 ops 1.21 ms/op 823.34 ops/sec -
sign.open 208 ops 2.41 ms/op 414.18 ops/sec -

The migration from Float64Array to Uint32Array preserves constant-time behavior because:

  1. Explicit Carry Chain Management
  • TweetNaCl's implementation never relies on implicit type behavior for overflow
  • All carries are explicitly calculated and propagated (e.g., carry = Math.floor((x[j] + 128) / 256))
  • Intermediate values are explicitly reduced with bitwise operations (e.g., x[j] &= 255)
  1. Uniform Data Access Patterns
  • Uint32Array provides consistent memory access timing regardless of value magnitude
  • Unlike regular JavaScript arrays, typed arrays don't change their internal representation based on value size
  • All operations on Uint32Array elements have constant-time behavior within the 32-bit range
  1. Modular Arithmetic Implementation
  • All large integer operations implement modular arithmetic with explicit reduction steps
  • Example from multiplication: v = t0 + c + 65535; c = Math.floor(v / 65536); t0 = v - c * 65536
  • This ensures values stay within defined bounds regardless of array type

Evgenii Popov added 3 commits March 16, 2025 01:02
Huge increase of performance in signing/verifying WITHOUT timing variations that could leak information about secret keys (that made PR dchest#134 unmergeable)

┌───────────────────────────────────────────────────────────────────────────────┐
│                        PERFORMANCE IMPROVEMENTS SUMMARY                        │
├───────────────────────────────────────────────────────────────────────────────┤
│ • Signing: ~2x faster on M4 Max compared to current implementation             │
│   (slightly slower than previous solution)                                     │
│                                                                               │
│ • Verification: ~40x faster than current implementation                        │
│   (~20x faster than previous PR dchest#134 solution)                                 │
│                                                                               │
├───────────────────────────────────────────────────────────────────────────────┤
│                             BENCHMARK RESULTS                                  │
├─────────────────────────────────────────────────────────────────────────────────────────────┤
│ On M4 Max (MBP Nov 2024), tested in Chrome Version 134.0.6998.89 - arm64:                   │
│                                                                                             │
│ THIS PR                                                                                     │
│ sign                 787 ops           0.64 ms/op      1573.69 ops/sec      [+91.1%]        │
│ sign.open           8641 ops           0.06 ms/op     17261.29 ops/sec      [+4067.3%]      │
│                                                                                             │
│ PR dchest#134 with risk of timing attacks                                                         │
│ sign                 889 ops           0.56 ms/op      1775.51 ops/sec      [+115.6%]       │
│ sign.open            454 ops           1.10 ms/op       906.55 ops/sec      [+118.9%]       │
│                                                                                             │
│ CURRENT IMPLEMENTATION                                                                      │
│ sign                 412 ops           1.21 ms/op       823.34 ops/sec      -               │
│ sign.open            208 ops           2.41 ms/op       414.18 ops/sec      -               │
└─────────────────────────────────────────────────────────────────────────────────────────────┘
Huge increase of performance in signing/verifying WITHOUT timing variations that could leak information about secret keys (that made PR dchest#134 unmergeable)

┌───────────────────────────────────────────────────────────────────────────────┐
│                        PERFORMANCE IMPROVEMENTS SUMMARY                        │
├───────────────────────────────────────────────────────────────────────────────┤
│ • Signing: ~2x faster on M4 Max compared to current implementation             │
│   (slightly slower than previous solution)                                     │
│                                                                               │
│ • Verification: ~40x faster than current implementation                        │
│   (~20x faster than previous PR dchest#134 solution)                                 │
│                                                                               │
├───────────────────────────────────────────────────────────────────────────────┤
│                             BENCHMARK RESULTS                                  │
├─────────────────────────────────────────────────────────────────────────────────────────────┤
│ On M4 Max (MBP Nov 2024), tested in Chrome Version 134.0.6998.89 - arm64:                   │
│                                                                                             │
│ THIS PR                                                                                     │
│ sign                 787 ops           0.64 ms/op      1573.69 ops/sec      [+91.1%]        │
│ sign.open           8641 ops           0.06 ms/op     17261.29 ops/sec      [+4067.3%]      │
│                                                                                             │
│ PR dchest#134 with risk of timing attacks                                                         │
│ sign                 889 ops           0.56 ms/op      1775.51 ops/sec      [+115.6%]       │
│ sign.open            454 ops           1.10 ms/op       906.55 ops/sec      [+118.9%]       │
│                                                                                             │
│ CURRENT IMPLEMENTATION                                                                      │
│ sign                 412 ops           1.21 ms/op       823.34 ops/sec      -               │
│ sign.open            208 ops           2.41 ms/op       414.18 ops/sec      -               │
└─────────────────────────────────────────────────────────────────────────────────────────────┘
@linegel
Copy link
Author

linegel commented Mar 15, 2025

I dedicate any and all copyright interest in this software to the
public domain. I make this dedication for the benefit of the public at
large and to the detriment of my heirs and successors. I intend this
dedication to be an overt act of relinquishment in perpetuity of all
present and future rights to this software under copyright law.

Anyone is free to copy, modify, publish, use, compile, sell, or
distribute this software, either in source code form or as a compiled
binary, for any purpose, commercial or non-commercial, and by any
means.

@dchest
Copy link
Owner

dchest commented Mar 16, 2025

Unfortunately, there can be an overflow with Uint32Arrays, as this requires more bits. See #187 for a similar discussion and a quick test.

The tests in this PR also fail if you run:

NACL_SRC=nacl-fast.js npm test

Evgenii Popov and others added 5 commits March 16, 2025 17:58
…arking

Restructured workflow into specialized jobs for targeted testing and benchmarking
Added separate test jobs for nacl.js and nacl-fast.min.js
Implemented comprehensive benchmarking with detailed performance comparisons
Added visual performance indicators (🟢 improvements, 🔴 regressions, ⚪ neutral)
Updated Node.js from v16 to v22
Added test artifacts for better debugging and analysis
Enhanced PR comments with detailed benchmark comparisons and performance metrics
Added contextual notes about benchmark variation thresholds
Enhance GitHub Actions workflow with comprehensive testing and benchmarking
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants