Correctly apply global scale in nvfp4 quantization. #30467

jreiffers · 2025-07-24T07:26:35Z

Currently, the global scale is not applied consistently to the input tensor, leading to all values being rounded to zeroes in most cases when using the global scale logic from the test.

The existing tests did not catch this because they use matrix multiplications on dequantized tensors as the reference. This ended up just comparing 0@0 to 0@0.

Instead of adjusting the matmul tests, I opted to add test coverage for quantization logic directly, since that doesn't require carefully chosen tolerances in most cases.

Currently, the global scale is not applied consistently to the input tensor, leading to all values being rounded to zeroes in most cases when using the global scale logic from the test. The existing tests did not catch this because they use matrix multiplications on dequantized tensors as the reference. This ended up just comparing 0@0 to 0@0. Instead of adjusting the matmul tests, I opted to add test coverage for quantization logic directly, since that doesn't require carefully chosen tolerances in most cases.

jreiffers · 2025-07-24T07:30:25Z

jax/_src/cudnn/scaled_matmul_stablehlo.py

    SCALE_MAX = dtypes.finfo(config.scale_type).max.astype(x.dtype)

-    scales_q = jnp.clip(scales / config.global_scale, 0, SCALE_MAX)
+    x /= config.global_scale
+    scales_q = jnp.clip(get_scales_per_block(x), 0, SCALE_MAX)
    scales_q = lax.optimization_barrier(scales_q.astype(config.scale_type))
    scaled_x = x / scales_q.astype(np.float32)


(This x is where we previously didn't apply the global scale)

jreiffers commented Jul 24, 2025

View reviewed changes

jreiffers requested a review from IvyZX July 25, 2025 06:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Correctly apply global scale in nvfp4 quantization. #30467

Correctly apply global scale in nvfp4 quantization. #30467

jreiffers commented Jul 24, 2025

Uh oh!

jreiffers Jul 24, 2025

Uh oh!

Uh oh!

Correctly apply global scale in nvfp4 quantization. #30467

Are you sure you want to change the base?

Correctly apply global scale in nvfp4 quantization. #30467

Conversation

jreiffers commented Jul 24, 2025

Uh oh!

jreiffers Jul 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!