-
Notifications
You must be signed in to change notification settings - Fork 74.8k
Description
Issue type
Bug
Have you reproduced the bug with TensorFlow Nightly?
Yes
Source
source
TensorFlow version
2.19.0
Custom code
Yes
OS platform and distribution
No response
Mobile device
No response
Python version
Python 3.12
Bazel version
No response
GCC/compiler version
No response
CUDA/cuDNN version
No response
GPU model and memory
No response
Current behavior?
When tf.raw_ops.NotEqual is called with two tensors whose shapes are not broadcastable, the behavior is inconsistent between the CPU and GPU implementations.
The GPU correctly identifies the invalid input and raises an InvalidArgumentError, which is the expected behavior for a mathematically invalid operation.
The CPU, however, fails silently and returns a misleading scalar value (tf.Tensor(True, shape=(), dtype=bool)), even when the incompatible_shape_error=False flag is used.
This violates the principle of device consistency, where the same operation with the same inputs should yield the same result or error across all devices. The GPU's strict error handling is preferable as it prevents silent bugs in user code.
Failing loudly on invalid inputs is crucial for preventing silent errors and difficult-to-debug numerical issues. The CPU implementation should be updated to match the GPU's stricter and more correct behavior of erroring out when presented with non-broadcastable shapes for this operation.
Standalone code to reproduce the issue
import numpy as np
import tensorflow as tf
# Set seed for reproducibility
np.random.seed(202)
# Generate input tensors with non-broadcastable shapes
# x.shape = (4, 1)
# y.shape = (1, 28, 2, 3, 2)
x = np.random.uniform(-32767., 127., size=(4, 1)).astype(np.float32)
y = np.random.uniform(0., 89., size=(1, 28, 2, 3, 2)).astype(np.float32)
# Convert to TensorFlow tensors
x_tensor = tf.constant(x, dtype=tf.float32)
y_tensor = tf.constant(y, dtype=tf.float32)
# --- CPU Execution ---
# This runs without error and produces a misleading result
try:
with tf.device("/CPU:0"):
result_cpu = tf.raw_ops.NotEqual(
x=x_tensor,
y=y_tensor,
incompatible_shape_error=False,
name="selu_cpu",
)
print("CPU Result:", result_cpu)
except Exception as e:
print("CPU Error:", e)
# --- GPU Execution ---
# This correctly fails with an InvalidArgumentError
try:
with tf.device("/GPU:0"):
result_gpu = tf.raw_ops.NotEqual(
x=x_tensor,
y=y_tensor,
incompatible_shape_error=False,
name="selu_gpu",
)
print("GPU Result:", result_gpu)
except Exception as e:
print("\nGPU Error:", e)
Relevant log output
**CPU Output:**
CPU Result: tf.Tensor(True, shape=(), dtype=bool)
**GPU Output:**
GPU Error: {{function_node __wrapped__NotEqual_device_/job:localhost/replica:0/task:0/device:GPU:0}} required broadcastable shapes [Op:NotEqual] name: selu_gpu