-
Notifications
You must be signed in to change notification settings - Fork 74.8k
Description
Issue type
Bug
Have you reproduced the bug with TensorFlow Nightly?
Yes
Source
source
TensorFlow version
2.19.0
Custom code
Yes
OS platform and distribution
No response
Mobile device
No response
Python version
Python 3.12
Bazel version
No response
GCC/compiler version
No response
CUDA/cuDNN version
No response
GPU model and memory
No response
Current behavior?
When tf.raw_ops.NotEqual
is called with two tensors whose shapes are
not broadcastable, the behavior is inconsistent between the CPU and GPU
implementations.
- The GPU correctly identifies the invalid input and raises an
InvalidArgumentError
, which is the expected behavior for a
mathematically invalid operation. - The CPU, however, fails silently and returns a misleading scalar
value (tf.Tensor(True, shape=(), dtype=bool)
), even when the
incompatible_shape_error=False
flag is used.
This violates the principle of device consistency, where the same
operation with the same inputs should yield the same result or error
across all devices. The GPU's strict error handling is preferable as it
prevents silent bugs in user code.
Expected Behavior
The behavior should be consistent across all devices. The most correct
and safest behavior would be for both the CPU and GPU to raise an
InvalidArgumentError
.
Failing loudly on invalid inputs is crucial for preventing silent errors
and difficult-to-debug numerical issues. The CPU implementation should
be updated to match the GPU's stricter and more correct behavior of
erroring out when presented with non-broadcastable shapes for this
operation.
Standalone code to reproduce the issue
import numpy as np
import tensorflow as tf
# Set seed for reproducibility
np.random.seed(202)
# Generate input tensors with non-broadcastable shapes
# x.shape = (4, 1)
# y.shape = (1, 28, 2, 3, 2)
x = np.random.uniform(-32767., 127., size=(4, 1)).astype(np.float32)
y = np.random.uniform(0., 89., size=(1, 28, 2, 3, 2)).astype(np.float32)
# Convert to TensorFlow tensors
x_tensor = tf.constant(x, dtype=tf.float32)
y_tensor = tf.constant(y, dtype=tf.float32)
# --- CPU Execution ---
# This runs without error and produces a misleading result
try:
with tf.device("/CPU:0"):
result_cpu = tf.raw_ops.NotEqual(
x=x_tensor,
y=y_tensor,
incompatible_shape_error=False,
name="selu_cpu",
)
print("CPU Result:", result_cpu)
except Exception as e:
print("CPU Error:", e)
# --- GPU Execution ---
# This correctly fails with an InvalidArgumentError
try:
with tf.device("/GPU:0"):
result_gpu = tf.raw_ops.NotEqual(
x=x_tensor,
y=y_tensor,
incompatible_shape_error=False,
name="selu_gpu",
)
print("GPU Result:", result_gpu)
except Exception as e:
print("\nGPU Error:", e)
Relevant log output
**CPU Output:**
CPU Result: tf.Tensor(True, shape=(), dtype=bool)
**GPU Output:**
GPU Error: {{function_node
__wrapped__NotEqual_device_/job:localhost/replica:0/task:0/device:GPU:0}}
required broadcastable shapes [Op:NotEqual] name: selu_gpu