-
Notifications
You must be signed in to change notification settings - Fork 74.8k
Open
Labels
Description
Issue type
Bug
Have you reproduced the bug with TensorFlow Nightly?
Yes
Source
binary
TensorFlow version
tf-nightly-2.21.0.dev20250722
Custom code
No
OS platform and distribution
Ubuntu 20.04
Mobile device
no
Python version
3.11
Bazel version
No response
GCC/compiler version
No response
CUDA/cuDNN version
12.8.1/9.8
GPU model and memory
RTX5080 16gb
Current behavior?
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
W0000 00:00:1753232013.685876 24341 gpu_device.cc:2431] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.
CUDA_ERROR_INVALID_HANDLE
Standalone code to reproduce the issue
import tensorflow as tf
import numpy as np
from tensorflow.python.client import device_lib
import keras
print("Keras version: ", keras.__version__)
print(device_lib.list_local_devices())
x = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
tensor = tf.convert_to_tensor(x)
print("Tensor: ", tensor)
# ========================================== define model ======================================
input_data = keras.Input(shape = (8,1))
# Data Encoder
dx = keras.layers.Dense(16, activation='relu')(input_data)
print("dx", dx.shape)
Relevant log output
Keras version: 3.10.0.dev2025072204
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
W0000 00:00:1753232219.958168 25258 gpu_device.cc:2431] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.
I0000 00:00:1753232220.029881 25258 gpu_device.cc:2020] Created device /device:GPU:0 with 11546 MB memory: -> device: 0, name: NVIDIA GeForce RTX 5080, pci bus id: 0000:01:00.0, compute capability: 12.0
W0000 00:00:1753232220.033014 25258 gpu_device.cc:2431] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.
W0000 00:00:1753232220.035555 25258 gpu_device.cc:2431] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.
I0000 00:00:1753232220.037159 25258 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 11546 MB memory: -> device: 0, name: NVIDIA GeForce RTX 5080, pci bus id: 0000:01:00.0, compute capability: 12.0
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 13363326776403279234
xla_global_id: -1
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 12107055104
locality {
bus_id: 1
links {
}
}
incarnation: 7207726466463925696
physical_device_desc: "device: 0, name: NVIDIA GeForce RTX 5080, pci bus id: 0000:01:00.0, compute capability: 12.0"
xla_global_id: 416903419
]
Tensor: tf.Tensor(
[[1 2 3]
[4 5 6]
[7 8 9]], shape=(3, 3), dtype=int64)
2025-07-23 10:57:00.115747: W tensorflow/compiler/mlir/tools/kernel_gen/tf_gpu_runtime_wrappers.cc:40] 'cuModuleLoadData(&module, data)' failed with 'CUDA_ERROR_INVALID_PTX'
2025-07-23 10:57:00.115757: W tensorflow/compiler/mlir/tools/kernel_gen/tf_gpu_runtime_wrappers.cc:40] 'cuModuleGetFunction(&function, module, kernel_name)' failed with 'CUDA_ERROR_INVALID_HANDLE'
2025-07-23 10:57:00.115761: W tensorflow/core/framework/op_kernel.cc:1842] INTERNAL: 'cuLaunchKernel(function, gridX, gridY, gridZ, blockX, blockY, blockZ, 0, reinterpret_cast<CUstream>(stream), params, nullptr)' failed with 'CUDA_ERROR_INVALID_HANDLE'
2025-07-23 10:57:00.115766: I tensorflow/core/framework/local_rendezvous.cc:407] Local rendezvous is aborting with status: INTERNAL: 'cuLaunchKernel(function, gridX, gridY, gridZ, blockX, blockY, blockZ, 0, reinterpret_cast<CUstream>(stream), params, nullptr)' failed with 'CUDA_ERROR_INVALID_HANDLE'
Traceback (most recent call last):
File "/home/mike/catkin_ws2/src/mypy311/scripts/tftest.py", line 19, in <module>
dx = keras.layers.Dense(16, activation='relu')(input_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/mike/PycharmProjects/py311/.venv/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 122, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/home/mike/PycharmProjects/py311/.venv/lib/python3.11/site-packages/keras/src/backend/tensorflow/core.py", line 152, in convert_to_tensor
return tf.cast(x, dtype)
^^^^^^^^^^^^^^^^^
tensorflow.python.framework.errors_impl.InternalError: {{function_node __wrapped__Cast_device_/job:localhost/replica:0/task:0/device:GPU:0}} 'cuLaunchKernel(function, gridX, gridY, gridZ, blockX, blockY, blockZ, 0, reinterpret_cast<CUstream>(stream), params, nullptr)' failed with 'CUDA_ERROR_INVALID_HANDLE' [Op:Cast] name:
Process finished with exit code 1