Question

I have an RTX 4070 on my Dell XPS laptop that also comes with an Intel IRIS Xe Graphics card. I am using Windows 11.

I have NVIDIA Graphics Driver Version 535.98 installed on my system and has support for CUDA versions up to 12.2. I need to use CUDA 11.2 and that s what I have installed on my system.

Cuda compilation tools, release 11.2, V11.2.67
Build cuda_11.2.r11.2/compiler.29373293_0

is what I get with nvcc --version.

I am working on a project that produces an exe upon compilation. The project exe expects multiple dlls including tensorflow_cc.dll since the neural network code is based out of tensorflow. Upon running the application, I receive the following error.

E tensorflow/stream_executor/cuda/cuda_driver.cc:271] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected

and then application proceeds to run successfully from there on (although a lot slower since it is running on the CPU).

I am able to run the DeviceQuery sample from NVIDIA successfully. DeviceQuery detects 1 CUDA capable device (Device0 = NVIDIA GeForce RTX 4070 Laptop GPU) with CUDA Driver Version 12.2 and CUDA Runtime Version 11.2 as expected.

I have tried everything I could to debug this issue - renistalled CUDA 11.2 as well as the NVIDIA drivers but with no success. I know that this question has been asked and answered multiple times before but nothing works for me. I have already set the CUDA_VISIBLE_DEVICES environment variable as 0.

In order to debug further, I tried GPUtil. getAvailable() which returns [0] (Device 0 detected as GPU) when run on a Python interpreter. I wanted to try running torch.cuda.is_available() as an additional test to debug but I am just unable to install a GPU supporting version torch with pip. Installation of torch for me always results in a +cpu version. I have tried installing all kinds of previous and current torch versions from the pytorch website with +cu110, +cu111, +cu113, +cu115, +cu116, +cu117, etc but it fails each time with the following error:

pip install torch==1.10.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html
Looking in links: https://download.pytorch.org/whl/torch_stable.html
ERROR: Could not find a version that satisfies the requirement torch==1.10.0+cu111 (from versions: 2.0.0, 2.0.0+cpu, 2.0.0+cu117, 2.0.0+cu118, 2.0.1, 2.0.1+cpu, 2.0.1+cu117, 2.0.1+cu118)
ERROR: No matching distribution found for torch==1.10.0+cu111

The same with torchvision or torchaudio. All I can pip install are +cpu versions. Until 2021, pytorch did not offer support for CUDA 11.2 and I could have probably ascribed my CUDA version to be the issue but not anymore - Pytorch does offer support till CUDA 11.7 now.

TL;DR

With CUDA 11.2 installed on my system, tensorflow is unable to find a CUDA enabled device even when I have an RTX 4070. I can even find it successfully under display adapters in Device Manager.

友情链接