Markdown

CUDA driver (libcuda.so)¶

Scope: the CUDA driver (libcuda.so, the user-mode half of the NVIDIA GPU driver), its Driver API vs the runtime API, the driver version vs the CUDA version nvidia-smi reports, minor-version and forward (cuda-compat) compatibility, and why it is the only host-level CUDA dependency for containers.

What it is¶

libcuda.so is the user-mode CUDA driver (UMD). It ships inside the NVIDIA GPU driver package, not the CUDA Toolkit: "The driver package includes both the user mode CUDA driver (libcuda.so) and kernel mode components necessary to run the application." ¹ Below it sits the kernel-mode driver (KMD): nvidia.ko and friends (NVIDIA Kernel Modules) plus GSP firmware (GPU Firmware and GSP); above it sits everything else.

Two distinct CUDA interfaces sit on top of libcuda.so:

Driver API: the low-level cu* entry points exported directly by libcuda.so (handle-based contexts, modules, explicit init). Versioned with the driver.
Runtime API: the higher-level cuda* entry points in libcudart.so, which ships in the CUDA Toolkit (CUDA Toolkit and Runtime) and itself calls down into the driver. Versioned with the toolkit.

Practical consequence: a developer "has to install only the CUDA Toolkit and necessary libraries required for linking" ¹. The driver is a separate, host-level component on its own release cadence (Driver Versions and Branches, Driver Install and Lifecycle).

flowchart LR
  APP["Application / framework"] --> RT["Runtime API: libcudart, in toolkit"]
  APP --> DRV_API["Driver API: cu functions in libcuda.so"]
  RT --> DRV_API
  DRV_API --> UMD["User-mode driver: libcuda.so"]
  UMD --> KMD["Kernel-mode driver: nvidia.ko + GSP"]

Why it's needed (and when)¶

The driver is non-optional: there is no CUDA without libcuda.so and its matching kernel modules. The reason it gets its own page is the version split, which is the single most common source of "CUDA version mismatch" confusion.

nvidia-smi prints two numbers in its header: a Driver Version (the installed NVIDIA driver, e.g. 580.167.08) and a CUDA Version (the maximum CUDA Toolkit that driver supports, not what is installed). These are independent of any toolkit on the box. The classic symptom (nvcc --version and nvidia-smi reporting different CUDA versions) is expected: nvcc reports the runtime/toolkit it came from; nvidia-smi reports the driver's ceiling. ⁸

When the distinction matters:

Sizing a driver for a toolkit. Each CUDA major release has a minimum driver: CUDA 13.x needs Linux driver >= 580, CUDA 12.x >= 525, CUDA 11.x >= 450. ² Pick the host driver from this floor plus your branch policy (Driver and Feature Support by GPU Tier).
Containers (see below): the host driver is the contract; the toolkit lives in the image.
Forward compatibility on a frozen driver: cuda-compat, below.

How it's installed & managed¶

The driver (and therefore libcuda.so) is installed by the kernel driver package, not the toolkit. Install and lifecycle (DKMS, branch pinning, open vs proprietary modules) are owned by [cuda-driver]-adjacent pages: NVIDIA Kernel Modules, Driver Versions and Branches, and Driver Install and Lifecycle. This page covers only what is specific to the user-mode driver and its compatibility envelope.

Read the installed driver version (scriptable):

# reference template, not hardware-tested
nvidia-smi --query-gpu=driver_version --format=csv,noheader

driver_version is the version of the installed NVIDIA (Kernel Mode / display) driver. ⁵ Add name to correlate per GPU, and use csv,noheader for clean machine parsing across a fleet (nvidia-smi Reference):

# reference template, not hardware-tested
nvidia-smi --query-gpu=name,driver_version --format=csv,noheader
nvidia-smi --help-query-gpu        # enumerate every queryable field

Forward compatibility with cuda-compat. To run a newer CUDA Toolkit on an older base driver (across major families), install the matching forward-compat package rather than upgrading the kernel driver. It is named cuda-compat-<cuda_major>-<cuda_minor> ³ and ships a newer user-mode libcuda.so.* that runs against the existing older kernel driver:

# reference template, not hardware-tested
# e.g. run a CUDA 13.0 build on an older base driver
apt-get install -y cuda-compat-13-0
# then prepend the compat libcuda to the loader path for that process:
LD_LIBRARY_PATH=/usr/local/cuda/compat:$LD_LIBRARY_PATH ./your_cuda_app

Hard constraints. Do not assume this works everywhere:

Datacenter-only. Forward Compatibility "is applicable only for systems with: NVIDIA Data Center GPUs. Select NGC Server Ready SKUs of RTX cards. Jetson boards." ³ On a plain GeForce box it is unsupported (a wrong-driver/non-datacenter-GPU error is the documented failure). ⁴ GeForce has no datacenter licence path either (Driver and Feature Support by GPU Tier).
Branch-gated. Supported only on the datacenter LLB/LTS driver branches for select GPUs. ⁴
Minimum base driver. CUDA 13.x compat needs a base install >= r580; CUDA 12.x compat needs >= r525. ³

Prefer a clean driver upgrade (Rolling Driver / CUDA Upgrade) when you control the host; reserve cuda-compat for fleets whose kernel driver is frozen by qualification windows.

Validated usage & tests¶

The commands below are reference templates. Output shapes are described from the official docs; exact version strings and counts depend on the host and are not asserted here.

1. Driver vs CUDA-supported version, side by side. The nvidia-smi header shows the Driver Version and, separately, the CUDA Version (max supported, not installed):

# reference template, not hardware-tested
nvidia-smi

Expect a banner line of the form Driver Version: <X> CUDA Version: <Y>, where <Y> is the toolkit ceiling for driver <X>, followed by the per-GPU table. Treat <Y> as a maximum, never as evidence a toolkit is installed.

2. Driver API vs runtime version from a process. The toolkit's deviceQuery sample reports both, e.g. a line of the form CUDA Driver Version / Runtime Version <driver> / <runtime>. ² A driver number lower than the runtime number means the host driver predates the toolkit, the case cuda-compat (or a driver upgrade) exists to resolve.

3. Confirm the loaded libcuda.so. Verify which user-mode driver a process resolves (stock vs compat path), useful when validating a cuda-compat install:

# reference template, not hardware-tested
ldconfig -p | grep -E 'libcuda\.so'        # system-resolved UMD
ldd ./your_cuda_app | grep -E 'libcuda\.so' # what this binary binds

Expect the path to resolve to the stock driver's libcuda.so.1 normally, or to the …/compat/ copy when LD_LIBRARY_PATH points there.

Containers: the driver is the only host CUDA dependency. A CUDA container "only requires the NVIDIA driver, the CUDA toolkit doesn't have to be installed" on the host. ⁶ The image bundles libcudart and the math/comm libraries (CUDA Math and Communication Libraries); the NVIDIA Container Toolkit (NVIDIA Container Toolkit and CDI) injects the host's user-mode driver and device nodes into the container at runtime. The host floor is therefore just a driver new enough for the image's toolkit, per the minor-version minimums above (and >= 418.81.07 for the container toolkit itself). ⁷

# reference template, not hardware-tested
# driver present on host; CUDA runtime supplied by the image
docker run --rm --gpus all nvidia/cuda:13.0.0-base-ubuntu24.04 nvidia-smi

Expect the in-container nvidia-smi to report the host driver version (the toolkit is injected from the host UMD), confirming the container saw the GPU without a host-side toolkit install.

Failure modes¶

nvidia-smi/nvcc show different CUDA versions: not a fault. Driver ceiling vs installed toolkit; see "Why it's needed". ⁸
Toolkit newer than driver (CUDA driver version is insufficient for CUDA runtime version): host driver below the major-release minimum. ² Upgrade the driver (Rolling Driver / CUDA Upgrade) or install cuda-compat on a supported datacenter GPU.
cuda-compat on the wrong silicon: non-datacenter GPU or unsupported branch; documented as a wrong-driver / no-Data-Center-GPU error. ⁴ Resolve per Driver and Feature Support by GPU Tier.
libcuda.so missing / mismatched after a partial install: UMD/KMD or GSP skew; modules fail to load. Stack-level signatures and recovery are in GSP Firmware / Driver Mismatch and Kernel Upgrade: GPU Missing.

References¶

CUDA Compatibility — Why CUDA Compatibility (UMD libcuda.so in the driver package; toolkit separate): https://docs.nvidia.com/deploy/cuda-compatibility/why-cuda-compatibility.html
CUDA Compatibility — Minor Version Compatibility (per-major minimum driver: 13.x≥580, 12.x≥525, 11.x≥450; deviceQuery output): https://docs.nvidia.com/deploy/cuda-compatibility/minor-version-compatibility.html
CUDA Compatibility — Forward Compatibility (cuda-compat-<major>-<minor>; Data Center GPU / NGC-Ready RTX / Jetson only; base driver minimums): https://docs.nvidia.com/deploy/cuda-compatibility/forward-compatibility.html
CUDA Compatibility — FAQ (forward-compat on LTS/Production branches, select GPUs; wrong-driver / non-Data-Center-GPU error): https://docs.nvidia.com/deploy/cuda-compatibility/frequently-asked-questions.html
CUDA Driver API — Driver vs Runtime API: https://docs.nvidia.com/cuda/cuda-driver-api/driver-vs-runtime-api.html
nvidia-smi manual (--query-gpu, --format=csv, driver_version): https://docs.nvidia.com/deploy/nvidia-smi/index.html
Useful nvidia-smi Queries (driver_version = installed Kernel Mode Driver version): https://nvidia.custhelp.com/app/answers/detail/a_id/3751/
NVIDIA Container Toolkit — Installing (host needs the driver, not the CUDA toolkit; >= 418.81.07): https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/1.13.5/install-guide.html
NGC CUDA container (host requires only the NVIDIA driver): https://catalog.ngc.nvidia.com/orgs/nvidia/containers/cuda
Driver branch/CUDA-version cadence (R580 LTS, CUDA 13; releases.json): https://docs.nvidia.com/datacenter/tesla/drivers/supported-drivers-and-cuda-toolkit-versions.html

CUDA Compatibility — Why CUDA Compatibility. https://docs.nvidia.com/deploy/cuda-compatibility/why-cuda-compatibility.html ↩↩
CUDA Compatibility — Minor Version Compatibility. https://docs.nvidia.com/deploy/cuda-compatibility/minor-version-compatibility.html ↩↩↩
CUDA Compatibility — Forward Compatibility. https://docs.nvidia.com/deploy/cuda-compatibility/forward-compatibility.html ↩↩↩
CUDA Compatibility — Frequently Asked Questions. https://docs.nvidia.com/deploy/cuda-compatibility/frequently-asked-questions.html ↩↩↩
Useful nvidia-smi Queries (NVIDIA Enterprise Support). https://nvidia.custhelp.com/app/answers/detail/a_id/3751/ ↩
NGC CUDA container / NVIDIA Container Toolkit install guide — host requires only the NVIDIA driver. https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html ↩
NVIDIA Container Toolkit — Installing (minimum driver >= 418.81.07). https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/1.13.5/install-guide.html ↩
nvcc vs nvidia-smi CUDA version discrepancy (driver ceiling vs installed toolkit). https://dev.to/moseo/solving-the-version-conflicts-between-the-nvidia-driver-and-cuda-toolkit-2n2 ↩↩