Markdown

NVIDIA GPU generations & families¶

Scope: the cornerstone reference for the whole NVIDIA range. It covers the three datacenter generations (Ampere → Hopper → Blackwell), what changed each generation (Tensor Core gen, precisions, NVLink, memory, MIG, Confidential Computing), and the four operational tiers (datacenter SXM/PCIe, workstation RTX PRO, consumer GeForce, and the GB10/DGX Spark desktop class). Built around a canonical difference matrix. Keep current as datasheets shift.

Figures verified against NVIDIA primary sources (product pages, architecture pages, datasheets) as of June 2026. Re-check the relevant datasheet before relying on any single number: vendor figures vary by configuration, by node-vs-GPU framing, and by dense-vs-sparse and peak-vs-sustained precision framing. Where two NVIDIA figures disagree, both are noted.

What it is¶

NVIDIA ships one microarchitecture per generation across many SKUs and tiers. The generation (Ampere, Hopper, Blackwell) fixes the silicon capabilities: Tensor Core generation, supported numeric precisions, NVLink generation, Confidential Computing support. The tier (datacenter, workstation, consumer, desktop superchip) fixes the operational envelope: which driver branch, whether NVLink/MIG/vGPU/ECC exist, the form factor, the power connector, and the licence terms.

The two axes are orthogonal and both matter. A consumer GeForce RTX 5090 and a datacenter B200 share the same fifth-generation Blackwell Tensor Cores and FP4 math, yet differ on nearly every operational property that governs a cluster: the 5090 has no NVLink, no MIG, no ECC, no vGPU, no GPUDirect RDMA, and a driver licence that restricts datacenter use. Choosing hardware means picking a point on both axes.

This page is the map. The per-family pages carry the detail: Ampere, Hopper, Blackwell platform, RTX & workstation, DGX systems, and DGX Spark.

flowchart TB
  ROOT["NVIDIA GPU range"]
  ROOT --> GEN["Generation axis: silicon capability"]
  ROOT --> TIER["Tier axis: operational envelope"]

  GEN --> AMP["Ampere 2020: 3rd-gen Tensor, TF32, NVLink 3 (600 GB/s)"]
  GEN --> HOP["Hopper 2022: 4th-gen Tensor + FP8, NVLink 4 (900 GB/s), Confidential Computing"]
  GEN --> BLK["Blackwell 2024: 5th-gen Tensor + FP4, NVLink 5 (1.8 TB/s), TEE-I/O"]

  TIER --> DC["Datacenter SXM/PCIe: NVLink, MIG, vGPU, ECC, datacenter driver"]
  TIER --> WS["Workstation RTX PRO: MIG, vGPU, ECC, NO NVLink, RTX Enterprise driver"]
  TIER --> GF["Consumer GeForce: NO NVLink/MIG/vGPU/ECC, GeForce driver + datacenter-EULA limit"]
  TIER --> SOC["Desktop superchip GB10/GB300: unified memory, C2C internal, DGX OS"]

The three datacenter generations¶

Each generation is a die family on a newer process node, defined by what its Tensor Cores and interconnect can do. The progression below is the load-bearing summary; the canonical matrix and per-family pages carry the SKU detail.

Ampere (GA100, TSMC 7 nm, 2020)¶

Third-generation Tensor Cores introduced TF32 (a 19-bit format that runs FP32-range training at Tensor-Core speed with no code change) and added structural sparsity, alongside FP16/BF16/INT8/INT4 and FP64 acceleration. There is no FP8 on Ampere. Interconnect is third-generation NVLink at 600 GB/s per GPU. MIG debuts here, partitioning an A100 into up to 7 isolated instances. No Confidential Computing.

A100 (datacenter, SXM/PCIe): 40 or 80 GB HBM2e (~1.6 / ~2.0 TB/s), 6,912 CUDA cores, 432 Tensor Cores, ECC, MIG up to 7. 400 W SXM / 300 W PCIe.
A30 (datacenter, PCIe): 24 GB HBM2e, MIG-capable (the smaller MIG-able Ampere).
A40 / A10 (workstation/inference, GDDR6): display and render-focused, no MIG.
A800: China-export A100 with reduced NVLink bandwidth.

Detail and citations: Ampere.

Hopper (GH100, TSMC 4N, 2022–2024)¶

Fourth-generation Tensor Cores plus the first-generation Transformer Engine bring FP8 to training and inference, roughly doubling throughput for transformer workloads versus FP16. Interconnect steps to fourth-generation NVLink at 900 GB/s per GPU. MIG is second-generation (still up to 7 instances) and now supports per-instance monitoring. Hopper is the first GPU with Confidential Computing, a hardware TEE that encrypts GPU memory with remote attestation. Adds DPX instructions for dynamic programming.

H100 (datacenter, SXM/PCIe): 80 GB HBM3, SXM 3.35 TB/s at 700 W with full NVLink, PCIe variant 350 W. PCIe Gen5.
H200 (datacenter, SXM): same GH100 die, 141 GB HBM3e at 4.8 TB/s, 700 W, NVLink 4, MIG 7, Confidential Computing (a memory-and-bandwidth upgrade on the same compute).
GH200 Grace Hopper: 1 Grace CPU + 1 Hopper GPU joined by NVLink-C2C at 900 GB/s with coherent unified memory.
H800 / H20: China-export variants with reduced interconnect/compute.

Detail and citations: Hopper.

Blackwell (2024–2025)¶

Fifth-generation Tensor Cores plus the second-generation Transformer Engine add FP4 / NVFP4 (a community-defined microscaling 4-bit format) for the inference-and-reasoning era, on top of FP8 and FP6. Interconnect steps to fifth-generation NVLink at 1.8 TB/s per GPU (900 GB/s each direction), forming a single 72-GPU domain in NVL72 racks (130 TB/s aggregate). Blackwell is the first TEE-I/O–capable GPU, extending Confidential Computing across NVLink so a trusted domain can span multiple GPUs. Memory is HBM3e. Blackwell deprioritises FP64, so classical double-precision HPC does not benefit equally.

B200 (datacenter, SXM): 192 GB HBM3e (~8 TB/s), dual-die package, ~9 PFLOPS dense FP4 per GPU, 1,000 W. Fits much existing datacenter power/cooling.
B300 (Blackwell Ultra) (datacenter, SXM): 288 GB HBM3e via 12-high stacks, ~15 PFLOPS dense FP4 per GPU, 1,400 W, liquid cooling mandatory. (NVIDIA's HGX B300 8-GPU page lists 2.1 TB node memory; the widely cited per-GPU figure is 288 GB; verify the exact per-GPU bandwidth and node total on the current datasheet before quoting.)
GB200 superchip = 1 Grace CPU + 2 B200 GPUs via NVLink-C2C. GB200 NVL72 = 72 B200 + 36 Grace (36 superchips), ~1.4 EF FP4.
GB300 NVL72 = 72 B300 + 36 Grace, ~1.1 EF dense FP4 (vendor sparse/peak figures up to ~1.44 EF), ~20.7 TB HBM3e, NVLink 5.
B100: early Blackwell at 700 W, sized to fit Hopper-class infrastructure.

Detail and citations: Blackwell platform.

What changed each generation¶

Property	Ampere (2020)	Hopper (2022)	Blackwell (2024)
Tensor Core gen	3rd	4th	5th
New precision	TF32, BF16, structural sparsity	FP8 (1st-gen Transformer Engine)	FP4 / NVFP4 (2nd-gen Transformer Engine)
FP64	accelerated	accelerated	deprioritised
NVLink gen / per-GPU BW	3rd / 600 GB/s	4th / 900 GB/s	5th / 1.8 TB/s
Memory type	HBM2e	HBM3 (H100) / HBM3e (H200)	HBM3e
MIG generation	1st (up to 7)	2nd (up to 7)	yes (B-series)
Confidential Computing	no	yes (TEE)	yes (TEE-I/O over NVLink)
Process node	TSMC 7 nm	TSMC 4N	TSMC 4NP (custom)

Sources: NVIDIA Ampere, Hopper, and Blackwell architecture pages and the per-card product pages cited under References.

The four tiers¶

The same silicon ships in four tiers with very different operations. The tier, not the generation, decides driver branch, whether you get NVLink/MIG/vGPU/ECC, the form factor and power connector, and the licence.

flowchart LR
  subgraph DC["Datacenter SXM/PCIe"]
    DC1["A100 / H100 / H200 / B200 / B300"]
    DC2["NVLink + NVSwitch, MIG, vGPU, ECC, GPUDirect RDMA, Confidential Computing"]
    DC3["Datacenter driver (LTS), Fabric Manager on NVSwitch systems"]
  end
  subgraph WS["Workstation RTX PRO"]
    WS1["RTX PRO 6000 Blackwell"]
    WS2["MIG, vGPU, ECC, GPUDirect RDMA — but NO NVLink"]
    WS3["RTX Enterprise / production-branch driver"]
  end
  subgraph GF["Consumer GeForce"]
    GF1["RTX 5090 / 4090"]
    GF2["NO NVLink, NO MIG, NO vGPU, NO ECC, NO GPUDirect RDMA"]
    GF3["GeForce driver + datacenter-EULA restriction"]
  end
  subgraph SOC["Desktop superchip"]
    SOC1["DGX Spark GB10 / DGX Station GB300"]
    SOC2["Unified CPU-GPU memory, NVLink-C2C internal only"]
    SOC3["DGX OS, networking via ConnectX"]
  end

Datacenter (SXM / PCIe)¶

The A100/H100/H200/B200/B300 line. SXM boards mount on an HGX baseboard and connect through NVSwitch into a single NVLink domain (up to 8 GPUs per node, 72 per NVL72 rack); PCIe variants are add-in cards with optional bridges. This tier has the full feature set: NVLink, MIG, vGPU, ECC, GPUDirect RDMA, and (Hopper+) Confidential Computing. It runs the datacenter driver (a Long-Term-Support branch such as R580), and NVSwitch systems require Fabric Manager (nv-fabricmanager, lockstep-versioned with the driver) to program the fabric into one domain. Cooling is air or liquid; Blackwell datacenter parts are liquid-cooled. This is the tier the rest of this knowledge base assumes. See GPU software stack, networking fabric, and datacentre readiness.

Bringing a fabric up and proving it (driver/Fabric-Manager install, InfiniBand/RoCE link checks, NVLink-domain formation, and nccl-tests line-rate benchmarks) is one ops procedure shared across generations: see the keystone Fabric bring-up, validation & benchmarking. The per-family pages carry the generation-specific networking detail (NVLink bandwidth, NIC, IB generation): Ampere (NVLink 3 / HDR), Hopper (NVLink 4 / NDR, Quantum-2), Blackwell platform (NVLink 5 1.8 TB/s / XDR, ConnectX-8, Quantum-X800, Spectrum-X, IMEX for cross-node NVLink).

Workstation (RTX PRO / RTX Enterprise)¶

The RTX PRO 6000 Blackwell and prior RTX 6000 Ada / L40 / L40S. These are professional GPUs with ECC GDDR7/GDDR6, MIG (RTX PRO 6000 Blackwell only: up to 4×24 GB / 2×48 GB / 1×96 GB), vGPU, and GPUDirect RDMA, but no NVLink (Ada and the Blackwell workstation parts have none), so multi-GPU is PCIe peer-to-peer only and there is no NVSwitch domain or Fabric Manager. They run the RTX Enterprise / production-branch driver, not the GeForce or datacenter branch. The RTX PRO 6000 Blackwell ships in Workstation, Server Edition (passive dual-slot for datacenter racks), and Max-Q editions. Detail and citations: RTX & workstation.

Consumer (GeForce)¶

The GeForce RTX 5090 (Blackwell) and RTX 4090 (Ada). Top consumer cards with current-generation Tensor Cores and large fast memory (5090: 32 GB GDDR7), but the operational envelope is deliberately minimal: no NVLink (Ada Lovelace dropped it; the RTX 3090 was the last NVLink GeForce, and Blackwell consumer has none either), no MIG, no vGPU, no ECC, no GPUDirect RDMA, and a GeForce driver whose licence restricts datacenter deployment. Multi-GPU is PCIe-only with NCCL falling back to PCIe/host transport. These are workstation/desktop and prototyping parts; the EULA and missing features are the reason they are not a datacenter substitute. Detail and citations: RTX & workstation.

Desktop superchip (GB10 / GB300 desktop)¶

The DGX Spark (GB10) and DGX Station (GB300 desktop) are a class of their own. The GB10 Grace Blackwell Superchip pairs a 20-core Arm CPU with a Blackwell GPU sharing 128 GB LPDDR5x unified memory (~273 GB/s) over an internal NVLink-C2C (not HBM, not discrete VRAM). There is no external NVLink; multi-unit scaling is over networking. The Spark has 2× QSFP ConnectX-7 ports to cluster two units (256 GB combined). It runs DGX OS with CUDA/cuDNN/TensorRT preloaded. The DGX Station is the high-end desktop: 1× B300 Grace Blackwell Ultra with 748 GB coherent memory (252 GB HBM3e + 496 GB LPDDR5X) and ConnectX-8. Detail and citations: DGX Spark, DGX systems.

Canonical difference matrix¶

The single table to reason from. Per-GPU figures unless stated; verify fast-moving numbers against the linked datasheet.

Feature	A100 (Ampere)	H100 / H200 (Hopper)	B200 / B300 (Blackwell)	RTX 5090 (GeForce)	RTX PRO 6000 BW (WS)	DGX Spark (GB10)
Tier	datacenter	datacenter	datacenter	consumer	workstation	desktop superchip
Tensor gen	3rd	4th (+FP8)	5th (+FP4)	5th	5th	5th
Memory	HBM2e 40/80 GB	HBM3/HBM3e 80/141 GB	HBM3e 192/288 GB	GDDR7 32 GB (no ECC)	GDDR7 96 GB ECC	LPDDR5x 128 GB unified
Mem bandwidth	~1.6 / ~2.0 TB/s	3.35 / 4.8 TB/s	~8 TB/s (verify B300)	1,792 GB/s	1,792 GB/s	~273 GB/s
NVLink	3rd, 600 GB/s	4th, 900 GB/s	5th, 1.8 TB/s	none (PCIe Gen5)	none (PCIe Gen5)	C2C internal only
MIG	yes (7)	yes (7)	yes	no	yes (4/2/1)	no
ECC	yes	yes	yes	no	yes	on-die LPDDR
Fabric Manager	NVSwitch sys	NVSwitch sys	NVSwitch sys	no	no	no
GPUDirect RDMA	yes	yes	yes	no (GeForce)	yes	via ConnectX
vGPU	yes	yes	yes	no	yes	n/a
Confidential Computing	no	yes	yes (TEE-I/O)	no	verify datasheet	no
Driver branch	datacenter LTS	datacenter LTS	datacenter LTS	GeForce	RTX Enterprise / prod	DGX OS
Power (board)	400 W SXM	700 W SXM	1,000–1,400 W	575 W	600 W	~240 W system
Power connector	SXM busbar	SXM busbar	SXM busbar	12V-2x6	CEM5 16-pin	desktop PSU
Form factor	SXM / PCIe	SXM / PCIe	SXM / PCIe	PCIe consumer	PCIe	desktop SoC
Cooling	air / liquid	air / liquid	liquid (DC)	air	air	desktop

Sources: NVIDIA A100, H100, H200, B200, RTX 5090, RTX PRO 6000 Blackwell, and DGX Spark pages and datasheets cited under References; cross-checked against the canonical session brief and the in-repo Glossary.

How to choose¶

Pick by the binding constraint, in this order.

Numeric precision the workload needs. FP4/NVFP4 inference and reasoning → Blackwell. FP8 transformer training/inference → Hopper or newer (Ada also has FP8 Tensor Cores). TF32-class mixed-precision training with broad availability → Ampere or newer. Heavy FP64 HPC → favour Ampere/Hopper, since Blackwell deprioritises FP64.
Scale-out / collective topology. Multi-GPU training that must scale on an NVLink/NVSwitch domain (tensor parallelism, large all-reduces) → datacenter SXM only. Embarrassingly parallel or single-GPU jobs tolerate PCIe-only tiers. There is no NVLink on consumer or current workstation parts, so NCCL falls back to PCIe/host, which is acceptable for inference and small jobs but a bottleneck for tightly-coupled training.
Multi-tenancy and isolation. Hard partitioning with fault isolation → MIG (A100/A30, H100/H200, B-series, RTX PRO 6000 Blackwell). VM passthrough with hypervisor mediation → vGPU (datacenter and workstation, not GeForce). Regulated data in use → Confidential Computing (Hopper+, TEE-I/O on Blackwell). See security & multi-tenancy.
Memory footprint. The model and KV-cache must fit. Largest single-GPU memory: B300 (288 GB) > B200 (192 GB) > H200 (141 GB). Unified large-but-slow memory for local dev of big models: DGX Spark (128 GB) or DGX Station (748 GB).
Deployment context and licence. Production datacenter → datacenter or RTX-Enterprise driver; the GeForce driver licence restricts datacenter use, so consumer cards are a compliance non-starter there regardless of raw performance. Desk-side development → DGX Spark / workstation. See cloud, neoclouds & cost.

flowchart TD
  START["Choose a GPU"] --> Q1{"Need NVLink/NVSwitch scale-out for tightly-coupled multi-GPU?"}
  Q1 -->|Yes| DC["Datacenter SXM: pick generation by precision (FP4 -> Blackwell, FP8 -> Hopper)"]
  Q1 -->|No| Q2{"Datacenter deployment or regulated multi-tenancy?"}
  Q2 -->|Yes| Q3{"Need MIG / vGPU / ECC but single-node is fine?"}
  Q3 -->|Yes| WS["Workstation RTX PRO 6000 Blackwell (RTX Enterprise driver)"]
  Q3 -->|No| DC2["Datacenter PCIe card (full feature set, no facility NVSwitch)"]
  Q2 -->|No| Q4{"Desk-side dev of large models on unified memory?"}
  Q4 -->|Yes| SPARK["DGX Spark GB10 / DGX Station GB300 (DGX OS)"]
  Q4 -->|No| GF["Consumer GeForce for prototyping only — NOT datacenter (EULA + no NVLink/MIG/ECC)"]

What each tier does NOT support¶

Explicit non-support, because these gaps cause the most operational surprises.

Consumer GeForce (RTX 5090 / 4090): no NVLink, no MIG, no vGPU, no ECC, no GPUDirect RDMA, no Confidential Computing, and a GeForce driver licence that restricts datacenter deployment. Multi-GPU is PCIe peer-to-peer only; NCCL falls back to PCIe/host; there is no NVSwitch domain and Fabric Manager is not used. These are prototyping/workstation parts, not datacenter substitutes.
Workstation RTX PRO 6000 Blackwell / RTX 6000 Ada / L40S: no NVLink. Multi-GPU is PCIe-only; no NVSwitch domain, no Fabric Manager. RTX PRO 6000 Blackwell does have MIG/vGPU/ECC; RTX 6000 Ada and L40S do not have MIG. Confirm Confidential Computing on the current datasheet; do not assume it.
MIG is not universal even within a generation. MIG-capable: A100, A30, H100, H200, B200, B300, GB200, GB300, and RTX PRO 6000 Blackwell. Not MIG-capable: GeForce (all), A40, A10, L40/L40S, RTX 6000 Ada, DGX Spark GB10.
DGX Spark (GB10): no external NVLink, no MIG, no discrete HBM. Memory is 128 GB LPDDR5x unified at ~273 GB/s (far below datacenter HBM bandwidth), and multi-unit scaling is over ConnectX-7 networking, not an NVLink domain. It is a development appliance, not a training-cluster node.
Confidential Computing is Hopper-and-newer. Ampere (A100/A30/A40/A10) does not support it. TEE-I/O across NVLink is Blackwell-only.
FP8 needs Hopper+ (or Ada); FP4/NVFP4 needs Blackwell. Ampere has neither. FP64 is deprioritised on Blackwell, so do not size double-precision HPC on B-series.

Gotchas & failure modes¶

Generation ≠ tier. "It's Blackwell" does not imply NVLink, MIG, or datacenter-legal. The RTX 5090 is Blackwell and has none of those. Always state both axes.
Driver-branch mismatch. Datacenter, GeForce, and RTX-Enterprise are different driver branches with different licences and feature gates; installing the wrong one silently loses MIG/vGPU/ECC support or violates the EULA. See GPU software stack and the driver upgrade runbook.
Fabric Manager only applies to NVSwitch systems. On PCIe-only or single-GPU tiers there is no fabric to program; expecting nv-fabricmanager there is a category error. It must be lockstep-versioned with the driver where it does apply.
B300 / GB300 numbers move. NVIDIA's HGX B300 node memory (2.1 TB across 8 GPUs) and the commonly cited 288 GB per-GPU figure do not divide cleanly; per-GPU FP4 PFLOPS and bandwidth vary by dense-vs-sparse and node-vs-GPU framing. Re-pull the datasheet before quoting a single number, and ignore third-party "384 GB B300" claims. NVIDIA's figure is 288 GB.
China-export variants differ on interconnect. A800/H800/H20 reduce NVLink or compute versus the base part; never assume base-SKU bandwidth from the family name.
Consumer-card datacenter deployment is a licence risk, not just a feature gap. Even where a GeForce card is performant, the GeForce driver EULA is the blocker, so verify the exact current clause before any datacenter use.

References¶

NVIDIA primary sources (each URL confirmed reachable, June 2026):

NVIDIA A100 product page: https://www.nvidia.com/en-us/data-center/a100/
NVIDIA A100 datasheet (PDF): https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/a100/pdf/nvidia-a100-datasheet-us-nvidia-1758950-r4-web.pdf
NVIDIA Ampere architecture page: https://www.nvidia.com/en-us/data-center/ampere-architecture/
NVIDIA Ampere architecture whitepaper: https://resources.nvidia.com/en-us-tensor-core/nvidia-ampere-architecture-whitepaper
NVIDIA A30 product page: https://www.nvidia.com/en-us/data-center/products/a30-gpu/
NVIDIA A40 product page: https://www.nvidia.com/en-us/data-center/a40/
NVIDIA H100 product page: https://www.nvidia.com/en-us/data-center/h100/
NVIDIA H200 product page: https://www.nvidia.com/en-us/data-center/h200/
NVIDIA H200 datasheet (PDF): https://resources.nvidia.com/en-us-data-center-overview/hpc-datasheet-sc23-h200
NVIDIA Hopper architecture page: https://www.nvidia.com/en-us/data-center/technologies/hopper-architecture/
NVIDIA Hopper architecture deep dive (developer blog): https://developer.nvidia.com/blog/nvidia-hopper-architecture-in-depth/
NVIDIA Grace Hopper Superchip page: https://www.nvidia.com/en-us/data-center/grace-hopper-superchip/
NVIDIA Blackwell architecture page: https://www.nvidia.com/en-us/data-center/technologies/blackwell-architecture/
NVIDIA B200 (HGX B200 / B300) page: https://www.nvidia.com/en-us/data-center/b200/
NVIDIA GB200 NVL72 page: https://www.nvidia.com/en-us/data-center/gb200-nvl72/
NVIDIA GB300 NVL72 page: https://www.nvidia.com/en-us/data-center/gb300-nvl72/
NVIDIA Blackwell launch announcement: https://nvidianews.nvidia.com/news/nvidia-blackwell-platform-arrives-to-power-a-new-era-of-computing
NVIDIA GeForce RTX 5090 page: https://www.nvidia.com/en-us/geforce/graphics-cards/50-series/rtx-5090/
NVIDIA GeForce RTX 4090 page: https://www.nvidia.com/en-us/geforce/graphics-cards/40-series/rtx-4090/
NVIDIA GeForce driver licence (EULA): https://www.nvidia.com/en-us/drivers/geforce-license/
NVIDIA RTX PRO 6000 Blackwell page: https://www.nvidia.com/en-us/products/workstations/professional-desktop-gpus/rtx-pro-6000/
NVIDIA RTX PRO 6000 Blackwell Server Edition page: https://www.nvidia.com/en-us/data-center/rtx-pro-6000-blackwell-server-edition/
NVIDIA L40S product page: https://www.nvidia.com/en-us/data-center/l40s/
NVIDIA DGX Spark page: https://www.nvidia.com/en-us/products/workstations/dgx-spark/
NVIDIA DGX Spark hardware docs: https://docs.nvidia.com/dgx/dgx-spark/hardware.html
NVIDIA DGX Station page: https://www.nvidia.com/en-us/data-center/dgx-station/
NVIDIA DGX H100 page: https://www.nvidia.com/en-us/data-center/dgx-h100/
NVIDIA DGX B200 page: https://www.nvidia.com/en-us/data-center/dgx-b200/
NVIDIA HGX platform page: https://www.nvidia.com/en-us/data-center/hgx/
NVIDIA NVLink & NVSwitch page: https://www.nvidia.com/en-us/data-center/nvlink/
NVIDIA NVLink-C2C page: https://www.nvidia.com/en-us/data-center/nvlink-c2c/
NVIDIA Multi-Instance GPU (MIG) page: https://www.nvidia.com/en-us/technologies/multi-instance-gpu/
NVIDIA MIG user guide: https://docs.nvidia.com/datacenter/tesla/mig-user-guide/
NVIDIA virtual GPU (vGPU) solutions: https://www.nvidia.com/en-us/data-center/virtual-solutions/
NVIDIA Confidential Computing page: https://www.nvidia.com/en-us/data-center/solutions/confidential-computing/
NVIDIA Transformer Engine (GitHub): https://github.com/NVIDIA/TransformerEngine
NVIDIA Transformer Engine user guide: https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/index.html