Skip to content
Markdown

Datacentre physical readiness

Scope: reading datacentre drawings and confirming the facility can actually host the cluster. Power, UPS, cooling, airflow, weight, and the schematics that describe them. This is the most facility-specific layer in this knowledge base.

flowchart LR
  FLOOR["Floor plan"] --> POWER["Power path"]
  FLOOR --> COOLING["Cooling path"]
  POWER --> READY["Rack readiness"]
  COOLING --> READY
  READY --> BOM["BOM validation"]

Overview

Blackwell-class density has moved the deployment risk from compute to facility. Memory capacity is no longer the constraint; power delivery and heat rejection are. The skill is to read a floor plan, power schematic, and cooling layout and state, with numbers, whether the hall can take the load.

Core knowledge

Drawing types to read

  • Floor plan / whitespace layout: rack positions, aisles, clearances, structural load zones, routes for power and liquid.
  • Rack elevations: U-by-U layout, weight distribution, top-of-rack switch placement.
  • Power distribution / one-line diagram: utility feed, transformers, switchgear, UPS, PDU, busway, down to rack whips. Phases and redundancy.
  • UPS schematic: topology (N, N+1, 2N), runtime, static transfer switches.
  • Cooling / mechanical layout: CDUs, manifolds, pipework, CRAH/CRAC or rear-door heat exchangers, hot/cold aisle containment, airflow direction.

Power density and the Blackwell reality

  • B300 GPU TDP is 1,400 W. A GB300 NVL72 rack draws roughly 120 to 140 kW (sources cite ~120 kW, Supermicro's range 132 to 140 kW, Microsoft ~136 kW). This is an order of magnitude above a conventional enterprise rack.
  • Transient behaviour: power can spike to about 1.4x steady-state during gradient synchronisation, at microsecond scale. NVIDIA uses power smoothing (energy storage and burn mechanisms) and multiple power-shelf configurations to absorb synchronous load ramps. Plan feeds and protection for the transient, not just the average.
  • Harmonics: GB300 racks show meaningful total harmonic distortion under training load. Beyond roughly eight racks, dedicated transformers with 12-pulse rectifiers are typically needed to stay IEEE 519 compliant.

Power & cooling by GPU generation

Per-GPU board power has roughly tripled across two generations, and the connector and cooling story changes with it (GPU generations):

GPU Board power Cooling viability
A100 SXM (Ampere) 400 W Air
H100 SXM (Hopper) 700 W (configurable) Air (dense) to liquid
B200 (Blackwell) ~1,000 W Liquid in DC density
B300 (Blackwell Ultra) 1,400 W Liquid mandatory
  • Air cooling stays viable through Hopper. H100-class racks are still routinely air-cooled; Blackwell-class density (B200 ~1 kW, B300 1.4 kW per GPU) pushes per-rack load to ~120-140 kW for a GB300 NVL72 and makes direct-to-chip liquid mandatory (consistent with the B300 figures already noted above).
  • Connectors differ by tier. Datacenter SXM modules draw from a baseboard/busbar (no per-card cable). Consumer cards use a single sequential cable: 12VHPWR on Ada (RTX 4090) and 12V-2x6 on Blackwell consumer (RTX 5090, 575 W total graphics power, fed by a Gen5 12V cable or 4x 8-pin adapter). The RTX PRO 6000 Blackwell (up to 600 W) takes a single CEM5 16-pin connector.
  • Per-rack implication. Air-cooled enterprise racks top out well below a single Blackwell GPU tray. Sizing PDU phase, whip, and connector to the SXM busbar (datacenter) versus a per-card 16-pin (PCIe pro/consumer) is a distinct BOM check (BOM validation); the consumer 12V connectors carry near their rated limit and demand correct seating and gauge.

Cooling

  • At 1,400 W per GPU, liquid cooling is mandatory in all B300 form factors; air cooling is insufficient. GB300 NVL72 ships with integrated liquid cooling; HGX B300 baseboards need an OEM liquid solution.
  • Direct-to-chip cold plates plus CDUs and rear-door heat exchangers. Rear-door HX programmes claim per-rack capacities well above 100 kW.
  • Practical rule from the field: size cooling for about 110% of rated TDP to absorb thermal spikes without throttling.

Mechanical and structural

  • A GB300 NVL72 cabinet weighs roughly 1.36 t (about 3,000 lb). It is a 48U rack but occupies a standard 42U floor footprint, so the load lands on a conventional tile area. Confirm floor loading.
  • Centre of gravity sits higher than standard servers due to dense upper-tray compute. Positive seismic/anchor hardware (bolted to the slab, not standard cage nuts) is advised against micro-vibration and tip risk during full load.

Connectivity to the outside

  • Uplink to customer edge: at least 2x 100 GbE with single-mode (DR1), BGP peering for route handover, in-band and OOB routes announced.

Don't-miss checklist

  • Confirm rack power feed, phase, connector, and redundancy match the BOM PDUs (BOM validation).
  • Confirm the hall's per-rack cooling capacity meets or exceeds rack TDP plus margin.
  • Confirm liquid-cooling loop: CDU capacity, flow rate, supply temperature, leak detection.
  • Confirm floor loading and anchoring for rack weight and centre of gravity.
  • Confirm UPS topology and runtime against the load.
  • Check harmonic mitigation for multi-rack deployments.
  • Confirm cable routes (power and fibre) on the floor plan match the run lengths in the BOM.

Failure modes

  • Hall rated for air cooling or for far lower per-rack kW than a GB300 rack needs.
  • Feed sized for steady-state, tripping on transient spikes during training.
  • Floor loading or anchoring inadequate for cabinet weight and high CoG.
  • Harmonics out of compliance once several racks are populated.
  • Cable routes on the plan shorter or longer than the procured media supports.

Open questions & validation

  • Build fluency reading a one-line power diagram and a cooling P&ID against real facility drawings.
  • Learn the per-rack cooling maths well enough to assess a layout live.
  • Confirm per-rack cooling capacity and harmonic mitigation against the actual hall before populating multiple racks (BOM validation).

References

  • GB300 deployment, power, cooling, weight, harmonics: https://introl.com/blog/why-nvidia-gb300-nvl72-blackwell-ultra-matters
  • Blackwell Ultra infrastructure requirements (power trajectory, liquid cooling): https://introl.com/blog/nvidia-blackwell-ultra-b300-infrastructure-requirements-2025
  • Facility planning framing (whitespace, cooling loops, power phases): https://radiant.co/blog/nvidia-blackwell-ultra-b300-gb300-gpus
  • NVIDIA A100 (400 W SXM): https://www.nvidia.com/en-us/data-center/a100/
  • NVIDIA H100 (700 W SXM, configurable): https://www.nvidia.com/en-us/data-center/h100/
  • NVIDIA GeForce RTX 5090 (575 W, Gen5 12V connector): https://www.nvidia.com/en-us/geforce/graphics-cards/50-series/rtx-5090/
  • NVIDIA RTX PRO 6000 Blackwell (up to 600 W): https://www.nvidia.com/en-us/products/workstations/professional-desktop-gpus/rtx-pro-6000/

Related: BOM · Commissioning · Platform · Glossary