Overview
The NVIDIA H200 NVL is a Hopper-architecture PCIe AI accelerator with 141GB HBM3e memory — the highest-capacity HBM memory available in PCIe form factor. Passively cooled with NVLink bridge support, it delivers 989 TFLOPS TF32 and 1,979 TFLOPS BF16 Tensor Core performance, enabling inference serving for massive-context LLMs and multimodal foundation models at enterprise scale.
Key Features
- 141GB HBM3e — maximum memory capacity in PCIe accelerator format
- Hopper architecture: 989 TFLOPS TF32 Tensor Core performance
- 1,979 TFLOPS BF16/FP16 for peak generative AI throughput
- FP8 inference at 3,958 TFLOPS for maximum efficiency
- Passive PCIe cooling with NVLink bridge for multi-GPU scaling
Ideal For
FP64 Tensor Core
67 TFLOPS
TF32 Tensor Core²
989 TFLOPS
Architecture
Blackwell
BFLOAT16 Tensor Core²
1,979 TFLOPS
FP16 Tensor Core²
1,979 TFLOPS
FP8 Tensor Core²
3,958 TFLOPS
INT8 Tensor Core²
3,958 TFLOPS
GPU Memory Bandwidth
4.8TB/s
Decoders
7 NVDEC, 7 JPEG
Confidential Computing
Supported
Max Thermal Design Power (TDP)
Up to 600W (configurable)
Multi-Instance GPUs
Up to 7 MIGs @16.5GB each
Interconnect
2- or 4-way NVIDIA NVLink bridge: 900GB/s, PCIe Gen5: 128GB/s
Server Options
NVIDIA MGX™ H200 NVL partner and NVIDIA-Certified Systems with up to 8 GPUs
NVIDIA AI Enterprise
Add-on
Warranty
3 years manufacturer parts or replace
Feature
Ships in 2 weeks from payment. All sales final. No returns or cancellations. For bulk inquiries, consult a live chat agent or call our toll-free number.
Prices may vary. Verify on vendor site.
Quick Specs
- FP64 Tensor Core
- 67 TFLOPS
- TF32 Tensor Core²
- 989 TFLOPS
- Architecture
- Blackwell
- BFLOAT16 Tensor Core²
- 1,979 TFLOPS
- FP16 Tensor Core²
- 1,979 TFLOPS
- FP8 Tensor Core²
- 3,958 TFLOPS
