AI VRAM Calculator

Calculate exact GPU memory requirements for running, fine-tuning, or training large language models. Pick your model and workload — get a VRAM budget and matching hardware from our catalog of 649 systems.

Configure your workload

What are you doing with it?
Precision
Usage pattern

Estimate

Minimum VRAM
19 GB
Recommended VRAM (with 20% headroom)
23 GB
Recommended GPUs
RTX 4090RTX 5090RTX A5000
VRAM breakdown
Model weights
16 GB
KV cache
1 GB
Activations / buffers
1.6 GB

Hardware that can run this

8 systems from our catalog with matching GPUs, sorted by fit and price.

Tracer VIII Ultra Gaming I16U 300

Tracer VIII Ultra Gaming I16U 300

CyberpowerPC • Laptops
Computer VisionLlm InferenceGenerative Ai
GPU: NVIDIA® GeForce RTX™ 4090 16GB GDDR6 Video Card [AI-Powered Graphics] (Included)
CPU: Intel® Core™ Processor i9-14900HX 8P/16 + 16E Max Turbo 5.8GHz 36MB Cache
$2,185
VSXR5 540CU

VSXR5 540CU

Thinkmate • Workstations
Generative AiLlm TrainingHpc
GPU: Integrated Video (Included with Motherboard) | NVIDIA® GeForce® RTX 5060 Ti 16GB GDDR7 (2-Slot) (3xDP, 1xHDMI) | NVIDIA® GeForce® RTX 5070 12GB GDDR7 (2.4-Slot) (3xDP, 1xHDMI) | NVIDIA® GeForce® RTX 5070 Ti 16GB GDDR7 (2.98-Slot) (3xDP, 1xHDMI) | NVIDIA® GeForce® RTX 5080 16GB GDDR7 (3-Slot) (3xDP, 1xHDMI) | NVIDIA® GeForce® RTX 5090 32GB GDDR7 (3.5-Slot) (3xDP, 1xHDMI) | NVIDIA® RTX A400 - 4GB GD…
CPU: Intel® Core™ Ultra 5 Processor 225 - 10 Cores (6P+4E) 10 Threads - 4.9 GHz Max - 121W Max - Graphics | Intel® Core™ Ultra 5 Processor 225F - 10 Cores (6P+4E) 10 Threads - 4.9 GHz Max - 121W Max | Intel® Core™ Ultra 5 Processor 245KF - 14 Cores (6P+8E) 14 Threads - 5.2 GHz Max - 159W Max | Intel® Core™ Ultra 7 Processor 265F - 20 Cores (8P+12E) 20 Threads - 5.3 GHz Max - 182W Max | Intel® Core™ Ult…
RAM: 16GB PC5-44800 5600MHz DDR5 Non-ECC UDIMM (Up To 4) | 32GB PC5-44800 5600MHz DDR5 Non-ECC UDIMM (Up To 4)
$2,356
Tracer VII Gaming I16G LC 300

Tracer VII Gaming I16G LC 300

CyberpowerPC • Laptops
Computer VisionLlm InferenceGenerative Ai
GPU: NVIDIA® GeForce RTX™ 4090 16GB GDDR6 Video Card [AI-Powered Graphics] (Included)
CPU: Intel® Core™ Processor i9-13900HX 8P/16 + 16E Max Turbo 5.4GHz 36MB Cache (Raptor Lake)
$2,395
Tracer VII Gaming I16G LC 400

Tracer VII Gaming I16G LC 400

CyberpowerPC • Laptops
Computer VisionLlm InferenceGenerative Ai
GPU: NVIDIA® GeForce RTX™ 4090 16GB GDDR6 Video Card [AI-Powered Graphics] (Included)
CPU: Intel® Core™ Processor i9-13900HX 8P/16 + 16E Max Turbo 5.4GHz 36MB Cache (Raptor Lake)
$2,495
PNY NVIDIA Quadro RTX A5000 24GB GDDR6 Graphics Card (One Pack)

PNY NVIDIA Quadro RTX A5000 24GB GDDR6 Graphics Card (One Pack)

Amazon • Accelerators
Llm InferenceData Analytics
$2,500
VSXR5 340R7

VSXR5 340R7

Thinkmate • Workstations
Generative AiLlm TrainingHpc
GPU: Integrated Video (Included with Motherboard) | NVIDIA® GeForce® RTX 5060 Ti 16GB GDDR7 (2-Slot) (3xDP, 1xHDMI) | NVIDIA® GeForce® RTX 5070 12GB GDDR7 (2.4-Slot) (3xDP, 1xHDMI) | NVIDIA® GeForce® RTX 5070 Ti 16GB GDDR7 (2.98-Slot) (3xDP, 1xHDMI) | NVIDIA® GeForce® RTX 5080 16GB GDDR7 (3-Slot) (3xDP, 1xHDMI) | NVIDIA® GeForce® RTX 5090 32GB GDDR7 (3.5-Slot) (3xDP, 1xHDMI) | NVIDIA® RTX A400 - 4GB GD…
CPU: AMD Ryzen™ 5 9600X Processor 6-core 3.90GHz 32MB L3 Cache (65W) | AMD Ryzen™ 7 9700X Processor 8-core 3.80GHz 32MB L3 Cache (65W) | AMD Ryzen™ 9 9900X Processor 12-core 4.40GHz 64MB L3 Cache (120W) | AMD Ryzen™ 9 9950X Processor 16-core 4.30GHz 64MB L3 Cache (170W) | AMD Ryzen™ 7 9800X3D Processor 8-core 4.70GHz 96MB L3 Cache (120W) | AMD Ryzen™ 9 9900X3D Processor 12-core 4.40GHz 128MB L3 Cache (…
RAM: 16GB PC5-44800 5600MHz DDR5 Non-ECC UDIMM (Up To 4) | 32GB PC5-44800 5600MHz DDR5 Non-ECC UDIMM (Up To 4) | 16GB PC5-41600 5600MHz DDR5 ECC UDIMM (Up To 4) | 32GB PC5-41600 5600MHz DDR5 ECC UDIMM (Up To 4)
$2,556
NVIDIA RTX A5000 Enterprise 24GB 105MH/s 230W

NVIDIA RTX A5000 Enterprise 24GB 105MH/s 230W

Viperatech • Accelerators
Computer VisionLlm InferenceGenerative Ai
$2,556
MSI GeForce RTX 4090 Gaming X Slim 24G

MSI GeForce RTX 4090 Gaming X Slim 24G

Viperatech • Accelerators
Computer VisionLlm InferenceGenerative Ai
$2,600

How this is calculated

Bytes per parameter

  • FP16 / BF16: 2 bytes — training + high-quality inference
  • INT8: 1 byte — ~2× memory savings, minor quality loss
  • INT4: 0.5 bytes — ~4× memory savings, used by GGUF / GPTQ / AWQ

Workload overhead

  • Inference: weights + KV cache + ~10% runtime buffers
  • LoRA fine-tune: frozen weights + adapter states + activation memory (QLoRA if INT4)
  • Full training (Adam): weights + gradients + 2× optimizer states + activations, typically ~16–20 GB per 1B params in FP16

Usage pattern overhead

Longer contexts and more concurrent requests grow the KV cache linearly. Serving 16 production requests at 8k tokens can add tens of GB on top of the model weights. Flash-attention-2 and PagedAttention (vLLM) improve efficiency but don't eliminate the cost.

Caveats

Estimates are approximate. Real requirements depend on model architecture (attention heads, hidden dim, MQA/GQA), kernel choice, and framework overhead. Budget 10–20% headroom beyond the minimum.