b300

Optimized for AI Reasoning with Breakthrough Attention Performance

2X attention performance over B200 GPUs
1.5X dense FP4 performance boost vs B200
192 petaFLOPS inference / 70 petaFLOPS training per system
Specifically designed for the era of AI reasoning models (like o1, o3) that require massive attention compute

Massive Memory Capacity for Trillion-Parameter Models

2.1TB total GPU memory (263GB per GPU across 8 GPUs)
Increased from 1.4TB in B200 — a 50% memory increase
14.4TB/s NVLink aggregate bandwidth
Enables handling the largest frontier models and multi-modal workloads without memory constraints

Data Center-Optimized Form Factor with Flexible Power

First DGX system compatible with NVIDIA MGX standard racks
Available in both AC/PDU and DC/busbar configurations for deployment flexibility
10U form factor (vs 4U for B200) designed for modern hyperscale datacenter layouts
~14kW power consumption with industry-leading efficiency

Use Cases

Large Model Inference

Run massive models with predictable latency. Optimize for throughput, batch size, and performance per watt.

Generative AI applications for text, image, and audio.

Scaling ML infrastructure as your customer base grows.