Chips & Compute layer
H100 vs H200 vs B200 TCO
5-year TCO for the three NVIDIA generations on a training workload of a given size.
The engineer question
Is upgrading from H100 to B200 worth the cost?
Result
- 5-yr TCO — H100acq $30.72 M + power $2.11 M
- $32.83 M
- 5-yr TCO — H200acq $35.84 M + power $2.11 M
- $37.95 M
- 5-yr TCO — B200acq $40.96 M + power $3.01 M
- $43.97 M
- $ / eff PFLOP-yr — H100
- $9.3 k
- $ / eff PFLOP-yr — H200
- $10.7 k
- $ / eff PFLOP-yr — B20041% better value/PFLOP than H100
- $5.5 k
- Best value (lowest $/eff PFLOP-yr)
- B200 SXM (Blackwell)
Recommendation
Upgrading to B200 is worth it: despite a ~33% higher street price and a 1000W TDP, its ~2.3× dense FP16 throughput makes it 41% cheaper per effective PFLOP-year than H100. The crossover holds because B200's perf-per-dollar outruns the power penalty at 70% utilization. Caveat: this ignores networking (NVLink/InfiniBand), which is a much larger share of Blackwell rack cost.
Assumptions
- · Street prices are ROUGH public estimates only — Nvidia publishes no list price. Mid-2026 typical figures: H100 ≈ $30,000, H200 ≈ $35,000, B200 ≈ $40,000 per GPU (±20–40% by volume, SKU, and quarter). Source: trade-press surveys, analyst notes, secondary-market listings. NOT an audited quote.
- · Throughput is public DENSE spec-sheet figures: H100/H200 989 TFLOPS FP16 (1979 FP8); B200 2250 TFLOPS FP16 (4500 FP8). Sparse numbers are ~2× higher but rarely realized — we use dense. Real-workload MFU (model FLOPs utilization) is typically only 30–50% of peak; this tool does not apply MFU, so absolute $/PFLOP figures are optimistic — use them for relative comparison.
- · Power: TDP H100 700W, H200 700W, B200 1000W (board-level, public spec). Energy = TDP × utilization (70%) × PUE 1.2 × 8760 h/yr × 5 yr × $0.08/kWh. PUE 1.2 is a typical modern AI-DC figure; older facilities run 1.4–1.6.
- · "Effective PFLOP-years" = dense FP16 PFLOPs × utilization × GPU-count × lifetime. It is an idealized sizing metric, not measured delivered work.
- · EXCLUDED: networking (NVLink/NVSwitch/InfiniBand — often 15–30% of Blackwell rack cost), host CPU + memory, storage, datacenter build/lease, cooling capex, staff/ops, software, depreciation/financing, resale value, and supply lead-time risk. Adding networking generally shifts the answer toward fewer, faster GPUs (favoring B200).
- · Single-generation cluster assumed (no mixed fleets). All figures deterministic from the inputs above; no live pricing feed.
Worked example (default inputs)
Result
- 5-yr TCO — H100acq $30.72 M + power $2.11 M
- $32.83 M
- 5-yr TCO — H200acq $35.84 M + power $2.11 M
- $37.95 M
- 5-yr TCO — B200acq $40.96 M + power $3.01 M
- $43.97 M
- $ / eff PFLOP-yr — H100
- $9.3 k
- $ / eff PFLOP-yr — H200
- $10.7 k
- $ / eff PFLOP-yr — B20041% better value/PFLOP than H100
- $5.5 k
- Best value (lowest $/eff PFLOP-yr)
- B200 SXM (Blackwell)
Recommendation
Upgrading to B200 is worth it: despite a ~33% higher street price and a 1000W TDP, its ~2.3× dense FP16 throughput makes it 41% cheaper per effective PFLOP-year than H100. The crossover holds because B200's perf-per-dollar outruns the power penalty at 70% utilization. Caveat: this ignores networking (NVLink/InfiniBand), which is a much larger share of Blackwell rack cost.
Assumptions
- · Street prices are ROUGH public estimates only — Nvidia publishes no list price. Mid-2026 typical figures: H100 ≈ $30,000, H200 ≈ $35,000, B200 ≈ $40,000 per GPU (±20–40% by volume, SKU, and quarter). Source: trade-press surveys, analyst notes, secondary-market listings. NOT an audited quote.
- · Throughput is public DENSE spec-sheet figures: H100/H200 989 TFLOPS FP16 (1979 FP8); B200 2250 TFLOPS FP16 (4500 FP8). Sparse numbers are ~2× higher but rarely realized — we use dense. Real-workload MFU (model FLOPs utilization) is typically only 30–50% of peak; this tool does not apply MFU, so absolute $/PFLOP figures are optimistic — use them for relative comparison.
- · Power: TDP H100 700W, H200 700W, B200 1000W (board-level, public spec). Energy = TDP × utilization (70%) × PUE 1.2 × 8760 h/yr × 5 yr × $0.08/kWh. PUE 1.2 is a typical modern AI-DC figure; older facilities run 1.4–1.6.
- · "Effective PFLOP-years" = dense FP16 PFLOPs × utilization × GPU-count × lifetime. It is an idealized sizing metric, not measured delivered work.
- · EXCLUDED: networking (NVLink/NVSwitch/InfiniBand — often 15–30% of Blackwell rack cost), host CPU + memory, storage, datacenter build/lease, cooling capex, staff/ops, software, depreciation/financing, resale value, and supply lead-time risk. Adding networking generally shifts the answer toward fewer, faster GPUs (favoring B200).
- · Single-generation cluster assumed (no mixed fleets). All figures deterministic from the inputs above; no live pricing feed.