Skip to main content

Chips & Compute layer

H100 vs H200 vs B200 TCO

5-year TCO for the three NVIDIA generations on a training workload of a given size.

The engineer question
Is upgrading from H100 to B200 worth the cost?

Inputs

Number of accelerators in the cluster (one generation at a time).

Amortization / service horizon before refresh.

All-in delivered electricity price (industrial / colo rate).

Result

5-yr TCO — H100acq $30.72 M + power $2.11 M
$32.83 M
5-yr TCO — H200acq $35.84 M + power $2.11 M
$37.95 M
5-yr TCO — B200acq $40.96 M + power $3.01 M
$43.97 M
$ / eff PFLOP-yr — H100
$9.3 k
$ / eff PFLOP-yr — H200
$10.7 k
$ / eff PFLOP-yr — B20041% better value/PFLOP than H100
$5.5 k
Best value (lowest $/eff PFLOP-yr)
B200 SXM (Blackwell)

Recommendation

Upgrading to B200 is worth it: despite a ~33% higher street price and a 1000W TDP, its ~2.3× dense FP16 throughput makes it 41% cheaper per effective PFLOP-year than H100. The crossover holds because B200's perf-per-dollar outruns the power penalty at 70% utilization. Caveat: this ignores networking (NVLink/InfiniBand), which is a much larger share of Blackwell rack cost.

Assumptions

  • · Street prices are ROUGH public estimates only — Nvidia publishes no list price. Mid-2026 typical figures: H100 ≈ $30,000, H200 ≈ $35,000, B200 ≈ $40,000 per GPU (±20–40% by volume, SKU, and quarter). Source: trade-press surveys, analyst notes, secondary-market listings. NOT an audited quote.
  • · Throughput is public DENSE spec-sheet figures: H100/H200 989 TFLOPS FP16 (1979 FP8); B200 2250 TFLOPS FP16 (4500 FP8). Sparse numbers are ~2× higher but rarely realized — we use dense. Real-workload MFU (model FLOPs utilization) is typically only 30–50% of peak; this tool does not apply MFU, so absolute $/PFLOP figures are optimistic — use them for relative comparison.
  • · Power: TDP H100 700W, H200 700W, B200 1000W (board-level, public spec). Energy = TDP × utilization (70%) × PUE 1.2 × 8760 h/yr × 5 yr × $0.08/kWh. PUE 1.2 is a typical modern AI-DC figure; older facilities run 1.4–1.6.
  • · "Effective PFLOP-years" = dense FP16 PFLOPs × utilization × GPU-count × lifetime. It is an idealized sizing metric, not measured delivered work.
  • · EXCLUDED: networking (NVLink/NVSwitch/InfiniBand — often 15–30% of Blackwell rack cost), host CPU + memory, storage, datacenter build/lease, cooling capex, staff/ops, software, depreciation/financing, resale value, and supply lead-time risk. Adding networking generally shifts the answer toward fewer, faster GPUs (favoring B200).
  • · Single-generation cluster assumed (no mixed fleets). All figures deterministic from the inputs above; no live pricing feed.

Worked example (default inputs)

Result

5-yr TCO — H100acq $30.72 M + power $2.11 M
$32.83 M
5-yr TCO — H200acq $35.84 M + power $2.11 M
$37.95 M
5-yr TCO — B200acq $40.96 M + power $3.01 M
$43.97 M
$ / eff PFLOP-yr — H100
$9.3 k
$ / eff PFLOP-yr — H200
$10.7 k
$ / eff PFLOP-yr — B20041% better value/PFLOP than H100
$5.5 k
Best value (lowest $/eff PFLOP-yr)
B200 SXM (Blackwell)

Recommendation

Upgrading to B200 is worth it: despite a ~33% higher street price and a 1000W TDP, its ~2.3× dense FP16 throughput makes it 41% cheaper per effective PFLOP-year than H100. The crossover holds because B200's perf-per-dollar outruns the power penalty at 70% utilization. Caveat: this ignores networking (NVLink/InfiniBand), which is a much larger share of Blackwell rack cost.

Assumptions

  • · Street prices are ROUGH public estimates only — Nvidia publishes no list price. Mid-2026 typical figures: H100 ≈ $30,000, H200 ≈ $35,000, B200 ≈ $40,000 per GPU (±20–40% by volume, SKU, and quarter). Source: trade-press surveys, analyst notes, secondary-market listings. NOT an audited quote.
  • · Throughput is public DENSE spec-sheet figures: H100/H200 989 TFLOPS FP16 (1979 FP8); B200 2250 TFLOPS FP16 (4500 FP8). Sparse numbers are ~2× higher but rarely realized — we use dense. Real-workload MFU (model FLOPs utilization) is typically only 30–50% of peak; this tool does not apply MFU, so absolute $/PFLOP figures are optimistic — use them for relative comparison.
  • · Power: TDP H100 700W, H200 700W, B200 1000W (board-level, public spec). Energy = TDP × utilization (70%) × PUE 1.2 × 8760 h/yr × 5 yr × $0.08/kWh. PUE 1.2 is a typical modern AI-DC figure; older facilities run 1.4–1.6.
  • · "Effective PFLOP-years" = dense FP16 PFLOPs × utilization × GPU-count × lifetime. It is an idealized sizing metric, not measured delivered work.
  • · EXCLUDED: networking (NVLink/NVSwitch/InfiniBand — often 15–30% of Blackwell rack cost), host CPU + memory, storage, datacenter build/lease, cooling capex, staff/ops, software, depreciation/financing, resale value, and supply lead-time risk. Adding networking generally shifts the answer toward fewer, faster GPUs (favoring B200).
  • · Single-generation cluster assumed (no mixed fleets). All figures deterministic from the inputs above; no live pricing feed.

Related tools in the Chips & Compute layer

Get notified when H100 vs H200 vs B200 TCO numbers update

We refresh the inputs as the market moves. One email when they change.