What does the HBM3 vs HBM3e vs HBM4 tool output?

Recommended HBM gen; Availability quarter; Per-GB bandwidth

What inputs does HBM3 vs HBM3e vs HBM4 need?

Workload (training / inference); Model size; Cost ceiling per GB

HBM Memory layer

HBM3 vs HBM3e vs HBM4

Bandwidth, capacity, and availability cross-cut for HBM generations across SK hynix / Micron / Samsung.

The engineer question
Which HBM generation matches my training workload economics?

Result

Recommended generationshipping today: HBM3e
HBM3 — bandwidth / stack6.4 Gt/s/pin: 819 GB/s
HBM3 — max capacity / stack8-Hi / 12-Hi: 24 GB
HBM3 — availabilityGA since 2022 (H100, MI300): Shipping
HBM3 — stacks for footprint~1,260 GB working set: 53 stacks
HBM3e — bandwidth / stack9.6 Gt/s/pin: 1.20 TB/s
HBM3e — max capacity / stack8-Hi / 12-Hi: 36 GB
HBM3e — availabilityGA since 2024 (H200, B200, MI325X): Shipping
HBM3e — stacks for footprint~1,260 GB working set: 35 stacks
HBM4 (target) — bandwidth / stack8 Gt/s/pin: 1.80 TB/s
HBM4 (target) — max capacity / stack12-Hi / 16-Hi: 64 GB
HBM4 (target) — availabilitySampling 2025–26, volume 2026+ (forward-looking): Target / forward-looking
HBM4 (target) — stacks for footprint~1,260 GB working set: 20 stacks

Recommendation

Working set ≈ 1,260 GB before sharding across GPUs. HBM3e is the mid-2026 sweet spot: ~1.2 TB/s/stack and up to 36 GB/stack are shipping in volume (H200 / B200 / MI325X), so you avoid the supply + maturity risk of HBM4. Pay the HBM4 premium only if you specifically need >1.6 TB/s or 48–64 GB/stack and can wait for volume.

Assumptions

· Source: JEDEC standards (HBM3 JESD238, HBM4 JESD270-4) + public SK hynix / Samsung / Micron datasheets & roadmap slides, mid-2026. All figures are approximate / typical — not audited.
· HBM3 ≈ 6.4 Gt/s/pin × 1024-bit = ~819 GB/s; HBM3e ≈ 9.2–9.6 Gt/s/pin → ~1.18–1.23 TB/s. Vendor speed grades vary ±5–10%.
· HBM4 is forward-looking: JESD270-4 doubles the interface to 2048-bit; per-stack bandwidth target ~1.6–2.0 TB/s and capacity ~48–64 GB (12-Hi/16-Hi) are vendor ROADMAP TARGETS, not shipping silicon. Treated with an 0.8 risk discount in the recommendation.
· Footprint heuristic: ~18 GB/B-param (training, mixed precision incl. optimizer+grads+activations) and ~2.5 GB/B-param (inference, fp8 weights + KV cache). Real usage swings ±50% with parallelism strategy, precision, batch and context length.
· Cost index is RELATIVE (HBM3e = 1.0), reflecting mid-2026 contract/street $/GB direction, not a quoted price. HBM4 carries a launch premium that erodes over time.
· Excluded: GPU/accelerator package limits (how many stacks a part actually carries, e.g. 6–8 on current top parts), interposer/CoWoS capacity, NVLink/Infinity-Fabric topology, power & cooling, total $/token, and supply allocation. This sizes a per-stack spec fit only, not a full BOM.

Worked example (default inputs)

Result

Recommended generationshipping today: HBM3e
HBM3 — bandwidth / stack6.4 Gt/s/pin: 819 GB/s
HBM3 — max capacity / stack8-Hi / 12-Hi: 24 GB
HBM3 — availabilityGA since 2022 (H100, MI300): Shipping
HBM3 — stacks for footprint~1,260 GB working set: 53 stacks
HBM3e — bandwidth / stack9.6 Gt/s/pin: 1.20 TB/s
HBM3e — max capacity / stack8-Hi / 12-Hi: 36 GB
HBM3e — availabilityGA since 2024 (H200, B200, MI325X): Shipping
HBM3e — stacks for footprint~1,260 GB working set: 35 stacks
HBM4 (target) — bandwidth / stack8 Gt/s/pin: 1.80 TB/s
HBM4 (target) — max capacity / stack12-Hi / 16-Hi: 64 GB
HBM4 (target) — availabilitySampling 2025–26, volume 2026+ (forward-looking): Target / forward-looking
HBM4 (target) — stacks for footprint~1,260 GB working set: 20 stacks

Recommendation

Assumptions

· Source: JEDEC standards (HBM3 JESD238, HBM4 JESD270-4) + public SK hynix / Samsung / Micron datasheets & roadmap slides, mid-2026. All figures are approximate / typical — not audited.
· HBM3 ≈ 6.4 Gt/s/pin × 1024-bit = ~819 GB/s; HBM3e ≈ 9.2–9.6 Gt/s/pin → ~1.18–1.23 TB/s. Vendor speed grades vary ±5–10%.
· HBM4 is forward-looking: JESD270-4 doubles the interface to 2048-bit; per-stack bandwidth target ~1.6–2.0 TB/s and capacity ~48–64 GB (12-Hi/16-Hi) are vendor ROADMAP TARGETS, not shipping silicon. Treated with an 0.8 risk discount in the recommendation.
· Footprint heuristic: ~18 GB/B-param (training, mixed precision incl. optimizer+grads+activations) and ~2.5 GB/B-param (inference, fp8 weights + KV cache). Real usage swings ±50% with parallelism strategy, precision, batch and context length.
· Cost index is RELATIVE (HBM3e = 1.0), reflecting mid-2026 contract/street $/GB direction, not a quoted price. HBM4 carries a launch premium that erodes over time.
· Excluded: GPU/accelerator package limits (how many stacks a part actually carries, e.g. 6–8 on current top parts), interposer/CoWoS capacity, NVLink/Infinity-Fabric topology, power & cooling, total $/token, and supply allocation. This sizes a per-stack spec fit only, not a full BOM.

Related tools in the HBM Memory layer

→ Get this data as JSON

Inputs

Result

Result

Related tools in the HBM Memory layer