Chips & Compute layer
Inference Cost Calculator
Per-million-tokens cost for self-hosted inference across H100 / H200 / B200 / MI300.
The engineer question
What does it cost to self-host a 70B model at 100k QPS?
Status · Coming soon
Inputs
- Model size + variant
- Throughput target (QPS)
- Hardware mix
Outputs
- $ / 1M tokens
- GPU hours / day
- Recommended cluster shape