AIAI Infrastructure Map
CompaniesInsightsMapsCompareNewsletterWatchlistStrategy

Chips & Compute layer

Inference Cost Calculator

Per-million-tokens cost for self-hosted inference across H100 / H200 / B200 / MI300.

The engineer question
What does it cost to self-host a 70B model at 100k QPS?

Status · Coming soon

Inputs

  • Model size + variant
  • Throughput target (QPS)
  • Hardware mix

Outputs

  • $ / 1M tokens
  • GPU hours / day
  • Recommended cluster shape

Related tools in the Chips & Compute layer