Chips & Compute layer
Memory Bandwidth Bottleneck Detector
Given a model + accelerator, decide whether you are bandwidth-bound or compute-bound.
The engineer question
Is my 70B inference bandwidth-bound on H100?
Status · Coming soon
Inputs
- Model size + batch size
- Accelerator
- Quantization scheme
Outputs
- Bandwidth utilization
- Compute utilization
- Bottleneck verdict + fix list