#power-wall — theAIcatchup

Pie chart breaking down LLM inference power: 99.8% data movement vs 0.2% compute on NVIDIA H100

Large Language Models

LLM Inference's Power Lie: 99.8% Wasted on Data Hauling, Not Crunching Numbers

We all figured bandwidth or VRAM would cap LLMs. Nope. Power's the brick wall, and it's mostly pissed away shuffling weights—not doing math.

4 min read 4 weeks, 1 day ago