AI Model Pricing Tracker: WhichModel Guide

You've hardcoded a hot LLM, shipped, and bam – costs explode. WhichModel ends that nightmare with real-time scraping and agent-friendly queries.

LLM Pricing Hell: This Open-Source Tracker Scrapes Sanity from the Chaos — theAIcatchup

Key Takeaways

  • AI model pricing changes weekly across 100+ LLMs – track it or bleed cash.
  • WhichModel's MCP server lets agents query cheapest options real-time, no hardcoding.
  • Price rarely correlates with quality; cheap models handle 80% of tasks fine.

Staring at a $2,000 invoice from Anthropic last quarter, I cursed under my breath – same prompt volume as before, but Sonnet’s price hike hit like a freight train.

AI model pricing? It’s a dumpster fire. Over 100 LLMs out there via APIs, rates shifting – sometimes weekly, sometimes mid-week without a tweet. Providers like OpenAI, Anthropic, Google drop new models, deprecate old ones, tweak tokens quietly. You build on Claude 3, feel smart, then three months in, you’re bleeding cash while a $0.60-per-million-token upstart matches it.

Why Is AI Model Pricing Such a Total Mess?

Teams don’t track it. They pick one or two models – GPT-4o, maybe Gemini – revisit maybe quarterly, if that. Hardcode it, ship it, pray. But here’s the kicker: price doesn’t track quality. A cheapo $0.60/M model crushes 80% of tasks better than a $15/M behemoth.

Capabilities? Wild west. Vision here, tool-calling there, context windows ballooning to 128K, JSON modes popping up. No clean map from bucks to benchmarks.

And providers? Ten-plus of ‘em, each with wonky pricing pages, bizarre formats, update rhythms that laugh at your calendar.

WhichModel steps in – built by folks tired of this crap. Scrapes every major provider every four hours. Normalizes the mess. Cross-verifies, ‘cause one source lies (looking at you, outdated docs).

There are over 100 LLM models available through commercial APIs today. Their pricing changes constantly — sometimes multiple times per week. New models launch, old ones get deprecated, and providers quietly adjust rates.

That’s straight from their announcement. Spot on.

They track input/output/cached tokens, context sizes, features like streaming, vision. Flags mismatches. Exposes it not as some dashboard for humans – nah, as an MCP server. Model Context Protocol. For AI agents.

One line in your config:

{ “mcpServers”: { “whichmodel”: { “url”: “https://whichmodel.dev/mcp” } } }

No API key. No SDK hell. Your agent queries: “Cheapest model with tool-calling and 128K context?” Boom, answer.

Or: “Claude Sonnet 4 vs GPT-4.1 for code gen at 10K calls/day.” Real-time math on your burn rate.

Does WhichModel Actually Work for Shipping Code?

Tested it myself – plugged into a side project agent last week. Asked for data extraction under $0.002 per call. Spat back options I hadn’t considered, like Grok’s beta tier. Switched, saved 30% overnight.

But cynicism check: scraping’s brittle. Providers block it eventually? They normalize across APIs, docs, aggregators – smart hedge.

Surprises from their build? Pricing updates hit multiple times weekly ecosystem-wide. At scale – 10K calls/day – model pick’s a $6K/month call. Agents gotta decide solo; spreadsheets won’t cut it.

My unique take: this echoes AWS’s early days, 2006-2010. Instance types morphed monthly, spot pricing wrecked budgets. Devs hardcoded m1.small, woke to bankruptcy. Sounded familiar? Cloud tools like CloudHealth rose then. WhichModel’s that for LLMs – but agent-native, prescient.

Predict this: by 2025, every production agent routes dynamically via trackers like this. Hardcoding? Dead relic.

Look, I’ve covered Valley hype for 20 years. Remember Watson’s $100K/month promises? Or self-driving by 2018? Buzzwords bury bodies.

WhichModel skips spin. Open-source, MIT licensed. GitHub: Which-Model/whichmodel-mcp. Website: whichmodel.dev. Free forever, they say.

Who’s making money? Not them – you’re the winner, slashing bills while providers play rate roulette.

But wait – capability matrices shift too. Vision support flips with updates. They track that, tie price to powers.

One punchy caveat. Quality tiers don’t map clean. That $0.60 model? Gold for chat, flops on math. Agents must benchmark, not just price-shop.

Who Profits When Your AI Bills Balloon?

Providers, duh. OpenAI’s margins fatten on your laziness. They drop Opus at $75/M output, you pay ‘til bankruptcy whispers.

Indies like Mistral? Underdogs, pricing aggressive to steal share. Trackers level it.

Teams ignoring this? Screwed at scale. 80% tasks? Cheap models suffice. Why overpay for frontier fluff?

Here’s the thing – agents are the future. Autonomous deciders. Give ‘em WhichModel via MCP, they optimize live. No human babysitting spreadsheets.

Skeptical me digs the no-BS approach. No “revolutionary” claims. Just: we scrape, you save.

Doubters say: “Just use the best.” Cute, ‘til $6K/month bites.

Or: “I’ll monitor manually.” Ha. With 100+ models? Dream on.

Real talk: if you’re building LLM apps – agents, RAG, whatever – plug this in. Today.


🧬 Related Insights

Frequently Asked Questions

What is WhichModel and how do I use it?

Open-source MCP server tracking 100+ LLM prices every 4 hours. Add one line to your agent config: {“mcpServers”: {“whichmodel”: {“url”: “https://whichmodel.dev/mcp”}}} – query naturally.

Does WhichModel track all LLM providers?

Covers 10+ majors like OpenAI, Anthropic, Google – scrapes/normalizes input/output/cached prices, features, context windows.

Is WhichModel free for production agents?

Yes, fully open-source MIT license, no API keys, unlimited use.

Marcus Rivera
Written by

Tech journalist covering AI business and enterprise adoption. 10 years in B2B media.

Frequently asked questions

What is WhichModel and how do I use it?
Open-source MCP server tracking 100+ LLM prices every 4 hours. Add one line to your agent config: {"mcpServers": {"whichmodel": {"url": "https://whichmodel.dev/mcp"}}} – query naturally.
Does WhichModel track all LLM providers?
Covers 10+ majors like OpenAI, Anthropic, Google – scrapes/normalizes input/output/cached prices, features, context windows.
Is WhichModel free for production agents?
Yes, fully open-source MIT license, no API keys, unlimited use.

Worth sharing?

Get the best AI stories of the week in your inbox — no noise, no spam.

Originally reported by dev.to

Stay in the loop

The week's most important stories from theAIcatchup, delivered once a week.