The 62x Problem: AI Model Costs Broke Your Pricing Page
Here's a fun number: GPT-5 Nano costs $0.40 per million output tokens. Claude Opus 4.6 costs $25. That's a 62x price difference between two models that give the exact same answer to "What's the capital of France?"
Now go look at your pricing page. You have one number on it. Maybe $49/month. Maybe $0.01 per API call. Whatever it is, it's pretending that your cost structure is stable. It is not. Your cost structure has a 62x variance depending on which model handles which request, and you're smoothing that into a single price like it's fine. It is not fine.
Your Margin Is a Roulette Wheel
The Afternoon team broke this down recently: per-seat pricing works when usage per user is consistent. Per-token works when your buyers are technical. Per-output works when your cost per output is stable. But here's the thing — none of those conditions hold when you're routing across models with a 62x cost spread.
A customer support bot handling 10K conversations a month? That could cost you $4 on Nano or $250 on Opus, depending on complexity routing. You charged the customer the same $199/month either way. One of those months you made 98% gross margin. The other month you lost money. Same customer. Same product. Same pricing page.
This is the part where someone in the meeting says "we'll just use the cheap model for everything." Great plan. Go explain to your enterprise customer why their complex financial analysis is getting GPT-5 Nano quality. I'll wait.
70% of Your Traffic Doesn't Need the Expensive Model
Here's the stat that should be on every AI product manager's wall: roughly 70% of typical AI API traffic is simple tasks — classification, extraction, translation, basic Q&A. Tasks where the $0.40/M model and the $25/M model produce functionally identical results.
Which means 70% of your AI spend is a pricing decision, not an engineering decision. You're not choosing a model for quality. You're choosing how much margin to throw in the trash. And if you're passing that cost through to customers via usage-based pricing, you're making their bills 62x more expensive than they need to be for most requests.
The companies figuring this out are building model routers — classifiers that send simple tasks to cheap models and complex tasks to expensive ones. But that creates a new problem: your unit economics now depend on your router's accuracy. Miss-classify 10% of simple tasks as complex and you just blew your margin forecast. Your CFO will love that conversation.
The Pricing Page Has to Get Honest
The old SaaS playbook was simple: pick a price, maintain 80%+ gross margins, done. AI broke that. Per-seat pricing doesn't scale with AI workload — one power user can consume 100x the inference of a casual user. Per-token pricing terrifies non-technical buyers. Per-output pricing works until your cost-per-output distribution has a fat tail that eats your margin on the hard cases.
So what actually works? Hybrid. Base platform fee for predictability. Usage component tied to a value metric the customer understands (not tokens — nobody outside a model lab thinks in tokens). And critically: tiered model access. Let customers pick their quality level. "Standard" tier runs on efficient models. "Premium" tier gets the heavy hitters. Price accordingly.
Anthropic figured this out with their own pricing — they cut Opus 4.6 prices 67% from Opus 4.1 ($75 → $25 per million output tokens). Even the model providers know the spread was unsustainable. If they're admitting the price was wrong, maybe your pricing page should too.
The 62x spread isn't a bug. It's the market telling you that "AI" isn't one product with one cost — it's a spectrum. Price it like one and you'll either leave money on the table or lose it. Probably both, on alternate Tuesdays.
Sources
- Medium — Every AI Model's Real Cost in 2026: The Complete Developer Pricing Guide — 62x price spread data, model pricing table, 70% simple traffic stat, Opus 4.6 price cut
- Afternoon — Per-Seat vs Per-Token vs Per-Output: Financial Tradeoffs of AI Pricing Models — per-seat/token/output tradeoff analysis, margin inconsistency under seat pricing