The AI Gross Margin Problem Nobody Is Pricing For
There's a quiet ticking sound coming from a lot of AI product roadmaps right now. It's the sound of pricing pages written in 2024 meeting the actual cost structure of 2026.
Here's the deal: traditional SaaS runs at 80–90% gross margins. You write the code once, host it cheaply, serve millions of customers. The economics are basically a money printer once you hit scale. That's why VCs loved SaaS. That's why the comps were absurd. That's why "software is eating the world" printed on a thousand pitch decks.
AI has a different vibe. Bessemer Venture Partners' 2026 AI monetization playbook puts AI company gross margins at 50–60% — a full 30 percentage points below what SaaS investors built their models on. Why? Because every single query costs money. Compute, inference, "humans in the loop" for quality — the COGS line that SaaS basically deleted has made a loud comeback. The more customers use your AI product, the more you spend on GPUs. There's no leverage until you figure out your pricing.
The Math You Skipped in Product Discovery
Most AI founders and PMs did approximately zero margin math before shipping a pricing page. They looked at what OpenAI charges, divided it vaguely by something, added a markup, and called it a day. Cost-plus pricing. The coward's way out.
BVP is blunt about it: "If the math doesn't work at 10 customers, it won't at 1,000." And yet here we are, with dozens of AI copilots and workflow tools priced like their COGS is zero, watching their unit economics quietly deteriorate at every new enterprise rollout.
The brutal irony: the customers who get the most value from your AI product are also the ones costing you the most to serve. High-volume power users who run 500 queries a day are your best testimonials and your biggest margin drag. If you haven't built pricing that captures value from those users, you're subsidizing their success with your balance sheet.
The Seat Pricing Escape Hatch Is Closing
For a while, the fix was easy: charge per seat and pretend usage doesn't matter. Let finance worry about it later. But as PYMNTS reported in February 2026, AI agents are fracturing the per-seat model at the root. Autonomous agents draft contracts, reconcile invoices, and triage tickets without being tied to a named employee. The link between headcount and software spend is weakening fast.
You can't charge per seat when the thing doing the work doesn't have a LinkedIn. So vendors are scrambling — credit pools, consumption tiers, per-workflow charges, per-resolved-ticket fees. Intercom charges $0.99 per resolved support ticket. That's not an accident. That's a very deliberate answer to the margin problem: price is tied to value delivered, which is tied to cost incurred.
The 2025 Pilot Reckoning Is Here
There's one more bomb dropping right now that almost nobody is talking about loudly enough. All those 2025 AI pilots — the ones sold on vibes, demos, and "we'll figure out ROI later" — are hitting renewal cycles this year.
BVP calls it the "soft ROI problem": copilots that offer advice without closing the loop, assistants that "help" without generating measurable outcomes, AI tools that make users feel more productive without reducing headcount or shipping faster. Customers are asking their real question now: "Are we actually getting value?" And a lot of AI vendors don't have a clean answer.
This is a pricing problem masquerading as a product problem. If your pricing isn't connected to an outcome — a ticket resolved, a contract generated, a fraud case caught — you have no quantifiable proof of value at renewal. You're asking the CFO to re-approve a budget line based on vibes. CFOs hate vibes.
So What Do You Actually Do?
Three things. First: do the margin math now, not later. Figure out your fully-loaded cost per unit of value — per query, per task, per workflow run — and make sure your pricing captures it. Don't launch a product where heavy usage is a margin nightmare.
Second: tie your charge metric to something the customer measures. Not tokens (meaningless to business buyers). Not "AI credits" (abstract). Something like: resolved tickets, documents processed, tasks completed, hours saved. If your customer tracks it in their own reporting, you can anchor renewal conversations on real numbers.
Third: build a hybrid floor. Pure consumption pricing gives CFOs anxiety because the bill is unpredictable. A base subscription plus usage overage gives you a floor for your P&L and gives them a ceiling to budget against. BVP's research consistently points to hybrid models — base plus usage tiers — as the model that wins when early-stage teams are still figuring out their cost structure. It's not elegant. It works.
The 80% gross margin SaaS era was fun while it lasted. AI is a different business with different economics. The companies that price accordingly in 2026 are the ones still standing in 2028.
Who's Actually Getting This Right
The vibe-coding generation of AI tools made this concrete fast. V0, Lovable, Bolt, and Replit all hit the same wall — inference costs money every time a user clicks "generate," and flat subscriptions become a slow-motion margin disaster. Each one found a slightly different answer.
V0 (Vercel)
V0 uses credits. Each component generation costs a set number of credits. Pro plan ships with 1,000/month; beyond that, you buy more. The credit unit is abstracted enough that users don't think about tokens but specific enough that Vercel can tune the cost-per-credit dial as model costs shift. It also solves the "superuser problem" elegantly — power users who generate 10× more aren't subsidized by casual users. The model leaks a little on simplicity (what's a credit worth?), but it protects the P&L.
Lovable
Lovable prices by messages. Each AI turn in a conversation costs a message from your monthly allotment. It's honestly one of the cleaner implementations because the unit is legible: you always know roughly how many messages it takes to build a feature. They've been transparent about why — inference isn't free, and pretending otherwise meant burning through runway. The message model forces users to be intentional, which actually improves product quality and reduces their compute costs simultaneously.
Bolt.new (StackBlitz)
Bolt sold token packages outright. You buy a block of tokens, you spend them, you buy more. The transparency was a deliberate product choice — showing a token counter gave users agency and made cost visible. The downside: some users burned through large packages in a single session and felt burned. Bolt has since layered in subscription tiers with included tokens. The lesson: transparent consumption pricing is great in theory, but you need a reset cadence (monthly, not per-purchase) or the sticker shock on individual sessions creates churn.
Replit
Replit's Agent checkpoint model is the most sophisticated answer here. Replit Agent runs in "cycles" — units of compute spent on your codebase. But critically, the agent pauses at checkpoints and asks: "I've used X cycles. Should I keep going?" This does three things at once: caps Replit's margin exposure on any single session, forces the user to validate the agent is on the right track (reducing wasted compute), and creates a natural upsell moment when users want to keep going. It's outcome-aware pricing — the meter stops when the user decides the result is good enough. Every AI dev tool should be studying this model.
The common thread: none of them left consumption fully unmetered. The companies that tried — unlimited generations for a flat $20/month — quietly either died or walked it back. The ones still growing are the ones that found a unit of value, made it visible, and priced it honestly.
Sources
- Bessemer Venture Partners — The AI Pricing and Monetization Playbook (Feb 2026) — AI company gross margins 50–60% vs. 80–90% for SaaS; Intercom $0.99/resolved ticket; soft ROI renewal risk; hybrid model guidance
- PYMNTS — AI Pushes SaaS Toward Usage-Based Pricing (Feb 17, 2026) — per-seat model fracturing as AI agents operate without named user accounts; credit-based and transaction-based pricing models emerging