I know a lot of people at companies where the marching orders changed on a dime end of Q1/start of Q2. These are shops that were fully on the "use AI or die (because we will fire you)" train.
Now there's monitoring, reporting, alerting not just on overall cost but on "over-use" of best/priciest models based on total-or-percent tokens/dollars, etc. All of this comes with direct developer engagement & standardized management escalation for holding it wrong.
To me this customer behavior does not smell like a product you can 10x the pricing on to get profitable. We have exited the exploration phase and now ROI matters.
Anyone know what they are spending this on? Can't remember seeing one OpenAI ad.. Is it just pr and influencers? Ads in the US?
Sure, you can use AI to potentially replace software engineers, but the F500 are also terrified of not having accountability or making mistakes. They won't be firing any engineers. In that scenario, there's just no room for AI usage. If you have to be responsible for all the code, then... AI has to either manage it completely autonomously (which even Fable can't) or... humans have to be in the loop which means they still have to understand the code. The best way to understand the code is to write the code yourself. So there's no productivity gain to be had.
I'm pro-AI, but I think we're due for a big crash next year.
There are ~1.6M software engineers on the US [0], earning a bit under 150k/year on average [1]. If AI companies captured all of that spend, that amounts to about 250B/year. The article assumed that they need around 300B/year to keep up with their debt.
At least based on Meta's recent behavior, forcing 30-50% of developers to switch to data labeling, it looks like that is actually their game plan.
[0] https://en.wikipedia.org/wiki/Software_engineering_demograph...
[1] https://www.indeed.com/career/software-engineer/salaries
That is worth a small multiple of the fully-loaded employee cost. So AI might be easily worth more than $200 per human-equivalent hour. With high utilization, that might be $8000-10000 a month.
With that kind of spend, AI provider financials looks less frightening.
might as well be the other way around with non subscribed token being 50x overpriced, or any combination thereof
also uber was non profitable for the longest time, raking up 31b in losses, on the bet of capturing the market worldwide. scale here is different, but it's also 10 years later, with a lot more volatility and floating cash in the market (voo grew 327% over that period, not unreasonable that round size grew on the same trajectory)
The companies that did not yet jump on this bandwagon and are still evaluating will have a decision to make.
No matter what the AI companies are going to change their pricing strategy and it’s going to become a lot lot more expensive to use. I am just hoping the price stays like this until I am done with my big chunk of work
What makes AI so convenient is how good it is at doing red-team code reviews on my work. I used to need all this unnecessary communication just to get a review, but now I only have to reach out to the people I actually want to talk to.
Frontier models may eventually achieve super-intelligence (no opinion beyond mild skepticism) but super-intelligence isn't necessary for most practical day-to-day programming. The problems, as always, become communication, understanding what users really need, etc. that is, softer skills.
Chinese models and open model providers are, indeed, competing on price, and the difference shows.
The drug dealer analogy has a darker side to it, however.
Once your dependent, they can drive up the price just because. It doesn't need to be for existential reasons.
Neither Anthropic nor OpenAI are subsidizing enterprise customers. Neither Anthropic nor OpenAI allow Business nor Enterprise customers access to the high value $200/mo plan. Both organizations have moved to a "cheaper plan per user + API Pricing after that" (e.g. $20/mo + usage). The $100/$200/mo plans are for individuals only (of course, many individuals use these plans at work, but that's beside the point; they aren't selling this plan to enterprises).
> SemiAnalysis also analyzed the platform's gross margins, implausibly assuming that tokens were priced at 4 times the cost of generating them and: With the current subsidies, all it takes for a user to have a gross margin of at best negative 25% is for them to use as little as 25% of their rate limit.
The article's source for this claim is not SemiAnalysis; its Zitron. But once you dig through his article, Zitron links to a SemiAnalysis tweet [1] where they, as the paragraph states, implausibly assume gross margins of 75% to come up with their weird analysis of the subscription plans. Citing this for anything is weird, because afaik that 75% number is a total shot in the dark. We have no clue what their margins are. My take is that the only reason that 75% number is implausible is because it may underestimate the inference margins of Ant/OAI's API pricing.
[1] https://x.com/SemiAnalysis_/status/2064815045767213400?ref=w...
Consider Google, Apple, Amazon, etc.
It's still early days...
If you think search ads are annoying, pre-roll YouTube ads are annoying, streaming ads are annoying, or basically ads-on-any-screen-anywhere-at-any-time are annoying, just wait until every stupid thing is powered by AI and is subtly trying to manipulate you to buy/watch/believe some crap all the time.
[1]: And this too is incorrect, should be " the number of jobs displaced would be around 32.5M" (the post says 32.5K)
This is going to be the new most misquoted/misunderstood data of the year, isn't it? The cost is mostly from a one-time accounting situation due to their pivot from a non-profit organization.[0] If we trust the leak [1] OpenAI is likely turning profitable this year.
[0]: $30Bn of it is the one-time cost. https://www.ft.com/content/e15b0d7e-ff6b-4f16-ba7a-4068feddb...
[1]: I suspect OpenAI itself leaked that financial report. It's almost unbelievably healthy.
You don't price based on cost, you price based on willingness-to-pay.
So maybe labs are "overcharging" enterprises on interference (because, up til now, enterprises have seemingly had unlimited budget for tokens) and "undercharging" individuals and SMBs (because they don't have an unlimited budget).
> [Ratio of per-token cost to subscription cost] means Anthropic is subsidizing their enterprise customers by up to 40 times, and OpenAI up to 70 times
Actually, they could be subsidizing by more (if they are taking a loss on API), or not at all (if they are soaking API customers by a massive margin).
Separately, these subscriptions get sold to large groups with varying usage, so it's crazy to model assuming every subscription is maxed out. Banks, gyms, and many other businesses work this way, offering consumers flexible access to services that they will realistically use in bursts. It's not always worth the complexity to prevent overuse by a small minority. You can feel like this kind of business model isn't as transparent, but it's silly to pretend it can't work.
> OpenAI spent 44% of their revenue [$5.3B] on sales and marketing! The hype needed to keep the AI bubble inflated is incredibly expensive.
Over that same period (2025), OpenAI added $10B in realized revenue and $14B in run-rate. Sounds like they're getting >2X return within 12 months of those go-to-market dollars. Compare that to like, any other business.
> Thus in recent weeks the idea that Generative AI (LLMs for short) is too expensive has been all over mainstream business media.
Would it be smarter for these companies never to test customers' price tolerance? The quotes following this make it seem like the companies are getting important information about the nature of that price tolerance, and preparing to react. This is the work markets do on both sides to understand the value of a new product.
There are lots of good arguments about AI overinflation, but in order for them to be useful, they have to be rigorous and targeted.
Over the last month I have seen companies scrambling to measure deliverables against cost. Most of the back room talk is to the affect of giving devs a small allowance ($500 a month) and then making them prove their own productivity increases (again, based on deliverables, not LoC) before they either take it away or give them more.
Obviously this won’t be on an individual basis but some kind of unit.
Either way, with how much I see these companies cutting back I have no idea how the big AI companies are going to be profitable.
For awhile it was every 2-3 years you'd start a hardware refresh. As companies moved into more and more training, this timeframe started to shrink. It went from 36 months to 24 months. From 24 months to around 16-18 months. Last I checked last year, it was at 12 months. I think things may have slowed because of component availability, but otherwise whole data centers would be 6-12 months into full operations before they would start a refresh cycle.
Not to mention the massive increase in power density demand and cooling demand per rack that entails.
So no, "AI costs" have not gone down, in fact they are more expensive on training AND inference than ever.
This is why many are concerned about the heroin drip of api costs into orgs. For the companies that are public, look into their financials. It's gonna hit companies and high volume users like a ton of bricks.
And then remarks like this:
Anthropic, OpenAI and Microsoft have all now transitioned customers from subscriptions to token-based pricing.
Huh? I use OpenAI via a subscription, as is anyone else using GPT-5.5-Pro who isn't a multimillionaire.The conversation in a lot of wealth management offices has shifted dramatically in the last few month from “how do I get in on this AI thing?” to “how do I protect my assets when this AI stuff blows up.”
There’s little question now if this will all implode, just when and who’s going to lose their shirt and be left without chairs when the music stops.
What’s playing out now is the scene from The Big Short where the banks wouldn’t mark down the value of bonds until they secured a short position. Once the big money has their helmets on it will stop providing fuel for the bubble and then look out below!
The only moat OpenAI and Anthropic have is regulation. If the Chinese really eant to hammer us, they could realse the full training data and pipeline.
Vendor lock-in is the current goal. Consumer prices are a drop in the bucket comparatively.
I work at a Fortune 200 company. At first, it was the Wild West. Need an LLM? You got it. Need to or want to build an army of agents? Done and done. We literally had everything at the tips of fingers for about 3 months. Teams were building their own internal tools, the team I work on canceled contracts with several software vendors because teams were building the same tools for what they thought was nothing.
Then they signed contracts with Anthropic and Google because I would assume they saw the token usage was through the roof. One month later? They completely cut off access to everybody for both Claude and Gemini. If you wanted access? Suddenly it was several forms, along with several approvals and a rock solid business case why you needed it. And before you got to the forms? You were added to a waiting list that was thousands of people long.
The entire company is now in damage control after trying to get the genie back in the bottle. I'm guessing someone saw how much we would be paying for the tokens we'd been using and decided to shut the party down so to speak.
The math doesn’t math.
I think you forgot what super-intelligence means…
Btw, some Chinese corporates have already seen this and increased their price. Zhipu AI & Tencent for example. Alibaba, Baidu, and Tencent also announced multiple price increases for their AI services.
- if AI costs go down you can ask how the companies will make profit and then suggest the bubble popping
- if AI costs go up you can ask how people will afford it and then suggest the bubble popping
- if companies actually do make profit then you can say the companies are getting too big and powerful so it’s a bad thing for consumers
Essentially you have left zero to a small narrow path where you are happy with the outcomes.
Likewise, the quality of what I can get from a local model like Qwen 3.6 on an RTX 5090 is light years ahead of what I could get a year ago on the same hardware.
Once moat is achieved, you don't have to compete on price. Of course it'll be academic because the AI will probably destroy all of us.
And, even with the price increases, Z.ai and Tencent are still much cheaper than Anthropic or OpenAI models. I think there's an efficiency focus among the Chinese models that is absent at OpenAI and Anthropic, and in the end I suspect efficiency will be the winning feature. Google seems to understand that. Gemini 3.5 Flash is pretty competitive with the big guys, and it's small enough for Google to run it profitably (I assume) for a price that's much less than the frontier models. Gemma 4 models are showing off a bunch of efficiency techniques (MTP, QAT, the 12B encoder-less vision model that soundly outperforms much larger vision models, DiffusionGemma), and I assume they have several more techniques that aren't published.
If apparently the only way you can make money with your product this early is to dilute and adulterate it behind the scenes, it strongly suggests you want the customer to continue to believe they are getting value that you can't afford to supply.
More prosaically: if either of these firms could prove that they were even really close to profitable on inference, they would have bloomin' said so while they were trying to raise more money.
I would assume when price hikes happen either 1) less non technical people would vibecode as it doesnt impact the work that much 2) people use the cheaper chinese models 3)we're jamming ai into everything because were exploring. We will just niche down into use cases that provide high roi
Only reason deepseek is so cheap is because well I don't know, but actual pricing should be around their initial price which was 4x, at that price you have a healthy 25-50% margin based on occupancy, given the deepseek v4 is a very sparse moe model.
GLM 5.2 for example doesn't have more than 30-50% margins that's assuming old pricing for GPUs, current inflated GPU pricing well I am certain the margins must be lower. Ofc you can host for cheaper with quantization, and if you have very consistent capacity/utilization, which is not the norm with AI workloads.
Overall for large models like GPT 5.5 or Opus there must be healthier margins of around 50-70% assuming GPU pricing didn't increase for these companies. Even if it did 30-40% margin should be possible, even in worst case assuming all GPU they had saw a jump in pricing.
For smaller models it's hard to say, I would guess 20% but these models might be much smaller than I suspect, then it might be double that.
Note the issue is less intelligent tokens don't linearly scale down in memory usage, which is the biggest pain point of serving models. Context sizes have fucked us all.
Also anyone claiming OAI makes less margins on APIs or stuff might be wrong given they are on much lower context size, 1M context definitely is a lot more expensive to serve especially with smaller models like sonnet.
> Neither Anthropic nor OpenAI allow Business nor Enterprise customers access to the high value $200/mo plan.
they may not "allow" it, but i've seen first hand enterprises encourage employees to use these accounts personally and get reimbursed later to avoid pay-as-you-go w/limits pricing for users who do tokenmaxing as a cost control measure...Please tell more :). Do you pay per token from bedrock / openrouter / somewhere else? How many tokens you use over the month, and how many for each task? Which harnesses?
The big push for regulation and export controls is only going to ensure OpenAI & Anthropic are more like the automakers. Only in business because of protectionism, left to screw over US consumers meanwhile the rest of the world gets to enjoy cheap EVs
How do you know that the other models you are referring to aren't subsidized?
This is the crisis point for vibe-coders. A developer can go back to writing code by hand, as horrible as that might sound. Someone who hasn't learned to code but builds with AI can't go back. They either pay or they stop. That will be an painful choice whichever way you fall.
If true then why are neither Anthropic or OpenAI dropping their API pricing to gain market share when both are clearly doing all sorts of political and PR maneuvering to compete in a cutthroat market?
Since they aren't dropping the API usage prices (and are in fact raising them in a lot of subtle ways) then one of these options almost has to be true: they are still subsidizing inference, training costs are so ridiculously high that they need to make huge profits off inference or collapse in on themselves, or they are price fixing.
Here's a concrete example. Does some random AI company make operating profit on inference? I.e. if you only kept marginal costs, would you make a profit?
Well, depends what you account as your costs. If you're using hand-me-down hardware from previous generation's training, how much do you charge yourself internally for it? Maybe you show less, so investors take solace in profitable inference, even if you're losing money overall. How exactly are you accounting for electricity costs between training and inference? Is your army of SREs mostly servicing training new models (R&D expenditure) or inference (operating cost)?
This even has a name, and is called the "big bath" approach. If investors expect one part of your business to be a fiscal black hole, just shove all your costs there. They are accepting of it, and you make the rest of the business look better.
I'm not accusing AI companies of cooking the books, rather I'm trying to highlight you could see all the cash flows and still not know how much money is made or lost where.
Eventually the frontier labs will try to cut out the middle man once these models prove themselves and start doing partnerships with big firms in the domains, so they can take a % of the profits in perpetuity rather than just taking a one time payment. For example, after Anthropic Galen, they'll do a partnership with Pfizer to generate Ozempic-Superjacked and take 20% royalties on global sales.
Having growth up in the 90s, it is weird seeing companies share their technology secrets publicly.
If you zoom out to the year 2100, it becomes a little pimple on the economy that is ready to pop, but in the here and now it can cause a lot of damage to real people's wages and finances over the next 3 years.
But next year we could be in the middle of a massive $600B/yr capital-spending bubble deflating hard with unemployment accelerating towards 10% (or higher).
The internet never failed, but the telcom/dotcom collapse still happened in 2001.
I didn't get the sense this was LLM-written, but typo-signalling is... I donno a bit weird. Firefox is underlining some of the words as I write. I'm leaving "donno" unchanged even though it's flagging it as a misspelling but I suppose I'd still opt to fix something like "maiinstream" even at the risk of potentially seeming more LLM-ish!
Cheap, but gave them a massive user base they can claim is using AI
We have a pretty good idea of how much it costs to serve these models. You can pencil out the economics and guess at the model sizes and we know pretty decently how expensive the hardware is.
This like claiming it's meaningless to guess the margins of a restaurant without going into their books and seeing the exact recipets and recipes.
They ain't doing dark arts in the back. You can guess at what goes into the food based on similar recipies and how much that costs based on what you pay at the grocery store.
The funniest comment here. Have you seen the prices of the technical shit for the past two years? Dang, GPUs are not getting any cheaper, but more expensive with each year.
As a localLLM evangelist, I am hopeful this will bring more attention to the joys of rolling your own sovereign AI.
Microsoft adding Deepseek support already as I recall?
That is - for any definition of "they are behind X months" then eventually they get to the point Claude was in January when the world freaked out, but at 1/10th the cost. A lot of firms are going to mandate that is good enough for their developers.
Do these knowledge jobs have a significant corpus of not only knowledge but discussion and problem solving, all conveniently labelled for the AI to train on? Probably not. Coding has stack overflow, what does, say, advertising use?
If you decided to boycott every company that replaced staff with automation, you would be forced to exit the economy. Every company does this to some degree and the customers who vote with their wallet do not seem to care about a reduction in force.
[1]: https://arstechnica.com/ai/2026/06/gm-installs-robots-at-fla...
Pay for OpenAI Pro directly, but I’m the only guy that uses Codex. $100 a month. My nontechnical partner likes to talk to ChatGPT 5.5 Pro for image related tasks (think generating interior decorating pics).
The nontechnical staff use a Gemini account on a Google family AI Pro sub. I use Antigravity when working on Android or Google Cloud API codebases.
Everyone gets OpenCode Go. The cost is trivial. $10 a month per person.
Pay for MiMo directly. We use it during Chinese off peak hours though. Total spend so far $25 in last month.
We run a few Qwen models locally and pretty much have them pegged all day. RTX 5090 on a PC and a Mac Studio.
There’s also Grok which is used for Imagine for artistic / graphic design related work. I also use the subscription for a vision model in my oh-my-pi harness.
We’re having discussions about how to pull in GLM-5.2 cost effectively. We compete with third world development shops so we can’t really pass on inference costs, but we can benefit from getting jobs done for customers faster. But ⅔ of our work is either internal or open source projects we can’t bill for.
The market for open weight model hosting gives you an idea of the profitable price floor, it's pretty clear there's markup baked into OAI/Anthropic's APIs.
They are? In the before times of 2025, Opus 4.1 was $75 per million tokens. Opus 4.8 is $25, and Fable is/was $50.
And it does, nowadays, give you a bit of a veneer of mere curiosity when you're being accused of massive theft.
This is a delusional take. Sorry, but anyone claiming this hasn't used Fable and compared it to the current best open source models. I see a lot of hype posting about GLM5.2. I see absolutely ZERO people using it in production compared to GPT 5.5 or Opus 4.8.
All depends on who is holding the bag, and how big the bag is.
I do hope that a day will come where you can buy the nvidia spark thingy for 5k that can run the equivalent of Opus 4.6 or 4.5 locally and that would be a massive thing.
There isn't one AI intelligence S curve, there are thousands of them, and they're mostly invisible in the major benchmarks, but for someone trying to do work in that specific area of capability, the progress is transformative.
Edit: to the commenter below . It was widely reported that these companies were unprofitable 1 from last year. I am asking question to this specefic comment because they made a very specific claim about part of plan thats profitable . something only an insider would know.
1. https://www.wsj.com/tech/ai/openai-anthropic-profitability-e...
Considering how much they spend on sales, marketing and R&D that doesn't sound that absurd
The people have a right to make and use whatever models they want, protected by the constitution. At a minimum, the models are described in research papers that are unquestionably protected speech. Skilled devs turn those into programs, also protected speech.
You are way too deep in the HN bubble.
It does remind me of the time a chef told me when he puts lemon juice over a dish, he would intentionally not remove any seeds that went on it because it was a signal of quality. I wonder if future slop chefs will intentionally place seeds on dishes that came from a box...
I'm actually curious if this works, haven't tried but I assume it would.
And of course the C-suite will have unlimited access to Mythos tier models, which they'll use to summarize reports, while passing down mandates to rank and file to increase usage of less expensive models.
Advertising has centuries of print ads, 100 years of radio advertising, 70 years of TV commercials, etc. And modern AI does not necessarily need labeling.
The same is not true for the software industry execs.
I worked at Verizon during their layoffs last year. Biggest layoffs in the USA.
As someone who’s been laid off before, I knew that it generally boosts the stock price.
I bought VZ because of that. It’s up 15% since the layoffs.
Microsoft, an AI stock, is down 30% in the same timeframe.
I believe this hasn't been confirmed yet but I think it speaks to a bigger problem for the AI companies which is, if you give capable developers a good reasoning LLM, they can make it work like it was a really expensive model.
I believe we are 100% at the stage of good enough for the vast majority of tech companines. Fable and others will be more valuable for non-traditional tech companies.
I read somewhere that the chinese AI companies are sharing knowledge and it would not surprise me if the government is applying pressure by saying work together or else. If they work together, they can truly commoditize LLMs and with China ramping up hardware support for AI, I see the future being inference speed and hardware being the moat.
Maybe I should be aiming for something targeting 48gb of memory?
OpenRouter is the best guide to real costs.
That’s usually a sign that sales are not “just fine”.
Due to the fact that we’ve already done this before (Enron, Global Crossing) -
I’m willing to bet that there are contracts in place ALREADY, that define what happens in the event of a default.
In particular, I’ll bet that the buildings, the GPUs, the patents, etc…
All of these have probably been accounted for.
I worked at a data center that closed during the WorldCom era, and when they put the padlocks on the door, there were still websites “hosted” from the building.
I don’t know if they killed the power or what. I’d cleared out my desk long before they locked it all up. I wouldn’t be surprised to learn that these websites couldn’t get their own servers, since ownership was tied up in the courts.
In the Bay Area during that time, there were row upon row of empty office buildings.
A wealth transfer from the working class to a handful of billionaires bigger than any the world has ever seen (and the world has seen a lot of wealth transfer from the working class to billionaires).
If AI was around in the early 2000s Countrywide.ai would have been a thing.
I don't see how.
So depending on how literally we interpret Darios comment, OpenAI & Anthropic need to get to Apple+Google+Meta revenue numbers in like single digit years?
The banks aren't has exposed this time, as in 2008, most of it is tied up in private credit, its more akin to the fiber buildout in the 90s.
I know because I see how people went over the 4o model. I can see opus behaving clearly differently enough that I pick it for certain tasks.
How?
* Moores Law is almost over. The 5090 improves over the 4090 mostly because of quant improvements.
* even if the hardware improves, there’s a huge incentive to slow roll the next generation. Nobody wants to end up like Sun Microsystems. Sun’s used hardware was faster than its new hardware, once you considered price. Sun ended up competing with its own used equipment.
The most obvious place for improvement is RAM, network and storage.
If someone can bring more RAM onto the market, that will unstick things.
Maybe you're somehow legally allowed to distribute and download the weights, but most of us can't run GLM 5.2 at home.
And much more informative than the speculation and guessing in the article.
Myself and several other devs were laughing about the whole thing. The company was so amped about what AI could do they never even bothered collecting any analytics that would affirm or deny any of this had a positive impact. Even some of my team members were talking about the placebo effect AI has had on a lot of C-Suite folks.
https://carteakey.dev/blog/local-inference/local-llm-optimiz...
https://botmonster.com/ai/self-hosted-ai-agent-frameworks-20...
Personally I find myself swapping models depending if I am engaged in “trad-development” vs building agentic probes or apps involving imagery. Tailscale the LLM to your deployments and ta-da!
Which makes sense to me. Selling a chatbot interface/model access to the general public was never going to be a viable long term play. You still need developers to wrap the models into specialized tools. Queue the Jobs quote "It's a feature, not a product."
A year ago in The Back Of The AI Envelope I pointed out that the AI platforms were running the drug-dealer's algorithm, "the first one's free". By massively subsidizing the use of their products, they were generating overwhelming demand for them. They used this demand to justify massive investments, in the hope that, by the time they had to show a return on these invetment, the users would be so addicted that they would pay the vastly higher prices needed to generate a return.
I have to confess that I was late to the party. The earliest skepticism I've been able to find was from Sequoia Capital's David Cahn in September 2023, entitled AI’s $200B Question. Only nine months later Cahn re-ran the same analysis in AI’s $600B Question. His estimate of the revenue gap had tripled. Cahn wasn't alone. Independent journalists such as Ed Zitron were flagging this problem long before I was.
I started to write this post a couple of months ago when the maiinstream business press began to notice companies complaining about the cost of the tokens their employees were burning. Since then the trickle has turned into a flood, which made finishing the post hard. Below the fold I throw up my hands and dump out a small sample from the flood.
One difficulty has been that estimates of the size of the subsidy have varied widely, typically in the range of costing the platforms $8 to $14 to generate $1 in revenue. Two recent posts from Ed Zitron have illuminated this issue.
First, in AI's Brokenomics Zitron reported that:
SemiAnalysis, an extremely pro-AI semiconductor analyst, ran a test made up of random long-horizon coding tasks until they maxed out the limit on OpenAI and Anthropic’s various subscription levels.
Their findings were shocking.
For $200 A Month, You Can Burn $8000 in Anthropic Tokens or $14,000 In OpenAI Tokens
That’s right. Anyone with a $200-a-month Anthropic subscription can burn $8000 in tokens, and with a $200-a-month ChatGPT subscription, you can burn $14,000 in tokens.
Zitron's numbers don't tell us the real cost of generating tokens but, subject to the assumption that the platforms are not subsidizing the token price, that means Anthropic is subsidizing their enterprise customers by up to 40 times, and OpenAI up to 70 times. No wonder they are seeing massive demand! But, despite OpenAI's subsidy being 175% of Anthropic's, OpenAI's adoption by businesses has recently been flat while Anthropic's has soared.
SemiAnalysis also analyzed the platform's gross margins, implausibly assuming that tokens were priced at 4 times the cost of generating them and:
With the current subsidies, all it takes for a user to have a gross margin of at best negative 25% is for them to use as little as 25% of their rate limit.
Naturally, subsidizing your sales like this means you are feeding cash into the furnace. We have seen OpenAI and Anthropic raising vast sums in equity, but because they both have been private companies we haven't seen the details of their spending or revenue. On June 15th this changed when Zitron saw OpenAI's 20025 financials and posted OpenAI Losses Increased Nearly 8X in 2025, With Spending Hitting $34 Billion, revealing that:
OpenAI Had $13.07 Billion In Revenue, $34 Billion In Costs and Expenses, and $20.92 Billion In Losses, with a net loss attributable to the company of $38.53 Billion
The numbers are somewhat complicated because:
2025 was the year that OpenAI converted from a non-profit to a for-profit entity, leading to a $41.55 billion loss due to changes in fair value of convertible interests and warrant liability.
...
Ultimately, the net loss attributable to OpenAI in 2025 was $38.5 billion.At the end of the year, OpenAI had just over $50 billion in assets, with almost half of that in cash.
Perhaps the most striking of their truly awful numbers were:
- Revenue: $13.07 billion
- ...
- Sales and Marketing: $5.73 billion
That is, OpenAI spent 44% of their revenue on sales and marketing! The hype needed to keep the AI bubble inflated is incredibly expensive. Despite this lavish spending, business adoption has been flat.
US equity markets are facing three IPOs of AI companies, SpaceX, Anthropic and OpenAI, each led by a world-class bullshitter, each losing tens of billions fo dollars a quarter, and all but SpaceX touting overwhelming demand for their products[1]. But, after they go public, they will need to charge enough to generate a return on their enormous capital investments. Ideally, they would have postponed the necessary swingeing price increases until the IPO money is in the bank.
Alas, their burn rate is so high that they have been forced to make some premature moves toward price sanity. Back in April Ed Zitron reported that Microsoft To Shift GitHub Copilot Users To Token-Based Billing, Tighten Rate Limits:
Leaked internal documents viewed by Where’s Your Ed At reveal that Microsoft intends to pause new signups for the student and paid individual tiers of AI coding product GitHub Copilot, tighter rate limits, and eventually move users to “token-based billing,” charging them based on what the actual cost of their token burn really is.
The document says that although token-based billing has been a top priority for Microsoft, it became more urgent in recent months, with the week-over-week cost of running GitHub Copilot nearly doubling since January.
The move to token-based billing will see GitHub users charged based on their usage of the platform, and how many tokens their prompts consume — and thus, how much compute they use.
Anthropic, OpenAI and Microsoft have all now transitioned customers from subscriptions to token-based pricing. For serious users, this is eye-wateringly expensive. Jamie John, Rafe Rosner-Uddin and Ryan McMorrow's ‘We created a monster’: companies rein in AI usage as costs strain budgets quotes a small company's CEO:
But the company got a shock when Anthropic switched it over to token-based pricing in May. “Our spend went up 7x the first day and I’m like, oh shit, we created a monster,” said Busse. “[Large language model] companies have been subsidising all of our usage and now no longer. User-based pricing shelters you.”
Thus in recent weeks the idea that Generative AI (LLMs for short) is too expensive has been all over mainstream business media. Examples include Bloomberg's video Major Companies Reconsider AI Costs, Scott Galloway's video AI May Not Be Worth The Cost — Here’s Why, Derek Thompson's The AI Boom Has Entered Its 'Wait, Is This Worth It?' Era, and Jowi Morales' AI cost crisis hits tech giants as employee 'tokenmaxxing' backfires, sparking corporate pullback at Microsoft, Meta, and Amazon — agentic AI eats up to 1000x more tokens than standard AI, who notes that:
it’s now apparent that using AI is more expensive than hiring people, especially since it offers only limited productivity gains at the moment.
Lest you think it is only the AI haters complaining about the cost, check out Bruno Ferreira's Nvidia exec says AI is more expensive than actual workers — yet some companies don't see the extra costs as a negative:
Bryan Catanzaro, Nvidia's VP of applied deep learning, recently told Axios that "For my team, the cost of compute is far beyond the costs of the employees", quite an interesting statement from the company selling the shovels for the gold rush.
That perspective is shared by Uber's CTO Praveen Naga, who "[went] back to the drawing board because the budget [he] thought [he] would need is blown away already" as of two weeks ago. Likewise, Swan AI's Amos Bar-Joseph posted a while back on LinkedIn about how proud he was about a $113k bill from Anthropic (makers of Claude) for a four-person team.
Oversimplified math pins that amount that at $28k per person per month, which is likely more than each person's monthly wages. Jokes abound right now that "companies have discovered jobs again," and the humor is backed up by a 2024 MIT study stating that 77% of the time, it was preferable to have humans do the work.
![]() |
| Source |
The reason is for the premature and impending price rises is that justifying the massive investment in building data centers, about 60% of which goes into rapidly depreciating hardware, requires implausibly astronomical revenues. Thierry Borgeat
notes that:
even under "best case" assumptions — assuming zero costs, just revenue against capex — the Financial Times calculated the implied return on hyperscaler AI investment from 2025 to 2030.
Only one of them clears positive.
Implied return on AI investment (FT / Panmure Liberum)
– Microsoft: -9.2%
– Alphabet: -15.7%
– Amazon: +7.2%
– Meta: -28.8%
– Oracle: -35.6%And remember: that's assuming zero costs. In reality, GPUs depreciate, power bills run, salaries get paid.
In
The AI Industry Is Panicking, Will Lockett estimates that over the next few years the AI platforms will accumulate around $3T in debt. Assuming this is at 3% over 10 years, servicing the debt will take $309B/year:
This means that for the AI industry to service its debt, it needs to generate hundreds of billions of dollars in profit each year.
Even giant monopolies like Google don’t make enough profit to service that much debt. AI can’t just be a novelty industry; it needs to replace human labour on a colossal scale to service this debt. Let’s optimistically assume AI one day reaches a 10% profitability margin, a cost parity with human labour, and the ability to complete most jobs (none of which are currently the case). Well, the average US salary is roughly $66,000, so at a 10% profit, the AI company will make on average $6,600 per year per job it replaces. To generate the $309 billion needed to service their debt, the AI industry will need to replace 46.8 million jobs, equivalent to around 27% of the current number of jobs in the US.
While this is all very rough maths, it highlights the implicit bet created by the debt the AI industry has racked up. To simply not default on this debt, the AI industry has to rapidly displace human labour at a staggering scale, even if we are extremely optimistic about AI’s economics.
One caveat with Lockett's math is that the cost of employing a human is greater than just the salary. It includes the employer's Social Security tax, health insurance, office space and so on. Chatbots don't need any of these. According to the Bureau of Labor Statistics:
Wages and salaries averaged $32.60 per hour worked and accounted for 69.9 percent of employer costs, while benefit costs averaged $14.01 per hour worked and accounted for the remaining 30.1 percent.
So the average profit per job would be around $9.5K, and the number of jobs displaced would be around 32.5
K
M.
How was the switch to token-based pricing received? We can guess from three pieces of recent news:
OpenAI's Sam Altman said that costs have become a "huge issue" for customers and the company is considering "drastic" price cuts to rein in rival Anthropic PBC’s lead in the corporate market.
Kyle Orland reported that Anthropic “pauses” token-based billing for its Claude Agent SDK:
Last month, Anthropic announced a billing change that would have substantially increased costs for heavy users of its automation-focused Claude Agent SDK, including many third-party apps. On Monday, though, Anthropic abruptly announced it had paused those pricing changes just as they were set to take effect, allowing Agent SDK users to continue drawing from the more generous usage limits in their existing Claude subscriptions.
Microsoft Plans June 30, 2026 Shift From Claude Code to Copilot CLI:
Microsoft is reportedly cancelling most Claude Code access for engineers in its Experiences and Devices division by June 30, 2026, shifting teams working on Windows, Microsoft 365, Outlook, Teams, and Surface toward GitHub Copilot CLI as the company tries to rein in internal AI coding costs. The decision is more than a procurement tweak. It is a rare glimpse into what happens when the world’s most aggressive AI software company runs into the same metered-billing problem now facing every large engineering organization.
Historically, companies wishing to IPO would be profitable. More recently they could have a successful IPO by showing a plausible path to profitability. Now, SpaceX has shown that even massive losses and a claimed path to profitability that is completely implausible is not a barrier to a successful IPO. But even despite this example, one would think that the last thing two companies racing to IPO despite massive losses and implausible paths to profitability would want would be to engage in a "drastic" price war.
There is significant room to make more specialized neural network accelerators with new compute-in-memory architectures.
If the brain can run 86 billion neurons on 30W it must be possible.
> but they do have the power to constrain commerce
its an interesting idea; i'd like to see someone claim buying/selling as a form of speech...Otherwise I don't see the comparison.
If I'm intelligent enough to use a tool, but I don't have the tool, that doesn't mean anyone who does have the tool is automatically more intelligent than me.
Likewise, comparing my performance without the tool against someone's performance with the tool wouldn't be benchmarking their performance, only benchmarking them with the tool's performance. The fairer comparison would be against me also with the tool.
This is the video I watched that explained the shenanigans (from the guests' perspective, not illegal, obfuscated)
Certainly, the best models have gotten better since then, but I wouldn't consider DeepSeek V4 Pro or GLM 5.2 to be a big enough downgrade to be worse than coding by hand. I'm willing to spend a premium for the best model for coding because it wastes less of my time with dumb stuff, so I've got a Claude subscription. But, there is a limit to how much of a premium I'll pay. 10x over Chinese models? OK, fine. Opus saves me enough time to make it worth a couple hundred bucks a month. But, 100x, or more? Nah. I'll go a little slower, review the PRs a little more carefully.
And, open weights models do keep improving. DeepSeek V4 Pro is a notable improvement over earlier DeepSeek models, and the first DeepSeek model to cross the "better to work with it than without it" threshold into Opus 4.5 (or better) territory. GLM 5.2 is somewhere in the ballpark of Opus 4.6 (though without vision, a notable limitation for anything that requires a UI).
I can manage this budget with the chinese models in AWS BedRock. However, in my experience, they aren't as good as claude today.
And.. now I feel the need to look again. Darn, there goes my afternoon
I built my career on Solaris and it got rugpulled by Linux.
That wasn’t because of software, it was because of hardware. Linux’s cost advantage existed because Sun hardware had huge margins, because their software was basically free.
AI will probably be a repeat of this. Whoever can come up with the hardware solution that minimizes the cost per token will win.
I believe the 5090 still holds this crown, but someone certainly knows better than I do.