It's the sort of messy job that agents excel at. Decisions need to be made on free text data, translations done into multiple languages, ambiguity handled.
I now need to recheck it still works with another model, which involves a lot of manual verification; and potentially move to Claude Code and pay more money I can ill afford right now.
I'm not even clear from the post when this comes in, I'm guessing effective immediately.
This really hammers home for me the point that we should not be renting our tools.
My own dumb fault for trusting them, I will make sure to learn from this.
There are no good solutions for them.
If OpenAI is indeed overbuilt they will completely eliminate Claude.
The joke is on them, though (maybe) because this also means that there's literally no reason to keep that account active.
The per-request model was pretty insane.
Given that they've already silently had session + weekly rate limits for the past couple weeks already at least (I've hit them), I wonder if this change is just making them visible to the user, or if it's actually tightening them too.
If it's the former then I can say they're still significantly more generous than claude pro (on the pro+ plan), so this might be okay. If it's the latter, and the new limits are similar to claude pro then copilot is going to be significantly less useful to me.
I subscribed two months ago, frustrated with Claude Code and their tight session limits.
The Copilot offer was unbeatable 100 dollars for a 12 months plan, if I remember correctly.
It was pretty clear they were losing money, but hey, it's Microsoft and they need customers, so a competitive push on pricing is expected.
Let's see what these limits look like and I'll decide whether to cancel my subscription or not.
Still a terrible move from them.
I'm a paying customer and I did not receive ANY communication about this. Was using Opus this afternoon and then it disappeared.
Microsoft really can't stop being Microsoft. I don't dispute the need to charge more for those models, but there is a basic decency to do things and as usual the Big Tech fuckery and complete lack of morals makes them do this in a way that generates total mistrust where it could be just annoyance.
I'll see how Sonnet handles the most difficult problems but I'm foresee a subscription cancelation soon.
Opus 4.6 had a 3x multiplier in Pro. Now the new Opus 4.7 model has 7.5x in Pro+, which offers 5x more requests, but costs 4x more than Pro. So now Opus is essentially 2x the price it used to be.
It’s likely that Sonnet 4.7 will be the new 3x model in Pro — https://github.blog/news-insights/company-news/changes-to-gi...
This whole thing is a massive asshole move, and probably illegal in all countries with a minimum set of consumer protections.
I think this is really telling. The cost of AI has really been masked HUGELY to drive adoption. The true cost is likely to be unsustainable for the big complex tasks (agents running for hours+) that companies have been pushing.
I was skeptical, then quietly bullish on AI, but I'm now seeing signs the market is cracking and the availability is going to receded/costs balloon.
I get the impression that the intersection of HN posters and Copilot users is quite small in practice; that Claude Code and Codex suck up all the oxygen in this room. But it seems plausible we’ll see similar “true costs greatly exceed our current subscription pricing” from Anthropic and OpenAI someday soon…
Opus 4.7 is available today for 7.5 credits per prompt.
They have also suspended new signups.
After testing all of the major IDEs/tools that integrate with LLMs over the last four weeks, I was happy to settle on Copilot. I, and others, seem to be a lot confident in that decision. Especially since there seems to be no refund path for people who prepaid for a year.
In my 30+ years online, I've never seen an industry change so much in terms of pricing, service levels, etc, as I have the last two months.
I'm really curious where all of this lands, and if AI coding tools will be something that only a small percentage can genuinely afford at a competitive level.
It's a shame because the VsCode copilot experience is quite good out of the box compared to all of the other harnesses I've used. But with typical lack of transparency, and sudden, harsh changes... What are they thinking?
After the restrictive rate limiting they've already instituted, I'm simply cancelling and continuing by using providers directly.
I've been using the Pro+ with Opus 4.6 very successfully and being charged 3x rate was mostly acceptable.
But removing Opus 4.6 and replacing with Opus 4.7 with a 7x rate is just insane!
It was clear (see the linked post from 70 days ago) that the current offering was unsustainable, but I'm a bit taken aback at how sharp the clawback is.
Looks like I'm ending my subscription, good (likely too good, no way my account was even remotely within profitable range) access to opus-4.6 was the only reason I used this at all.
Now it's going to cost me an upgrade to $39 Github Pro+ to keep using Opus, and even then it's with much higher multipliers. I don't fully understand the extent to which this reflects actual costs for Opus versus Microsoft leveraging network effects to discourage the usage of a competitor.
I didn't really want to wander outside of VSCode just yet because I was happy with VSCode/Copilot/Opus-4.5 and I don't want to spend all my time experimenting when stuff is changing so fast. But I guess my hand has been forced.
That's not how my creative energy works. I have time that I want to solve problems, and I want to solve them. I don't want a cooldown timer applied to solving a problem. Not to mention the anxiety of realizing that while I sleep I could have burned tokens in that time.
I'm incredibly disappointed when I sat down to my hobbyist programming time and realized copilot was suddenly and dramatically changed in a way that is incredibly disheartening.
Meter my token usage DON'T tell me when I can use them! ARGH.
I guess overall probably was a good decision.
But 7.5x as well as quota limits is pretty hard to swallow.
The annoying thing about the quota limits is they make it really awkward to actually fully utilize the 1500 premium requests you are paying for.
Like if you don’t plan working around the daily and weekly quotas you may not actually be able to utilize your full request allocation.
Claude has the same issue. Single session blows through the quota.
And you can then cancel it. I have no idea what a premium request is and it's all just too complicated to use.
Pricing per turn/request was/is an idiotic model and I'm glad they are paying for it. It just forces you into a workflow just to work around business model. Heck the best laugh would be to create a plan outside vscode with interactive CC/Codex then copy paste into GH copilot to do a single session burn of few M tokens.
Again ridiculous model.
They're all operating at a loss, enshittification is coming for us all.
> The value-add that Microsoft brings to Github Copilot is near zero
You are not their target audience.The value add is the GitHub integration. By far the best.
GH has cloud agents that can be kicked off from VS Code. You can apply enterprise policies on model access, MCP white lists, model behavior, etc. from GitHub enterprise and layered down to org and repo (multiple layers of controls for enterprises). It aggregates and collects metrics across the org.
It also has tight integration with Codespaces which is pretty damn amazing. `gh codespace code` and it's an entire standalone full-stack that runs our entire app on a unique URL and GH credentials flow through into the Codespace so everything "just works". Basically full preview environments for the full application at a unique URL conveniently integrated into GH. But also a better alternative to git worktrees.
If you are a solo engineer, none of this is relevant and probably doesn't make sense (except Codespaces, which is pretty sweet in any case), but for orgs using the GH stack is a huge, huge value add because Microsoft is going to have a better understanding of enterprise controls.
Over here in the EU, we need to store sensitive data in an EU server. Anthropic only offers US-hosted version of their models, while G-cloud and Azure has EU based servers.
Now, it may be the right call to immediately give up and shutdown after Opus 4.5, but models and subscriptions are in flux right now, so the right call is not at all obvious to me.
The agentic AI models could be commoditized, some model may excel in one area of SWE, while others are good for another area, local models may be at least good enough for 80%, and cloud usage could fall to 20%, etc. etc.
Staying in the market and providing multi-model and harness options (Claude and Codex usable in Copilot) is good for the market, even if you don't use it.
I was actually hoping they would change it to something that more closely tracks their actual costs so that they wouldn't have to rug-pull this badly. In particular what was really bad about it was that sending prompts to agents while they were working (to give them corrections) cost extra so I stopped doing that (after initially OpenCode didn't cause billing for that, until they became official).
I had the exact same issues with the latter - randomly stops working, wipes chat history, just generally seems to be totally broken. But the former works totally fine and still lets you select sonnet/opus. My experience was before this recent 4.6 -> 4.7 change though.
For what its worth, i have been paying for Pro+ and i still got locked out of Opus. I only have access to Opus 4.7 at 7.5x
This was my first thought too but apparently you can just use Claude Code within VSC: https://code.claude.com/docs/en/vs-code
Think GitHub will do that eventually, just like everyone else is. TFA ends with:
The actions we are taking today enable us to provide the best possible experience for existing users while we develop a more sustainable solution.Guess it’s time to rediscover the lost art of programming without an LLM.
From what I've been gathering, this split in success seems to depend a lot on the types of tasks, the domains / programming languages / frameworks used, and style of prompting.
I couldn't get 5.2 to follow instructions for the life of me, even when repeating multiple times to do / not do something. 5.3-codex was an improvement and 5.4 while _usually_ decent still regularly forgets, goes on unnecessary tangents, or otherwise repeatedly stops just to ask for continuation.
Sure, I'm paying 3x more per request, but I'm also doing 5x fewer requests.
Or well, used to. Still bummed about them dropping 4.6.
It felt like I constantly have to go back and either fix things or I just didn't like the results. Like the forward momentum/progress on my projects overally wasn't there over time. Even with tho its cheaper it just doesn't feel worth it, to the point I start to feel negative emotions.
I'm actually a bit worried that I've somehow become to feel more negative emotions with agentic coding. Quicker to feel frustrated somehow when things aren't working.
But seeing that they are stopping to get new subscriptions, and rumours/evidence that they plan to increase coefficients of remaining models, it seems they want us to see "the writing on the wall"
Copilot (before today) had one of the simplest & cheapest pricing on the market.
It can bill to our Azure sub and I don't have to go through the internal bureaucracy of purchasing a new product/service from a new vendor.
It doesn't matter how competent the actual model is, or how long it's able to operate independently, if the harness can't handle it and drops responses. Made me think are they even using their own harness?
At least Anthropic is obviously dogfooding on Claude Code which keeps it mostly functional.
I’ve been using Anthropic models exclusively for the last month on a large, realistic codebase, and I can count the number of times I needed to use Opus on one hand. Most of the time, Haiku is fine. About 10% of the time I splurge for Sonnet, and honestly, even some of those are unnecessary.
Folks are complaining because they lost unlimited access to a Ferrari, when a bicycle is fine for 95% of trips.
I have always just used the API, but I decided to give copilot a go on the weekend because of the cheap price. And I am seeing weird behavior like I have never seen before... It will somehow fail to use the file editing tool and then spend an absolutely huge amount of time/tokens building a python script to apply the edit in a sub process... And it will spin it's wheels on stuff the API routinely just gets right in one shot.
It was great while it lasted.
While I agree with the sentiment, I think that might have been initially driven by older models being nerfed and/or newer ones were better at token/$. And there is this notion that those labs don't constraint the model on the first days after its release.
Why would it be illegal in any country? Did you pay for an year upfront? Even if so they're offering a pro-rated refund according to the linked blog post:
> If you hit unexpected limits or these changes just don’t work for you, you can cancel your Pro or Pro+ subscription and receive a refund for the time remaining on your current subscription by visiting your Billing settings before May 20
Not sure where the expectation that a business should continue serving you at a given price till the end of time no matter what came from.
I never understood the low visibility.
Expensive ram is annoying. I don't look forward to expensive ai.
I just found out via other news sources, and was surprised I hadn't seen it on HN already.
Enterprise might stick around, but individually, I reckon the developers will flock to OpenCode + open weights (Qwen/GLM/Codestral). The problem then is, if the open weight models impress these new adopters, they will shout about it from rooftops (conferences, social media, blogs) in unison, which might result in an exodus. Especially troublesome considering developers are a major market for both frontier labs (Anthropic & OpenAI) & its IPO ambitions.
Speaking as someone where he only 'real' option we have at work is Copilot Plugin, but I also use Copilot Plugin at home....
This is a shitty shitty shitty move.
As a personal user, I can now only use Opus 4.7 at a 7.5x 'Introductory' multiplier if I upgrade to pro+, but at work I can still apparently do Opus 4.6 at a 3x Multiplier on my work 'enterprise' account.
Honestly it strikes me as though someone at Github Copilot took Palantir's manifesto to heart; Screw the individual, consolidate power to companies on every level.
From a business perspective, why would I start thinking about which model to use, when I could cheaply always use the best model?
Example zed issue https://github.com/zed-industries/zed/issues/54219?issue=zed...
As far as I can tell, the distinctive feature of my workflow is that I'm giving it small, contained single-commit-sized tasks and limited context. For instance: "For all controller `output()` functions under `Controller/Edit/` and `Controller/Report/`, ensure that they check `Auth::userCanManage`." Others seem to be taking bigger swings.
On wait, nevermind.
Haiku is most definitely not fine for the code bases that I work on. Sonnet is probably fine for most daily tasks, but Opus is still needed to find that pesky bug you've been chasing, or to thoroughly review your PR.
> I’ve been using Anthropic models exclusively for the last month on a large, realistic codebase, and I can count the number of times I needed to use Opus on one hand. Most of the time, Haiku is fine. About 10% of the time I splurge for Sonnet, and honestly, even some of those are unnecessary.
You and I couldn't have more different experiences. Opus 4.7 on the max setting still gets lost and chokes on a lot of my tasks.
I switch to Sonnet for simpler tasks like refactoring where I can lay out all of the expectations in detail, but even with Opus 4.7 I can often go through my entire 5-hour credit limit just trying to get it to converge on a reasonable plan. This is in a medium size codebase.
For the people putting together simple web apps using Sonnet with a mix of Haiku might be fine, but we have a long way to go with LLMs before even the SOTA models are trustworthy for complex tasks.
- If you pay for unlimited trips will you choose the Ferrari or the old VW? Both are waiting outside your door, ready to go.
- Providers that let you choose models don't really price much difference between lower class models. On my grandfathered Cursor plan I pay 1x request to use Composer 2 or 2x request to use Opus 4.6. Until the price is more differentiated so people can say "ok yes Opus is smarter, but paying 10x more when Haiku would do the same isn't worth it" it won't happen.
The change applies to existing subscriptions, some paid a year in advance.
Warning: baseless speculation/theorizing ahead.
This is the consequence of LLM inference being really expensive to run, and LLM inference companies being really attractive to VCs. The VC silly money means their costs are totally decoupled from revenue for a while, but I guess eventually people look at incomings vs outgoings and start asking questions.
Previous big trends like SaaS apps, NFTs, blockchain etc were similarly attractive to VCs (for a period of time at least for the last two, the first one is still pretty attractive to VCs), but nowhere near as expensive to run so the behaviour of the companies running them wasn't quite the same.
You give it 3 examples of the change you want, then ask it to do the other 87. You'll end up saving time and “money”.
I basically never just yolo large code changes, and use my taste and experience to guide the tools along. For this, Haiku is perfectly fine in nearly all circumstances.
I have never had the situation you describe, where Opus won’t come up with “a reasonable plan”, but your definition of “reasonable” might be very different than mine, and of course, running through your credit limit is an entirely tangential problem.
Obviously we’re a long way away from being able to rationally evaluate whether the value of X tokens in model Y is better than model Z, let alone better in terms of developer cost, but that’s kind of where we need to get to, otherwise the model providers are selling magic beans rated in ineffable units of magicalness. The only rational behavior in such a world is to gorge yourself.
From my simple checks - and from Microsoft's own blog - per token pricing isn't going to be realistic for agentic coding either.
> If you hit unexpected limits or these changes just don’t work for you, you can cancel your Pro or Pro+ subscription and you will not be charged for April usage. Please reach out to GitHub support between April 20 and May 20 for a refund.
So:
- DO use AIs to build tools for yourself faster. If the AI goes away, the dashboard and scripts you made will still work.
- DO NOT build your business on top of 3rd party AI services with no way of swapping the backend easily. The question isn't whether there's going to be a "rug-pull", but when it happens. It might be sudden like this one or gradual where they just pump up the price like boiling a frog.
The number of intermediaries that some customers, especially governmental agencies, go through to get just an Azure bill can be wild...
The Qwen models are cool, but if you're coming from Opus you will be somewhere between mildly to very disappointed depending on the complexity of your work.
Usage limits are/were higher in Copilot. They also charge per prompt, not per token.
Yeah, I hear that a lot, but it never comes with proof. Everyone is special.
I’m sure you’d find that Haiku is pretty functional if there were a constraint on your use.
Remember that it's not only the cost per token, but also speed. Some tasks are done faster with simpler/less-thinking models, so it might actually make sense to micromanage the model when you have deadlines.
It is basically a token based pricing, but you get alos a limitation of prompts (you can't just randomly ask questions to models, you have to optimize to make them do the most work for e.g. hour(s) without you replying - or ask them to use the question tool).
You were the one who made the claim that Haiku is fine most of the time. To any reasonable person, the burden of proof is on you. Maybe you should share some high level details about your codebase, like its stack, size, problem domain, and so on? Maybe they are so generic that Haiku indeed does fine for you.
They removed this now without notice but Wayback Machine still has it: https://web.archive.org/web/20260420190656/https://github.bl...
Maybe, just maybe, the tool isn't suitable for all problem spaces.
I don't know how anyone could believe that Haiku is useful for most engineering tasks. I often try to have it take on small tasks in the codebase with well defined boundaries to try to conserve my plan limits, but half the time I end up disappointed and feeling like I wasted more time than I should have.
The differences between the models is vast. I'm not even sure how you could conclude that Haiku is usable for most work, unless you have a very different type of workload than what I work on.
But I'm not vibecoding, I don't let models do large work or refactorings, this is just for some small boring tasks I don't want to do.
Most importantly, define your acceptance criteria. What do you mean by “disappointed” - this word is doing most of the heavy lifting in your anecdote. (i.e. I know plenty of coders who are “disappointed” by any code that they didn’t personally write, and become reflexively snobby about LLM code quality. Not saying that’s you, but I can’t rule it out, either.)
The models are not the same, but Haiku is definitely not useless, and without a lot more detail, I just ignore anecdotal statements with this sort of hyperbole. Just to illustrate the larger point, I find something wrong with nearly everything Haiku writes, but then again, I don’t expect perfection. I’d probably get a “better” end result for most individual runs with the more expensive models, but at vastly higher cost that doesn’t justify the difference.
I’m not saying that. If anything, it really doesn’t matter much what model you use, and it’s only a case of “you’re holding it wrong” in the sense that you have to use your brain to write code, and that if you outsource your thinking to a machine, that’s the fundamental mistake.
In other words, it’s a tool, not a magic wand. So yeah, you do have to understand how to use it, but in a fairly deterministic way, not in a mysterious woo-woo way.
Today we’re making the following changes to GitHub Copilot’s Individual plans to protect the experience for existing customers: pausing new sign-ups, tightening usage limits, and adjusting model availability. We know these changes are disruptive, and we want to be clear about why we’re making them and how they will affect you.
Agentic workflows have fundamentally changed Copilot’s compute demands. Long-running, parallelized sessions now regularly consume far more resources than the original plan structure was built to support. As Copilot’s agentic capabilities have expanded rapidly, agents are doing more work, and more customers are hitting usage limits designed to maintain service reliability. Without further action, service quality degrades for everyone.
We’ve heard your frustrations about usage limits and model availability, and we need to do a better job communicating the guardrails we are adding—here’s what’s changing and why.
These changes are necessary to ensure we can serve existing customers with a predictable experience. If you hit unexpected limits or these changes just don’t work for you, you can cancel your Pro or Pro+ subscription and receive a refund for the time remaining on your current subscription by visiting your Billing settings before May 20..
GitHub Copilot has two usage limits today: session and weekly (7 day) limits. Both limits depend on two distinct factors—token consumption and the model’s multiplier.
The session limits exist primarily to ensure that the service is not overloaded during periods of peak usage. They’re set so most users shouldn’t be impacted. Over time, these limits will be adjusted to balance reliability and demand. If you do encounter a session limit, you must wait until the usage window resets to resume using Copilot.
Weekly limits represent a cap on the total number of tokens a user can consume during the week. We introduced weekly limits recently to control for parallelized, long-trajectory requests that often run for extended periods of time and result in prohibitively high costs.
The weekly limits for each plan are also set so that most users will not be impacted. If you hit a weekly limit and have premium requests remaining, you can continue to use Copilot with Auto model selection. Model choice will be reenabled when the weekly period resets. If you are a Pro user, you can upgrade to Pro+ to increase your weekly limits. Pro+ includes over 5X the limits of Pro.
Usage limits are separate from your premium request entitlements. Premium requests determine which models you can access and how many requests you can make. Usage limits, by contrast, are token-based guardrails that cap how many tokens you can consume within a given time window. You can have premium requests remaining and still hit a usage limit.
Starting today, VS Code and Copilot CLI both display your available usage when you’re approaching a limit. These changes are meant to help you avoid a surprise limit.

Usage limits in VS Code

Usage limits in Copilot CLI
If you are approaching a limit, there are a few things you can do to help reduce the chances of hitting it:
/fleet will result in higher token consumption and should be used sparingly if you are nearing your limits.We’ve seen usage intensify for all users as they realize the value of agents and subagents in tackling complex coding problems. These long-running, parallelized workflows can yield great value, but they have also challenged our infrastructure and pricing structure: it’s now common for a handful of requests to incur costs that exceed the plan price! These are our problems to solve. The actions we are taking today enable us to provide the best possible experience for existing users while we develop a more sustainable solution.
Editor’s note: Updated April 21, 2026, to clarify the refund policy.
VP of Product
Everything you need to master GitHub, all in one place.
Build what’s next on GitHub, the place for anyone from anywhere to build anything.
Meet the companies and engineering teams that build with GitHub.
Catch up on the GitHub podcast, a show dedicated to the topics, trends, stories and culture in and around the open source developer community on GitHub.
This points toward a deeper issue though. We’ll probably see more individual offerings dry up over time. That means you’ll have individuals stuck with hand coding while the hyper productive AI assisted coders will all be at large organizations. If that happens, we’ll enter a phase where computing will once more be available exclusively to the elite few.