Having access to dozens of models through a single API key, tracking cost of each request, being able to run the same request on different models and comparing their results next to each other, separating usages through different API keys, adding your own presets, setting your routing rules...
And once you start using an account with multiple users, it's even more useful to have all those features!
Not relying on a subscription and having the right to do exactly what you want with your API key (using it with any tool/harness...) is also a big plus to me.
My backup has been Opencode + Kimi K2. It's definitely not as strong as even Sonnet but it's pretty fast and is serviceable for basic web app work like the above.
It's nice that it works for the author, though, and OpenRouter is pretty nice for trying out models or interacting with multiple ones through a unified platform!
It’s not just Zed, CoPilot also reduces the capabilities and options available when using models directly.
No thanks, definitely agree with the Open Router approach or native harness to keep full functionality.
I am in a situation where every sub-folder has its own language server settings, lint settings, etc. VSCode (and forks) can handle this by creating a workspace, adding each folder to the workspace, and having a separate .vscode per-folder. I haven't figured out how to do the same with Zed.
I would love to stop using VSCode forks
OpenCode picked up my CLAUDE.md files and skills straight away, and I got similar performance to Opus 4.6.
I might be paranoid but I feel that access to models will become more constraint in the future as the industry gets more regulated.
Any insights / suggestions / best practices?
I switched to OpenCode Zen + GitHub Copilot. For some reason, Claude Code burns through my quota really quickly.
OpenCode Go has the simplest plan at the highest rate limits for any subscription plan with multiple model families, and it's $10/month ($5/month for first month). With the cheapest model in the plan (MiniMax M2.5), it is a 13x higher rate than Claude Max, at 1/10th the price. The most expensive model (GLM 5.1) gives you a rate of 880 per 5h, which is more than any other $10 plan. I don't expect this price to last, it makes no sense. OpenCode also has a very generous free tier with higher rates than some paid plans, but the free models do collect data.
The cheapest plan of all is free and unlimited - GitHub Copilot. They offer 3 models for free with (supposedly) no limit - GPT-4o, GPT-4.1, and GPT-5-mini. I would not suggest coding with them, but for really basic stuff, you can't get better than free. I would not recommend their paid plans, they actually have the lowest limits of any provider. They also have the most obtuse per-token pricing of any provider. (FYI, GitHub Copilot OAuth is officially supported in OpenCode)
The next cheapest unlimited plan is BlackBox Pro. Their $10/month Pro plan provides unlimited access to MiniMax M2.5. This model is good enough for coding, and the unlimited number of requests means you can keep churning with subagents long after other providers have hit a limit.
The next cheapest is MiniMax Max, a plan from the makers of MiniMax. For $50/month, you get 15,000 requests per 5-hours to MiniMax M2.7. This is not as cheap as OpenCode Go, which gives you 20,000 requests of MiniMax M2.5 for $10, but you are getting the newer model.
If you don't want to use MiniMax, the next cheapest is Chutes Pro. For $20/month, you get a monthly limit of 5,000 requests.
I'll be adding more of these as I find them to this spreadsheet: https://codeberg.org/mutablecc/calculate-ai-cost/src/branch/...
Note: This calculation is inaccurate, for multiple reasons. For one, it's entirely predicated on working 8 hours a day, 22 days a month; I'll recalculate at some point to find cheapest if you wanted to churn 24/7. For another, some providers (coughANTHROPIC) don't actually tell you what their limits are, so we have to guess and use an average. But based on my research, the calculations seems to match up with the per-request API cost reported at OpenRouter. Happy to take suggestions on improvements.
Because GH is accessing the API behind the scenes, you should face less degradation when using Sonnet/Opus models compared to a Claude subscription.
Keep a ChatGPT $20 subscription alongside for back-and-forth conversations and you'll get great bang for buck.
OpenRouter credit rollover is the real insight — credits that don't expire vs time windows that reset whether you used them or not. I'm surprised Anthropic hasn't offered a token pack option alongside the subscription.
Also ditching Claude Code is mistake. It is quite capable model, and still great value. I would keep it, even if it's just for code reviews and planning. Anthropic allows pro plans use in Zed.
I have actually ditched PyCharm for the snappiness of Zed. But the paper cuts are really adding up.
well worth the 5% they take
For general use, I personally don’t see much justification as to why I would want to pay a per-token fee just to not create a few accounts with my trusted providers and add them to an instance for users. It is transparent to users beyond them having a single internal API key (or multiple if you want to track specific app usage) for all the models they have access to, with limits and logging. They wouldn’t even need to know what provider is hosting the model and the underlying provider could be swapped without users knowing.
It is certainly easier to pay a fee per token on a small scale and not have to run an instance, so less technical users could definitely find advantage in just sticking with OpenRouter.
I opened just one of the typescript projects inside VSCode and I see something like 1gb (combining the helpers usage). I'm not using it actively, so no extra plugins and so on.
That's on mac, so I guess it may vary on other systems.
I don’t have any extensions installed and I’m basically leaving it open, idle, as a note scratch space. I do have projects open with many files but not many actual files are open
Anyway idk
Spent a couple of hours trying to make the Svelte extension ignore a particular type of false positive CSS error, failed, and returned to VS Code
Will definitely give it another chance when the extension system is more mature though!
Eg: Ctrl+P "Open Fol.." in Zed does not surface "Opening a Folder". Zed doesn't call them folders. You have to know that's called "Workspace". And even then, if you type "Open Work..." it doesn't surface! You have to purposefully start with "work..."
Just the floating and ephemeral "Search in files" modal in Jetbrain IDEs would convince me to switch from any other IDE.
If anything I would consider switching to OpenAI subscription (if I didn't despise them even more than Anthropic as a company), but converting to API use seems completely infeasible to me. I'd have to severely cut back on my use for not much benefit, other than having maybe an agent thats a little less jank than CC.
Now I'm happy with agents as the models and harnesses have improved significantly but the token usage comes at a cost.
It’s mad for sure, but I’d bet 99.9% of people spending money on AI aren’t spending their own hard earned sooo… “YOLO it’s a business expense/investment”…
Money is relative. I retired at less than the average professor salary (all ages) at a not-rich school. I would have made more in tech. I still have weeks where the market goes up 2000x my AI budget, just the retirement savings from my salary. Anyone who isn't living in a van and eating peanut butter if they must, to save the max toward retirement, isn't recognizing how profoundly our system is rigged to favor saving.
It easily pays for itself 10x over.
Not true in any non startup where there is an actual finance department
But if OpenRouter does better (even though it's the same sort of API layer) maybe it's worth it?
opencode go gives about 14x the requests of copilot pro. I was like, there must be something not right.
Then I compared the best model GLM5.1 on opencode go, and antropic opus 4.6, yes opus is better on most benchmarks, but glm 5.1 is not too far behind.
I'd like to give the new GLM models a try for personal stuff.
No need for database MCP, I use postgres and tell it to use psql.
Occasionally I use prettier to remove indentation - the LLM makes a lot less edit errors that way. Just add the indent back before you commit. Or tell pi to do it.
at first i thought i was goring to build lots of extra plugins and commands but what ended up working for me is:
- i have a simpel command that pulls context from a linear issue
- simple review command
- project specific skills for common tasks
It's designed to be a small simple core with a rich API which you can use for extensions (providing skills, tools, or just modifying/extending the agent's behaviour).
It's likely that you'll eventually need to find extensions for some extended functionality, but for each feature you can pick the one that fits your need exactly (or just use Pi to hack a new extension).
1. What do you use the hooks for?
2. Do you use an editor alongside the CLI to review code or only examine the diffs?
The new gimped claude code limits means my claude code spend the last month is $131. It cost me $20. I did an additional spend $5 on extra usage which cost me $5.
While VC's are setting fire to money I am going to warm my hands.
1. The LLM provider doesn't know it's you (unless you have personally identifiable information in your queries). If N people are accessing GPT-5.x using OpenRouter, OpenAI can't distinguish the people. It doesn't know if 1 person made all those requests, or N.
2. The ability to ensure your traffic is routed only to providers that claim not to log your inputs (not even for security purposes): https://openrouter.ai/docs/guides/routing/provider-selection...
It's been forever since I played with LiteLLM. Can I get these with it?
If you're only using flagship model providers then openrouter's value add is a lot more limited
Sadly Zed seems to add 10% so it's still more worthwhile to use OpenRouter.
And people keep claiming the token providers are running inference at a profit.
I am only doing single project workflows, but with Z.ai I feel like it opens a whole new door to parallel workflows without hitting usage limits.
Honesty as a marketing strategy is really undervalued in cases like this
Due to the quota changes, I actually find myself using Claude less and less
> Run <other harness> in tmux and interrogate it how feature X works, then build me the equivalent as a pi extension.
Maybe in a few years there will be obvious patterns with harnesses having built really optimal flows, but right now it works so much better to experiment and try new approaches and prompts and flows, and pi is the easiest one to tweak and make it your own.
With the anthropic billing change (not being able to use the max credits for pi) I think I have to cancel - as I'm whirring through credits now.
Going to move to the $250/mo OpenAI codex plan for now.
He went on an "OSS vacation", which is perfectly reasonable and said he'd be back on a certain date. I had a PR open for a trivial fix, someone asked when it would land. I shared he was still away. After his return I politely asked, "@badlogic hey, what can we do to progress this? Thanks x"
I then got what I would consider an abusive reply, because he confused me with someone else. In the meantime he extended his vacation. Didn't even think his shitty attitude was worthy of an apology, that HE confused me with someone else.
https://github.com/badlogic/pi-mono/discussions/1475#discuss...
And another other thing I fixed with no attribution, just landed it himself separately. https://github.com/badlogic/pi-mono/discussions/1080
and
https://github.com/badlogic/pi-mono/issues/1079#event-223896...
Now he's seemingly marked anything with my name on as a "clanker", despite all my changes being by hand.
I've been around open source enough to have a thick skin, but when i'm doing something "for fun" and someone treats you like that, i'd rather avoid it as far as possible. I certainly could not in good faith use this project for anything work related.
But their tab complete situation is abysmal, and Supermaven got macrophaged by Cursor
The liabilities are completely offset by prepayments from your customers though. Even better, you can earn interest on the deposits without paying any out.
If you just dont want the liabilities on the books, issue refunds. Expiring credits feels like a cash grab.
So yes obviously you can do what you want as long as you abide by terms of service, but the terms of service does NOT allow you to resell the API.
But at that point we are just min/maxing the details, and all I can say is if you are on a $100/$200 a month subscription to any of these services and not using them regularly then you shouldn't be on a $200 subscription any more than you should be on a $700 a month gym membership when you go every 3 months for 15 minutes.
If you're trying to minimize cost then having one of the inexpensive models do exploratory work and simple tasks while going back to Opus for the serious thinking and review is a good hybrid model. Having the $20/month Claude plan available is a good idea even if you're primarily using OpenRouter available models.
I think trying to use anything other than the best available SOTA model for important work is not a good tradeoff, though.
I'm pretty conservative when it comes to clearing the context, and I also tend to provide the right files to work on (or at least the right starting point).
I had seen prior to using the model that it starts producing much worse results when the context used is larger, so my usage style probably helps getting better results. I work like this with Claude Code anyway, so it wasn't a big change.
We are not the only one. I found other people online experiencing the same issue. It is hard to tell how wide-spread this is but it is strange to say the least.
85% discount is actually a bit lower than I remember. I think it used to be closer to 90-95%. They're getting stingy ;)
Extrapolating that out, the subscription pricing is HEAVILY subsidized. For similar work in Claude Code, I use a Pro plan for $20/month, and rarely bang up against the limits.
- context is aggressively trimmed compared to CC obviously for cost saving reasons, so the performance is worse
- the request pricing model forces me to adjust how I work
Just these alone are not worth saving the 60$/month for me.I like the VSCode integration and the MCP/LSP usage surprised me sometimes over the dumb grep from CC. Ironically VSCode is becoming my terminal emulator of choice for all the CLI agents - SSH/container access and the automatic port mapping, etc. - it's more convenient than tmux sessions for me. So Copilot would be ideal for me but yeah it's just tweaked for being budget/broad scope tool rather than a tool for professionals that would pay to get work done.
https://www.techradar.com/pro/bad-news-skeptics-github-says-...
The minus is that context caching is only moderately working at best, rendering all savings nearly useless.
FWIW this is highly unlikely to be true.
It's true that the upstream provider won't know it's _you_ per se, but most LLM providers strongly encourage proxies like OpenRouter to distinguish between downstream clients for security and performance reasons.
For example:
- https://developers.openai.com/api/docs/guides/safety-best-pr...
- https://developers.openai.com/api/docs/guides/prompt-caching...
Many of us got the annual Lite plan when they had the $28 discount. But even at $120 I think it's a good deal.
if that wasn't the reason, hey that's actually a great way to launder money (not financial advice).
OpenRouter is a valuable service but I’ll probably try to run my own router going forward.
As someone else pointed out cooler heads and less passive aggressive responses would've resolved this issue easily.
That’s what really appeals to me. I’ve been fighting Claude Code’s attempts to put everything in memory lately (which is fine for personal preferences), when I prefer the repo to contain all the actual knowledge and learnings. Made me realise how these micro-improvements could ultimately, some day, lead to lock-in.
> Run <other harness> in tmux and interrogate it how feature X works, then build me the equivalent as a pi extension.
I’ll give it a try!
I deffo get more perceived value out of it than the 100$ I pay. Could I get MORE value with the same 100$? imo only through OpenAI (no harness lock in and more lenient limits), but I deeply dislike the way their company is evolving. Admittedly, recent launches from Anthropic like managed agents and Mythos Preview don't make me very hopeful the individual developer pricing is here to stay, but I'll use what I can get while I can get it.
Could I get my required value with less than 100$? Mayyybe I could get by with like, three Anthropic 20$ plans? or 2x20$ and an OAI 20$? but this is so min-maxy that I just don't really want to bother. Pay by token would kill my workflow instantly. I'd have to add so many steps for model selection alone. I'll cross that bridge when Anthropic cuts me off.
I agree though most people on the $200 plans are either just not using them or in some deep AI psychosis. I'd like to exclude myself from these groups, but the pipeline to AI psychosis seems very wishy washy to begin with (the thread the other day about iTunes charts being AI dominated had a surprising amount of people defending AI music, imo).
Not everyone gets $1K of usage, and you don't know how fat the per-token margins are. It's like saying the local buffet place is losing money because you eat $100 worth of takeout for $30.
Honestly, it seems like both of you were feeling a bit "grumpy" at the moment, but sending passive aggressiveness towards the maintainer you are trying to get to merge your code (or not your code, someone else's code?) seems like a very bold strategy regardless.
Is OpenAI codex not also charging by usage instead of subscription when using pi?
If I just let opencode zen run claude opus to plan and execute, I'd spend $20 in like 5 minutes lol
OpenRouter recently started enforcing account-level regional restrictions for providers that enforce it (OpenAI, Anthropic, Google) - ie blocking accounts that look like they are being used by users in China. The regional restriction used to be based on the Cloudflare edge worker IP's geolocation and enforced upstream, so a proxy/server running inside of supported regions would get around the geoblocks, but now OpenRouter are using (unspecified) signals like your billing address to geoblock. People say "banned" because the error message says "Author <provider> is banned", which really should be read as "Unable to use models from provider due to upstream ban".
> TOS says: access the Site or Service for purposes of reselling API access to AI Models or otherwise developing a competing service;
I think what you meant is "you aren't allowed to expose the access to the API to end users", which is a fair condition IMHO.
You're still allowed to expose the functionality (ie. build a SaaS or AI assistant powered by OpenRouter API), just don't build a proxy.
I’ve been disappointed to feel that I’m hitting Claude limits faster than before. For context, I use both Claude Code and the Claude desktop app for work and pay $100/month for the privilege of hitting limits. I’m not the only one (this was AMD’s senior director of AI) with numerous other reports found over Reddit and Twitter.
My usage pattern is “bursty” so I’m not using the windows all the time throughout the day but find it incredibly frustrating to hit a limit mid-way through a coding session.
This article is how I’m reallocating that spend to other tools and models while getting more flexibility at the same time.
I like options and while Opus is undoubtedly the market leader for agentic coding, there are other models that I like to use to balance cost and speed depending on the complexity of the task in hand. I’m looking at how I can use different models with an Agent Harness.
verbose output
An “agent harness” coordinates sending and receiving messages from LLMs, injecting tool defintions and calling the tools and orchestrating all of this into workflows (including retrying failing tasks).
Claude Code is an example of such a system. It takes the user message, coordinates reading/writing files - among other things - and makes calls to the LLM.
Plans: $10 / month - pricing page
You don’t realise how slow/laggy VSCode and the all of the forks are until you try out Zed. The builtin agent harness is basic but nice with the ability to follow the agent around as it modifies files and to add new profiles to customise the agent behaviour. Like Cursor it shows the context usage and the rules that are being applied to the current session. If you continue to use Claude Code or other tools like Mistral Vibe, Zed integrates them directly into the editor using the Agent Client Protocol (ACP) - see supported agents.
The biggest disadvantage is definitely the lack of extensions compared to VSCode but there are enough to cover common languages and common tasks.
Zed do offer usage based pricing once you have used up the credits they provide however their token prices are higher than going directly to the API themselves. This is why I prefer to use the OpenRouter integration into Zed instead. A nice side benefit is you get the more native context window sizes. For some reason Zed limits the Gemini 3.1 context to 200k tokens in their native integration however with OpenRouter you can make use of the full 1M. Their docs say this may be changed in the future.
Edit: It has been brought to my attention (thanks to hhthrowaway1230 on HackerNews) that OpenRouter does
.
The largest option of models and providers that I know of is OpenRouter and it’s easy enough to sign up, pre-pay some credits and get an API key.
I don’t like that I have a set window of Anthropic credits. If I use it I have to wait for it to reset (or pay). But when I’m not using it I’m missing out on that window of opportunity. Instead I can top up my OpenRouter credits which expire after 365 days if unused. Then I can use the credits when I’m working and save them/roll-over when I’m not.
To minimise data exposure risk, I have chosen not to consent to OpenRouter being able to use inputs/outputs “to improve the product” (though you get a 1% discount if you do), and I have enabled the “Zero Data Retention (ZDR) Endpoints Only” in my Workspace Guardrail settings. You do lose out on some models here - for example, qwen/qwen3.6-plus which is only hosted on Alibaba Cloud - however that’s a small price I’m willing to pay.
Plans: $20 | $60 | $200 / month - pricing page
verbose output
I originally switched from VSCode & Copilot to Cursor in 2025 after experiencing the magic of the Cursor “Tab” jumping around the editor preempting my next move.
As it moved from autocomplete-on-steroids to more agentic coding, I was thankful to have access to multiple models to experiment with (this is now also available in Copilot but in the beginning they were OpenAI only).
I mostly ignored Cursor 2.0 as they put more emphasis on the chat interface however with Cursor 3.0 as a complete rewrite (in Rust like Zed) and focused on Agent orchestration, I am curious to try it out.
Cursor was (or still semi-is) my preferred editor. As a VSCode fork, all extensions are available. They were an early adopter of the plan mode -> agent mode workflow and now support a new debug mode which is a more advanced print style debug that the agent can also interact with.
Cursor also supports different types of rule applications, something I personally love and am surprised that other agent harnesses haven’t adopted. Most agent harnesses take an “apply intelligently” approach, trying to let the AI make decisions on when to include a rule based on the description. But Cursor also offers the ability to only apply to specific files. I know I have rules that only apply to *.py files, or even **/models.py etc. I am able to make the most of my context window by explicitly setting those rules to be added only to certain filepath regexs. It guarantees their usage
Choosing Cursor you get API rate pricing above the included use in your plan (and you can limit this so your total spend is limited to $100) but you are still paying minimum $20/month which does not roll over to the next month.
Claude Code is optimized for Anthropic models and may not work correctly with other providers.
I know - I said I’m redirecting funds away from Anthropic, but it is possible to continue using the Claude Code agent harness with other models (or even Opus should you want to). We might want to do this because Claude Code is undeniably a great harness, however we need to configure Claude Code to use OpenRouter rather than the Anthropic API.
First, log out of Claude Code if you have been using it before:
claude
> /logout
Next, set some environment variables to configure the OpenRouter endpoints and which models you want to use for “Opus”, “Sonnet”, “Haiku” and “SubAgents” (I recommend setting these in your ~/.zshrc or ~/.bashrc file so they persist):
export OPENROUTER_API_KEY="<your-openrouter-api-key>"
export ANTHROPIC_BASE_URL="https://openrouter.ai/api"
export ANTHROPIC_AUTH_TOKEN="$OPENROUTER_API_KEY"
export ANTHROPIC_API_KEY="" # Important: Must be explicitly empty
# Set these models to whichever model you would like to use on OpenRouter
export ANTHROPIC_DEFAULT_OPUS_MODEL="anthropic/claude-opus-4.6"
export ANTHROPIC_DEFAULT_SONNET_MODEL="anthropic/claude-sonnet-4.6"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="anthropic/claude-haiku-4.5"
export CLAUDE_CODE_SUBAGENT_MODEL="anthropic/claude-opus-4.6"
Verify that Claude Code is using your new config (you may need to restart your terminal or source ~/.zshrc):
claude
> /status
Auth token: ANTHROPIC_AUTH_TOKEN
Anthropic base URL: https://openrouter.ai/api
There are a multitude of other coding Agent Harnesses that can be used from the command line with OpenRouter. I’ve tried a few but none have stuck, here’s the list for you to try and my brief thoughts on them:
Typescript - The one I use the most. Good support for a lot of things. Very popular.Go - I want to like it. It has a distinct style choice (that I don’t mind). It’s performant. But it’s a pain to configure custom models (all manual) so annoying when trying out new ones.Even for popular tools that typically limit you to use their own models like Gemini CLI, often there are forks which attempt to make it OpenRouter compatible. This is worth checking if you are using and like a different harness but want to try other models.
I’m now a happy subscriber to Zed for the reasonable $10/month. I actually also maintain my Cursor subscription for $20/month as I want to see where they go with their new Cursor 3, agent orchestrator. The other $70 gets auto added to my OpenRouter credits each month which don’t get lost. They rollover, waiting for me to use them.
If you’re regularly hitting Claude limits and want to give other models a shot (but you can still use Opus when you need to), I highly recommend giving it a try. You can get started with Zed for free and load up OpenRouter with $20 worth of credits without any subscription.
It's obviously capital-subsidized and so I have zero expectation of that lasting, but it's pretty anti-competitive to Cursor and others that rely on API keys.
No parallel running; I would very consistently get tokens for over 3 hours then take a walk around the block and come back and be ready to go again.
The underlying provider can still limit rates. What Openrouter provides is automatic switching between providers for the same model.
(I could be wrong.)
Well, we're going to find out sooner rather than later. Right now you don't know how thin (or negative) the margins are, either, after all.
All we know for certain is how much VC cash they got. Revenue, spend, profit, etc calculated according to GAAP are still a secret.
It turns it into a very good value for money, as far as I'm concerned.
GHCP at least is transparent about the pricing: hit enter on a prompt= one request. CC/Codex use some opaque quota scheme, where you never really know if a request will be 1,2,10% of your hourly max, let alone weekly max.
I've never seen much difference with context ostensibly being shorter in GHCP, all of the models (in any provider) lose the thread well before their window is full, and it seems that aggressive autocompaction is a pretty standard way to help with that, and CC/Codex do it frequently.
I have been wanting to subscribe but based on how awful the experience is for most people, I just can’t pull the trigger
For prompt caching, they already say they permit it, and do not consider it "logging" (i.e. if you have zero retention turned on, it will still go to providers who do prompt caching).
Or what are you really saying here? I don't understand how that's related to "you don't have the right to do what you want with the API Key", which is the FUD part.
But that doesn't negate the maintainer talking to people like that (and taking contributions without attribution).. and the net result is I don't want to use the software, and frankly they probably won't miss me.. so the end result is neutral.. I just find it sad.
It does talk about a competing service, if I build a service that propose all the image gen models of Openrouter, and charge the user for it per token, am I allowed?
FWIW, I've never dealt with outages since I signed up over 3 months ago (Lite plan). It is slow - always has been. I can live with that.
At the same time, I'm not using it for work. It's for the occasional project once in a while. So maybe I just haven't hit any limits? I did use it for OpenClaw for 2-3 weeks. Never had connection issues.
Looking at https://docs.z.ai/devpack/faq
you can see the details of their limits. Seems GLM 5.1 has low thresholds, and will get lower starting May. On Reddit I see some people switching to GLM 5 and claiming they haven't hit limits - the site doesn't indicate the limits for that model.
They also say that those who subscribed before February have different limits - unsure if it's lower or higher!
GLM-4.7 is still a fairly capable model. Not as good as Opus, but for most personal projects it's been adequate. I see on Reddit plenty of people plan using GLM-5.1, and use 4.7 for implementation.
Then we've had wildly different results. Running CC and GH copilot with Opus 4.6 on same task and the results out of CC were just better, likewise for Codex and GPT 5.4. I have to assume it's the aggressive context compaction/limited context loading because tracking what copilot does it seems to read way less context and then misses out on stuff other agents pick up automatically.
Personally, I've had a lot of good results in my little personal projects with Kimi K2.5, GLM 5 and 5.1, and MiniMax M2.5.
Quite sure most (perhaps >99%) adult people would consider this passive aggressive.
But yeah, I agree with you for the rest part. Why did Mario assume that bot is you...?
Quote from their own TOS: access the Site or Service for purposes of reselling API access to AI Models or otherwise developing a competing service;
When you say "you don't have the right to do what you want with the API Key" it makes it sound like specific use cases are disallowed, or something similar. "You don't have the right to go against the ToS, for some reason they block you then!" would have been very different, and of course it's like that.
Bit like complaining that Stripe is preventing you from accepting credit card payments for narcotics. Yes, just because you have an API key doesn't mean somehow you can do whatever you want.
Are we allowed yes or not to make a service that charge per Token to end-users, like giving access to Kimi K2.5 to end-users through Openrouter in a pay per token basis?