We ought to come up with a term for this new discipline, eg "software engineering" or "programming"
This PR was created by the Claude Code Routine:
https://github.com/srid/claude-dump/pull/5
The original prompt: https://i.imgur.com/mWmkw5e.png
And because they use AI heavily, they produce new product every week. So fast, that I have no time to check, does it worth or not.
This one looks interesting. I have some custom commands that I execute manually weekly, for monitoring, audits, summary, reports. It it can send reports on email, or generate something that I can read in the morning with my coffee, or after I finish with it ;) it might be a good tool.
The question is, do I really want to so much productive? I am already much better in performance with AI, compared with the 'old school' way...
Everything is just getting to much for me.
Oh cool! vendor lock-in.
It was a bit buggy, but it seems to work better now. Some use cases that worked for me:
1. Go over a slack channel used for feedback for an internal tool, triage, open issues, fix obvious ones, reply with the PR link. Some devs liked it, some freaked out. I kept it.
2. Surprisingly non code related - give me a daily rundown (GitHub activity, slack messages, emails) - tried it with non Claude Code scheduled tasks (CoWork) not as good, as it seems the GitHub connector only works in Claude Code. Really good correlation between threads that start on slack, related to email (outlook), or even my personal gmail.
I can share the markdowns if anyone is interested, but it's pretty basic.
Very useful, (when it works).
I think to become really efficient they'll have to invent new programming language to eliminate all the ambiguity and non-determinism. Call it "prompt language", with ai-subroutines, ai-labels and ai-goto.
So who are they building these for?
This Routines feature notably works with the subscription, and it also has API callbacks. So if my Telegram bot calls that API... do I get my Anthropic account nuked or not?
They support much of the same triggers and come with many additional security controls out of the box
The main bugs / missing features are
1. It loses connection to it's connectors, mostly to the slack connector. It does all the work, then says it can't connect to slack. Then when you show it a screenshot of itself with the slack connector, it will say, oh, yeah, the tools are now loaded and does the rest of the routine.
2. ability to connect it to github packages / artifactory (private packages) - or the dangerous route of allowing access to some sort of vault (with non critical dev only secrets... although it's always a risk. But cursor has it...)
3. the GitHub MCP not being able to do simple things such as update release markdown (super simple use case of creating automated release notes for example)
You are so close, yet so far...
Am I needed anymore?
- No trust that they won't nerf the tool/model behind the feature
- No trust they won't sunset the feature (the graveyard of LLM-features is vast and growing quickly while they throw stuff at the wall to see what sticks)
- No trust in the company long-term. Both in them being around at all and them not rug-pulling. I don't want to build on their "platform". I'll use their harness and their models but I don't want more lock-in than that.
If Anthropic goes "bad" I want to pick up and move to another harness and/or model with minimal fuss. Buying in to things like this would make that much harder.
I'm not going to build my business or my development flows on things I can't replicate myself. Also, I imagine debugging any of this would be maddening. The value add is just not there IMHO.
EDIT: Put another way, LLM companies are trying to climb the ladder to be a platform, I have zero interest in that, I was a "dumb pipe", I want a commodity, I want a provider, not a platform. Claude Code is as far into the dragon's lair that I want to venture and I'm only okay with that because I know I can jump to OpenCode/Codex/etc if/when Anthropic "goes bad".
The reason someone would use this vs. third-party alternatives is still the fact that the $200/mo subscription is markedly cheaper than per-token API billing.
Not sure how this works out in the long term when switching costs are virtually zero.
The new reality of coding took away one of the best things for me - that the computer always just does what it is told to do. If the results are wrong it means I'm wrong, I made a bug and I can debug it. Here.. I'm not a hater, it's a powerful tool, but.. it's different.
The report that they are 90% Ai code generated seems more likely the more I attempt to use their products.
Feature delivery rate by Anthropic is basically a fast takeoff in miniature. Pushing out multiple features each week that used to take enterprises quarters to deliver.
EDIT: This comment is apparently [dead] and idk why.
I bet anthropic wants to be there already but doesn't have the compute to support it yet.
Sorry, but I just have to ask. Why is u/minimaxir's comment dead? Is this somehow an error, an attack, or what?
This is a respected user, with a sane question, no?
I vouched, but not enough.
edit: His comment has arisen now. Leaving this up for reference.
I also clearly see the lock-in/moat strategy playing out here, and I don't like it. It's classic SV tactics. I've been burned too many times to let it happen again if I can help it.
We're watching a speed run of growthism, folks.
Oh uh... ok then.
- SDK that allows you to use OAuth authentication!
- Docs updated to say DO NOT USE OAUTH authentication unless authorized! [0]
- Anthropic employee Tweeting "That's not what we meant! It's fine for personal use!" [1]
- An email sent out to everyone saying it's NOT fine do NOT use it [2]
Sigh.
[0] https://code.claude.com/docs/en/agent-sdk/overview#get-start...
[1] https://www.reddit.com/r/ClaudeAI/comments/1r8et0d/update_fr...
This happens in all their UIs, including, say, Claude in Excel, as well.
You can still use OpenClaw on their API pricing tier as much as you want. What they did is not allow subscriptions to be used to power automated third-party workloads, including OpenClaw.
Now, is their messaging around this confusing? Absolutely. The whole thing has been handled shambolically. Everyone knows that they lack the compute to keep up, and likely have lower margins on subscriptions than API; but they cannot just say that because investors may be skittish.
...Except now you sorta-kinda can: now they auto-detect 3rd party stuff and bill you per-token for it?
If I'm reading it right:
To the contrary, they've proven again and again and again they'll absolutely do that the first chance they get.
I see people making similar conclusions about various LLM providers. I suspect in the end it’ll shake out about the same way, the providers will become practically inoperable with each other either due to inconvenience, cost, or whatever. So I’ve not wasted much of my time thinking about it.
The Chilling Effect of this is real and it gets more and more frustrating that they can't or won't clarify.
Another comparison would be "unlimited storage", where "unlimited" means some people will abuse it and the company will soon limit the "unlimited."
edit: And specifically i'm making an IDE, and trying to get ClaudeCode into it. I frankly have no clue when Claude usage is simply part of an IDE and "okay" and when it becomes a third party harness..
Too bad we've now managed to turn programming into the same annoying guesswork.
but you can replicate these yourself! i'm happy that ant/oai are experimenting to find pmf for "llm for dev-tools". After they figure out the proper stickyness, (or if they go away or nerf or raise prices, etc) you can always take the off-ramp and implement your own llm/agent using the existing open-source models. The cost of building dev-tools is near zero. it is not like codegen where you need the frontier performance.
All these not really helpful, but vendor specific, "bonuses" sounds like a way to try to lock people in, to try to raise the switching cost.
I'm using, on purpose, a simple process so that at any time I can switch AI provider.
But yea there's some annoying overlap here with Cowork which also has scheduled tasks, in Cowork the tasks can use your desktop, browser and accounts which is pretty useful - a big difference from these Claude Code Routines.
I like to just check the release notes from time to time:
https://github.com/anthropics/claude-code/releases
and the equally frenetic openclaw:
https://github.com/openclaw/openclaw/releases
GPT-4.1 was released a year ago today. Sonnet 4 is ~11 months old. The claude-code cli was released last Feb. Gas Town is 3 months old.
This is a chart that simply counts the bullet points in the release notes of claude code since inception:
This is as bad and as slow as it's going to be.
I'd say that counts as yes.
(For clarity: neither are powered by Claude Code Routines. Rather, Claude Code coded them and they're simple cron jobs themselves.)
Yes, I expect that is very much the point here. A bunch of product guys got on a whiteboard and said, okay the thing is in wide use but the main moat is that our competitors are even more distrusted in the market than we are; other than that it's completely undifferentiated and can be swapped out in a heartbeat for multiple other offerings. How do we do we persuade our investors we have a locked in customer base that won't just up-stakes in favour of other options or just running open source models themselves?
I actually trust that they will.
... until this week! Opus is struggling worse than Sonnet those last two weeks.
The bell curve up and then back down has been so jarring that I am pivoting to fully diversifying my use of all models to ensure that no one org has me by the horns.
I think the real issue stems from the 1 Million token context window change. They did not anticipate the amount of load it would give you. That first few days after they released the new token window, I was making amazing things in one single session from nothing, to something (a new .NET based programming language inspired by Python, and a Virtual Actor framework in Rust). I think since then they've been trying too many things to tweak things, whilst irritating their users.
They even added a new "Max" thinking mode, and made "High" the old medium, which is ridiculous because you think you're using "High" but really you're not. There's a hidden config file to change their terrible defaults to let Claude be smarter still, and apparently you can toggle off the 1M tokens.
I think the real fix, and I'm surprised nobody there has done this yet, is to let the user trim down their context window.
Think about it, you used to have what? 350k tokens or so? Now Claude will keep sending your prompt from 30 minutes ago that's completely irrelevant to the back-end, whereas 3 months ago it would have been compacted by now.
Others have noted that similar prompting for some ungodly reason adds tens of thousands of extra garbage tokens (not sure why).
Edit looks like someone figured out that if you downgrade your version of Claude Code and change one single setting it unruins Claude:
airgramming plusgramming programming maxgramming studiogramming
and recently the brand new way of working: Neogramming !
Personally I stick for now with the "Programming " tier. Maybe will upgrade to "Maxgramming" later this year...
For example, this demo (https://github.com/barnum-circus/barnum/tree/master/demos/co...) converts a folder of files from JS to TS. It's something an LLM could (probably) do a decent job of, but 1. not necessarily reliably, and 2. you can write a much more complicated workflow (e.g. retry logic, timeout logic, adding additional checks like "don't use as casts", etc), 3. you can be much more token efficient, and 4. you can be LLM agnostic.
So, IMO, in the presence of tools like that, you shouldn't bother using /loop, code routines, etc.
It says in the prohibited use section:
> Except when you are accessing our Services via an Anthropic API Key or where we otherwise explicitly permit it, to access the Services through automated or non-human means, whether through a bot, script, or otherwise.
So it seems like using a harness or your own tools to call claude -p is fine, AS LONG AS A HUMAN TRIGGERS IT. They don’t want you using the subscription to automate things calling claude -p… unless you do it through their automation tools I guess? But what if you use their automation tool to call your harness that calls claude -p? I don’t actually know. Does it matter if your tool loops to call claude -p? Or if your automation just makes repeated calls to a routine that uses your harness to make one claude -p call?
It is not nearly as clear as I thought 10 minutes ago.
Edit: Well, I was just checking my usage page and noticed the new 'Daily included routine runs' section, where it says you get 15 free routine runs with your subscription (at least with my max one), and then it switches to extra usage after that. So I guess that answers some of the questions... by using their routine functionality they are able to limit your automation potential (at least somewhat) in terms of maxing out your subscription usage.
Cursor has that too by the way (issue -> remote coding session -> PR -> update slack)
What grinds my gears is how Anthropic is actively avoiding standards. Like being the only harness that doesn't read AGENTS.md. I work on AI infra and use different models all the time, Opus is really good, but the competition is very close. There's just enough friction to testing those out though, and that's the point.
I guess I'm one of the people who disagree, specifically about AWS. I think a lot of companies just watch their bill go up because they don't have the appetite to unwind their previous decision to go all-in on AWS.
Ignoring egress fees, migrating storage and compute isn't hard, it's all the auxiliary stuff that's locked in, the IAM, Cognito, CloudFormation, EventBridge, etc... Good luck digging out of that hole. That's not to say that AWS doesn't work well, but unless you have a light footprint and avoided most of their extra services, the lock-in feels pretty real.
That's what it feels like Anthropic is doing here. You could have a cron job under your control, or you could outsource that to a Claude Routine. At some point the outsourced provider has so many hooks into your operations that it's too painful to extract yourself, so you just keep the status quo, even if there's pain.
Chinese models (GLM, MiniMax) are better.
I downgraded my $200/mo sub to $20 this past week and I’m going to try out Codex’s Pro plans. Between the cache TTL (does it even affect me? No idea), changes in the rate limit, 429 rate limit HTTP status code during business hours, adaptive thinking (literally the worst decision they’ve ever made, as far as my line of work is concerned), dumb agent behavior silently creating batshit insane fallthroughs, clearly vibe coded harness/infrastructure, and their total lack of transparency, I think I’m done. It was fun while it lasted but I’m tired of paying for their mistakes in capacity planning and I feel like the big rug pull (from all three SOTA providers) is coming like a freight train.
There's utility in LLMs for coding, but having literally the entire platform vibe-coded is too much for me. At this point, I might genuinely believe they're not intentionally watering anything down, because it's incredibly believable that they just have no clue how any of it works anymore.
But this week I've lost count of the times I've had to say something along the lines of: "Can you check our plan/instructions, I'm pretty sure I said we need to do [this thing] but you've done [that thing]..."
And get hit with a "You're absolutely right...", which virtually never happened for me. I think maybe once since Opus 4-6.
A routine is a saved Claude Code configuration: a prompt, one or more repositories, and a set of connectors, packaged once and run automatically. Routines execute on Anthropic-managed cloud infrastructure, so they keep working when your laptop is closed. Each routine can have one or more triggers attached to it:
A single routine can combine triggers. For example, a PR review routine can run nightly, trigger from a deploy script, and also react to every new PR. Routines are available on Pro, Max, Team, and Enterprise plans with Claude Code on the web enabled. Create and manage them at claude.ai/code/routines, or from the CLI with /schedule. This page covers creating a routine, configuring each trigger type, managing runs, and how usage limits apply.
Each example pairs a trigger type with the kind of work routines are suited to: unattended, repeatable, and tied to a clear outcome. Backlog maintenance. A schedule trigger runs every weeknight against your issue tracker via a connector. The routine reads issues opened since the last run, applies labels, assigns owners based on the area of code referenced, and posts a summary to Slack so the team starts the day with a groomed queue. Alert triage. Your monitoring tool calls the routine’s API endpoint when an error threshold is crossed, passing the alert body as text. The routine pulls the stack trace, correlates it with recent commits in the repository, and opens a draft pull request with a proposed fix and a link back to the alert. On-call reviews the PR instead of starting from a blank terminal. Bespoke code review. A GitHub trigger runs on pull_request.opened. The routine applies your team’s own review checklist, leaves inline comments for security, performance, and style issues, and adds a summary comment so human reviewers can focus on design instead of mechanical checks. Deploy verification. Your CD pipeline calls the routine’s API endpoint after each production deploy. The routine runs smoke checks against the new build, scans error logs for regressions, and posts a go or no-go to the release channel before the deploy window closes. Docs drift. A schedule trigger runs weekly. The routine scans merged PRs since the last run, flags documentation that references changed APIs, and opens update PRs against the docs repository for an editor to review. Library port. A GitHub trigger runs on pull_request.closed filtered to merged PRs in one SDK repository. The routine ports the change to a parallel SDK in another language and opens a matching PR, keeping the two libraries in step without a human re-implementing each change. The sections below walk through creating a routine and configuring each of these trigger types.
Create a routine from the web, the Desktop app, or the CLI. All three surfaces write to the same cloud account, so a routine you create in the CLI shows up at claude.ai/code/routines immediately. In the Desktop app, click New task and choose New remote task; choosing New local task instead creates a local Desktop scheduled task, which runs on your machine and is not a routine. The creation form sets up the routine’s prompt, repositories, environment, connectors, and triggers. Routines run autonomously as full Claude Code cloud sessions: there is no permission-mode picker and no approval prompts during a run. The session can run shell commands, use skills committed to the cloned repository, and call any connectors you include. What a routine can reach is determined by the repositories you select and their branch-push setting, the environment’s network access and variables, and the connectors you include. Scope each of those to what the routine actually needs. Routines belong to your individual claude.ai account. They are not shared with teammates, and they count against your account’s daily run allowance. Anything a routine does through your connected GitHub identity or connectors appears as you: commits and pull requests carry your GitHub user, and Slack messages, Linear tickets, or other connector actions use your linked accounts for those services.
1
2
3
4
5
6
7
Run /schedule in any session to create a scheduled routine conversationally. You can also pass a description directly, as in /schedule daily PR review at 9am. Claude walks through the same information the web form collects, then saves the routine to your account. /schedule in the CLI creates scheduled routines only. To add an API or GitHub trigger, edit the routine on the web at claude.ai/code/routines. The CLI also supports managing existing routines. Run /schedule list to see all routines, /schedule update to change one, or /schedule run to trigger it immediately.
Open the Schedule page in the Desktop app, click New task, and choose New remote task. The Desktop app shows both local scheduled tasks and routines in the same grid. See Desktop scheduled tasks for details on the local option.
A routine starts when one of its triggers matches. You can attach any combination of schedule, API, and GitHub triggers to the same routine, and add or remove them at any time from the Select a trigger section of the routine’s edit form.
A schedule trigger runs the routine on a recurring cadence. Pick a preset frequency in the Select a trigger section: hourly, daily, weekdays, or weekly. Times are entered in your local zone and converted automatically, so the routine runs at that wall-clock time regardless of where the cloud infrastructure is located. Runs may start a few minutes after the scheduled time due to stagger. The offset is consistent for each routine. For a custom interval such as every two hours or the first of each month, pick the closest preset in the form, then run /schedule update in the CLI to set a specific cron expression. The minimum interval is one hour; expressions that run more frequently are rejected.
An API trigger gives a routine a dedicated HTTP endpoint. POSTing to the endpoint with the routine’s bearer token starts a new session and returns a session URL. Use this to wire Claude Code into alerting systems, deploy pipelines, internal tools, or anywhere you can make an authenticated HTTP request. API triggers are added to an existing routine from the web. The CLI cannot currently create or revoke tokens.
1
2
3
4
Each routine has its own token, scoped to triggering that routine only. To rotate or revoke it, return to the same modal and click Regenerate or Revoke.
Send a POST request to the /fire endpoint with the bearer token in the Authorization header. The request body accepts an optional text field that’s appended to the routine’s configured prompt as a one-shot user turn. Use text to pass context like an alert body or a failing log. The example below triggers a routine from a shell:
curl -X POST https://api.anthropic.com/v1/claude_code/routines/trig_01ABCDEFGHJKLMNOPQRSTUVW/fire \
-H "Authorization: Bearer sk-ant-oat01-xxxxx" \
-H "anthropic-beta: experimental-cc-routine-2026-04-01" \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d '{"text": "Sentry alert SEN-4521 fired in prod. Stack trace attached."}'
A successful request returns a JSON body with the new session ID and URL:
{
"type": "routine_fire",
"claude_code_session_id": "session_01HJKLMNOPQRSTUVWXYZ",
"claude_code_session_url": "https://claude.ai/code/session_01HJKLMNOPQRSTUVWXYZ"
}
Open the session URL in a browser to watch the run in real time, review changes, or continue the conversation manually.
For the full API reference, including all error responses, validation rules, and field limits, see Trigger a routine via API in the Claude Platform documentation. The /fire endpoint is available to claude.ai users only and is not part of the Claude Platform API surface.
A GitHub trigger starts a new session automatically when a matching event occurs on a connected repository. Each matching event starts its own session.
GitHub triggers are configured from the web UI only.
1
2
3
4
GitHub triggers can subscribe to any of the following event categories. Within each category you can pick a specific action, such as pull_request.opened, or react to all actions in the category.
| Event | Triggers when |
|---|---|
| Pull request | A PR is opened, closed, assigned, labeled, synchronized, or otherwise updated |
| Pull request review | A PR review is submitted, edited, or dismissed |
| Pull request review comment | A comment on a PR diff is created, edited, or deleted |
| Push | Commits are pushed to a branch |
| Release | A release is created, published, edited, or deleted |
| Issues | An issue is opened, edited, closed, labeled, or otherwise updated |
| Issue comment | A comment on an issue or PR is created, edited, or deleted |
| Sub issues | A sub-issue or parent issue is added or removed |
| Commit comment | A commit or diff is commented on |
| Discussion | A discussion is created, edited, answered, or otherwise updated |
| Discussion comment | A discussion comment is created, edited, or deleted |
| Check run | A check run is created, requested, rerequested, or completed |
| Check suite | A check suite completes or is requested |
| Merge queue entry | A PR enters or leaves the merge queue |
| Workflow run | A GitHub Actions workflow run starts or completes |
| Workflow job | A GitHub Actions job is queued or completes |
| Workflow dispatch | A workflow is manually triggered |
| Repository dispatch | A custom repository_dispatch event is sent |
Use filters to narrow which pull requests start a new session. All filter conditions must match for the routine to trigger. The available filter fields are:
| Filter | Matches |
|---|---|
| Author | PR author’s GitHub username |
| Title | PR title text |
| Body | PR description text |
| Base branch | Branch the PR targets |
| Head branch | Branch the PR comes from |
| Labels | Labels applied to the PR |
| Is draft | Whether the PR is in draft state |
| Is merged | Whether the PR has been merged |
| From fork | Whether the PR comes from a fork |
A few example filter combinations:
main, head branch contains auth-provider. Sends any PR that touches authentication to a focused reviewer.true. Routes every fork-based PR through an extra security and style review before a human looks at it.false. Skips drafts so the routine only runs when the PR is ready for review.needs-backport. Triggers a port-to-another-branch routine only when a maintainer tags the PR.Each matching GitHub event starts a new session. Session reuse across events is not available for GitHub-triggered routines, so two pushes or two PR updates produce two independent sessions.
Click a routine in the list to open its detail page. The detail page shows the routine’s repositories, connectors, prompt, schedule, API tokens, GitHub triggers, and a list of past runs.
Click any run to open it as a full session. From there you can see what Claude did, review changes, create a pull request, or continue the conversation. Each run session works like any other session: use the dropdown menu next to the session title to rename, archive, or delete it.
From the routine detail page you can:
Routines need GitHub access to clone repositories. When you create a routine from the CLI with /schedule, Claude checks whether your account has GitHub connected and prompts you to run /web-setup if it doesn’t. See GitHub authentication options for the two ways to grant access. Each repository you add is cloned on every run. Claude starts from the repository’s default branch unless your prompt specifies otherwise. By default, Claude can only push to branches prefixed with claude/. This prevents routines from accidentally modifying protected or long-lived branches. To remove this restriction for a specific repository, enable Allow unrestricted branch pushes for that repository when creating or editing the routine.
Routines can use your connected MCP connectors to read from and write to external services during each run. For example, a routine that triages support requests might read from a Slack channel and create issues in Linear. When you create a routine, all of your currently connected connectors are included by default. Remove any that aren’t needed to limit which tools Claude has access to during the run. You can also add connectors directly from the routine form. To manage or add connectors outside of the routine form, visit Settings > Connectors on claude.ai or use /schedule update in the CLI.
Each routine runs in a cloud environment that controls network access, environment variables, and setup scripts. Configure environments before creating a routine to give Claude access to APIs, install dependencies, or restrict network scope. See cloud environment for the full setup guide.
Routines draw down subscription usage the same way interactive sessions do. In addition to the standard subscription limits, routines have a daily cap on how many runs can start per account. See your current consumption and remaining daily routine runs at claude.ai/code/routines or claude.ai/settings/usage. When a routine hits the daily cap or your subscription usage limit, organizations with extra usage enabled can keep running routines on metered overage. Without extra usage, additional runs are rejected until the window resets. Enable extra usage from Settings > Billing on claude.ai.
/loop and in-session scheduling: schedule local tasks within an open CLI session(Amazon + Anthropic does seem like a much more compelling enterprise collaboration / acquisition than Microsoft + OpenAI ever did.)
Ironically, they are now playing against their own models that can relatively easily build wrappers around any API shape into any other API shape.
1. Anthropic realized their models weren't enough of a moat.
2. They built tools so they could expand their moat.
3. People don't want to use their tools, they want their models, and use other, better tools.
4. Anthropic bans the use of better tools, taking advantage of their model superiority to try to lock people into subpar tools.
"I don't have enough of a moat so I'll use my little moat and pretend it's a big one" doesn't sound like a great strategy. All they're doing with this anticonsumer behaviour is making sure that I'll leave the moment another model works for me as well as Claude does.
Your own, personal, Jevons.
A bit annoying, but not the end of the world.
My point was, I don't think it mattered much, and it feels like an ok comparison - cloud offerings are mostly the exact same things, at least at their core, but the ecosystem around them is the moat, and how expensive it is to migrate off of them. I would not be surprised at all if frontier AI model providers go much the same way. I'm pretty much there already with how much I prefer claude code CLI, even if half the time I'm using it as a harness for OpenAI calls.
your experience just hasn’t been my experience I guess. The more managed the service you use, the more costs you are going to pay - for a very long time I’ve got by with paying for compute, network, and storage on the barebones services. If you want to pay for convenience you will pay for it.
One area that was a little shitty that has changed a lot is egress costs, but we mostly have shifted to engineering around it. I’ve never minded all that much, and AWS support is so good at enterprise tiers that they’ll literally help you do it.
Claude Code routines sounds useful, but at the same time, under AI-codepocalypse, my guess is it would take an afternoon to have codex reimplement it using some existing freemium SaaS Cron platform, assuming I didn't want to roll my own (because of the maintenance overhead vs paying someone else to deal with that).
* make sure the model maxes out all benchmarks
* release it
* after some time, nerf it
* repeat the same with the next model
However, the net sum is positive: in general, models from 2026 are better than those from 2024.
1) that AI will be more advanced in the future
2) that the AI I am using will be worse in the future
In a way, it’s true if china has superior AI then it’s dominance over US will materialize. But it’s not hard to see how this scenario is being used to essential lie and scam into trillions of debt.
Its interesting how the cutthroat space of big tech has manifested into an incidious hyper capitalist system where disrupting a system is it’s primary function. The system in this case is world order and western governments
It would be absurd to me if the same application is somehow allowed via ACP but not via official SDK. Though perhaps the official SDK offers data/features that they don't want you to use for certain scenarios? If that were they case though it would be nice if they actually published a per-SDK-API restrictions list.
That we're having to guess at this feels painful.
edit: Hah, hilariously you're still using the SDK even if you use ACP, since Claude doesn't have ACP support i believe? https://github.com/agentclientprotocol/claude-agent-acp
It changes a number of things. Not all tasks require very high intelligence, but a lot of data may be sensitive enough to avoid sharing it with a third party.
I didn't even know what opencode was prior to that drama, yet now here i am using opencode and a ton of crafted openai agents in my projects. Would love to have some claude agents in that mix, but i guess im stuck in Claude Code if i wanna even touch their models... I'd love to go back to just claude as i "trust" them more in a sorta less evil vibe manner, but if they are gonna prevent subscription usage to something people use to allow themselves more freedom, they gotta then close that gap with their own tools rather than pumping out stuff like this which scares me off given the past couple months.
I totally understand why they are cutting off 3pa access to stuff like openclaw, where the avg user is just a power user in comparison to avg claude user or whatever. I haven't kept up a ton with their opencode issues, but I just know i can't get behind a company actively trying to make my potential usage of tokens less optimized to keep me locked into their ecosystem.
Really just kinda hoping local models kill it all for devs after a few years, I'm not interested in perma relying on data centers for my workflow.
# Note: This is inefficient, but deterministic and predictable. Previous
attempts at improvements led to hard-to-predict bugs and were
scrapped. TODO improve this function when AI gets better
I don't love it or even like it, but it is realistic.I never asked for a 1M context window, then I got it and it was nice, now it's as if it was gone again .. no biggie but if they had advertised it as a free-trial (which it feels like) I wouldn't have opted in.
Anyways, seems I'm just ranting, I still like Claude, yes but nonetheless it still feels like the game you described above.
I'm thinking they should go back to all their old settings and as a user cap you at their old token limit, and ask you if you want to compact at your "soft" limit or burst for a little longer, to finish a task.
Luckily you can turn if off pretty easily, but I don't know why it's on by default to begin with. I guess holdover from when people used it with a $20 subscription and didn't care.
I've got a setup where GPT5-mini (Free on GH) talks to you to refine and draw the outline of your feature, calls a single Opus subagent to plan it in depth, then calls a single sonnet subagent to implement it.
Github will only charge you for one opus request (With a 3x multiplier) and one sonnet request, whether they consume 50 or 500.000 tokens. I'm running this setup for 9 hours a day at work and I've barely consumed 40% of my monthly allowance.
https://x.com/lydiahallie/status/2039800718371307603
--- start quote ---
Digging into reports, most of the fastest burn came down to a few token-heavy patterns. Some tips:
• Sonnet 4.6 is the better default on Pro. Opus burns roughly twice as fast. Switch at session start.
• Lower the effort level or turn off extended thinking when you don't need deep reasoning. Switch at session start.
• Start fresh instead of resuming large sessions that have been idle ~1h
• Cap your context window, long sessions cost more CLAUDE_CODE_AUTO_COMPACT_WINDOW=200000
--- end quote ---
https://x.com/bcherny/status/2043163965648515234
--- start quote ---
We defaulted to medium [reasoning] as a result of user feedback about Claude using too many tokens. When we made the change, we (1) included it in the changelog and (2) showed a dialog when you opened Claude Code so you could choose to opt out. Literally nothing sneaky about it — this was us addressing user feedback in an obvious and explicit way.
--- end quote ---
It's very light on token usage in general, as well.
Sometimes you have to keep starting new session until it works. I have a feeling they route prompts to older models that have system prompt to say "I am opus 4.6", but really it's something older and more basic. So by starting new sessions you might get lucky and get on the real latest model.
> My point is that they must apply these restrictions.
I fully understand and respect they need restrictions on how you can use your subscription (or any of their offerings). My issue is not there there _are_ restrictions but that the restrictions themselves are unclear which leads to people being unsure where the line is (that they are trying not to cross).
Put simply: At what point is `claude -p` usage not allowed on a subscription:
- Running `claude -p` from the CLI?
- Running `claude -p` on a Cron?
- Running `claude -p` as a response to some external event? (GH action, webhook, etc?)
- Running `claude -p` when I receive a Telegram/Discord/etc message (from myself)?
Different people will draw the line in different places and Anthropic is not forthcoming about what is or is not allowed. Essentially, there is a spectrum between "Running claude by hand on the command line" and "OpenClaw" [0] and we don't know where they draw the line. Because of that, and because the banning process is draconian and final with no appeals, it leads to a lot of frustration.
[0] I do not use OpenClaw nor am I arguing it should be allowed on the subscription. It would be nice if it was but I'm not saying it should be. I'm just saying that OpenClaw clearly is _not_ allowed but `claude -p` wouldn't be usable at all with a subscription if it was completely banned so what can it (safely) be used for?
Not just that, but there’s really no way to come to an objective consensus of how well the model is performing in the first place. See: literally every thread discussing a Claude outage or change of some kind. “Opus is absolutely incredible, it’s one shotting work that would take me months” immediately followed by “no it’s totally nerfed now, it can’t even implement bubble sort for me.”
Suuuuuuure it was.
That said, I had way better experiences with old (but contemporary) Apple hardware than any other kind of old hardware.