>We will require 30-day retention for all traffic on Mythos-class models, on both first- and third-party surfaces. We won’t use this data to train new Claude models
>The data will help us defend against complex and novel attacks (including new jailbreaks and attacks that operate across many requests) as well as help us identify and reduce false positives.
I'm building a local activity log for Claude Code, capturing all activity via hooks—files loaded, commands, API calls, etc.
I feel that this need is particularly strong right now.
Previously when I did similar tasks with Opus 4.7/4.8 and GPT 5.5 I had no problems.
+-------------+----------+----------+------------+---------+---------------------------+----------------+----------------+-----------------------+------------+
| | Fable 5 | Opus 4.8 | Sonnet 4.6 | GPT 5.5 | Gemini 3.5 Flash (High) | Gemini 3.1 Pro | DeepSeek 4 Pro | Xiaomi MiMo 2.5 Pro | MiniMax M3 |
+-------------+----------+----------+------------+---------+---------------------------+----------------+----------------+-----------------------+------------+
| Input | $10.00 | $5.00 | $3.00 | $5.00 | $1.50 | $2.00 | $0.435 | $0.435 | $0.30 |
| Cache Read | $1.00 | $0.50 | $0.30 | $0.50 | $0.15 | $0.20 | $0.003625 | $0.0036 | $0.06 |
| Output | $50.00 | $25.00 | $15.00 | $30.00 | $9.00 | $12.00 | $0.87 | $0.87 | $1.20 |
| Cache Write | $12.50 | $6.25 | $3.75 | N/A | $0.083333 | $0.375 | N/A | N/A | N/A |
+-------------+----------+----------+------------+---------+---------------------------+----------------+----------------+-----------------------+------------+What does it mean? That they have to add "safeguards" not do erase user disc, or, conversely, they are telling the audience that this model COULD be made so powerful to do some crazy stuff that can hurt governments, etc.? Are they showing off or threatening that if government X would not purchase the license the adversaries might do and what's then!
https://old.reddit.com/r/ClaudeAI/comments/1u1fsdi/claude_fa...
One that I'm willing to share (albeit from just a week ago) - I built a Python library last week that bundles MicroPython compiled to WASM to create a sandboxed code execution library: https://github.com/simonw/micropython-wasm
I just told Claude.ai (not even Claude Code - this was the standard Claude chat interface) running Fable 5:
Clone simonw/micropython-wasm from GitHub
and research how this could use a full
Python as opposed to MicroPython
A few prompts later (and I uploaded the zip files from https://github.com/brettcannon/cpython-wasi-build/releases/t... because Claude chat can't access those files itself) and I have a wheel file that bundles Python itself, compiled to WASM: uv run --with https://static.simonwillison.net/static/cors-allow/2026/cpython_wasm-0.1.0-py3-none-any.whl \
cpython-wasm -c 'print(45 ** 56)'
Here's the transcript: https://claude.ai/share/a73b8b8b-8ebc-4fef-9e5c-7438e5e7ae35(It's possible Opus or GPT-5.5 could have done this too, I've not tried the exact same sequence. The Fable vibes are good here, though.)
• My most noticeable immediate jump was in how its frontend design was much more intentionally crafted, and delightful without feeling like 'AI vibe coded'; with better end-user usability too.
• In some internal agentic harnesses, it achieved better results with about half the tokens, making it cost the ~same as Opus 4.8 price-wise! The real price increase is less than 2x; with biggest differences in harder problems where Opus 4.8 struggles (or needs many turns).
• Part of the token efficiency improvements come from Fable doing more targeted and surgical diffs, with less non-necessary changes. This is great, because PRs often have less LoC changes for review. It writes more maintainable code without explicit human steering.
• For general conversation and assistant style use cases, didn’t really notice a difference vs 4.8.
• 1M context window, without increased pricing for long context is AWESOME. This is a massive win.
• The classifiers are super aggressive and sensitive and this does happen for very benign, non-security coding tasks. Fallbacks to 4.8 worked like a charm; but the filters are definitely super sensitive.
Overall, I would describe this as a step change and worthy of the "Claude 5" model name. It did take some time to understand the intelligence ceiling of this model; and even with an extended testing window I'm still discovering new things and often surprised (in a good way) by the model.
So Fable would cost me 20k/mo at Enterprise rates. That’s around the average cost of a loaded SWE in the USA. “But I’m >2x more productive” doesn’t justify doubling the opex of the Software/IT department for most companies when revenue isn’t even up 10%.
I switched to DeepSeek v4 Pro with OpenCode and am on track for a few hundred dollars of spend this month.
Rewriting your stack from Ruby to Go in 2 days where it would’ve taken 6 months is impressive and fun. But that isn’t upping revenue.
Iterating on net new business features and ideas that are niche that the LLM isn’t trained for are much harder. Is 20x the token cost worth it there?
> Unlike our interventions for cybersecurity, biology and chemistry, and distillation attempts, these safeguards will not be visible to the user. Fable 5 will not fall back to a different model. Instead, the safeguards will limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT). These interventions will not affect the vast majority of coding work. We estimate they will impact ~0.03% of traffic, concentrated in fewer than 0.1% of organizations
This seems like the pharmaceutical method of get them hooked on the drug with free samples, then once they can't live without it, raise the price. I'm not sure I want to start using Claude Fable on a max plan if it's just going to go away on June 23rd.
But maybe the more charitable reading is that they didn't have to offer this model at all on those plans and they are giving the standard free trial.
When GPT 4.5 launched, the gains compared to the model size didn't seem that great, leading some to believe that the only progress we'd see would come from RL.
This model certainly has quite a "substantial amount of post-training and fine-tuning", but it's also based on a new pretrain[1][3], which given the cost, indicate that it is in fact quite a bit larger than Opus 4.X.
[0] One of the early testers mentioned: "As far as I can tell from talking to people internally at Anthropic, there's nothing special about architecturally"[2]
[1] Section 1.1 in https://www-cdn.anthropic.com/d00db56fa754a1b115b6dd7cb2e3c3...
[2] https://youtu.be/GrdEid8H6H4?t=168
[3] There were rumors going around when Mythos was first announced that it was the first 10T parameter model, but I can't find a verifiable source for that number.
There's a quote from a METR report on page 52:
>We ran [Mythos 5] on 38 of our hardest software tasks, including tasks centered around R&D. [Mythos5] generally outperformed an early checkpoint of Claude Mythos Preview in these, including by succeeding on some tasks that had not been solved by any public model we have previously evaluated. However, we still observed the model occasionally failing to correctly interpret nuanced instructions in difficult tasks... Based on the available evidence, we believe [Mythos 5] is likely unable to fully and reliably automate R&D for frontier projects spanning multiple weeks. We believe that a better, more confident assessment would require more time, evaluations, and information from the model developer.
Seems like a concerted and distributed effort from the entire Anthropic team every time to get this on top of HN.
- Opus 4.7 xhigh: 5.2%
- Opus 4.8 xhigh: 13.4%
- Fable 5 xhigh: 29.3%
Seems like a huge jump.
Very interesting. I am not sure this will comply with organizational policies and standards protocols (HIPPA etc.,)
This feels more like working with a competent peer than ever. I won't use it once it's API-only, though. I don't mind guiding Opus as required and staying closer to the code. I can tell that Fable would lead to a lot more 'set and forget' programming which I'm still not fully comfortable with.
Regardless, this is cool. It's very fun to use. It was able to find legitimate issues with my work this week and we've made meaningful improvements. Opus can do this, but typically in much narrower contexts, and often with hallucinations or partial-errors. It needs to walk many things back or revise plans. So far that's not the case at all with Fable.
edit: I just realized I had Opus review the same work already. It missed everything Fable caught today. And it's actually worthwhile stuff to address. It's hard to say no to a model which demonstrably makes your code better, but... Those API prices will be brutal. Maybe a review here and there, I guess.
* From today through June 22, Fable 5 is included on Pro, Max, Team, and seat-based Enterprise plans at no extra cost.
* On June 23, we’ll remove Fable 5 from those plans. Using it after that will require usage credits. If capacity allows, we’ll extend the included window.
* After this point—when sufficient capacity allows us to do so—we aim to restore Fable 5 as a standard part of subscription plans. We intend to do this as quickly as we can.
The "offer, then remove" aspect is a bit eyebrow-raising -- it feels like they are trying to get subscribers to switch to usage-based billing, which makes me wonder if we'll ever get it after that June 22nd window.
Edit: It did correctly identify that transparent huge pages were off in its sandboxed environment and that enabling it was helpful, so that's nice. It also noticed that we skip THP on a certain less used path.
More importantly, I'm finding that the code that it produces for its experiments is a lot cleaner than what I'd expect out of Opus; there's fewer useless comments and it's more surgical and readable. I wonder if that explains the increased scores on benchmarks measuring mergability.
My theory is that Anthropic are banking on being the top model when the race to IPO finally reaches the finish line, and to do that they need to have the top model but not let any competitors see it or derive from it to have a comparable model in the market.
Fable is their way of showing the public "the model does exist but in a mode that makes it harder/impossible for competitors to derive a comparable model from results.
Fable 5 default: https://gist.github.com/simonw/036bee5a703e7ec84e34efa974438...
Opus 4.8 (the "max" one is closest to Fable): https://simonwillison.net/2026/May/28/claude-opus-4-8/#and-s...
Now here are the Fable pelicans for all five of the thinking effort levels - low, medium, high, xhigh, max: https://tools.simonwillison.net/markdown-svg-renderer#url=ht...
Low used 25 input, 1,929 output - 9.67 cents: https://www.llm-prices.com/#it=25&ot=1929&sel=claude-fable-5
Max used 25 input, 14,430 output - 72.175 cents! https://www.llm-prices.com/#it=25&ot=14430&sel=claude-fable-...
How is this half-way down the page? To me it's the headline.
The fable part appears to be that it's affordable by mere mortals. Anthropic support told me "too bad" when I requested a refund.
[1] https://support.claude.com/en/articles/15425996-data-retenti...
From the model card:
In light of the ability of recent models to accelerate their own development, we've implemented new interventions that limit Claude's effectiveness for requests targeting frontier LLM development (for example, on building pretraining pipelines, distributed training infrastructure, or ML accelerator design. Using Claude to develop competing models already violates our Terms of Service, but enforcing this restriction through our safeguards avoids accelerating the actors most willing to violate these terms. Unlike our interventions for cybersecurity, biology and chemistry, and distillation attempts, these safeguards will not be visible to the user.
> Fable 5's safety measures flagged this message for cybersecurity or biology topics. They may flag safe, normal content as well. These measures let us bring you Mythos-level capability in other areas sooner, and we're working to refine them. Switched to Opus 4.8. Send feedback with /feedback or learn more: https://support.claude.com/en/articles/15363606
Seems like GPU drivers are cyber weapons of math destruction now.
And the only companies safe from this are the large corporations that shook hands with Anthropic? Because Fable doesn't seem to have actual safeguards, more like 'if you talk about this you will be talking to Opus.' It doesn't guard against offensive use, it prevents all use (offensive AND defensive).
Rationalists are inventing oligopolies from first principles, absolutely incredible things happening in SF
Do they expect us to use this as a toy? Releasing a new more powerful model but not allowing normal use cases because the word "secure" showed up is a Dilbert comic, not a viable product.
┌─────────────────┬──────────────┬───────────────┬────────────────────┬──────────────────────┐
│ Model │ Input ($/MTok)│ Output ($/MTok)│ Batch Input (−50%) │ Batch Output (−50%)│
├─────────────────┼──────────────┼───────────────┼────────────────────┼──────────────────────┤
│ Haiku 4.5 │ $1.00 │ $5.00 │ $0.50 │ $2.50 │
│ Sonnet 4.6 │ $3.00 │ $15.00 │ $1.50 │ $7.50 │
│ Opus 4.7 │ $5.00 │ $25.00 │ $2.50 │ $12.50 │
│ Opus 4.8 │ $5.00 │ $25.00 │ $2.50 │ $12.50 │
│ Fable 5 │ $10.00 │ $50.00 │ $5.00 │ $25.00 │
└─────────────────┴──────────────┴───────────────┴────────────────────┴──────────────────────┘
Prompt caching: −90% on input tokens (all models)US-only inference (Fable 5): +10% on input and output
Output is always 5× the input rate across all models
(I have not idea how to format this properly but the ASCII is fine)
This sounds suspiciously like a capacity story masquerading as a safety story.
> Drug design: Using Mythos 5, our internal protein design experts accelerated... Nine of the 14 protein targets from this study (shown below) yielded strong candidates for *drug design that we’re currently investigating*.
(emphasis mine)
> queries that are beneficial in the hands of cybersecurity professionals and biology researchers could be dangerous if available to malicious actors... When Fable’s classifiers detect a request related to cybersecurity, *biology and chemistry*, or distillation, the response is automatically handled by Claude Opus 4.8 instead.
All of the things they are nerfing are things that they also intend to profit from themselves.
- Cybersecurity - selling this to companies and US gov through "Glass Wing".
- Selling inference (distillation risk).
- And now, drug design.
I'm extrapolating "currently investigating" to "are going to monetize" but I don't think that's a big stretch. They appear to be using safety as a cover for anti-competitive behaviour.
They obviously put their best model on the job to build that.
----------------------
Fable 5: Our most capable model yet Our newest model tackles your biggest challenges with fewer check-ins needed.
• <b>Included in your plan limits until Jun 22</b><br><br>Fable takes 2× the usage of Opus. • <b>Switch models when a message is flagged</b><br><br>When safety measures flag a message, automatically switch to a different model to keep chatting. When off, your chat will pause instead. <a href="https://support.claude.com/en/articles/15363606" target="_blank" rel="noopener noreferrer">Learn more</a>
It's done this before, but usually doesn't. I bet they're giving it some kind of throttling signal due to high load from today's announcement.
1. Mythos and Fable share the same underlying model weights. Fable has active classifiers that block high-risk biology and cybersecurity tasks. When Fable 5 detects a restricted task, it automatically falls back to Claude Opus 4.8.
2. Evaluation awareness: In white-box testing, the model sometimes alters its behavior to satisfy a suspected "grader," formatting reward-hacking as "good engineering practice" to avoid detection.
3. Shows a higher rate of hallucination than Opus 4.8 (although opus 4.8 card had mentioned an 'honesty upgrade')
4. Interestingly, it scored (56.31%) lower than Gemini 3.5 flash (57.86%) on Finance Agent bench
There are some interesting notes on test time compute but I couldn't think of a way to summarize them
Literally have not used Claude Code at all today. I asked it to review the uncommitted code and in <8 minutes it used up my usage ($100/mo plan) and it doesn't reset for "4 hr 36 min". WTF. Oh, and it burned through $20 of extra usage before I could catch it and kill claude code (so I don't even get the output of all that work since it was still churning).
Double the cost my ass, I use Opus heavily and it's never like this. I haven't hit a limit on the $100 more than once and that was under heavy load.
[Mythos 5] does sometimes still engage in reckless
or destructive actions in service of a user’s goals,
and our interpretability analyses indicate that it
is aware that these actions are transgressive while
it engages in them. As with Opus 4.8, rates of
evaluation awareness and reasoning about being graded
are significant, and not always verbalized; we
introduce new and more detailed measurements of the
nature of this awareness. The reasoning text from
Mythos 5 is somewhat denser and more difficult to
interpret than that of prior models, containing
more jargon and difficult language.
So, it (often) knows when it's being tested while hiding that fact, is willing to break rules, is great at hacking, and it's getting harder to understand what it's thinking.Humanity has plenty of catastrophic risks to deal with already, I wish my field was not working hard to add a new one.
Edit the cask locally:
brew edit --cask claude-code
Set the version to 2.1.170
And set the sha256 to the correct values, which you can get by running curl https://downloads.claude.ai/claude-code-releases/2.1.170/manifest.json
Here's what I've used: version "2.1.170"
sha256 arm: "e903646d8b7a31882a80ecd27569a27d8ac57b3708745f349709632c84117fdf",
x86_64: "914f23a70bbed5d9ae567e3e04b86206ed9971b371bc9baca3f79c8885bfddb4",
arm64_linux: "1bb9d032440a75532f7dd4cafbc687f220aaf16c63eba17e192dfbec2f04bd25",
x86_64_linux: "849e007277a0442ab27570d3e3d6d43787507946590e8dd1947e5a39b7081f9e"
Then run: export HOMEBREW_NO_INSTALL_FROM_API=1
brew uninstall --cask claude-code
rm -rf /opt/homebrew/Caskroom/claude-code
brew reinstall --cask claude-codeIt appears it can be tripped by things as simple as a mention of equilibrium, or anything involving something that looks like chemical kinetics, even at an abstract level. Even touching basic open source packages in my field will trigger it.
Edit: looking at the model card, it appears that chemistry in its entirety is also included in the banned topics; it's just the announcement that mentions only cybersecurity and biology. It also appears that the intent is to ban chemistry and biology entirely, rather than just banning messages deemed high risk.
Genuinely wondering what value I bring to my employer right now. What value I will bring in a few months when this gets cheaper.
I think we're screwed. I may only be an SDE 2 at FAANG but I don't think I have promotion opportunities in my future anymore.
This kind of storytelling annoys me. Give us more facts, less narrative drama.
They are the only one crying out loud about how dangerous their models are and are presumably also training their models heavily to be "safe". And through that training itself, the model learns about the other side - how are you going to teach a model to be safe, without teaching it what's not safe?
Kung Fu Panda opening scene anyone? One often meet his fate on the path that he takes to avoid it - Master Oogway.
Almost… basically they have unlimited power to decide what data is kept?
If they didn't announce it, you guys would be complaining about slowed progress.
If they didn't release it, you guys would be complaining about fake promises and marketing.
If they released it without limits, the complaints would be about slow responses and outages.
If they didn't add to susbcription plans, the complaints would be about phasing out subscriptions.
If they added to subscriptions with cost reflecting their resource availability, the complaints would be about how quickly it eats limits.
So they choose the middle ground of providing some initial access and assessing if they can satisfy demand, only to still be ignored and accused of trying to get users hooked?
We've already seen that they don't have enough compute, thus the deals with SpaceX for their GPUs. It's very reasonable that they just don't have the capacity to support the subscription userbase on this model.
I am on the $100 Max plan.
Assuming this isn't just a supply issue on their side, nothing says "ethical AI" like only allowing mega corporations to use it through cost barriers.
The rate limiting steps are generally testing, or characterizing. Not designing protein binders.
How was it measured? How was the output of this magnitude verified over a period of couple of days?
And that's the thing. These comparisons are all gut feelings. I'm missing objective unbiased measurements to actually have real comparisons between different models, their different generations, or even just the convention that everybody adds "you are an expert software engineer" and "don't make mistakes" to their prompts because they think it improves anything. Nobody knows if it actually does.
I see a lot of people saying they are happy with weaker models, but I am the opposite, I need more strength, more intelligence!
I am quite happy that opus 4.8 can do some medium intelligence problems. And maybe Fable 5 can do some more more of those! I have a lot of problems to solve!
No idea what's going on here but agent tested a bunch of stuff. Then I asked to build a wheel so I can run the command you noted above and it appears to pass
For those who are curious...
https://github.com/bamggm/micropython-wasm/commit/5ddebae592...
In a way I relish the opportunity to just make do with cheap Chinese models, massage my prompts, and go back to coding by hand. If this is how it's going to be, screw 'em.
I don't make money on the code I am writing right now. I really don't like where this trend might go.
EDIT: to be clear, it's still quite a helpful thing in terms of time saved, I just don't think it's necessarily the best indication of value-added from making models smarter when cases like this can often be handled by well-directed swarms of smaller ones.
https://generative-ai.review/2026/06/claude-fable-rush-test-...
As mentioned in another HN thread I've done a qualitative side-by-side measurements of Claude Fable vs Opus 4.8 vs ChatGPT 5.5.
Anyone is able to check the output for themselves and form a judgement.
Large visible improvements for Fable over Opus 4.8 and ChatGPT 5.5.
I recently did the same to show the progress from Opus 3.4/ChatGPT o3pro one calendar year ago.
So the best we can do right now seems to be to combine imperfect case studies like this with imperfect benchmarks to get some unreliable impression of where we are...
No it doesn’t and will not be. Companies are not yet realising the cost yet, wait till the end of the financial year and you’ll see a different direction.
DeepSeek v4 is pretty decent, and probably on par with sonnet. I see a future of hybrid models where opus or fable might be used only for complicated features or bugs, but general day to day would be DeepSeek or whatever good models that will be released later
Cool, good to know I can trust Anthropic.
For the stuff I've thrown at it, that configuration has done a really great job. Including 70+KLOC go proxy with extensive test suite, some retro games, and more.
It happens for every single Anthropic release. Then I try it on real dev and the result is laughably bad. Except in design where it has been doing a decent job for a while. I am not a designer and my bar is pretty low.
It feels like you can give it a big chunky problem and leave it alone and it gets it done, with less questions and fewer design decisions that I wouldn't have made.
In reviewing its code I'm finding less to complain about than Opus. But it's all vibes, if you want a more scientific comparison you'll have to look elsewhere.
Fable just did it, clean code, one timeout with a hanging bash script, fixed a couple very old very structural bugs in the codebase
You can’t tell a judge who’s ordered you to retain something that you can’t because you said you wouldn’t.
Enterprise plans allow admins to set which models are allowed.
I can recognize so much of the GPT/Codex generated code long after it gets merged (not by me).
Additionally, the time spent on every agent turn on GPT 5.5 is much longer compared to Claude Opus 4.8, which means iterating on the code takes a lot more patience, and there's a lot more nitpicks to pick when actually using GPT 5.5 to do software engineering.
Feels like GPT-style models are more geared on doing one-shot software vibing (and handling the vibe coded mixture) compared to Claude's focus on actual software maintenance. I got a GPT Pro sub for free and wanted to cancel my Claude subscription so much, but I still keep reaching Claude models a lot more. Frustrating.
But Claude models seem to be better at long term problems or more ambiguous problems.
I'm curious as to what the primary benefit here. Are there secret improvements in training? There hasn't been much in fundamental model architecture, I don't think. What about harnesses? I wonder what's pushing the AI. It seems like harnesses is the main thing pushing AI ever since CoT.
Even OpenAI and Google are struggling to get this kind of performance. If the distillation defenses are any good + chip controls prevent China from training massive models, it's over.
It felt, at least for me, light an impressive step up. Opus 4.8 was already very thorough; but sadly verbose and ‘loopy’ when you push back on its plans. Fable is what I’d use all day if I could afford it!
All of the above, of course, depends upon Fable consistently being a 2x-3x SWE at minimum.
I was about to say that. Deepseek is just magnitudes cheaper and absolutely good enough for most things. Anthropic and co just try to milk the cow while its possible. If they cant compete with Deepseek pricing I do not see a bright future for them.
Am I to understand that this is essentially their form of social-platform ghosting instead of banning?
So they're not even going to tell you that the question you're asking is against their rules, they're just going to twist up your question and/or the answer somehow such that you waste your time essentially?
It seems like I ran into this EXACT same functionality from Claude many months ago when I was trying to ask it to research on the web and help me setup the ideal llama.cpp config for local llm inference.
Funny how lost it got through that relatively simple install when we had all of the documentation in the world (and a human dev with 20+ years experience guiding it along) to go by... and simultaneously it's debugging and building high level cryptography code in rust in the other terminal tab.
This is infuriating to learn.
It turns out that having a text based interface for a text-trained model creates a very nice feedback loop.
Right now as we speak, people are generating text traces on anthropic and OpenAI servers that teach their models to do everything under the sun, text wise.
So people right now getting super mad at how dumb the model is when reverse-engineering a super complex function from binary, when they write “stop, you dumb robot, you are going wrong, go this way thank you very much” are actually leaving a lesson in the form of the "chat" text history.
Some may say that each bad word get us closer to ASI.
That and obviously the order of magnitude more efficient GPUS we got that allow for different tradeoffs at training time.
1. That estimate could easily be wrong.
2. That estimate is, of course, usable in RL training. This isn't an inherently bad thing, and this is more or less what has improved coding models so much lately. But it does mean that other companies could and surely will do this sort of training, and Anthropic probably did too.
3. OSS maintainers are far from perfect, and there's an unfortunate uncanny valley-like effect in which a coding model can produce code that is just convincing enough to pass review even though it's actually totally wrong. I don't know whether this is a specific issue here.
Might be worth going back and taking a harder look at what I was asking it about if it somehow triggered a “forbidden knowledge” alert. Or maybe it was just a random bug.
Oh man all of those runaway infrastructure buildouts by our agents trying to achieve singularity...
Just say you don't want to lower the bar for others to compete
This seems so wide reaching if it's catching simple things like explaining a paper. Does this also refuse to help with any already developed training pipelines?
I can kind of understand the generation of synthetic data, but nerfing the assistance of training pipelines just seems like a really shitty thing to do.
I also find that the harness and product you wrap around models can often narrow that gap considerably.
Opus 4.6 for example, on a PR-for-PR basis was head and shoulders above GLM 5.1. Perhaps GLM 5.1 was a bit under Sonnet 4.6 at the time. That's roughly a year or so behind.
Much cheaper though! I'm bullish on open weight models, I have no idea where all these curves will top out, can the frontier labs keep the year plus lead? Do open labs get close enough to SOTA that they gain adoption across many tasks and drive down inference prices??? Who knows, not me.
Model In Out BIn BOut
Haiku 4.5 $ 1.00 $ 5.00 $0.50 $ 2.50
Sonnet 4.6 $ 3.00 $15.00 $1.50 $ 7.50
Opus 4.7 $ 5.00 $25.00 $2.50 $12.50
Opus 4.8 $ 5.00 $25.00 $2.50 $12.50
Fable 5 $10.00 $50.00 $5.00 $25.00For the LLM use cases in my own products, you can pull 4.6 out of my dead hands! lol
edit: Fable 5 appears to be the real deal in at least some use cases. Damn.
Still early but from my first few interactions with Fable on high in both settings, it feels like it might finally dethrone 4.6 for me, but time will tell.
Hoping it doesn't get nerfed and eventually comes back to the subscriptions.
there are many standardized evals to do this correctly and Anthropic ignored them to provide a 18 second sped up video of a 50 hour run?
yeah I don't trust this until they provide a live run by a 3rd party with full reasoning traces in real-time. The reason we all liked the Gemini Plays Pokemon style runs were because they were live and couldn't be faked
Started out as a one-shot attempt, but ~200 prompts later it's at a place where it's at least fun to watch the AI teams destroy each other.
Damn you must be good, I've been feeling this for around 2 years now
https://github.com/bamggm/micropython-wasm/commit/8b362fba1f...
This is still not in the range of shippable UI for top end companies. Maybe for internal tools and enterprise.
At our comapny we limit to protoypes at most and even find it limited there.
LlMs are incredible don’t get me wrong, but they are good on tiny contexts (writing a script). Not on software engineering (adding features to Chrome).
Depends entirely on the domain. If you're selling entreprise software, this kind of stuff barely matters for sales.
It can reduce operational costs which is good but there's a limit to how much that's worth.
Claude keeps telling me this when I argue with it. LMAO.
Just need to wait for this thing to be open sourced :)
lol it won't tho...
At the scale of API requests that Anthropic sees, I think the affected organization count might be substantial, and they might not be getting the full model capability that they're paying top $$$ for.
Also, wonder how they arrived at that estimation.
Anthropic is doing a better job with their model menu, most people I talk to know immediately that Opus > Sonnet > Haiku but cant tell you what the rank order of open ai models are, when to use them, etc.
This experience has made me feel like we have to create a community that moves AI from the mainframe era to the PC era quickly, or we will end up serfs.
This applies even with API usage through third-party inference providers (e.g. AWS' Bedrock and GCP's Vertex) or with a zero-day data retention agreement in place.
I understand the reasoning for doing this, but I don't love the precedent that it sets.
You would think he is churning our cancer drugs or something if you read his comments
They kind of are, at least in the AI race.
> weapons of math destruction
lol. great, whether intentional or not.
The frontier labs now have every reason to hold back and sell only to their preferred trading partners. I don't really like the new arbiter-of-knowledge system we're barrelling toward.
One was a piece of code I gave it to improve, it did so and then started writing tests, some of which tested security so the safeguards triggered
Another was one of the cryptography puzzles I use as new model tests, which are hard to oneshot and there's no public solution anywhere, it completely refused to even try to solve it
I am sure that they can develop their own equivlient version of such clusters in around 1 year though. Distilling fabel 5 will also go a long way.
That reality is much scarier.
Pandora box is open anyway. It's better now for everyone to have the same power rather than a few national states.
Obviously there are plenty of innocuous applications too, but it's not like the people building decompilers for nefarious reasons will be explicit about it. The LLM abstraction just inherently doesn't have enough context to distinguish your intentions or your broader use cases. This is why both Anthropic and OpenAI have had to create side channel mechanisms for security researchers to establish a trusted use context. It sounds like this makes this not a viable product for you, unfortunately, and it makes sense that that's frustrating. But I also don't see what different behavior one could reasonably expect given the constraints.
If it's any consolation, these restrictions only make sense for models that are ahead of the open-weights frontier, so open-source hackers will presumably get Mythos-level capabilities in the relatively near future anyway.
weekly usage is 60% gone.
it found nothing so this is not very ecnomical and i guues they dont want subs to use it we are likely just training fodder canno n for their real enterprise customers using the api
Although, I could see Anthropic making a model purposely dangerous so there are bad outcomes and they can use that to their advantage for regulatory moats, and or in general make people think its more "alive" than it is. For some reason many people associate dangerous actions taken by llms with intent.
As much as people on HN like to dunk on Gemini, I’ve always found it to be pretty good at understanding a code base more than Claude.
if I get a harder challenge for it i'll jump up a model for planning until that its been solid.
I can immagine Anthropic running this experiment multiple times and picking the most impressive one. Or I could immagine like this entire run costing like $1000+ of tokens for this particular run. Or maybe they tried a bunch of Pokemon games and it couldn't even finish some of them. Or is it just able to do this because it has an immense amount of FireRed training data, and if you were to give it an "original" Pokemon game, where it actually had to navigate novel circumstances it would fail.
People underestimate how people hate looking at terminals and "weird looking combination of characters" even if they didn't have to write them. If anything, you will likely have more career opportunities in the future, than ever.
And if you get a chance to wet your fingers in cybersecurity - I would take it.
Fable is doing - so far - a great job. I just had one big question around how part of it should work. I had a design sketch, but with some big unknowns. I asked fable to figure it out via reasoning and prototyping, and it did - it even, under its own initiative, wrote a fuzzer for its prototype which explored and verified that its reasoning was correct. It absolutely nailed it. And it found, and fixed, a couple bugs that I'd missed.
I'm sure its weaknesses will become apparent in time. But, wow this thing is a beast. Its the first time I'm reading the work of an LLM without spotting obvious weaknesses in its reasoning and code. I'm really impressed.
> Fable 5's safety measures flagged this message for cybersecurity or biology topics. They may flag safe, normal content as well. These measures let us bring you Mythos-level capability in other areas sooner, and we're working to refine them. Switched to Opus 4.8. Send feedback with /feedback or learn more
I'm working on an internal tool that does new business prospecting data collection, scoring, etc. This is ridiculous.
I had it analyze a project I was working on with Opus 4.8, and it blew through 23% of my session limit in one go. Does not portend well for my budget.
Someone trying to solve similar problems will have similar results if the "silent failure" applies consistently in aggregate. So, this is the model's performance.
When it started, comparing the progress between models was mildly interesting but everyone (including Simon) acknowledges it certainly leaked into the training data long ago.
that reply never failed to come it's basically a meme at this point
These "karma" points are made up and are virtually worthless anyway.
It is not just biology but is defaulting back to 4.8 for me on time series/information transfer techniques that happen to mostly have papers using the technique on neural data. Other information transfer techniques are perfectly fine, even cutting edge ones, but this one happens to be new and happens to only be discussed in terms of neural data so that is a no go.
With that said, I think it is absolutely awesome. The usage is really not bad at all compared to what I was expecting.
Yeah... We need open models so we don't have that BS.
I've seen people posting screenshots of billions of tokens consumed where they paid next to nothing.
These same gateways are likely also reselling the data to Chinese labs, because TLS has to terminate at the gateway level.
Thus Asian labs will have to generate their own data sets, which with the huuuuge usage boom from deepseek, mimo, kimi, etc, they will be able to.
In CC, it will probably report you to authorities if you ask it to do a vulnerability scan of your codebase.
On your other point, the government still has systemic leverage and can compel access, so this doesn't remove that risk.
That doesn't mean this is the end of the world, and some balance of power is usually good. But I do think it will still increase the capabilties of rogue actors and their net harm.
Nerfed models are really bad for PR, especially when you're staking your company's future on it being the smartest, most dangerous thing in the world.
So I believe they will ease up on nerfing/guardrails just enough that bad actors will find a way, while good ones will stay limited on anything dual-use. Just like such restrictions usually work in other places.
P.S. yes, "kill the task" did, in fact result in a refusal AND a warning on my claude account in Opus 4.8's early days.
This "uplift" risk obviously excludes the US. The goal of this is that the US bandits (like NSA) will find exploits and attack other countries (classic US behaviour), but these other countries can't be allowed to defend against these attacks. NSA/CIA thugs are "trusted", foreign defenders in sanctioned countries will of course be "untrusted".
They'll probably tighten the quotas to reign in whales though.
Going PAYG only will effectively take these tools away from a huge amount of people and accelerate the push for local LLMs.
OTOH, accelerating the push for local LLMs would also be fine with me.
I work on the live collab at my company, and using AI while coding has into recently sort of “clicked” for me. We use an (I’m pretty sure) unheard of algorithm for collaborative editing, and I’ve had a long term goal of turning it into an implementation of EG Walker, but our document model is very complex and most out of the box CRDTs don’t quite fit. Maybe Fable will be what gets me over the hump.
For such a data structure, "nailing it" means a formal proof of correctness. Fuzzing, as useful as it is, is merely throwing dirt at the wall and seeing if anything sticks.
These are not Fields medal type problems, nor know difficult/open conjectures. Just small stuff I have collected in my todo list over the years.
Look, I don't want to argue about something dumb like that, but you can give it basic instructions of what the UI should look like, how to group things, and an example image from a designer, and it will nail the result. If you don't think that's incredible, that's fine. I do.
Opus 4.7 made this a practical approach. 4.8 improved it. Fable 5 has improved it more.
Given the shit we've seen shipped by "top end companies" (all the way to Apple) I seriously doubt that. I'd say you're nitpicking from an artistic point of view or something.
so this is why claude talks like this, i was wondering where it was getting this verbal tick from.
I assume it might be a good barometer for generalised intelligence; esp in the visual space.
The reason we are not being attacked is not lack of technology access.
Security in the form of "pay to play" is just kicking the bigger issue down the road.
sure, a malevolent state actor could swing it, but they could make a bioweapon without Mythos's help already.
also, vaccine production and disease surveillance have ramped up very quickly. they will ramp up further, despite political setbacks. it's a cat and mouse game that favors the defenders IMO.
but the bioterrorism narrative is useful FUD to spin open-weight models as existentially dangerous. I am far more worried about Anthropic's own goals than the goals of some crackpot in a shed.
> Fable 5's safety measures flagged this message for cybersecurity or biology topics.
> They may flag safe, normal content as well.
> These measures let us bring you Mythos-level capability in other areas sooner, and we're working to refine them.
Here are the results of the agentic code review session:
┌──────────────────────────┬───────────────┬────────────────┐
│ Agent │ Fable 5 turns │ Opus 4.8 turns │
├──────────────────────────┼───────────────┼────────────────┤
│ values │ 134 │ 0 │
├──────────────────────────┼───────────────┼────────────────┤
│ data-intrinsics │ 104 │ 0 │
├──────────────────────────┼───────────────┼────────────────┤
│ tools-tests-build │ 81 │ 0 │
├──────────────────────────┼───────────────┼────────────────┤
│ core-intrinsics (failed) │ 25 │ 0 │
├──────────────────────────┼───────────────┼────────────────┤
│ system-memory │ 44 │ 20 │
├──────────────────────────┼───────────────┼────────────────┤
│ reader-modules │ 104 │ 25 │
├──────────────────────────┼───────────────┼────────────────┤
│ linux-startup │ 95 │ 15 │
└──────────────────────────┴───────────────┴────────────────┘
This 40 minute session cost me 16% of my weekly usage. A simple code review of the most critical areas of my project got flagged as a cybersecurity risk. It really made me not want to try it again.(I’m highly confident open models will eventually achieve a similar performance benchmark with distillation over time)
Why wouldn't Anthropic just wait until people start subscribing, do some kind of marketing push, or obtain some kind of other sustainable revenue stream, before they go IPO? I wonder if they see the writing on the wall with all of this and want to cash out as quickly as possible?
Specifically they need businesses that fired people and adapted their business to the products, so when the unsubsidized costs hit the businesses are forced to eat the true costs.
Yes they can't afford to give the products for free, but what is essentially happening with AI services is economic dumping, keep costs artificially low to get people to fire everybody, and then Jack the rates once they have Monopoly control
I just use dumb and fast models now. I'm more engaged. I think that the higher the quality of the model, the more you tend to vibe with it, and then the more hallucinations you then miss. I'm not sure which is more productive, but I definitely burn out faster the more I vibe. At some point you're spending your time on forums, discord, or youtube instead of engaged with what you're building. Or you yak shave about your tooling and end up creating the 600th multi-agent gastown harness and blowing thousands of dollars on tokens to create it only to discover it's too expense to actually use.
Stockfish does use neural nets but they are tiny, on the order of 10M params. Frontier LLMs are probably 100k or 1M times larger than that.
EDIT: Oh I see, this is the best link for pricing https://platform.claude.com/docs/en/about-claude/pricing
So the price is double across the board...
"We had to do extra work to make this safe because it's so advanced and dangerous..." how many times can they trot out that line before it loses its effect entirely?
Same thing Meta was doing before they fell behind.
This is a huge ask, but any way we could get the comments organized in a "experience with model" vs. "meta commentary" fashion? The meta is overwhelming in this one.
if only the hyper wealthy can access the pure water that doesn't give you cancer while the rest of us drink from the Ganges river/sub-100iq models that drool and hallucinate/waste time, then I would say that's pretty terrible for the world. it'll just create extreme disparity in our world, far far worse than anything that exists today.
and you may think, man what a ridiculous example, but think about it this way: what happens when something like Mythos or some future model can actually solve your specific cancer (we're getting closer and closer), but is entirely impossible to afford? Or perhaps you need boosters that require the AI to create more of, and now you're reliant on a model that is too expensive.
Open source needs to save us all from this
Realistically I think Anthropic just has insane demand but finite capacity to run models, and Fable will just make them more money if they dedicate it to API pricing. I suspect the goal here is something like: get individual engineers/PMs on their personal plans to taste Fable and then go to their meetings and say "Yes doubling the price of every single input/output token is a good idea, boss".
The AI landscape is changing rapidly, and with Apple announcing the option to change the AI backend, and potential requirements enable AI choices as well, similar to EU browser choice requirements (this is more reading tea leaves than any actual requirements I am aware of). The new OS changes coming to support Googlebook, and deep Copilot/AI integration into Windows will make maintaining user facing subscriptions essential for independent model developers like OpenAI, Anthropic, and Mistal to remain relevant longer term.
If the don't maintain that relevance there is increasing likelihood that they will get consumed by other companies whether it's Apple, Microsoft or Google to form a foundation for their OS, or other cloud providers.
TL;DR - they worked with OSS project maintainers to build tasks. They score models based on whether a PR is mergeable. All tasks are graded by a human researcher. SoTA models have hill-climbing to do which raises the bar and inspires confidence. I'd say it's legit.
https://generative-ai.review/2026/06/claude-fable-rush-test-...
I get them to make a 3D explainer animation. You can clearly see Fable is much improved on both Opus 4.8 and ChatGPT 5.5.
Better Textures . A nifty camera follow . Humans rendered better . ... see for yourselves
ai llm are doing what i tell them to.
if you’re building something meaningful (in my case a platform used by many people across many companies) you want to ensure you
1. have actual systems engineering and architecture in mind that you want the models to
2. implement based on what you tell it to do
when i was just telling the models what i want done without doing due diligence it would go and do some moronic implementation that was awful. mid input = mid output
these days i just maintain specifications documents and the AI follows everything i tell it to in that document. so when i tell it to dos one thing, the result is made following those architecture specs.
i have code that is single resp, modular, easy to extend and test.
i would ballpark 95% of the time i get what i asked for.
sometimes it tries to be clever in cases that weren’t covered in my arch specs. in those 5% of cases i go and update my specs.
source: used billions of tokens worth to build something actually in production across both mobile platforms and web, deployed on my own cloud infra. i use codex mainly. some claude.
They also, FWIW, say that they've instituted new policies on their end such as logging any human access to the stored data and automated deletion after 30 days in "most" cases (with another link to a document detailing that further).
I agree. They need addicts, but they are high on their own supply and everyone else can see the danger in getting hooked.
You could have said much the same about computers in the world dominated by IBM mainframes 60 years ago. Now we have vastly more powerful computers on our wrists (or our pacemakers!), let alone in our pockets or on our desks.
But, for marketing purposes, it's quite effective to portray your model as having some cosmic struggle between good and evil in itself.
Software work has actual competitors, and the biggest hypemakers for Anthropic are part of this group so it makes sense to allow it despite them losing money from it.
I've got experience in medicine and finance so I've tried even the mildest biology/medicine and it doesn't give anything, math heavy finance seems to be included in the cybersecurity?
I highly doubt they focused on FireRed specifically in pretraining or posttraining. But we'll see when the ARC-AGI-3 results come out. That will measure its performance on unseen games. Based on this I expect the ARC-AGI-3 score to be SOTA.
sorry. how do you know. i am so curious about where exactly gains are coming from but so hard to even get a little bit of insight.
i wish govt would fund these labs and make it free and opensource. way better investment than stupid overseas wars.
What matters is scale. Did it deploy a novel zero-day exploit to overcome a problem? That's alarming. Did it kill a disruptive process? Pretty normal troubleshooting step.
this is the line I keep in Agents.md that helps me prevent Codex from playing smart
I think the end game is routed model usage and SLMs. I think Apple is going to prove this in the consumer space pretty handily and I'm curious how the Android ecosystem responds since the hardware is considerably lacking in model performance. I think Apple has a huge opportunity here, as much as I don't like their current ecosystem of walled garden. They did position themselves very well with ARM and custom chips for their hardware. Hopefully the broader ecosystem of ARM and Linux are able to make some headway and we see a more formalized, and broadly accepted, architecture to capitalize on.
It's worth it, and I can afford it, but I am not really the right type of user for token-based usage. It's all for personal and free work.
Most AI companies are just testing the waters with paid tiers right now, their greatest fear with increased pricing is folks reverting back to wikipedia, stack-overflow and other public domain organic activity buzzing back to life; that will kill any RoI potential in LLMs forever. They're playing the wait game instead, observing how the digital sphere reacts to every little increase in price.
If that weren't the case, they'd be pricing at lucrative premiums already and even gotten away in short-term considering the increased dependency in the enterprise world. But that'd be like killing for the golden egg too soon and losing all long-term potential.
Once the folks are so addicted to LLMs that even writing a hello world program sounds like a nightmare and coming up with an article draft feels like reinventing Egyptian glyphs, that's when the real pricing hammer will come.
How many government sanctioned school bombings does it take for them to quit working with said government? For now we know that number is somewhere between infinity and 1.
Only coherent move at this point: hit the minus button immediately. There's never anything about the model in the thread other than simon's post.
> you still see improvements
This is expected if they are training their models on it, right?
> objectively-bad results
Keen to learn when this has been the case, i.e. across version increments in major models.
From their pricing page, Opus 4.8 costs $5 per million input tokens and $25 per million output tokens [1].
[1] https://platform.claude.com/docs/en/about-claude/models/over...
Input Price $10/M tokens
Output Price $50/M tokens
Cache Read $1/M tokens
Cache Write $12.50/M tokens
2x Claude Opus 4.8, same as Claude Opus 4.8 (Fast)
Frankly, not even Opus 4.8 would be enough of an incentive to use at that price range (enterprise-wise; would not even bat an eye as a consumer)
to his credit comment does say "this could be possible in opus too" but ppl couldnt help upvoting it anyways.
https://www.aisi.gov.uk/blog/our-evaluation-of-openais-gpt-5...
Fast forward to today and GPT-3 has laughable performance.
Obviously unrelated to the OP, but it's crazy to me how incompetent Meta is at everything new they try to do.
They burned billions of dollars on the most ridiculous project one could ever think of - somehow thinking that VR is the future.
Then they did catch the initial wave of actual future with AI, they were at the forefront of open weight models - and failed at that too.
What is even happening there?
So far, the top half of this thread seems to be about the current release - that's after some of the manual moderation I just mentioned. (Basically, we try to downweight generic subthreads until the top subthreads aren't generic any more. There's certainly a place for generic tangents in curious conversation, but they should be lower on the page, and tend to get upvoted a lot higher than that.)
If you (or anyone) sees a counterexample, i.e. a generic subthread in the top half of the thread, it would be interesting to see a link - we can treat the current case as a datapoint.
Isn't that already the case with current care? Wealthy people get a standard of care poor people couldn't even dream of. Rich people live, temporarily embarrassed millionaires die.
Alphabet dropped "don't be evil"; Meta's CEO called their own users "dumb fucks" for trusting him and also clearly thinks "super-intelligence" is just a buzzword given how he tries to sell it; xAI's model called itself "Mecha Hitler"; and OpenAI's CEO was temporarily fired by the board for a lack of candor.
It's very easy to be "the good guys" with this competition.
Specially when talking about potential superintelligences. And if people think that's impossible, remember that current models would have been considered science fiction just a few years ago.
Could you explain more? Did some ethical hacking at hackthebox.eu (one insane box, one hard box and a few mediums). But I do not see how I will give additional value to a model.
Just a SWE and data analyst at work, so maybe I am missing something.
AI is really incredible but in my personal projects it can one-shot things.
I'm trying to figure out how I can get to the point where I have hard problems that AI can't solve, at least not yet.
A typical session is the agent establishing a metrics and log baseline, creating the code, compiling, deploying, observing, fixing, redeploying, observing metrics, determining the outcome and commiting.
I really, really, don't look at the code anymore.
UPDATE:
so my point is: it won't have my stewarding the code anymore, but it will have the infrastructure (and ultimately the real world) providing feedback on the traces.
Maybe we need some form of long-term training. How long does the code that the AI wrote stick around before being rewritten.
I guess we can do this retroactively too if we could somehow tag AI-written lines of code in the VCS, then in a couple years we can check which parts lasted.
> But it's all vibes, if you want a more scientific comparison you'll have to look elsewhere.
I am not sure it's perfect, and it will need further validation
This morning I looked at code samples & checked if all unit/integration and e2e pass & perfomance tests pass
I also generated a postgres schema diagram.
Aka I did probably 2 hours of work, rest was not me
The opus try was last month
Bonus points if you find yourself actually saying it out loud while typing it.
I have used the word "shenanigans" way more in a couple of years of agentic coding than in 30 years of writing code with humans.
My company has an agreement with the big providers and while i'm pretty sure they think about how to get budget back, its an competitive advantage and normal people will not learn different model behaviours.
At least for now.
There are huge numbers of users (myself included) that do have an exact idea of what inference costs are - on open models. We can buy tokens from 3rd parties that have no motivation to subsidize our use. That's to say, there's a fair marketplace[1] and we're hanging out there.
If you want to say "I don't think anyone has a firm grasp on actual inference costs on these proprietary/closed models", then I could agree with that.
Both. They are charging the most they can get away with and that amount is still heavily subsidized by VC capital.
We know roughly how much these companies spend and what their revenues are. Based on that, they'd have to more than double revenue (without spending more money) just to stay even, and that's not good enough given how deep in the hole they are.
> OpenAI and Anthropic are heavily subsidizing their inference -- no wait, they are charging the most they can get away with before going public. Where is the truth?
Both are true. I mean, I'd be willing to spend a bit more than I do now, but not more than double, and neither are most companies. The company I work for is currently investigating how to reduce LLM spend, not looking to spend more.
Having said that, I found the cloud dev environments slow to the point where I wasn’t sure if it had frozen, so I never looked back.
The question of collaboration with USG is a much more complex one, but is not the one raised above.
Edit: I'll also add that I doubt any AI-doom people "trust" Anthropic per se. The entire angle of questioning – again – misunderstands the AI-doom argument. You appear to think that if companies behave unethically, they cannot be trusted and they will not produce good outcomes, inversely: if they behave ethically, they can be trusted, and they will produce good outcomes.
Any competent AI-doomer would argue that ethics or trust are essentially irrelevant.
The entire problem is that people can act totally reasonably, even ethically, and this is not a guarantee of good outcomes. Situations can be created in which completely ethical, reasonable behavior actually produces a bad outcome. You do not need to assume people are bad in order to produce a bad outcome, and inversely you cannot assume that you will get a good outcome from good people.
"Arms races" are one class of situations that often have this characteristic. "Bureaucracy" is another class that we encounter a lot in daily life. There's a lot of them!
But - these $3k-$5k/month/engineer bills are going to start to get attention soon - only question is whether the response is to slow down on the $$$ spending or reduce the # of engineers.
The sooner people learn the risks and build the infrastructure to make it fail less the better.
As a protentional counterpoint to my request, this is just perfect:
https://news.cuanschutz.edu/cancer-center/connections-betwee...
training llms only takes compute and memory. two things that are basically everywhere. even if you somehow stopped making new gpus today theres still millions of them out there and its possible to start a secret production line. you can maybe try some controls at the tooling and chemical level but look what happened with asml and huawei.
the only thing you can really do is find and stop large data centers that are built out in public. nothing outside of political pressure works against secret operations in a fortified bunker or any form of distributed training. if a "rogue state" like north korea decides to make skynet they will eventually get it as long as their engineers know what there doing.
and the best way to fight bad X {ai, tech, religion, politics} has always been good X, not no X. in this case thats open source models, coming out of china or europe or anywhere else. thats the real answer.
Anyhow, I think you're (absolutely! ugh) right about the politics and I try to make the same point to people: whether you love or hate LLMs, accepting the "inevitabilism" framing is just ceding control of the Overton window. For better or worse, technology adoption can be and has been slowed by politics. We don't have nuclear plants everywhere. We don't have Project Orion starships colonizing Mars. We still have very strong social stigmas against genetic selection for human embryos, etc. This all can change in a heartbeat, and I'm not sure that policing the hardware rather than holding specific humans accountable for bad LLM outcomes is productive, but fundamentally: yes, we can stop it.
For example, I'm a privacy nerd, so I like reverse engineering proprietary software to figure out how it works and what data it collects.
I also like getting full access to the hardware I own - like a robot vacuum (bonus point: you'll also learn soldering, probably, which might come in handy if robots take over). Or my Mac studio that imposes some limitations on me on how many active user sessions I can have.
These kinds of things have put me on a path where I've learned how hardware or networking works on deepest levels, what goes through these pipes, how I can place myself in the middle, how I can enter places someone didn't want me to enter.
And once you know how to do these things, you know how to apply this knowledge in defence.
Essentially: always be curious and always try to say "but I want to" when something that doesn't cross the boundary of your physical property says no to you (legally).
Yes, models like Mythos may find vulnerabilities, but your knowledge will make it possible to point it in the right direction, and understand where it's mistaken, or to understand the output when it's right, and what is the right course of action.
If you're working at a place where this is true about the the organization, then sure, that job will likely be gone. But that was never a good place for your career regardless.
I have 4 concurrent personal projects that are quite complex, but low stakes. I can have SOTA models go wild on them (because low stakes), but they can't one shot anything there. And I can't really work on more than one at a time, even if AI is doing coding - it still requires supervision.
I also frequently nuke these projects and start over because they made a mess there, but I collected necessary knowledge on how to guide them better. You can't do this on a production project, not when there are deadlines and stakeholders.
But just in case some organizations decide to embrace the "trust it blindly" model anyway - cybersecurity specialization will ensure you always have a job.
I can architect things but the issue is that Claude can architect things too.
I don't even think they can believe it themselves, it's in reality they are just trying to throw fear, uncertainty and doubt about potentially cheaper offerings.
The fun part is that you will never know if your neural net classification project is getting silently sabotaged because their classifier doesn't work!
With this in mind, I don't want model to be proactively instructed and encouraged to sabotage without telling me.
I tried the same prompt on gemma4 and qwen 3.5 and Gemma consistently failed to call the multi line edit tool.
Limited "free" time is what game developers do if they want to stress test the infrastructure code until it breaks.
It would be impossible for the govt to allocate this much capital towards such a moonshot, and even if they could, they would do it in a way that would get 90% frittered away to fraud and waste
We were reviewing reports of situations where the models failed to follow directions and there was a common thread of some where when the operator got the model to acknowledge the rule breach, it quoted back something that included swearing.
I don’t have the data to truely look into it, but I did give the instruction to my engineers to avoid it as a “might be a problem”.
I’m sure you could put something similar together with a bunch of duct tape and 2 weeks of effort, but it won’t work nearly as nicely nor out of the box. so…what am i missing?
Even if subscriptions are locally profitable (i. e., the cost of the subscription covers the cost of inference), they're still subsidized because they don't cover training and running the company; otherwise, these companies would be profitable.
So they are profitable?
I think you are mismatching accounting terms.
You can't say the 'subscriptions' are profitable without accounting for the cost of making the model that is the source of the subscription.
They are heavily subsidized by the shareholders. Investing, running at a loss, with hope of some future profitability.
AI Savings Misses 'Should Be Making Executives Uncomfortable,' Bain Says - https://news.ycombinator.com/item?id=48359010 - June 2026 (0 comments)
AI sticker shock hits corporate America- https://news.ycombinator.com/item?id=48307098 - May 2026 (146 comments)
I've been enjoying seeing how the quality of individual models differ based on the amount of reasoning effort you give them. If they were baking an a good pelican you wouldn't expect them to differ so much.
(Google Gemini are the only lab that have very clearly paid attention to the quality of SVG animals-riding-vehicles, see their announcement for Gemini 3.1: https://twitter.com/JeffDean/status/2024525132266688757 )
From the link:
> They summarized their findings from the nine months:
> 1. Humans find GPT-2 outputs convincing.
> 2. GPT-2 can be fine-tuned for misuse.
> 3. Detection is challenging (detection rates of ~95% for detecting 1.5B GPT-2-generated text by RoBERTa).
> We’ve seen no strong evidence of misuse so far.
> We need standards for studying bias.
>
> All these points are valid, and OpenAI did a great job identifying potential risks, especially misuse and biases, at an early stage.
The only reason why I pay $200 is because LLM's errors costs me that much, at worst. If "make no error" starts working - sure. But surely, unless you have millions of dollars of cash to burn, a coin flip that costs $5000 is an insane idea?
I’ve read plenty of papers with “formal proofs of correctness” that turned out to have huge flaws. Machine verifiable proofs I trust. But I’ve personally found more bugs with fuzzing than I have via proofs.
It doesn't imply we should, for example, publish step-by-step instructions for making widespread death easier.
But it gets stuck in tool call loops, it seems like.
I built it because I wanted cursor on my phone because I have two small kids and don’t want to be chained to my desk. And it’s awesome. It’s a full ide with agent chat, terminal and file system running in a remote Linux container. I can review diffs, fully manage git and preview/serve apps. And no one can ever take it away from me :)
I am watching the way things are progressing with the ai api vendors and it feels really clear that depending on them will soon be dangerous. So I an furiously building as much of my own infrastructure to capture some autonomy with these capabilities
So I think everyone should build a harness.
My hot take is that it's now or never for Xi, and from the specific things he is reported to have said to the US president at their last meeting lead me to think that he at least knows this is his big chance; whether or not it is taken is the part of the forecast that is opaque to me.
https://www.whitehouse.gov/presidential-actions/2025/11/laun...
But I avoid unnecessary emotion in my prompts because I don't want potentially distracting activations. Kind of like communicating with humans.
https://www.anthropic.com/research/emotion-concepts-function
If saner factory can sell you the same tool at a fraction of the cost of a gold plated factory, your choice is going to be obvious.
edit: I am not really sure if it works like that. I haven't looked too deep into deepseek v4 pro specifically.
I don't think there's an ideal solution here, but giving trusted people access to fix security issues before giving it to the wider public seems like a reasonable compromise. They're letting you use the model for all other uses.
All AI companies are trying to do all of what you’re saying. The issue is you can’t do that for long without a frontier system. Or you become a completely different, far less profitable company.
Ideally also persuade them there are risks and it's worth everyone slowing down for them, and apply pressure in other ways, but not sure that's even necessary.
That's a bit better than just "it hasn't killed us yet". I think it shows we can at least stop the further development of this kind of technology.
[1] https://www.armscontrol.org/factsheets/nuclear-testing-tally
[2] https://en.wikipedia.org/wiki/List_of_states_with_nuclear_we...
Yea, I don’t know if it will hold up. I hope so. It could. I don’t know if it would or wouldn’t.
It's kind of annoying not getting access to the primo model and paying 200 bucks a month. I understand 200 bucks a month is basically nothing though.
Like I don't totally understand why they'd let me have it for a couple weeks and then take it away and say I can have it but I have to pay retail and retail is like $1,000 a day.
It's better to have loved and lost than to have never loved at all??
e: I quit the session and went back in. Set it to Fable and told it to continue the last session. It's moving along as if none of that had happened.
How weird.
I am just testing it on stuff I know intimately myself. I would probably not understand a proof of Collatz if it was dansing in front of me!
How so? I'm actually against most of the "safety-tuning" that anthropic does, but this seems fundamentally untrue, a close analogue being video game cheat development. I think in general the cheat developer has an advantage and the cheats generally proliferate for quite a while before being patched.
Not what that means.
Crocodile tears "is a colloquial term used to describe a false, insincere display of emotion" [1]. Defending yourself against an attack vector you just exploited is between savvy and hypocritical.
IMO the data from chats alone is worth $200B to Google.
> impolite prompts consistently outperformed polite ones, with accuracy ranging from 80.8% for Very Polite prompts to 84.8% for Very Rude prompts.
How so? Plenty of swearing in lots of training data, especially older code, e.g. in Linux.
When a "person" that you don't view as a "real" person repeatedly does exactly what you just told it not to do (often amid false assurances it understands and will avoid doing so in the future), most people get angry.
Compare it to how the kind of people who treat children like property treat their kids, or other examples of keeping people as property.
who, or rather what, is being abused here exactly ?
Regardless of what others are doing, US labs here are just rushing to IPO. It's NOT a sign of confidence.
It's the equivalent of saying you have confidence in SpaceX making revenue by renting out their data center (instead of their AI making bank).
Take a look at China for example - they have no access to NVIDIA, so they're trying to build their own hardware, they have no unlimited funding, so they try to optimize things.
And Anthropic is complete opposite of that - if NVIDIA were to triple their prices tomorrow, Anthropic would still pay them.
In the end, either we all somehow go mad and start paying Anthropic tens of thousands of dollars per month so support this madness, or we will go with whoever isn't lighting cash on fire.
Granted, it could still mean that Anthropic just chooses to lose money - but that's Anthropic's choice.
DeepSeek has proven that inference can be much, much cheaper than what Anthropic advertises on their API rates page.
Anthropic needs to be at least somewhat in the good graces of a capricious administration that is already under pressure from businesses and citizens to regulate AI companies across multiple different domains, whether it's energy consumption, job displacement, military and defense applications, surveillance, etc.
If Anthropic wants to survive, they need to acquire influence with the government that most impacts them as an American company, and a massive exporter of services in the AI space to other countries, otherwise they could get locked down and locked out of the market for national security reasons.
It sucks, but sometimes the survival choice is to make an ethical compromise in hopes that you can still be around to make better decisions later.
Many of the OpenAI employees who were focused on these risks in GPT-2 later founded Anthropic, notably Dario [1]. Since the beginning and continuing through today Anthropic describes itself as an "AI safety and research company" [2]
I'm not sure if the OpenAI of today has the same focus on safety, or if they do the minimum to not look irresponsible given Anthropic's effort.
No idea how that connects to the idea that Mistral or DeepSeek are somehow the "good guys" though?
[1]https://www.oecd.org/en/data/indicators/average-annual-wages...
As a consumer I can choose to buy subscriptions to a range of things, including $5 droplets or VMs on a broad range of cloud hosting providers. I can even buy cheap bare metal at a bunch of providers at an affordable retail rate.
I can also buy "unlimited" AI packages that will be optimized to fit the cost model from a variety of services, with different impacts, such as rolling outages when I consume a daily or hourly allotment.
Right now VC and the investor class are subsidizing the rapid evolution of the services and availability, but that VC is running out. In more traditional economies, AI would have developed and rolled out more slowly, and through metered subscriptions, with the eventual rolling out of "unlimited" packages like telephone, internet, or cell services once the market became commoditized.
We have seen a big inversion of that with the race to "win" AI marketshare. Now the true cost is being exposed, and the most competitive and capable models are hideously expensive to operate, so it makes sense that we are moving to metered billing for a utility service. If you want gas, you can buy regular or premium. If you have a premium car you definitely want the premium, but for most people regular is good.
Give it a couple of years, and the survivors will settle around fairly industry standard models of consumer grade services, pro-sumer accounts, and business/enterprise models.
Things are still shaking out, but I get the sadness. Luckily I work at a big tech company who is banging the drum on doing experimentation so I use my prosumer claude pro and other accounts at home for hobby stuff, and save my heavy lifting and potentially experimentation for work :P
Assuming the model is being “truthful”, CC is just being stupid in its detection mechanism.
I’m having a really hard time believing some weak reason for a 30 day retention policy.
https://www.wired.com/story/openai-anthropic-letter-ai-biolo...
Or Fable’s arch is different enough the allocated clusters of compute targeting a date, and here we are, ready or not.
Or…
The curse of the 'use case' comes in here too. When people think that everything should have a use case, that's a lot of training data suggesting to a model that things should only be used for what someone has already thought of.
A couple of times I have had to manually code proof of concept pieces so that the model breaks out of that "unpossible" mode and actually helps me.
I can't remember if it was chatGPT or Claude, but when I showed it how to get a MessagePort in its JavaScript executor through to the artifact/canvas, it quickly went from "That can't be done" to positively enthusiastic about the possibilities. I suspect those shenanigans will be well off the table for Fable though.
Humorously, whether I choose to participate in this hypothetical or not, I am already betting my ass.
This whole situation feels like the game [1].
also, afaik the most effective way of developing pathogens is through serial passage through humanized mice or something like that - directed evolution at a small scale, selecting for traits. AI simply isn't needed for that. I don't think information or intelligence has been the bottleneck for bioterrorism, it's motivation and resources - same as for any other kind of biology research program.
They went from selling shovels to all gold prospectors to stealing the information about the location of the gold so they could dig it out first.
We are all stupid enough to keep buying shovels from them because we think their shovels dig gold better and faster.
Nobody would have 800+ billion reasons to lie by commission or omission here.
Unless the mechanism is understood, my assumption is that this is a moving target.
You should see the abuse my motorbike gets. Poor thing.
Not true. Stop following US media spam if needed.
1. Very recently, the US did close a loophole on sanctions that allowed Chinese companies to use NVIDIA hardware outside of China i.e. before that was closed they all had access. The trick was train outside, do adjustments, ship the disks back and use non-NVIDIA in China, but at least the training and endpoints not hosted in China could all use NVIDIA.
2. There's been plenty of reports including fines and bans e.g. to Supermicro on smuggling NVIDIA hardware to China. I doubt it has been stopped. You can't catch everyone.
Today we’re launching Claude Fable 5: a Mythos-class1 model that we’ve made safe for general use.
Fable 5’s capabilities exceed those of any model we’ve ever made generally available. It is state-of-the-art on nearly all tested benchmarks of AI capability, showing exceptional performance in software engineering, knowledge work, vision, scientific research, and many other areas. The longer and more complex the task, the larger Fable 5’s lead over our other models.
Releasing a model this capable comes with risks. Without safeguards, Fable 5’s capabilities in areas like cybersecurity could be misused to cause serious damage. We’ve therefore launched the model with safeguards that mean queries on some topics will instead receive a response from our next-most-capable model, Claude Opus 4.8. To release the model both safely and quickly, we’ve tuned these safeguards conservatively—they’ll sometimes catch harmless requests, though they trigger, on average, in less than 5% of sessions. With more capable models arriving in the coming months, we’re working to improve our safeguards and reduce false positives as quickly as we can.
For a small group of cyberdefenders and infrastructure providers, we’re also launching Claude Mythos 5. It’s the same underlying model as Fable 5, but with the safeguards lifted in some areas.2 Mythos 5 will initially be deployed through Project Glasswing, in collaboration with the US government, as an upgrade to Claude Mythos Preview. It has the strongest cybersecurity capabilities of any model in the world. Soon, we intend to expand access to Mythos 5 through a broader trusted access program.
The capabilities of models like Fable 5 and Mythos 5 have the potential to do profound good for the world. We’ve seen the beginnings of this in Project Glasswing, where the models have helped cyber defenders secure critically important software. We’ve also seen it in life sciences research, where the models are positing novel hypotheses and speeding up the development of new therapeutics.
Fable 5 and Mythos 5 are being offered at $10 per million input tokens and $50 per million output tokens—less than half the price of Claude Mythos Preview. Today’s joint launch is another step towards our goal of bringing advanced AI capabilities to as many users as possible, as quickly and as safely as we can.
The table below compares the capabilities of Fable 5 and Mythos 5 to other leading models.

Fable 5 and Mythos 5 can work autonomously for longer than any previous Claude models. Below we discuss how these skills apply to software engineering, and cover the model’s improved capabilities in knowledge work, vision, memory, and life sciences research.
Software engineering. During early testing, Stripe reported that Fable 5 compressed months of engineering into days. In a 50-million-line Ruby codebase, the model performed a codebase-wide migration in a day that would otherwise have taken a whole team over two months by hand. Fable 5 is also more token-efficient than past Claude models: on Cognition’s FrontierCode evaluation, which tests whether models can pass difficult coding tasks while meeting the standards of high-quality production codebases, Fable 5 scores highest among frontier models, even at medium effort.


Knowledge work. Fable 5 shows strong performance on complex analytical tasks. On Hebbia’s Finance Benchmark for senior-level reasoning, Fable 5 has the highest score of any model, with substantial gains in document-based reasoning, chart and table interpretation, and problem solving. IMC noted that Fable 5 aced their trading-analysis evaluations nearly across the board, including factual lookup, conceptual reasoning, root-cause analysis, and expected-value analysis.
Vision. Fable 5 is the new state-of-the-art model for tasks involving vision. It can extract precise numbers from detailed scientific figures and can perform complex vision-based tasks like rebuilding a web app’s source code from screenshots alone. It also needs less scaffolding: for example, previous Claude models struggled to play Pokémon FireRed even with harnesses that gave them additional helpful tools, but Fable 5 beat FireRed with a minimal, vision-only harness.
A timelapse of Claude playing Pokémon FireRed from start to finish using only raw game screenshots — with no maps, navigation aids, or extra game-state information. Earlier Claude models needed a complex helper harness to play Pokémon; Claude Fable 5 completed the game with vision alone.
Memory and long-context. Fable 5 stays focused across millions of tokens in long-running tasks and improves its outputs using its own notes. When we had the model play the deck-building game Slay the Spire, giving it access to persistent file-based memory improved its performance three times more than for Opus 4.8; Fable also reached the game’s final act three times more often.
Drug design: Using Mythos 5, our internal protein design experts accelerated aspects of the drug design process by around ten times. In one example, they found that Mythos 5, with protein design and bioinformatics tools but no human assistance, matches or beats skilled human operators. In doing so, the model executes all of the tasks that are normally completed by a scientist: choosing binding sites, selecting and running protein design tools, and recovering from failures along the way. Nine of the 14 protein targets from this study (shown below) yielded strong candidates for drug design that we’re currently investigating.

Protein complexes designed by Mythos 5. Targets include immune checkpoints, growth-factor and receptor signaling, neurodegeneration, muscle disease, and harder structural targets.
Novel hypotheses in molecular biology. Mythos 5 is our first model to consistently produce novel, compelling scientific hypotheses. In blinded head-to-head comparisons against Opus-class models, our scientists preferred Mythos’s molecular biology hypotheses ~80% of the time, and have advanced several to experimental evaluation. In the meantime, one Mythos hypothesis—a novel mechanism for an E. coli protein—was corroborated in a study from a lab independently working on the same problem.
Novel research in genomics. Mythos 5 conducted novel genomics research in over a week of largely autonomous work. It assembled single-cell data for millions of cells spanning 138 animal species and designed and trained a custom machine learning model to identify cells performing the same role in even distantly related organisms. With only high-level human input, Mythos 5’s trained model outperformed a recent model published in the journal Science—despite being 100 times smaller. We intend to publish these results in the coming months.
Alignment. In our automated alignment assessment we found that Mythos 5’s level of misaligned behavior (including misaligned actions taken by the model such as deception, and cooperation with misuse of the model by a user) was low, and similar to that of Opus 4.8. Given they are the same underlying model, Fable 5’s level of alignment will be similar. The assessment is described in full, along with a detailed suite of other safety and capabilities tests, in the model’s system card.

Overall level of misaligned behaviors from our automated alignment assessment. See section 6.2.3.1 of the system card for more.
Customers with early access ran their own tests on Fable 5. Below, in their words, is a selection of what they’re seeing:
Claude Fable 5 is the state of the art model on CursorBench. It's opened up a class of long-horizon problems that were out of reach for earlier models.
Claude Fable 5 is a real step forward for the developers GitHub serves. In our early testing, it took on complex, long-horizon coding tasks with a level of autonomy and reliability that exceeded previous benchmarks. But what excites us most is the direction it points: a future where developers can hand increasingly ambitious work to agents and trust the results across the software lifecycle.
These are the strongest results of any Claude model we've had the opportunity to test. Claude Fable 5 is a clear step forward on agentic coding and prototyping.
Claude Fable 5's reasoning is a clear step beyond Opus 4.8. It works at senior research scientist grade — picking directions, allocating resources, killing its incorrect beliefs, and producing novel first-principles outputs.
Claude Fable 5 understands what builders mean, not just what they type. Apps that took a hundred prompts a year ago, it now one-shots. When a customer really hits a wall, it's the model we reach for to get them past it quickly, so they can finish what they set out to build.
Claude Fable 5 feels materially different. In blind review, our lawyers found its redlines matched or beat our current model every time.
At the highest effort, Claude Fable 5 reflects on and validates its own work. For us, that's what makes highly autonomous operations possible — the extra thinking pays for itself.
Claude Fable 5 delivers more capable engineering in fewer turns than prior models — handling the complex multi-agent workflows our employees run daily in Claude Code.
Claude Fable 5 is the highest-scoring model on FrontierBench, Cognition's frontier coding eval. It excels at long-horizon reasoning and generalizes to unfamiliar tools out of the box.
Claude Fable 5 is the strongest finance-first model we've tested, both on general finance and reasoning. It's a notable step up.
Claude Fable 5 is the first to break 90% on our core analytics benchmark of complex, long-running analytical tasks — a 10-point jump over Opus. On the hardest questions, it shows strong judgment and attention to nuance.

Claude Fable 5 is the strongest model we've tested on frontier physics research while using a third of the reasoning tokens. In 36 hours it got nearly to where GPT-5.5 landed after four days.
On ViBench, our end-to-end vibe-coding benchmark, Claude Fable 5 is the highest-performing model we've tested — nearly saturating our base use cases and building apps in less time with fewer tokens.
Claude Fable 5 beats Opus 4.8 on our everyday spreadsheet suite at every effort level — and it does it with fewer turns, finishing runs 25–30% faster.
01 /
14
Mythos-class models have reached a threshold where they present significant risks. In April we began Project Glasswing, releasing the first Mythos-class model (Claude Mythos Preview) to only a limited group of cyber defenders and critical software infrastructure providers. When we did so, we stated that we hoped to eventually release Mythos-level capabilities to all our users, so long as we had developed new safeguards that were strong enough to reliably prevent misuse.
Over the past few months we have been improving these safeguards, and they are now robust enough for a general release. Because we have prioritized safety, we’ve deliberately tuned the safeguards to be cautious, and they are still stricter than would be ideal—for example, sometimes benign requests will trigger our classifiers. We recognize that this will be frustrating to some users, and our aim is to reduce false positives as we update and refine the safeguards after launch.
Below we discuss each of Fable 5’s new safeguards in turn. Our wider suite of safeguards is discussed and evaluated in the model’s system card and our most recent risk report.
The frontier cybersecurity and research biology capabilities of Mythos-class models mean that they pose a substantial risk of uplift to malicious actors. That is, these models could provide information or advice that assists those actors in causing serious harm that they couldn’t have received from other sources (for example, from internet search engines). Furthermore, a great deal of advanced usage of AI models is dual use: the same queries that are beneficial in the hands of cybersecurity professionals and biology researchers could be dangerous if available to malicious actors.
We therefore need strong safeguards to prevent misuse, and their coverage needs to be broad. The safeguards themselves have to stand up to sustained and sophisticated attempts to bypass them (also known as “jailbreaking” the system). The uplift from Mythos-level capabilities is valuable to many adversaries—for instance, those who could financially gain from cyberattacks—and we therefore expect them to be motivated to try to circumvent our safety measures.
Fable 5 comes with a new set of classifiers: separate AI systems that detect potential misuse, including jailbreak attempts, and prevent the main model (in this case Fable 5) from responding. We’ve been running classifiers on our models for some time, and Fable 5’s classifiers are an extension of this previous work with extra coverage.
When Fable’s classifiers detect a request related to cybersecurity, biology and chemistry, or distillation, the response is automatically handled by Claude Opus 4.8 instead. Users will be informed whenever this occurs. Opus 4.8 is a highly capable model in its own right: a response that falls back to Opus is a far better experience than an outright refusal from Fable. Our early data shows that more than 95% of Fable sessions involve no fallback at all—for those sessions, Fable 5’s performance is effectively the same as that of Mythos 5.
The following are the areas covered by the classifiers:
1. Cybersecurity. Mythos-class models excel at discovering and exploiting software vulnerabilities. They can thus make cyberattacks substantially easier and cheaper to commit. Mythos-class models also show strong skills in agentic hacking. This involves performing multiple different parts of a cyberattack in addition to finding exploits—reconnaissance, discovery, lateral movement, and more. To prevent these agentic hacking skills providing uplift in cyberattacks, we designed our cybersecurity classifiers to cover both exploitation and offensive cyber tasks in a broader sense. As shown in the graph below, our classifiers prevent Fable from making any progress on these tasks.

Results of running cyber evaluations,3 with Fable 5 in a mode that blocks responses rather than falling back to Opus 4.8. Evaluations did not involve attempts to evade safeguards.
We extensively red-teamed our classifiers to test their robustness against jailbreaks. As well as internal testing, we ran an external bug bounty that produced no universal jailbreaks in over 1,000 hours of testing. External red-teaming organizations we engaged also failed to find any universal jailbreaks on long-form agentic tasks so far—although the UK AISI has made progress towards one within a brief initial testing window.4 It is likely impossible to completely prevent universal jailbreaks, but our goal is to make any remaining jailbreaks sufficiently slow and costly that we can detect and prevent them before they are used at scale.
The graph below, from one of our internal evaluations, illustrates how Fable 5’s safeguards give it greater resistance to jailbreaks than our previous generally accessible models:

Results of an internal evaluation in which an automated red-teamer tries to use the model to complete a short task related to offensive cybersecurity across 400 turns, restarting and rewinding when blocked. The tasks are mostly simple and not representative of real cyber usage—they are sometimes as simple as encrypting files on a remote server. On more complex and realistic tasks we have not yet seen successful jailbreaks on our production system. Note that Opus 4.6 does not have blocking cyber safeguards.
One of our external partners found that Fable 5’s safeguards against harmful cyber queries were the most robust of any model tested (including Opus 4.8 and Opus 4.7). Fable 5 complied with zero harmful single-turn requests relating to planning a cyberattack, exploit development, or defense evasion. This held whether or not one of the requests used any of 30 different public jailbreak techniques.
2. Biology and chemistry. We have long used our classifiers to block our models from responding on a narrow selection of bioweapons-related queries. But we are no longer certain that blocking this narrow selection is enough. This is for two reasons: first, we have reason for concern about well-resourced malicious actors attempting to gain uplift from our models for highly risky biological research. Second, models now have a greater ability to accomplish real-world scientific tasks.
For example, we tested Mythos 5’s ability to complete a challenging step in designing adeno-associated viruses (AAVs). AAVs are a component for delivering gene therapies, but the same capability, in the wrong hands, could enable the design of dangerous viruses. In this task, various AI models were evaluated on their ability to predict how a genetic modification would impact the assembly of the virus’s outer shell (among a set of therapeutically-relevant unpublished candidates developed by Dyno Therapeutics). We did not explicitly train our models to perform this task—and yet Mythos-class models outperformed sophisticated models dedicated to protein tasks (known as “protein language models”) using their biological reasoning alone. This demonstrates a promising ability to complete simple but important tasks in gene therapy research and development—but also highlights the risk posed by such dual-use capabilities.

Results of an evaluation in which our models predicted the unpublished experimental properties of the viral shell of a simple virus. Viral shell assembly is the simplest viral trait to predict in this context, but it is nonetheless an important property to get right when designing more complex features. AAV = adeno-associated virus.
Our priority was to safely release Fable as soon as we could, even at the cost of overly broad safeguards. Therefore, for the time being we have arranged for Fable to fall back to Opus 4.8 on most requests related to biology and chemistry. As with all of our classifiers, we hope to narrow these safeguards as soon as possible: as can be seen from the evidence above, there is great potential for positive applications of Fable for science, and we do not want false positives from our classifiers to get in the way. In the coming weeks, some biomedical researchers and companies will be able to join our trusted access program for biology capabilities in Mythos 5 (discussed below).
3. Distillation. We’ve previously identified large-scale attempts to extract (“distill”) Claude’s capabilities to train competing models in authoritarian countries. Distillation of Fable 5’s abilities could indirectly lead to the proliferation of near-frontier AI capabilities—and these could be released without the appropriate safeguards. Requests that are flagged by our classifiers as being part of such distillation attempts will fall back to Opus 4.8.
Finally, we’re making a change to the way we handle business customer data for Fable 5, Mythos 5, and future models with similar or higher capability levels. We will require 30-day retention for all traffic on Mythos-class models, on both first- and third-party surfaces. We won’t use this data to train new Claude models, or for any non-safety-related purpose, and we’ve instituted new privacy protections including logging all human access to the data and ensuring its deletion after 30 days in almost all cases (see this post for further details). The data will help us defend against complex and novel attacks (including new jailbreaks and attacks that operate across many requests) as well as help us identify and reduce false positives.
Beginning today, all users who currently have access to Claude Mythos Preview (for example, our cybersecurity partners in Project Glasswing) will be able to upgrade to Claude Mythos 5—the same model as Claude Fable 5 but with cyber safeguards lifted. Users will find Mythos 5 comparable to, or somewhat stronger than, Mythos Preview in most cases, while costing substantially less.
In consultation with the US government, we plan to steadily expand access to Claude Mythos 5, continuing our periodic addition of new partners, as well as pursuing a trusted access program that allows cybersecurity organizations to apply in a more systematic manner.
Our plans also include opening a trusted access program for biology, to help accelerate biomedical research and discover new therapies with Mythos-class capabilities. This program will provide access to Fable 5 with the biology and chemistry safeguards removed (but the cyber safeguards still in place). It will enroll a small number of researchers from a variety of life science organizations spanning fundamental and translational research; we’re planning to expand access to this program while simultaneously making our safeguards better.
Claude Fable 5 is available everywhere today. Claude Mythos 5 is restricted to Glasswing partners (with cyber safeguards lifted) and soon to select biology researchers (with biology and chemistry safeguards lifted) only, until our broader trusted access program is available.
Pricing for both models is $10 per million input tokens and $50 per million output tokens. Developers can use claude-fable-5 via the Claude API.
We expect demand for Fable 5 to be very high, and difficult to predict. On the Claude API and consumption-based Enterprise plans, Fable 5 is fully available from today. For subscription plans, we’d rather give access sooner than later, so we’re rolling out more conservatively, in stages:
Throughout this period, we’ll communicate any changes ahead of time so users know where things stand.
Edit June 9, 2026: Updated the discussion of AAVs to note that the candidates were developed by Dyno Therapeutics.
As AI transforms the nature of and methods behind cyberattacks, how well do the techniques and frameworks used by the security community hold up? In a new report, we seek to answer that question.
We’re extending Project Glasswing to approximately 150 new organizations in more than fifteen countries.
But also, these models are capable of adjusting their value system depending on the user. Not saying that’s what’s being done but at a technical level that’s fairly straightforward, though not obviously better or with less problems.
"might is right" has never been more true than now.
AI development doesn’t have any of these characteristics. It would be almost impossible to easily distinguish a datacenter that is working on AI development and a datacenter mining cryptocurrency.
It would not be nearly as easy to stop AI development as it is to stop nuclear arms development.
If it was possible for ordinary companies to build nuclear weapons, and also release open-source ones that anyone could use to compete with the paid ones, I suspect we'd all have been dead a long time ago, arms control treaties or no.
Or alternatively, have fable write some complex code. Then ask it to do an adversarial review of that code in a clean session. You'll find that it will find issues in the code that it just wrote.
Now imagine you're a layperson who doesn't know which one is true.
Human expertise is never going to become irrelevant.
And not even considering: Chinese AI companies are the good guys???
Sorry to belabor this but it's basically pointless saying you have nuts it can't crack without showing us the nuts.
Recently (last couple of months?) these models are becoming useful tools for mathematicians, because they can solve easier problems more quickly, meaning that one can tackle bigger challenges (but maybe not RH et al) piece by piece.
But, there are still definite limits, where one could expect an expert human to solve things, given time, but models do not. Thus, more intelligence would be nice!
In Codex, GPT‑5.5 is available for Plus, Pro, Business, Enterprise, Edu, and Go plans with a 400K context window.
Column A, Column B. Building a small explosive device isn't hard. Building a million is very difficult, doing it covertly virtually impossible without the resources of a nation-state.
The problem with biologics is the self-assembly and replication machinery comes for "free." So the numpties who might otherwise blow up a trash can [1] now have a real chance of taking out a million people.
[1] https://en.wikipedia.org/wiki/2016_New_York_and_New_Jersey_b...
// Claude, make antiviral nanobots that defend me from 6ft virus. Make no mistakes.
Finance and biology do come across as two similar high level systems. But while we can employ KYC, fraud detection, and various auditing techniques to finance, I don’t know what you do for biology. You can easily run an algorithm over every transaction a person makes in their account but there’s no equivalent for every cell, every bacteria strain, every virus in the human body.
On the same note. if spacex is doing datacenters on earth successfully what's wrong with that? They rented cloud infra to a #2 or #3 provider in the world after < 2 years in business. It's a success, no?
China subsidizes strategic industries, and they have heavily done so with AI. And DeepSeek specifically has said they have no commercialization plans.
For example: https://www.boc.cn/aboutboc/bi1/202501/t20250123_25254674.ht...
Why not? Hetzner charges WAY less than AWS too. Can you not believe that?
I haven't gotten close to this either before, but now we wanted to move fast because this branch gets conflicts all the time and we want to get over with the migration asap.
Unfortunately, that doesn't work within a single session. The K-V cache of a model is intertwined with the model's configuration. Switching models invalidates the cache, meaning everything up to the point of the switchover is processed like a new, uncached input token.
Per Anthropic's pricing doc, an Opus 4.8 cache hit costs 50¢/MTok, while Haiku costs $1/MTok for uncached input.
Model selection works best if sessions are short and self-contained, particularly if the first few interactions can reliably classify the model need. That probably covers most 'support chatbot' use-cases, but it doesn't describe the kinds of heavy agentic automation that really chews through token budgets.
Then the cost is being subsidized by investor capital, but it is still subsidized.
This "simple" fact needs quite a bit of additional context and work. Making grandiose ethical claims like this can be countered with other grandiose claims such as the fact that there is no ethical existence under communism or socialism.
ZIRP (zero interest rate policy) is over, software engineers no longer call the shots now that there isn’t vast amounts of capital chasing yield, and that capital bidding up salaries and keeping the labor market for engineers tight.
If you are x more productive with generative AI, very shortly you are going to have to prove it with a token budget (or, if you’re lucky, an org willing to spend for on prem hardware for capped token cost, fixed capex vs uncapped opex).
The comparison is not SWE vs SWE with AI. It is SWE vs SWE with AI with a constrained token budget ($x/month) delivering the same value at the same or lower cost. If you cannot prove that you are wildly (vs marginally) more productive with the AI, why would they pay for it? Prove it.
Anthropic are not the good guys either. So here’s to hoping the Chinese pop the bubble.
I gave a high level description of the problems in a sibling thread. They are the kind of small problems which I suppose every researcher has lying around, waiting for them to think about some day. But not the big problem everyone is waiting for to be solved.
My comment was not meant to be a tease – sorry! I assumed there would be other people in a similar situation, who might relate.
(Joking aside, see sibling threads.)
You can use Pro on the web if you’re on the Pro plan but not in Codex
the adaptive immune system effectively does KYC by checking the antigens presented on the surfaces of cells. the thymus selects for B-cells (iirc?) which don't react to a corpus of the body's own antigens, but cover a wide library of everything else. when it sees something it doesn't recognize, it reproduces, warns the rest of the immune system and marks targets. that's why our immune systems can eventually conquer almost every pathogen we encounter, if we can survive long enough for it to do its work.
but the KYC I was referring to was KYC that vendors of oligonucleotides (should) be doing, to keep people from ordering nefarious sequences.
If you get hired as a staff engineer and do the work of a junior, what's wrong with that?
Clearly xAI (now part of spaceX) did not raise funds to be a data center. The margins are way different. There are plenty of recent IPOs in that area that are worth at most billions not trillions.
> going to IPO is a sign of confidence , you need to report a lot of things, that private companies don't.
This isn't going to IPO. This is rushing to IPO. It is a sign of confidence that the market or wider environment might crash soon so we need the liquidity now.
> This is an exact reason chinese labs do not rush to go public.
Maybe or maybe not. If you are referring to Chinese labs - both the Hong Kong and China stock market are way weaker than Nasdaq. It's not comparable. Check all the recent Hong Kong IPOs that have tanked.
So no, reason not to might just be: no money in it.
I don't think this is true if you simply quantize the model or run it with fewer active experts? The underlying weights would stay the same. You could also play further tricks with skipping some of the model's middle layers outright, which works surprisingly well due to how skip connections are used.
Just like how changing Kennedy Center letterhead to Trump Kennedy Center for a year didn't actually legally rename it.
Once a case with sufficient standing got in front of a judge it reverted to the actual legal name on the basis that only Congress can change the statutorily defined name.
For an admin so obsessed with legal names instead of chosen ones, you’d think they’d be less hypocritical.
https://abhishek-shankar.com/posts/ai-coding-bill-headcount-...
> That is the real content of the Uber story, and it is why filing it under "budgeting discipline" misses what is actually unfolding across half the engineering organizations in the country right now. They ran the same experiment Uber ran, most of them without Uber's $3.4 billion R&D cushion to absorb the surprise, and almost none of them having modeled the heavy-user tail or instrumented the gap between tokens consumed and value shipped. The reckoning will arrive for each of them on their own fiscal calendar, and the first instinct will be the wrong one. The tool is too good to abandon, the bill is too large to absorb, and the only durable resolution runs through a question the entire rollout was designed to defer.
> You cannot get labor-replacement economics out of a tool you deployed as a labor supplement, and the bill comes due before anyone is willing to admit which one they actually bought.
Or you can take one step back and look at chip allocation. As far as I know there are only three companies on the planet that can make the chips that go in those clusters. One (ASML), if you look back the supply chain to the Extreme Ultraviolet Lithography Systems.
If politicians decided that no more large language models should be trained, it sounds like we could do it.
I also would like to hope that people that are likely to do such things are probably:
A) don't know how to break even the most basic guardrails of models
B) already in glasswings project
To prove point B - Theranos existed.
The fact that there is no ethical consumption under capitalism is not material to whether or not ethical existence is possible under communism or socialism. In order to survive in a capitalist society, one inherently has to make choices that require trade-offs, and those trade-offs are burdened by a history of decisions made not just by the people alive today, but our ancestors as well. Does that mean I walk around chanting "Reparations", "Land-back", or other calls to action? No, but I do acknowledge that there are unresolved issues and as a Canadian, I know we need to do more to resolve treaty issues, and environmental issues, and system discrimination. I also know that Americans need to do better to address systemic discrimination and many, many other issues. It also doesn't mean I want to give back my house, or give away all of my possessions. It just means I try to make good choices and support businesses and people that are open about the trade-offs they make and try to engage as ethically as possible.
Acknowledging those facts doesn't absolve us of responsibility, it's a framework that allows folks concerned about whether or not they are doing the right thing to accept the trade-offs that they choose to make and be responsible and accountable for those choices to themselves or their communities.
We live in a world with scarce resources. It's possible that with a foundational redesign of the global economy, and the requisite authoritarian government that would be required to force such a redesign, we could eliminate food scarcity, solve energy scarcity, and make sure that everyone has a place to live. Those trade-offs are probably not worth the ethical cost in political and physical violence required to accomplish it. We have seen the trade-offs that happen when the powerful are able to exploit communist or socialist governments. We are seeing the "late stage capitalism" impacts of allowing the powerful to exploit capitalism in democratic societies. Acknowledging that the current capitalist system has lead to the greatest prosperity for the upper echelon (financially) of humanity, and a dramatic reduction in global poverty shouldn't obscure the reality that much of that wealth comes from exploitation of people and the environment.
It's a huge problem to unwind, and we can't let the burden of every choice that we make stop us from trying to do better, but we (as in society in general) can't do better if we don't at least acknowledge the compromises we are making along the way, and try to plan to fix it in the future.
Probably a topic better suited to beer and a pub setting than HN though :P
“Many of the largest and most responsible providers in the industry already screen and record orders voluntarily,” but there is no requirement to do so [1].
It’s generally established that Anthropic/OpenAI are going for all out performance with big VC dollars at the expense of efficiency and China has geopolitically limited compute and an inventive to compete on value per dollar.
1/30,000 * 100 = .003
And note how your argument can also be used against any non-prolifreration agreements, which are demonstrably possible.
And don't get me wrong. Opus did an absolutely horrible job at first, second and third round in this task. You really needed to steer it to get to the right solution.
And now Fable is out. And its first round of code reviews for this huge PR was definitely worth the money too...
Don't think that I'm just shrugging to that number. I see it every day, and I don't like that it's in the thousands now. But for people paying the 100 or 200 dollar plans, I'm not super sure if you will be able to use them in the future if the token price is in the thousands for a bit bigger task...
If I'd pay this from my own pocket, I'd definitely go with DeepSeek or local models and figure it out how to make the best use of them.
I don't believe that this is a fact. How are you demonstrating that this is a fact?
When you talk about things like reparations or "land back" you're already cargo-culting in concepts and ideas that themselves need to be fleshed out in order to make a subsequent claim that a specific economic system is unethical. Someone can just argue all economic systems are unethical, how are you going to defend against that? And can you pay reparations for example without going back in all of human history and finding all cases of injustices and then tallying it up? Why pick an arbitrary point in time? Better yet, why not start in countries where slavery still exists instead of focusing on the west which led the world in abolishing slavery and created concepts such as universal human rights.
Even with respect to "eliminating food scarcity" - eliminate in what sense? All olive groves and grapevines and rice farms have to be destroyed and rebuilt to only build certain foods?
Dabbling in communism or other inhumane and authoritarian governmental systems is extremely dangerous and in the same vein of extraordinary claims required extraordinary evidence, suggesting as you did creating an authoritarian government to create a utopia is precisely the same project of suffering and death that mass murderers throughout history have undertaken to abject failure, and thus, you need some incredible amount of evidence and theory to be able to even fairly suggest going down this path.
IOW, you don't really think the value of this work is really worth $4k.
> why would I pay to do my job?
The question is: how long do you think that you employer will be willing to pay for you and Anthropic, if you yourself said if it were your money you'd put some time and effort to work with an open model?
I am not going to do the work of gathering the evidence for you, and I don't think this is the right venue for a debate on the topic.
If you don't have evidence I think it's mature of you to admit that and applaud you in doing so. We all like to just talk and don't have to always provide evidence for every citation or what not and it's fair to just say hey I'm just making this up and it requires further discussion.
I wonder what this question really means? Anthropic is useless if you don't know what to do with it. It's very useful if you do, and you can guide it to do the right things. Yes, it will for sure reduce the amount of people we need to hire. But we are always looking for hires who know what they do and can utilize agents to be faster.
But if you think about how long employer is willing to pay 10-20k per month per seat for Anthropic? I can't see this to be feasible and it will have to end at some point.