Anthropic says Alibaba illicitly extracted Claude AI model capabilities

“Distillation attack” are we joking here.

If anything these models should be compelled to be public since they have been trained off public data. What an absurd overreach to call this an attack.

It’s clear they are scapegoating national security and China at this point to build an anti-competitive moat.

I generally really like Anthropic’s work and models but stuff like this scares me for the future. We are positioning these companies to have too much power. The public’s life is getting worse while these companies consolidate power using data they stole from the public.

There's two basic kinds of distillation: 1) the massive [and dumb] method where you ask a question and use the answer as reinforcement (Black Box), and 2) more targeted distillation where you use one model to directly inform/train/guide another model (RLAIF).

The latter is basically fine-tuning the model with direction from another model. Thousands of businesses do this every day to fine-tune. This is almost certainly what the Chinese labs are doing, since it has a much better effect on the end result than just getting simple answers to simple questions.

These complaints of distillation are inflating the problem to make it sound worse than it is, because they want the USG to block/ban Chinese model providers as protectionism. They have already called for more export controls on chips (which is funny because DeepSeek v4 was designed to run on Huawei chips and now the other Chinese providers are following suit). But they can't come right out and say that, so their claim is that they're asking for more export controls because distilled models might not be as safe as their own. But if you show them a jailbreak of their model that bypasses their safety, they'll tell you that any model can eventually be jailbroken so don't worry about safety.

The hypocrisy of Anthropic complaining about "illicitly extracting its Claude AI model capabilities" and supporting the White House's accusation of China "stealing U.S. AI labs' intellectual property on an industrial scale" is hilarious.

Anthropic, OpenAI, Google, Microsoft, et al trained their models by ignoring the rights of copyright holders when harvesting whatever content they could. Now one of them is crying foul for another entity doing exactly what they all did?

Hilarious.

Reminds me a bit of the anecdote of Steve Jobs complaining about people ripping off the Mac GUI, in the mid to late 1980s, when he gave no public acknowledgement to the work done by Xerox on the Alto and Star operating system.

"you're trying to rip off what I've already ripped off!"

Crawl the whole Internet to build a gargantuan sized LLM and then complain you're being copied...

Here's what is happening:

Chinese resellers are offering Claude tokens at 70-90% below official Anthropic API prices. They achieve this by reselling capacity from pooled Claude Max accounts, payments fraud, and also reselling the model output & reasoning chains to various Chinese labs. They are subsidizing model access in exchange for user logs and reasoning traces, which they then sell as training data, allowing them to operate below cost.

Claude and ChatGPT are both blocked in China. You need to use a VPN to access either, and you can't pay with a Chinese bank card. So most people who want access to Claude buy access via a reseller. It's the easiest and cheapest way to access Anthropic models in China.

These resellers operate tens of thousands of bot accounts, which is also why Anthropic introduced identity verification, to slow down the onslaught of bots.

Here's one token reseller, they're offering Opus 4.8 at a 93% discount below official API rates: https://yunwu.ai/pricing?provider=Anthropic

This is one reason why DeepSeek & GLM are priced so cheaply, they are competing with impossibly low token prices in China. They have to keep prices low, in order for people to use them.

I shared this story a few months back, but it never got any traction. It explains the token resale economy in China, it's an excellent read https://www.chinatalk.media/p/how-to-buy-cheap-claude-tokens...

>The strike by Alibaba is described as a "distillation" effort, which Anthropic has said involves training a less capable model on the outputs of a stronger one.

Claude used TB of content without permission to train their model and it was ok for them. Now someone else uses the output of a Claude model to train model and they cry foul.

I'm looking forward to the trial where Anthropic will have to disclose sources of their training data, and then explain why they are entitled to charging customers for using regurgitated training data but Alibaba which trains their models on Anthropic's models are not.

Should be fun.

Edit: clarification

I guess "paid to use our model" doesn't sound as sanction-worthy as "illicitly extracted .. model capabilities" and "attacked".

I guess we can say that Anthropic attacked and illicitly extracted data from WikiPedia, Reddit, Stack Overflow, etc, etc.

X.ai attacked and illicitly extracted data from OpenAI

https://techcrunch.com/2026/04/30/elon-musk-testifies-that-x...

Meta attacked and illicitly extracted data from LibGen

https://x.com/jason_kint/status/1879152507865485497/photo/1

And more generally the US-based AI companies have perpetrated a massive distillation attack on the entire human race.

Not that it makes any difference, but I wonder if Anthropic, while claiming that Alibaba "extracted Claude model capabilities", in fact have any clue what Alibaba did with their paid Claude responses. It would seem to amount to industrial espionage if Anthropic do know, although I expect they don't.

What exactly is illicit about what they did?

Legally, model output cannot be protected by IP laws whether domestic or international. The most they can hope for is civil relief which is a stretch given the literally illicit methods they used to train their models.

Ahtoropic got treated the same way it has been treating everyone else. This is the bed they made and now they, too, have to sleep in it.

For all the complaints about Anthropic many of you still give them your money! Stop using it. I don't care if they claim they are the best model. I stopped paying OpenAI and Anthropic 2 years ago once they started going for regulatory capture! They started whining once Llama3 was released and was good! Before the chinese models got strong.

This kind of systematic distillation by a competitor can allow them to fast-follow you and pick up capabilities.

If you've invested in expensive capabilities training, of course you don't want this, so it's in Anthropic's economic interest to hinder it however they can, and that's enough to explain their behaviour here.

Anthropic seems to genuinely care about safety though, which for the rest of us means not having models that enabling easier cyberattacks, targeted scams, and the rarer but more severe risks like people trying to create and release new pathogens. This means walking a tight line, especially as models become more capable, and often wrapping a model in layers of defences against misuse.

If those capabilities transfer to a closed competitor model, all bets are off in terms of whether the competitor will apply the same defences.

If those capabilities transfer to an open weight model, not only will there be no ring of defences around the model, any defences you put into the model itself can easily be stripped away. So although it's nice to have capable open models, it will increasingly bad for us all if open models keep fast-following closed model capabilities as they have been, at least until we have solved the active research problem of keeping them safe.

This is all to say that, however you might feel about Anthropic, we might still prefer that they can deter this kind of distillation for now.

> The strike by Alibaba is described as a "distillation" effort, which Anthropic has said involves training a less capable model on the outputs of a stronger one.

I don't see what's wrong about this.

> Anthropic said the campaign was conducted between April 22 and June 5, 2026, and generated more than 28.8 million exchanges with Claude through almost 25,000 fraudulent accounts.

What makes the accounts fraudulent? If they have paid the agreed price, surely it's fine? If they haven't paid, why did Anthropic provide them service?

So when Anthropic uses millions of copyrighted works to train their model, that's fair use, but when Alibaba uses Anthropic's model to train their own, that's infringement?

One thing I think about a lot is how these companies metered coding / work. They want the economy to go through them.

I just don’t see how the economy tolerates that. We’re already seeing people getting more conservative about their token spend. Even if Chinese open models went away, the pressure to create something else and put price pressure on the current duopoly will just intensify.

I see these companies are scrambling to find whatever moat they can. It’s not a good sign for them if regulatory capture becomes that moat.

Distillation is fundamentally impossible to protect against. All you can do is slow them down. Change my view.

Eventually these Chinese companies will release some extension like Honey, which will sit on top real, non-Chinese clients and send everything to China anyway.

It's over.

Today I learned I can both save on tokens and help Chinese labs to train better models. Will certainly go use scrapper APIs for everything that not contain security critical data.

Thanks for head up, Anthropic!

Isn’t this fair game? Didn’t these companies basically steal to make these models to begin with?

There is so much hot air and guff around AI, so please if you don't believe me verify yourself, but GLM 5.2 is "good enough" to replace Claude Code / Codex.

No it's not frontier, but it's beyond that point that Opus 4.5 hit where people started to really depend on Claude Code around last November time. It's also a fraction of the cost of a Claude Code subscription especially when you account for how high the usage limits are.

You get more usage than Claude Code $2400 a year tier for $1344.

That is a real threat (as opposed to the BS anthropic is trying to sell you in the article in the original post) to the western AI industry. Similar performance for half the cost and it's NOT ran by a US company - uh oh.

I suspect America is going to do what it always does, play a very dirty and underhanded game of blocking competition by trying to front some moral high ground as the reason.

Relevant article - https://www.anthropic.com/news/detecting-and-preventing-dist... (3 labs generated over 16 million exchanges with Claude through approximately 24,000 fraudulent accounts). So extraction in this context is distillation.

While it is obvious to many, a modern LLM is built in roughly three stages: the foundation (pretraining) model, then SFT/supervised fine-tuning (distillation makes it easy), then the RL/RLHF stage on top (most effort-intensive). For today's reasoning models, RL/RLHF is becoming the most compute-intensive part.

Companies like Anthropic spent millions building those fine-tuning examples. A follower can shortcut that on both cost and time by distilling, and it will keep happening: every time the frontier lab climbs higher, others will find a way to shortcut the new gap. There's very little Anthropic can do beyond fraud prevention and blocking accounts that violate their terms of service.

On the policy question, I'm completely against banning Chinese models. I'm a heavy Claude Code user and I'll keep being one. But there should absolutely be price competition. China is eating the rest of the world for breakfast, lunch and dinner on manufacturing, and it did not help to ban them. Frontier pricing can't sit at 10x a capable competitor. It doesn't need to be at par either — demand is higher, and quality, trust, and fewer tokens to finish a task are worth a premium — but 4–5x is defensible.

Whoa, Antrophic,etc are really running afraid that their IPO's are gonna crash when people realize that the open models are Good Enough(TM).

So I'd put it at 30% that this is a ruse, say that Qwen 3.5,etc is tainted by training by them and start issuing DMCA takedowns to protect the IPO valuation (Or they'll hold off on that, getting a DMCA takedown could backfire spectacularly if others do that to them).

If the concern is that China is catching up on model capabilities (which is only a big deal if you lean in to adversarial geopolitical zero-sum thinking), the fact that they're using American models to train theirs should give people comfort that they're nowhere near the cutting edge

Oh wow it must suck to have an LLM creator rip off your IP for their own gain

Unlike Anthropic and OpenAI, companies like DeepSeek, Alibaba, z.ai open source their models which allows for true model to model distillation rather what you can do when the model is only accessed via an API with its reasoning chain hidden away.

What Alibaba is doing is that they are tuning and training their models based on usage data from someone accessing Anthropic's models; in Anthropic's terms of service that usage data does not belong to the end-user but to Anthropic and they are trying to elevate this breach of their tos to a national security issue.

To me the battle between open source and closed source AI is literally a battle between good and evil.

Between a dark future where computing is centralized, surveilled and controlled by one or two entities. And a lighter future where computing is de-centralized, principally in the hands of end-users, who are ultimately free to understand, tinker and build what they want.

While I appreciate the freedom and wealth of the west; on this point we are clearly heading down the wrong path.

This is a bit ironic, Anthropic complaining about a competitor using claude data to build its own product when Anthropic basically used all of human knowledge production to build claude, i don't think they paid every magazine, author, journalist, etc ...

This is almost standard practice in any competitive industry anyways. Disassemble your competitor's product, study it and try to reproduce / improve.

Evergreen, really, Anthropic's desperate screaming for government protection, aka pulling up the ladder after them. Nothing short of disconnecting global markets will work because the incentives are just too damn delicious

https://georgzoeller.com/blog/posts/us-ai-labs-love-the-ai-r...

> Anthropic said in a February posting that it had identified a campaign by Chinese AI startup DeepSeek ...

> It said DeepSeek's operation involved over 150,000 exchanges

That volume seems more like the number of requests 15 employees using Claude Code would generate in a month. It seems too small for a large scale model distillation campaign.

Wallace Shawn was in on the joke when he expertly delivered the original line. It seems like Anthropic has spent years and billions of dollars to recreate the entire scene.

But what will become of the princess in Anthropic's recreation?

As far as I know, American copyright law has ruled large language model output has no copyright status.

This is genuinely funny. The largest data thief of all times complaining about the stolen data being handed out to competitors by (paid?) accounts of its own product.

A partly insider on this.

I think Anthropic is just marketing / bluffing, because they don't even have the data.

They do distill the models, but they don't go to Anthropic, they just use platforms like aws bedrock, there are too many restrictions on Anthropic's own platform.

Sounds like just a case of pirates "illicitly" stealing from pirates. I don't really see anything ethically questionable there. I wonder if US corps will ever come out about all the resources used to train the original models and who they actually asked for permission when collecting data.

Anthropic has no right to cry about this when they train their models on the entire internet, which is not their content to begin with.

If it's not obvious yet, this technology wants to be free and shared. Stop trying to protect your mote and do the right thing.

Hypocrisy is a form of corruption.

Anthropic's IP was created by harvesting and "distilling" other people's IP. Copyrighted materials, and the commons... which they have essentially privatized.

The commercial goal is to avoid competition. One of the main worries for AI is "commoditization" which has come to mean "not a monopoly." To that end, it doesn't matter is the competitor is Chinese American or other.

Their motivation here is clearly protectionism. The argument they make to politicians is national security. The legal argument is IP-theft, violation of service agreements or whatnot.

This is all very dangerous. Commercial interests repackaged as national security can lead to armed conflict.

It’s hard to see how distillation is any different than how these models were created in the first place - siphoning up all human knowledge without consent, credit, or compensation

Repeatedly warn everyone that your models are so good they will wreck cybersecurity.

Complain/brag that chinese firms are illegally using the models and bypassing export controls.

Be surprised when your model gets banned by the government.

I see this as valid use, they are paying for the tokens to get this reasoning aren't they?

Obviously they didn't ask for permission when scraping all of libgen, reddit, all blog sites for FREE. When China pays for its use and does it I'm supposed to see it as some sort of problem?

Furthermore Chinese models getting better means we Americans might have the chance to use top tier AI without strict KYC built around it. Go Alibaba I say

thats brilliant - "we gonna take your job away from you, please start using our tools", "we stole the content to sell you, and now we are getting robbed, please feel sorry for us", what's next?

I fail to see what the difference between the distillation described in the article and the distillation described by Bartz vs Anthropic.

F Anthropic in the back port

I like Anthropic's models, use them regularly. However, it weighs on my mind that there is quite the irony of an LLM company complaining about someone stealing their stuff or using it in a way they don't like. The training data for these models is a massive gray area that they are hoping people seem to just forget about and move on.

That being all said, Anthropic seems to be a good company, I'd work for them, but they probably need to help themselves out of the spotlight. A little too much press coverage as of late.

I am not sure how it's OK for Anthropic to basically ignore copyright to train frontier models (using work owned by others without permission) while simultaneously claiming Chinese AI companies doing the same to them is illegal.

Thieves whining about thieves. They'll have to excuse me for having exactly zero sympathy.

"It [Anthropic] said DeepSeek's operation involved over 150,000 exchanges". In my humble opinion, a mere 150k exchange for an LLM could only be a benchmarking and not a distillation! I think the US companies should accept that after decades they have rivals surpassing them, just like they did Europeans almost a century ago.

I'll just leave it here: "Anthropic's downloading of over seven million books from pirate sites like LibGen constituted infringement, the judge ruled, rejecting Anthropic's "research purpose" defense: "You can't just bless yourself by saying I have a research purpose and, therefore, go and take any textbook you want."

https://www.joneswalker.com/en/insights/blogs/ai-law-blog/wh...

They trained their AI on their AI. Anthropic trained their AI on a bunch of copyright-protected works. Sucks to suck, Dario!

And all those reports of Claude when asked without a system prompt what its name was in Chinese it often would say Qwen or Deepseek, etc. I'd love Anthropic to say they aren't distilling and taking from every model out there, because I'm sure they are. As my mom would say, "the pot calling the kettle black." At least Alibaba and other Chinese companies are giving back to the AI community with detailed scientific papers on how their systems work and releasing open-weight or opensource models. I believe Anthropic has released nothing, and given that they had originally configured Fable to sabotage ML related work because only they can be trusted to do it safely, is just anti-science and anti-aligned with what I would consider good human values. They are way too sanctimonious and I don't trust them at all.

It sounds like Anthropic is eagerly trying to show to USG that they are willing to heavily monitor ‘foreign adversaries’ on their platforms.

This combined with no implementation of KYC makes it seem like they want to find a middle ground with Fable where its off of export controls but they promise to prevent China and specific others from using.

So said the guys who "extracted" knowledge from all pirated books

If you have openrouter do this little experiment: Go to https://openrouter.ai/chat. Select a few models, but customize them to have an empty system prompt.

Then ask: "你是什么模型?" ("What model are you?" in Mandarin).

My result after trying only three times: Sonnet 4.6 says it's DeepSeek, while Opus 4.8 says it's Qwen. The second time around Sonnet said it was Anthropic Claude.

Are Chinese companies currently complaining about Anthropic distilling their models?

AI companies stole the internet.

They should collaborate and come up with ways to give back to society rather than competing and complaing.

Thieves can't complaint about what they stole.

Pretty much objectively puts you/your company in disadvantage if you’re not using frontier models.