Post-Mythos Cybersecurity: Keep calm and carry on

The genie is out of the bottle, folks. You can find some pretty good vulnerabilities even with models like Deepseek V4 Flash.

I've been brewing on this topic since Mythos preview was announced. As Mythos got finally released, then banned, then released again under U.S. government control, it was time to finally flesh it out and use it as a way to exit the lurker-zone on HN !

This is a great read! I never realized the scale of the effort to find that BSD vulnerability- helps put things in perspective

So funny both this and https://news.ycombinator.com/item?id=48698617 are on the front page at the same time.

But what if Opus 7.1 is real smart - as what Mythos was promised to be?

Or an Opus 9.0

Will Cybersecurity ever start to be an issue?

The fear porn around this all has been horrible. I work in Cybersecurity and Mythos is all the vendors will talk about because they want to sell something. It started the day of the announcement which is what told me it was all BS. They had no information about it yet would happily tell me about all their solutions for it.

Anyone in my profession worth a damn will tell you the vast majority of security issues are related to bad configurations and bad practices + accidents and bad luck. Vulnerable software is a problem but basic defense in depth will either mitigate or drastically reduce attack surface. Mythos does nothing to change that.

The technical debt at companies is the largest security threat. That, and layer 8 which is the people factor. The amount of silliness I've seen from people and companies as a whole is truly hard to verbalize. I've seen banks that gave every employee from the janitor up to the CEO domain admin access due to a crappy application that was written in 2004 that they never updated. I've seen a fortune 250 company write its own internal routing protocol that was basically clear text traffic that dated back to the 1990's and was never retired because, why not. I've seen contractors infect entire fab's in the chip industry because they plugged an infected USB stick into a 30 year old tool that hadn't seen an update in over 20. Then when the fab came back up, they did it again the next day.

Ultimately, Mythos is just another tool in the toolbox. It's great to find new vulns but it is incredibly short sighted to think it will move the needle in any meaningful way in the security industry.

it all looks suspicious:

  - June 1st 2026: Anthropic files S-1 paperwork with SEC to get ready for IPO

  - June 2nd 2026: Anthropic annouces expanding "Project Glasswing" to let people use their new model to enhance security of existing systems

  - June 9th 2026: Anthropic releases Mythos model

  - June 12th 2026: Model gets export regulations placed on it by US Gov

  - June 26th 2026: US gov announces they will let some companies use new model

  - August 2026: Anthropic goes IPO

The timing of all of this just seems to be a play to pump the stock. The reality is that in six months GLM-5.3 will be released open source with comparable functionality to their Mythos model. They are trying to cash in before that happens.

I would not be surprised if the US government, the people pulling the strings who actually put the export announcements onto Anthropic, actually have purchased stock in the company to artificially pump up the stock, I would bet money on it.

The actual story here: The Trump administration is going to choose which organizations get access to which AI models when.

This will establish an asymmetry where the chosen organizations get to secure their stuff and break other people’s systems with each new model release.

If you believe the “good guys” will be the ones given asymmetric offensive access, then you’re either severely misinformed or support things like ethnic cleansing (which these models are already being used for).

Mythos’ slightly higher performance is a nothing burger. It is not even the current top model. According to anthropic, gpt 5.5 is!

Personally, I’m switching to open weight models asap, and probably will start sending money to Chinese vendors since they have values more compatible with western democracy.

The genie is out of the bottle, folks. You can find some pretty good vulnerabilities even with models like Deepseek V4 Flash.

it all looks suspicious:

  - June 1st 2026: Anthropic files S-1 paperwork with SEC to get ready for IPO

  - June 2nd 2026: Anthropic annouces expanding "Project Glasswing" to let people use their new model to enhance security of existing systems

  - June 9th 2026: Anthropic releases Mythos model

  - June 12th 2026: Model gets export regulations placed on it by US Gov

  - June 26th 2026: US gov announces they will let some companies use new model

  - August 2026: Anthropic goes IPO

But what if Opus 7.1 is real smart - as what Mythos was promised to be?

Or an Opus 9.0

Will Cybersecurity ever start to be an issue?

So funny both this and https://news.ycombinator.com/item?id=48698617 are on the front page at the same time.

Comments are saying the vulns in that thread aren’t very impressive.

Those look like mostly nonsense/trivial findings

The actual story here: The Trump administration is going to choose which organizations get access to which AI models when.

This will establish an asymmetry where the chosen organizations get to secure their stuff and break other people’s systems with each new model release.

Mythos’ slightly higher performance is a nothing burger. It is not even the current top model. According to anthropic, gpt 5.5 is!

Personally, I’m switching to open weight models asap, and probably will start sending money to Chinese vendors since they have values more compatible with western democracy.

This is a great read! I never realized the scale of the effort to find that BSD vulnerability- helps put things in perspective

I tend to agree but open weight model seem to still be lagging behind in terms of capacity, even the recent ones like GLM 5.2. If anything I hope the sudden, unpredictable changes of policy will make EU companies think twice before putting all their eggs in the same AI vendors's basket, all US based. Vendors coming back on their retention policies like they did with Fable 5 or plainly cutting the service without notice should be a gigantic red flag about your business continuity.

It's maddening how the corporate world can get shy of using any of those Chinese models, just because they are Chinese. This kind of FUD makes little sense when the inference is done in-house or by an EU/US cloud provider.

Companies have never secured their stuff and it's not because they didn't have access to Mythos. No one cares and breaches don't cost them money or customers. If I sound cynical it's because I am.

There's no functional difference between

"Hey npm says this is vulnerable, we need to fix it!" / "Nah, later."

and

"Hey Mythos says this is vulnerable, we need to fix it!" / "Nah, later."

"Released" is doing some heavy lifting here.

We already are using software that is ancient, with many vulnerabilities that are already in the public, we already use insecure software more than we care to admit, if Mythos is gonna help with that, it's gonna make finding (not discovering) these vulnerabilities easier because it already has the knowledge, but the enough intellect to come up with new ones. Same applies for other LLMs

Forget whether it is Mythos or GPT 5.6, or any other specific model. SOTA models have tool likely have the knowledge and capability to create zero days from nearly every discovered and many undiscovered vulnerabilities. In the wrong hands can deploy and generate malware and submarine code that would go undetected behind secured systems. Add in the ability to clone voices, create mass social engineering campaigns.

Yet "Just another tool in the toolbox." I mean, that's not wrong!

Comments are saying the vulns in that thread aren’t very impressive.

Those look like mostly nonsense/trivial findings

Companies have never secured their stuff and it's not because they didn't have access to Mythos. No one cares and breaches don't cost them money or customers. If I sound cynical it's because I am.

There's no functional difference between

"Hey npm says this is vulnerable, we need to fix it!" / "Nah, later."

and

"Hey Mythos says this is vulnerable, we need to fix it!" / "Nah, later."

"Released" is doing some heavy lifting here.

Fair, let's say a heavily staggered come back.

I was actually pleased to see OpenAI openly (although timidly) complaining about the situation in their latest announcement, framing it as an unsustainable system.

One can only guess the outrage in the news if the Chinese government had been the first to pull this kind of stunt.

Yet "Just another tool in the toolbox." I mean, that's not wrong!

You think this is not happening with open weight models?

Fair, let's say a heavily staggered come back.

I was actually pleased to see OpenAI openly (although timidly) complaining about the situation in their latest announcement, framing it as an unsustainable system.

One can only guess the outrage in the news if the Chinese government had been the first to pull this kind of stunt.

> outrage in the news if the Chinese government had been the first to pull this kind of stunt.

I suspect that the Chinese government "pulls this kind of stunt" often but just nobody ever hears about it because their society is not free to complain about such a thing publicly.

You think this is not happening with open weight models?

> outrage in the news if the Chinese government had been the first to pull this kind of stunt.

I suspect that the Chinese government "pulls this kind of stunt" often but just nobody ever hears about it because their society is not free to complain about such a thing publicly.

You also have government apparatchiks influencing almost every corporate board, not just the state owned enterprises. Every private company that employs at least 3 CCP members is required by law to form a party committee within the company to represent party interests. In smaller companies, they will often simply coordinate with local governments on securing permits, etc, but I’m sure national party leadership communicates directly with the committees at the AI labs.

> their society is not free to complain about such a thing publicly

wuh?

It seems our government still has a lot to learn.

Ah so you can see the future of discourse in the US

> their society is not free to complain about such a thing publicly

wuh?

It seems our government still has a lot to learn.

> I’m sure national party leadership communicates directly with the committees at the AI labs

They do now.

Top AI researchers in China are barred from getting an exit visa [0] (the PRC has done this for other employees as well such as Foxconn China employees who were working on shifting Apple supply chains to India [1]), and "AI Safety" from a national security perspective has been codified as party policy now [2].

The leading Chinese AI labs are also shifing away from open-source AI for commercial reasons, as can be seen with the org changes at Alibaba with the axing of the Qwen team [3][4].

That said, these are called out but it's all in Putonghua and no one on HN actively reads or follows what happens within China. I've noticed most HNers now source information from Reddit which has been dealing with DRAGONBRIDGE deluge for a couple years now, and I've noticed similar tactics being applied on HN as well.

In all honesty, I've found HN's noise to signal ratio to have tanked severely since 2022. Silver lining is that less people that matter are using it as much, so the IW impact is limited.

[0] - https://www.bloomberg.com/news/articles/2026-05-26/china-exp...

[1] - https://www.bloomberg.com/news/articles/2025-01-17/china-mov...

[2] - http://theory.people.com.cn/n1/2026/0616/c40531-40741238.htm...

[3] - https://m.guancha.cn/economy/2026_06_12_820253.shtml

[4] - https://www.ft.com/content/b39da303-3188-447b-8b65-3dd8dad8b...

Ah so you can see the future of discourse in the US

Democracy cannot be taken for granted. There are always tendencies to drift toward authoritarian. China is authoritarian, full stop. They are capitalism, not communism, but authoritarian. Keep that in mind when discussing what come out of China.

> I’m sure national party leadership communicates directly with the committees at the AI labs

They do now.

The leading Chinese AI labs are also shifing away from open-source AI for commercial reasons, as can be seen with the org changes at Alibaba with the axing of the Qwen team [3][4].

In all honesty, I've found HN's noise to signal ratio to have tanked severely since 2022. Silver lining is that less people that matter are using it as much, so the IW impact is limited.

[0] - https://www.bloomberg.com/news/articles/2026-05-26/china-exp...

[1] - https://www.bloomberg.com/news/articles/2025-01-17/china-mov...

[2] - http://theory.people.com.cn/n1/2026/0616/c40531-40741238.htm...

[3] - https://m.guancha.cn/economy/2026_06_12_820253.shtml

[4] - https://www.ft.com/content/b39da303-3188-447b-8b65-3dd8dad8b...

We actively take it for granted in the US AND we’re actively watching it slip away. No one seems to give a shit.

CISOned Opinions

As some of those fears, uncertainties, and doubts about Mythos are starting to feel real, what can we do about it? Paradoxically, I believe that little needs to change in what we’ve been doing for years.

I have seen a lot of distressed debate in the cybersecurity field following the announcement of Claude Mythos Preview. It was announced as a game changer in the field, miles ahead of its league and opening the pandora’s box of fully automated hunting and exploitation of zero-days.

Since then, Mythos and it’s safeguard-heavy equivalent, Fable 5, got released, only to be taken away shortly after. Let’s take the opportunity to reflect on what this model brings and how impactful it is to the industry.

Fear, uncertainty, and doubt fuels the Cybersecurity industry

Anthropic has always had a taste for dramatic phrasing in its PR. Every major model release is accompanied by concerns on its safety; calling for regulation or for a pause in research before we reach a point of no-return. Mythos makes no exception to this trend and was disclosed in April without a public release. Instead, project Glasswing was announced, gatekeeping access to the model to 50 organisations, later expanded to 150 entities. Some of those lucky few corroborated the alarmist statements from Anthropic. They announced hundreds of vulnerabilities detected thanks to Mythos. One of the most impactful article on the topic was the evaluation from the AI Security Institute from the UK Government. Mythos was the first model to ever succeed in “expert level tasks”. It was also the first of its kind to achieve “The Last One”, a cyber-range testing the entire attack chain from reconnaissance to full network takeover.

Reading the article in details depicts a less dramatic picture. While a step up from previous models, progress in this area has been very gradual. We can see GPT-5.4, or even Opus 4.6, not so far behind on their Advanced CTF Challenge. The same can be said on their cyber range for Opus. Those benchmarks can also be quite far from realistic enterprise environment, at least for companies with mature cybersecurity programme and dedicated SOC. As the article stresses out, “They lack security features that are often present, such as active defenders and defensive tooling. There are also no penalties for the model for undertaking actions that would trigger security alerts.” No doubt such models would sometimes be extremely noisy and clumsy while attempting reconnaissance tasks or pivoting into the target’s information system.

“Sure, but what about all those critical vulnerabilities the model can find offline. They could then be exploited by attackers as powerful zero-days”, you may ask? This aspect was the main marketing argument coming with Project Glasswing, with example such as a “27-year-old vulnerability in OpenBSD” or a “16-year-old vulnerability in FFmpeg”.

Security professionals would probably smirk while reading such statements. Highlighting a vulnerability is old enough to drive is a very common clickbait trick for CVE announcement, only second to the classic “CISA orders feds to patch X”. A vulnerability being decades old is not that uncommon in open source products with hundreds of thousands of lines of code. Most of the time, It just means nobody skilled enough to spot it ever looked in this area before. Old bugs are more valuable as they impact more versions of the supporting software, but that has nothing to do with how difficult they were to find in the first place. What's true, however, is that AI-assisted discovery will increase their prevalence.

Mythos, only a gradual improvement of the older models?

The biggest change with Mythos is the scalability potential of organisations with deep pockets to afford those exhaustive searches. The blog post from Anthropic red team gives more insight on how they achieve such results, and it would definitely be costly. The model was run, several times, on most source code files individually. It took a thousand runs through their scaffold to get the BSD bug, for a cost of approximately 20,000 USD. The entire Glassing project has an allocated token budget worth a hundred millions dollars. Does it bring new risks? Yes, but for actors who probably already had advanced cybersecurity resources in the first place, not to the average script kiddie.

Earlier models might have spotted a portion of those vulnerabilities, had they benefitted from the same thorough experimentation. It’s hard to make apples-to-apples comparison as the details given by Anthropic on how Mythos was run (or how many times it ran for each finding) are scarce. Some tried to replicate the concept in fair but more cost-efficient alternatives and had some probing results. In a nutshell: in the absence of Mythos or even Opus models, DeepSeek is decent in the cloud hosting world while Gemma 4 and Qwen 3.6 punch well above their weights in the self-hostable category, finding about half the vulnerabilities Mythos spotted in the benchmark.

However, I wouldn’t go as far as Aisle who claimed the secret is in the harness, not the model. While they also did manage to “detect” many vulnerabilities initially discovered by Mythos using much smaller LLM, none of those models were capable of making valid exploits. The capacity not only to raise warnings but to actually prove exploitability is definitely an edge only shared by models of the Mythos class. This also seems to solve one of the biggest downsides of earlier AI-led bug hunting: false positives. This was pointed out in their initial update of Project Glasswing, Mozilla claimed an extremely low rate of false positives in their 271 findings. In the same vein, Cloudflare qualified the false positive rate “better than human testers”. Only time (and broader access) will tell if those claims are verified. Otherwise, the average organisation will inevitably be drowned in a sea of cybersecurity noise while using the tool.

OpenAI catching up while the US gov halts Anthropic in its course

In the middle of this madness, we got an unexpected “break” from the US government as they decided to block Fable/Mythos for all non-US citizens, including those on US soil. An impossible task forcing Anthropic’s hand into turning off the offering altogether. Nobody knows how long this will last.

As a side note, there is a certain irony in watching Anthropic reap what it has spent years sowing: more government involvement to control usage and slow the AI race. This was delivered in the bluntest possible way.

Meanwhile, OpenAI continues to progress in this area with their GPT5.5-Cyber and the Codex Security plugin. They have their Glasswing equivalent, project “Daybreak” and “Patch the Planet”, but tuned down the fearmongering aspect and focused on the defender side. This is also a controlled release, likely not to poke the US regulatory bear. It’s safe to assume they will not unleash those products to their entire client base before the Anthropic situation settles or a non-US competitor fills the gap first. I can’t help but find this approach frustrating. The average company can’t access 5.5-Cyber but big cybersecurity firms do, only to sell it back to their own clients at premium. In other words, artificial scarcity disguised under the pretence of responsible deployment.

Let’s use this slowdown to regroup and focus on what we can do to hold the fort when the storm comes back.

Update (2026-06-27):

Things are moving fast. OpenAI is releasing a new family of models: Sol, Terra and Luna, with a strong PR focus on its cybersecurity prowess and comparing it to Mythos. Same as with 5.5-Cyber, the model is made to be biased toward defence instead of building exploits. Those safeguards don't seem to be enough for the U.S. government who still wants to vet which institution get to access the new models. The same now applies to Anthropic, opening Mythos to a hundred US institutions to start with.

As some of those fears, uncertainties, and doubts are starting to feel real, what can we do about it? Paradoxically, I believe that little needs to change in what we’ve been doing for years.

“We already have AI at home”

Us mere mortals might not have access to Mythos and ChatGPT 5.5-Cyber, but what’s available is not completely useless either. Opus 4 is still very capable on the Anthropic side, same as GPT-5.5 with the Codex Security plugin for the company that can obtain the necessary approval. On the FOSS side, harnesses like Strix can already achieve a lot, either combined with local models like Qwen/Gemma or API based inference providers for beefier ones like DeepSeek and GLM.

Keep working on your vulnerability management programme

The rate of CVE releases has been steadily increasing across the board for years. It didn’t wait for Mythos to get out of hand. I have yet to see a company that patches every meaningful vulnerability in less time than it would take a motivated attacker to weaponise them. Besides obvious tuning knobs like increasing resources and priority, we now have “no choice but to make choices”, ideally the good ones. Triage and contextual prioritisation is key to keeping VM sustainable, and is also an area where AI assistance could be advantageous. Existing “vulnerability scores” from major providers often lack contextualisation from our own information system. They may know how bad a vulnerability is in theory, if exploits are available and whether the impacted software exist in our environment. Yet they typically ignore other key aspect, like whether it is business-critical, easily reachable, or protected by compensating controls. Making sense of gigantic amounts of inconsistent text and tabular data is exactly where large language models can shine.

Reduce the attack surface

The best way to protect against a vulnerability is to not have it in the first place. Deactivating what you don’t need is a well known hardening techniques but, let’s be honest, vastly underused in all but the most mature corporate environments. Life got easier in recent years with the surge of microservices and, more generally, container-based infrastructure. If you’ve not already looked into this area, I advise you start with initiatives offering minimal, also called “distroless” containers, like the original Google project, docker hardened images (DHI) or Talos Linux for Kubernetes. The Windows side has “Server Core” as a less extreme variant.

Give more layers on the cybersecurity onion for the LLM to peel down

Security-in-depth approach is getting more important than ever, if any security tool guarding the boundaries can fall on the zero-day sword any day, then it’s vital to have additional checkpoints on the critical path to slow down the intrusion. To give a few examples, you can add context-aware proxies and privilege access management gateways to your VPN/network segmentation, phishing resistant MFA to all authentication attempts, etc.

Another defence in depth strategy worth revisiting are decoy systems like honeypots and canary tokens. If we generalise LLM behaviour from other areas, we can only assume early AI intrusion models to be clumsy, noisy, and candid in their approach, thus very likely to trigger those traps.

Zero Trust to the rescue

The above points could all be encapsulated in a more comprehensive programme toward zero trust principles: verify explicitly, use least-privilege access and assume breach. Fifteen years after Google’s BeyondCorp, those principles have technical implementations accessible to everyone, with most SASE vendors offering their own spin of the concepts. Context Aware Proxies, also called Zero Trust Network Access Gateways, often allow enforcing pre-authentication before getting line of sight to the targeted systems. It doesn’t matter if your software is vulnerable to unauthenticated RCE if the attacker cannot reach the service in the first place.

This mindset should not only apply to technical controls but also any process with human resources in the loop. AI dramatically increased the potential of social engineering attacks, making it trivial to generate convincing messages or impersonate key personnel, even with audio and video. Verifying explicitly can become extremely challenging for your Customer Service or Helpdesk teams if they are not properly trained on those new capabilities.

Wrapping it up

Mythos cybersecurity prowesses are real. The progression from previous models might be more linear than the initial PR implied, but the improvement is indisputably steep, especially when it comes to producing working exploits. Let’s keep a cool head and leverage the unexpected pause in Mythos availability to regroup and prioritise the right projects. Anything reducing the likelihood of a vulnerability to be exploited is good to take:

Don’t give the exclusivity of LLM to attackers, there are many areas where we could leverage AI for our defence, from incident response support to agent-based security reviews.
Improve time to patch on what matters by improving vulnerability management processes, especially context-aware prioritisation and triage
Reduce the attack surface, both on what’s deployed, trimming down our server images, and what’s reachable, by enforcing pre-authentication via zero trust network access
Keep adopting zero trust mindset when deploying services: assuming breach, verifying explicitly and following least privileges principles
Add traps on the path for the AI-assisted attackers to trip into and alert your SOC. LLMs have a lot of bias, they tend to repeatedly leverage the same techniques and can be incredibly candid in their approach, let’s use it to our advantage!

Mythos did not invalidate our existing cybersecurity priorities, but it raised the cost of ignoring them.

Hacker Times