Frontier AI has broken the open CTF format

Must I beg to have an acronym spelled out a least once, the first time it's used? Even if you assume 90% of readers already know, the other 10% (including me, in this case) will thank you, it doesn't take much effort, and it expands the reach of your communication or idea.

Exceptions for cases where the acronym is just so well known that a lot of people don't even know what it stands for even though they know the concept well. I recall one corporate training I was sitting through and they used the term "Border Gateway Protocol" and it took me a half beat to think through "oh, you mean BGP?"

Thanks!

Replace ‘CTF’ with ‘high school’ or ‘university’ and you’ve described the total slow motion collapse of education; the only saving grace is that most of it requires in person presence.

We’ve figured out the human replacement pipeline it seems, but we haven’t figured out the eduction part. LLMs can be wonderful teachers, but the temptation to just tell it ‘do it for me’ is almost impossible to resist.

I feel the post. For me AI has ruined both, playing CTFs and also building CTFs challenges. The most annoying thing to me is the "yeah idk but here is the flag" mentality.

Before when playing CTFs with my mates was usually sitting there for hours tackling a challenge until some other mate joined, had some look together and solved it with you together in 30 minutes which is the most rewarding learning experience. Nowadays mate joins in throws the clanker on it and solved it in 5 minntes. Asking on how it worked you always get the "yeah idk what it did, but who cares, here is the flag" response.

Same for creating challenges. Whenever I ask for writeups or if some people solved it differently I usually get the "yeah idk, clanker solved that one" response taking the fun out of it.

So yep, this CTF format is definitely dead. Mainly because the strong competitiveness and prices. This encourages people to cheese challenges and sometimes solving them differently was fine as you still had a creative out-of-the-box thinking moment, but nowadays with AI there is no brainpower needed, no cheesing needed, no human needed. As you mentioned, it's pay to win.

My two cents is that the 24/7 CTFs will get more attraction as the scoreboard doesn't matter there and simply doesn't give you any price.

I was writing an obfuscator recently, I just had the model deobfuscate and optimize the code back to original and I kept improving the obfuscator until it couldn't. The funny thing is that after all this I also ended up with a really strong deobfuscator and optimizer which is probably more capable than most commercial tools.

The solution is just to make CTFs harder, but when do CTFs become too hard? Maybe the problem is that 'hard' CTFs are fundementally too 'simple' where it's just a logic chain and an exhaustive bruteforce towards a solution since there really are limited ways to express a solution in plain sight.

Or maybe human creativity has been exhausted and we're not so limitless as we thought. Only time will tell.

I had another idea spring to mind: we could hide two flags, one that could only be found by ai agents and not humans or tools written by humans.

Meta: this was submitted with the article’s title “The CTF scene is dead” which I found very easy to understand. It has just been updated to use the subtitle’s first sentence, “Frontier AI has broken the open CTF format”. I find that much harder to grasp, rather like a garden-path sentence. My immediate thoughts were that “Frontier” was a company name, and that there was some file format named CTF. If you don’t know about Capture The Flag contests, the change doesn’t help. If you do, I think the change makes it worse.

I have no experience in the CTF scene so I'm curious - why not lean in and design the puzzles with an AI harness like the one top teams use in the loop and use the(presumably) expert skills of the designers to patch up the holes until the AI can't find them? Do you just end up designing ~perfectly secure systems that no human can break without finding monetizable 0days?

This is happening to other forms of competitive programming too. The most recent AIs have problem solving skills rivaling top humans, and so if AI can't be easily banned, the competition is dominated by AI agents.

I thought code golf would take longer for AIs because there's so little training data (it's more niche), but we're seeing AIs starting to match expert humans there too. Sucks because golf has been my favorite type of programming puzzle.

It's crazy how far AIs have come in problem solving ability.

https://en.wikipedia.org/wiki/Capture_the_flag_(cybersecurit...

still has no mention of AI, but that will likely change as they increasingly dominate competition.

>The competition is turning into "who can afford to run enough agents, with enough context, for long enough."

This will basically become true for everything.

«That feedback loop is breaking. If the visible scoreboard is dominated by teams using AI, a beginner is pushed toward using AI before they have built the instincts the AI is replacing. That is an anti-pattern. It prevents active learning, and active struggle is the bit that actually teaches you. It is also completely demotivating to put in real effort and see no visible progress because the ladder above you has been automated.»

This stands out to me, and speaks perhaps broader than the article itself? I’m sure this has been in the spotlight before, but well put for many areas I think.

Competitive programming scene always included offline competition and with AI they are becoming more important (and in general they were more fair even before). If CTFs are to survive, they should probably try to adopt this strategy.

You could even go so far that anything loaded on your computer is fair game, but not more than that (certain competitive programming competition for example allow unlimited amount of paper material - for CTFs you probably need much more than that, therefore electronic).

I can't help but draw parallels with video games. Aimbots in competitive multiplayer games is a well defined issue: it's considered cheating and frowned upon, players caught cheating are banned from the game. Tool-assisted speedruns (TAS) where a player attempts a world record at completion in a single-player game is another face of the same concept (computers help you win), but one that is socially accepted as long as runs are clearly labelled as TAS.

It's not only CTFs. I strongly believe being a programmer at a gamejam like Ludum Dare, or hackathons is pretty much over.

A big fraction of the comments on this thread are about the impact of cheating on competitive games. It's important to understand that automating CTF challenges isn't usually cheating. It's normally part of CTF culture. The better teams have toolboxes ready to shred the early challenges; it's not a level playing field and was never intended to be.

(The author of the piece understands this; I think they're broadly right, though I think these games will find other ways to incentivize participation without the now-meaningless leaderboards.)

I don't do CTF's but took part at the security workshop for fun ~2 years with my Android phone only. I was first with the first simple challenge, but then couldnt continue because my phone was just too limited. But I watched what the others did. And a young Indian guy did everything with ChatGPT then. I found it silly, but amusing, because he actually got second. There was no Codex nor Claude then. Nowadays it must be dead for real, because I would solve everything with my agents, as I do in the real world.

I don’t think CTFs are dead, they’ll just evolve. The difficulty level will need to be increased or the rules locked down. Just like sports and racing persist despite the existence of performance enhancing drugs and rocket technology.

I just did a CTF where I was in the top 10. It was the first CTF I completed and I used AI because the rules permitted it. That said, I couldn’t solve all challenges.

But yes, it was significantly easier now than I last attempted one. Even manually solving with AI assisted assembly interpretation was much easier.

I have normally found any sort of timed technical competition intimidating. Even so, about 6 or 7 years ago, after being persuaded by a colleague, I participated in a few CTFs. I am glad I did, back when this type of thing still meant something. I have kept a screenshot from one of the CTFs that I am quite fond of: https://susam.net/files/blog/ctf-2019.png

Interesting and well written article that mirrors/foreshadows how LLMs do and will change other scenes.

As I don't know much about the CTF scene, I looked for other takes on this topic.

Here's an article from 2015 about how tool-assistance already changed CTFs:

> Individual skill will undoubtedly be a factor next year. But, I'm left wondering whether next year's DEFCON CTF will tell us anything more than how well-developed each team's tools are (and how well they can interpret the results).

https://fuzyll.com/2015/ctf-is-dead-long-live-ctf/

But there are quite a few recent (2026) articles with the same core message as in the original article, e.g., https://blog.includesecurity.com/2026/04/ctfs-in-the-ai-era/ or https://k3ng.xyz/blog/ctf-is-dead

And here's someone explaining how Claude Max allowed them to win CTFs:

> I had always been interested in CTF as one of the only ways people could compete and show off their skill in coding/problem solving on a global scale. It was just too difficult and didn't make sense for me to learn the fundamentals as an electrical engineer. As time went on, I got better and better, and it was hard to tell whether it was because of experience or if it was because of improvements in AI.

> I accomplished my goals, and for that reason I'm quitting CTF, at least for now. [...] I'd like to think I highlighted the problem before it became a bigger issue. So, how do we fix this? Teams and challenge authors losing motivation is not good. CTF dying is not good. AI bad. Or is it?

https://blog.krauq.com/post/ctf-is-dying-because-of-ai

The only article that saw LLMs as a non-negative force for CTFs was this one. Fittingly, it sounds like LLM output ("Let's be honest", "This is where things get interesting.") and only contains hallucinated references.

https://caverav.cl/posts/ctfs-not-dead/ctfs-not-dead/

When I did my first CTF, it was close to the deadline and I thought I had the extracted the flag from the program and the rest of the program was just filler, so I entered the flag, and it told me it was not the flag. It turns out the program multiplies the input by a pseudorandom matrix before comparing it against the flag, so I had to implement a matrix inversion and then get the flag. That's not the story though.

The matrix was always the same and the challenge was clearly designed so that the point was being able to read anything at all, not knowing how to invert a matrix, so I asked the creator what was up.

He told me that there were tools that would trace input values until they reached a comparison instruction, then print what they were compared against. Therefore it was necessary for every deobfuscation challenge to scramble the input in some way too complex for these tools to undo, before comparing it. Hence the multiplication by a pseudorandom matrix.

The point is, cheating tools aren't new.

The "CTF for fun" aspect has been dead ever since the winning teams had thousands of dollars of rewards waiting for them. Of course people are going to use anything that's not explicitly forbidden by the rules to win. Introducing what amounts to an "I win" button that both can't be prevented by rules and is accessible to anyone didn't "break the format" anymore than the epidemic of giant merger teams did a couple years ago, it just broke the community because you now don't have to actually talk to other people to cheat anymore.

Many CTFs have switched to a dual-leaderboard format recently, one for "agentic teams," one for the rest. If all you care about is "learning" and imaginary internet points, you can just participate as a human team and adblock the AI scoreboard, and maybe lobby CTFTime into splitting their rankings as well.

You could make it offline and with provided laptops only, just like with the competitive CS2 scene.

The first paragraph on anything with an acronym in it should explain the bloody acronym. I assumed CTF was an encryption standard, given the headline. It was only coming here and reading the comments that made me realise it's a game-format ("Capture The Flag").

What is CTF? And why is the cyber security world filled with silly gaming references?

There's something funny about complaining about cheating in a hacking competition.

Well actually I get it. In cycling motor doping, putting a hidden engine into the bike, seems more offensive than regular doping. I think this is because there is a continuum from eating well to taking supplements to injecting stuff, but having a engine breaks a fundamental idea about cycling. Similar hacking is about cleverly abusing the rules.

It's tough. We run botsbench.com , which tracks AI progress on a top CTF, and I gave a talk at CCC a few months ago on our own results doing AI speed runs, so I think about this a lot.

In our own trainings we give (AI agents for security, and a graph masterclass), we ended up leaning into it. For example, we ship with a skills bundle. There are plus sides, like less code-forward participants can go further and are appreciating that, and less of a gap between high-level concepts and successful hands-on. But at the same time, manual work does build a lot of intuition & knowledge that gets missed in auto modes.

>If adaptation means accepting that the scoreboard is now an AI orchestration benchmark, then we should say that honestly instead of pretending the old competition still exists.

This is like someone complaining that making machine parts has been ruined: Skillful craftsmen used to make them by hand using manual tools!

Nowadays the CAD/CAM/CNC cheaters have almost completely automated the whole thing. How is the next generation of craftsmen going to learn how to craft a gear by hand when the process of gear making has been reduced to pressing start on a CNC machine?!

See what I mean? Sorry, I think this article is just Luddite. I can empathize with the pain of your beloved craft basically being rendered obsolete by new technology, but the process can neither be stopped nor is it bad in general.

The manual skills you trained with CTF puzzles are now simply no longer relevant . (Field-specific) "AI orchestration" is the new cyber securtiy skill if LLMs really have become so good at this, and what the author used to do manually then has the same value as being able to craft a gear by hand.

Is AI also superior to humans at black box challenges and attacking actual targets on the internet? That seems like a really important question.

I guess this is very similar to what happened to demo scene, in some way. The limits are what makes these problems interesting, and once we have better machines / tools, the incredible skill is no longer prerequisite, making everything less interesting for participants. Sad, but - such is life...

I think soon there will be ways to trick this models and I think when it happens it will be yet another layer like aslr

These models seems completely unbeatable only in the ads. There are 100+ times way someone puts Hindi Yoda talk In Morse Code and it goes nuts. The reason they are going to hard for PR Marketing on this is because they know it is a matter of time.

Great article, well written, and good analogy to chess. I’ve been playing competitive chess most of my adult life and I think that the solution lies in how chess dealt with this problem:

Explicit ELO measurements with some cheating detection. AI assistance wholly banned. As you climb the ELO ladder, detection gets more onerous. At top level during online events, anti cheating teams require the use of both monitoring software and multiple cameras.

Idea is that you can cheat pretty easily at the lowest levels but it gets less easy the higher you go. This allows for better feeding into the truly elite competitions.

I think chess’s very firm stance that AI is never allowed in competition (neither online nor in person), rather than CTF’s acceptance, was the right call.

You can still do competitions. But you'll all need to fly to the same place and work on laptops with a fresh install of Linux. 1 hour to install tooling then Internet off, challenge revealed.

Not as easy logistically...

,,a beginner is pushed toward using AI before they have built the instincts the AI is replacing. That is an anti-pattern.''

The same article talks about CTF skills as a way to learn about security best practices and separately a sport.

In reality it was all about learning an extremely important skillset (securing/attacking software and systems) that is getting automated.

The real thing the author seems to be frustrated about is AGI is coming in computationally verifiable domains first, and lot of his skillset was taken over in a big part.

Yes you're right - But just like many other stuff things change - CTF Veteran for more than 3 decades I find lots of fun figuring out how to use some of my agents and new tools to find vulnerabilities - The goal is the same / tools change and that's good.

This left a strange feeling. The article reads as extremely bleak. But from a different perspective this is extremely bullish for AI.

easy, CTFs should ban it. then it'd be more like the chess community

Question: Was this website made with Claude?

I've seen that exact font and color scheme a dozen of times the past weeks.

“solve”, why not solution? Like “spend” and not expenditure, why use the verb as a noun and not care about grammar?

Do CTFs like Lan parties or factor in new tooling avalable to people. change is not death. or death is not an end. either way, people will enjoy applying and showing off their skill. competing with eachother on a human level,.with or without ai tools.

I'm conflicted on the use of AI in CTFs. On the one hand, they are supposed to mirror real-life scenarios, so of course you should be able to use any tool that would be available to you in real life.

On the other hand, CTFs are fundamentally a game and a competition which are supposed to be fun and compare and improve ones skill. So when I let an LLM generate the entire solution for me, what's the point anymore? I did not learn anything. I did not work for that place on the leaderboard, I just copied the solution. And worst of all, I did not have any fun. It's boring.

So how does using AI as a solver not feel like cheating?

I’m interested in finding out how attack-defense style CTFs are affected by slopping. ENOWARS skorbor will probably significantly differ from the last time around.

Chess and Go are not dead just because Ai got better than humans at these games.

What am I missing here?

I thought a company called Frontier broke a file format CTF.

We’re in an age where, to be possibly a bit rude but blunt, pseudo-intellectuals are obsolete. A pseudo-intellectual prided themselves on being able to efficiently solve closed, man made problems such as leetcode, CTF problems, or even math Olympiad problems. They could do good in school by memorizing a rote technique and applying it to some test. They typically don’t have any real creativity and if you put them to work on a problem you can’t Google or isn’t a fake man made one, they fall apart incredibly fast.

They may as well be the human equivalent to what LLMs currently are.

I do not mourn these people, as they’re usually the most arrogant types. I hope for their sake they adapt.

AI-generated phishing is the scariest development in cybersecurity right now. Click rates on AI-written phishing emails are 54% compared to 12% for traditional attacks. Automated real-time detection is the only scalable answer at this point

Unable to find what “CTF” means, since it doesnt look like referring to Capture The Flag gaming

Neither the article nor the comments in this thread explain which of the many meanings the acronym CTF is being applied to...

Very impressed that OP has gone from starting university in 2021 to becoming a Senior Security Engineer.

It's an incredibly exciting time in security research in my humble old man opinion.

Think the cadence of new exploits is perhaps a good measure of that rather than subjective thoughts by anyone regardless of experience.

My first ever was Stripe CTF in 2012 I think, I still wear the shirt I got (now super fainted) from passing some challenges. I was a student in portugal and remember receiving the shirt for it and thinking, maybe those Americans aren't any better than me and I can compete at the same level.

I never got super into security but it gave me the confidence to play in the same field and lose the stupid aura I had that somehow "rich americans" would be better than me at everything because they had better universities or because of Hollywood or something.

Sad that another cool thing is lost to AI but I guess kids will learn in other ways.

This stands out to me, and speaks perhaps broader than the article itself? I’m sure this has been in the spotlight before, but well put for many areas I think.

I see this with beginner programming students at university. They get AI to help them with assignments, with the intention of learning, but ultimately they do not get the understanding they would have if they had done the assignment themselves. Then they are at a deficit for learning more advanced topics.

My fear is that they never get to the level they need to be at to create good software even with the help of AI. So, although an expert with AI can create great software, that is not where we end up. In stead we will have vibe coded messes by people who barely have any grasp of what is going on.

>The competition is turning into "who can afford to run enough agents, with enough context, for long enough."

This will basically become true for everything.

Or maybe human creativity has been exhausted and we're not so limitless as we thought. Only time will tell.

I had another idea spring to mind: we could hide two flags, one that could only be found by ai agents and not humans or tools written by humans.

A portion could require astral projection and computers can't do that. Or maybe just a VR mini-game like the 90s always imagined.

Interesting, what I just did recently is basically the same of this as I tried to push the limit of js obfuscator as much as possible by keep forcing gpt/claude deobfuscate final output then having gpt improve the tool to break the deobfuscator.

Do you publish it somewhere? Here's a sample my my js obfuscator output: https://gist.github.com/Trung0246/c8f30f1b3bb6a9f57b0d9be94d...

I feel the post. For me AI has ruined both, playing CTFs and also building CTFs challenges. The most annoying thing to me is the "yeah idk but here is the flag" mentality.

Same for creating challenges. Whenever I ask for writeups or if some people solved it differently I usually get the "yeah idk, clanker solved that one" response taking the fun out of it.

My two cents is that the 24/7 CTFs will get more attraction as the scoreboard doesn't matter there and simply doesn't give you any price.

I don’t know like chess engines didn’t kill chess. You could just play with people that don’t use the “engine”

It's crazy how far AIs have come in problem solving ability.

Code golf is well-suited for AI because you have a easily verified objective (minimize code size while passing tests) and can run an LLM in a loop to churn away at it.

Thanks!

Replace ‘CTF’ with ‘high school’ or ‘university’ and you’ve described the total slow motion collapse of education; the only saving grace is that most of it requires in person presence.

https://en.wikipedia.org/wiki/Capture_the_flag_(cybersecurit...

still has no mention of AI, but that will likely change as they increasingly dominate competition.

It's not only CTFs. I strongly believe being a programmer at a gamejam like Ludum Dare, or hackathons is pretty much over.

Since this is the top comment at the moment: CTF stands for Capture The Flag.

Personally I have never, ever heard that concept referred to by the initialism. Granted, it's almost never come up in my circles, so... shrug

Which acronym do you mean? CTF? I think that acronym, just like BGP, is more well known by itself than what it stands for.

More generally, not every piece of writing is meant for every audience. Like if someone writes a blog post about CTFs aimed at people who like CTFs, nobody in the target audience needs to have CTF explained to them. Ultimately HN is a link aggregator, but sometimes its a bit like eavesdropping on a conversation. When you are just listening in you don't get the full context sometimes.

I didn't know what BGP is, but I did know CTF. YMMV

Apart from everything else people have said in response to this, it's rude to presume that an article has HN as an audience simply by dint of it being available for us to link to. It's totally reasonable for people to write for an audience they know understands these terms.

So, in fact, you must not beg to have authors include courtesy definitions for you. That's not reasonable. Instead, you should simply ask here, on the thread, without complaining about the article.

I think so many acronyms have meaning that isn’t explained by the words that the stand for. The other day I was explaining what CI is and they asked what it stood for; I realized that Continuous Integration is almost completely useless for someone trying to understand what CI actually is

At the same time, I did a search for "what is a ctf to play" and got the answer. We know how to find answers to these problems. I agree the blog post was poor form.

Your two paragraphs are completely contradictory. I agree with the first one.

“hacker” news, ladies and gentlemen

Let’s reduce this to absurdity:

I think you only wanted clarification of CTF (Capture the Flag) and not AI (Artificial Intelligence) and not GPT-4 (Generative Pre-Trained Transformer version 4) and not CLI (Command Line Interface) and not MCP (Model Context Protocol) and not LLM (Large Language Model)

Quoting TFA (The Fucking Article): “just adapt bro”

lol at the BGP example

Interesting and well written article that mirrors/foreshadows how LLMs do and will change other scenes.

As I don't know much about the CTF scene, I looked for other takes on this topic.

Here's an article from 2015 about how tool-assistance already changed CTFs:

https://fuzyll.com/2015/ctf-is-dead-long-live-ctf/

And here's someone explaining how Claude Max allowed them to win CTFs:

https://blog.krauq.com/post/ctf-is-dying-because-of-ai

https://caverav.cl/posts/ctfs-not-dead/ctfs-not-dead/

There's something funny about complaining about cheating in a hacking competition.

They may as well be the human equivalent to what LLMs currently are.

I do not mourn these people, as they’re usually the most arrogant types. I hope for their sake they adapt.

You can still do competitions. But you'll all need to fly to the same place and work on laptops with a fresh install of Linux. 1 hour to install tooling then Internet off, challenge revealed.

Not as easy logistically...

,,a beginner is pushed toward using AI before they have built the instincts the AI is replacing. That is an anti-pattern.''

The same article talks about CTF skills as a way to learn about security best practices and separately a sport.

In reality it was all about learning an extremely important skillset (securing/attacking software and systems) that is getting automated.

The real thing the author seems to be frustrated about is AGI is coming in computationally verifiable domains first, and lot of his skillset was taken over in a big part.

I thought a company called Frontier broke a file format CTF.

Sad that another cool thing is lost to AI but I guess kids will learn in other ways.

I’m interested in finding out how attack-defense style CTFs are affected by slopping. ENOWARS skorbor will probably significantly differ from the last time around.

If it helps I understand the second much better and feels less clickbaity and includes more info. I do agree with the points you made about the confusion although I find frontier a term used in this area a lot, “frontier AI models have” would probably resolve that.

I agree, it took me a second to parse. It may be because this is the first time I've seen "frontier models" described as "Frontier AI". That sounds more like a company name, especially when the F is capitalized.

Frontier as in "Frontier Model" is a legitimate vocabulary term you should probably be aware of in 2026. It's not something the author made up or chose randomly, it's common parlance in the space.

The article never defined CTF. Nor have the top comments here. Skip.

Basic rule: define every abbreviation when it is first used.

Why do people always hijack threads to discuss titles? Most articles have terrible titles. Just downvote it and move on.

Everything we've learned in the last 10 years is telling us that computers do not help human education in the slightest. We remember better when we write with pen and paper. We learn better with whiteboards and paper books. The simple answer: Remove most computing from education entirely. Blue composition books, pencils, whiteboards is what trains humans. Calculators are helpful perhaps but it is quite possible that slide rules are better. We need humans that can critically think from first principles to counter the recycled information generated by AI.

We are interviewing for a software dev role and we made the first round in person to prevent cheating. The gap between people who learned pre ai vs post is immense. I had a dev with supposedly 3 years experience and a degree in software who wouldn't have been able to write fizzbuzz without AI.

> Replace ‘CTF’ with ‘high school’ or ‘university’ and you’ve described the total slow motion collapse of education; the only saving grace is that most of it requires in person presence.

So something like, "Frontier AI has broken the 'high school' or 'university' format"?

The hype surrounding AI is just pervasively exhausting: you've got the folks talking about an entire new age for humanity where we're shortly going to take over the entire universe. And you've got the folks talking about how our entire society is crumbling.

Education is one place folks seem to throw up their hands and say nothing can be done.

The fix is simple: students are to be evaluated on their performance in person. That's it.

Any other "collapse of education" isn't due to AI, it's something else.

I found this interview [0] on the subject of AI in CS education on the Oxide & Friends podcast very illuminating. Of course, Brown University CS != All education, but interesting angle nevertheless.

[0] Episode webpage: https://share.transistor.fm/s/31855e83

Wonderful teachers that give unreliable information with total confidence?

They were a forcing function for skillz and they no longer are. We need new forcing functions for skillz or we will become WALL-E blobs.

Well, they were ostensibly forcing functions... ten years ago everyone was paying the exchange student to do their homework and assignments for them, and that guy was paying his cousin back in his home country, but the whole thing is a bit more efficient now.

We've already had consolidation of education for a while now. Even before all the edutech courses, there were Youtubers educating better than many university professors. 10-15 years ago students were already skipping lectures and just showing up for tests.

In my university education (2007-2011), 80% of the grade was based on exams at the end of each year, with no resits.

> We’ve figured out the human replacement pipeline it seems, but we haven’t figured out the eduction part.

No we have not.

>LLMs can be wonderful teachers

Are they or aren't they

I think that misses the point - it's a little bit like asking why FPS game developers don't lean into aimbot usage. You could, but by default it's a bit boring, and a different type of game.

Using AI on CTF is like using a car to get better at the 100 yard dash

Ludum Dare 59 just wrapped up last week, and both first and second place were won by developers using "Agentic" coding tools, something the community there is still discussing:

https://ldjam.com/events/ludum-dare/59/setidream/about-ai-ar...

For what it's worth, the non-AI-coded entries were still quite good relative to the winners, so it's not so obvious that AI use confers an unbeatable advantage.

(The author of the piece understands this; I think they're broadly right, though I think these games will find other ways to incentivize participation without the now-meaningless leaderboards.)

This is already addressed in the blog post about the fast that frontier LLMs have moved to being able to solve the kind of problem you'd expect a talented amateur or mid-level pro to do (aka top level CTF problems)

What is CTF? And why is the cyber security world filled with silly gaming references?

Capture The Flag is a cybersecurity game where the organizers set up a bunch of intentionally vulnerable computer systems with a "flag" on them, a string that's "supposed to be" secret but is accessible through exploiting the vulnerabilities. This may be a line in /etc/password, a string in memory, a field in a database, whatever. The goal of the game is to hack into the computer systems, find ("capture") the flag, then copy/paste it into the organiser's scoreboard website to prove that you solved that particular challenge.

It's pretty fun. Or at least it was, back when you had some sense that your competitors were competing on an even playing field and just beat you because they were better than you.

I wouldn't say the name is a "gaming reference", it's just a descriptive name for a game.

The biggest difference would be the fact that you can discover video game cheating through some kind of trace. Speed running communities go pretty hardcore on that kind of thing nowadays.

It's a lot harder to detect cheating when your only trace is how fast someone submitted the string CTF{DUck1e_Pwned}

Aimbots in competitive multiplayer games are (almost always) game-breaking abuses. CTFs have always rewarded tooling and automation. They're different cultures.

Sure if the goal is entertainment and sports, you're right. However, unlike chess or counter strike it's downstream from a real needed utility. Like, is there a point to do it anymore? (ofc there is, but still, it's been devalued from the perspective of the 'real utility')

Is AI also superior to humans at black box challenges and attacking actual targets on the internet? That seems like a really important question.

No, the search space is much more vast and the feedback loop almost nonexistent.

The reason LLMs can do CTFs so well is partially because the challenges are usually designed to avoid wasting time and to introduce a single concept without noise.