Between the network latency and low end machines, there is an enormous lag between chatgpts response and being able to reply, especially for editing a canvas.
I've been sitting there for up to a minute plus waiting to be able to use the canvas controls or highlight text after an update.
Clearly I'm blocking some tracker and it's upset about that. I allowlisted a sentry subdomain and since then got no more complaints.
It now behaves like Claude, attaching the paste as a file for upload rather than inlining it.
This affected page UX some and reduces the cost of the browser tab some.
At some point, maybe still true, very long conversations ~froze/crashed ChatGPT pages.
Why run this check before user can type?
Why not run it later like before the message gets sent to the server?
Why would two AI bots want to chat with each other?
But it seems clear to me that this is why I can't start typing right away when I first load the page and click to focus in the text field.
Also, you can have it spotcheck colors: light orange on light background is unreadable, ask it to find the L*[1] of colors and dark/lighten as necessary if gap < 40 (that's minimum gap for yuge header text on background, 50 for text on background, these have gap of 25)
I haven't tried this yet, but, maybe have it count word count-per-header too. It's got 11 headers for 1000 words currently, makes reading feel really stacatto and you gotta evaluate "is this a real transition or vibetransition"
[1] L* as in L*a*b*, not L in Oklab
...I don't think that's possible even if you are a bot? I would be very surprised if OAI had their origin exposed to the internet. What is a "non-Cloudflare proxy"? Is this AI slop?
It's likely just looking at the CF properties as part of a bot scoring metric (e.g. many users from this ASN or that geoip to this specific city exhibit abusive patterns).
Really really bad user experience, wondering about when they will leave this approach.
My best guess is -- ChatGPT is running something in your browser to try to determine the best things to send down to the model API –- when it should have been running quantized models on its own server.
Every provider seems to have been plauged by these freeloaders to such an extent that they've had to develop extreme and onerous countermeasures just to avoid losing their shirts.
What's the word? Schadenfreude?
Sick.
A big reason we invest in this is because we want to keep free and logged-out access available for more users. My team’s goal is to help make sure the limited GPU resources are going to real users.
We also keep a very close eye on the user impact. We monitor things like page load time, time to first token and payload size, with a focus on reducing the overhead of these protections. For the majority of people, the impact is negligible, and only a very small percentage may see a slight delay from extra checks. We also continuously evaluate precision so we can minimize false positives while still making abuse meaningfully harder.
> This is bot detection at the application layer, not the browser layer.
I kind of just assumed that all sophisticated bot-detectors and adblock-detectors do this? Is there something revealing about the finding that ChatGPT/CloudFlare's bot detector triggers on "javascript didn't execute"?
You can probably run 50 of those simultaneously if you use memory page deduplication, and with a decent CPU+GPU you ought to be able to render 50 pages a second. That's 1 cent per thousand page loads on AWS. Damn cheap.
Specifically, Turnstile as far as I'm aware doesn't do anything specifically configurable or site specific. It works on sites that don't run React, and the cookie OpenAI-Sentinel-Turnstile-Token is not a CF cookie.
Did OpenAI somehow do something on their own API that uses data from Turnstile?
this is meaningless btw. A browser headless or not does execute javascript.
I have kind of lost count of how many content creators have said personally to me traffic is meaningfully down because of all these chatbots. The latest example is this poor but standup guy: moneyfortherestofus.com.
The scary part is that you don't even see the irony in writing this.
Or, are you just okay "misusing" everyone for your own benefit?
I read it to mean: "A browser that doesn't execute the JavaScript bundle won't have [the rendered React elements]." Which is true.
Said another way, if done in the background the user wouldn’t even notice unless they typed and submitted their query before the check completed. In the realistic scenario this would complete before they even submit their request.
Nick, I understand the practical realities regarding why you'd need to try to tamp down on some bot traffic, but do you see a world where users are not forced to choose between privacy and functionality?
That said, is it not a little bit weird that you want to protect yourself from scraping and bots, when your entire company, product, revenue, and your employment, depends on the fact that OpenAI can bot and scrape literally every part of the internet? So your moat is non-hydrated react code in the frontend?
Typing the chat box is slow, rendering lags and sometimes gets stuck altogether.
I have a research chat that I have to think twice before messaging because the performance is so bad.
Running on iPhone 16 safari, and MacBook Pro m3 chrome.
Is this to be expected? I would presume that if I'm authenticated and paying, VPN use wouldn't be a worry. It would be nice to be able to use the tool whether or not I'm on a VPN.
Can you share these mitigations so we can mitigate against you?
Did you mean to use the word hypocrisy. If not, I'm happy to have said it.
I just want to note, that it is well covered how good the support is for actual malware...
Make sure not to browse the Internet without adblock and/or similar.
Another way is to just do better isolation as a user. That's probably your best shot without hoping these companies change policies.
Every time I try this, I end up crossing wires (ie using the browser that 'works' for most things, more than the one that is 'broken')
Either way, pretty wild that you can have billions of dollars at your disposal, your interface is almost purely text, and still manage to be a fuckup at displaying it without performance problems.
Heard from a founder who recently switched his company to Claude due to OpenAI's lagginess–it's absolutely an OpenAI problem. Not an AI problem in general.
It's the reality of how bad the bots have become.
As has been amply explained, the API pricing per token is far more for equivalent use when maximizing a subscription plan.
It isn’t really a massive hurdle to deal with this full SPA load check. If one is even aware it exists they already have the skills to bypass it anyway.
I get why people would “what about” the automation inherit in what OpenAI is doing but that is a separate matter.
Other businesses and applications can put into place their own hurdles and anti bot practices to protect the models they’ve leaned into—-and they have been.
Honestly it is a very healthy competitive market with reasonably low switching costs which drives prices down. These circumstances make rolling your own a tough sell.
Imagine an apartment building with a flimsy front door lock that breaks all the time, and the landlord only telling you that that can't be helped because of all the burglars.
The reason it has to block until it's loaded is that otherwise the signal being missing doesn't imply automation. The user might have just typed before it loaded. If you know a legit user will always deliver the data, you can use the absence of it to infer something about what's happening on the client. You can obviously track metrics like "key event occurred before bot detection script did" without using it as an automation signal, just for monitoring.
I don’t know whether ChatGPT is one of those products, but if it is, that behavior might be a side effect of blocking the input pipeline until verification completes. It might be that they want to get every single one of your keystrokes, but only after checking that you’re not a bot.
You want to go to the world's best hotel? You are gonna be on their CCTV. Staying at home is crappier but private.
Unfortunately for the first time moores law isn't helping (e.g. give a poor person an old laptop and install linux they will be fine). They can do that and all good except no LLM.
search for me is now a proprietary index (like exa) that filters rubbish, with a zero data retention sla. so we don't need google profiling.
the content is distilled into markdown pulled from cloudflare's browser rendering api.
i let cloudflare absorb the torrent of trackers and robot checks, i just get md from the api with nothing else. cloudflare is poacher and gamekeeper.
an alternative is groq compound which can call browsers in parallel.
for interactive sites, or local ai browsing, i sometimes run a browser in a photon os docker with vnc, which gives you the same browser window but it runs code not on your pc.
that said little of my use is now interacting with websites, its all agentic search and websets so i don't have to spend mental energy on it myself
What are you talking about? It works fine with firefox with RFP and VPN enabled, which is already more paranoid than the average configuration. There are definitely sites where this configuration would get blocked, but chatgpt isn't one of them, so you're barking up the wrong tree here.
They did it because a lot of devices running Netflix (TVs, DVD players, etc) were underpowered and Netflix was not keen on writing separate applications. They did, however, invest into a browser engine that would have HW acceleration not just for video playback but also for moving DOM elements. Basically, sprites.
The lost art of writing efficient code...
I'm running firefox and seeing the normal amount.
Of course then you got sites like gnu.org too that block you because your slightly outdated user agent.
I noticed the ChatGPT app also checks Play Integrity on Android (because GrapheneOS snitches on apps when they do this), probably for the same reason. Claude's app doesn't, by the way, but it also requires a login.
Coincidentally about an hour ago, I wanted to look something up in ChatGPT and I happened to be in a browser window I don’t normally use, with no logged in accounts. I assumed it wouldn’t work, but to my surprise with no account, no cookies of any kind it took my query and gave me an answer.
if you browse them you will see that bot writers are very annoyed if they can't scrape a site with a headless browser.
you can do what you suggested, but with Linux VMs/containers. windows is too heavy, each VM will cost you 4 GB of RAM
Fine, by extension, you agree I can scan all of your systems for whatever I desire. This works both ways.
If you mean that you can infer client side tampering with the page contents you could still do that - permit typing but don't permit the submit action on the client. The user presses enter but nothing happens until the check is complete. There you go, now you can tell if the page was tampered with (not that it makes much difference tbh).
In jest, as us mere mortals don't always understand :)
It is difficult to get a man to understand something, when his salary depends on his not understanding it.
Every ChatGPT message triggers a Cloudflare Turnstile program that runs silently in your browser. I decrypted 377 of these programs from network traffic and found something that goes beyond standard browser fingerprinting.
The program checks 55 properties spanning three layers: your browser (GPU, screen, fonts), the Cloudflare network (your city, your IP, your region from edge headers), and the ChatGPT React application itself (__reactRouterContext, loaderData, clientBootstrap). Turnstile doesn't just verify that you're running a real browser. It verifies that you're running a real browser that has fully booted a specific React application.
A bot that spoofs browser fingerprints but doesn't render the actual ChatGPT SPA will fail.
The Turnstile bytecode arrives encrypted. The server sends a field called turnstile.dx in the prepare response: 28,000 characters of base64 that change on every request.
The outer layer is XOR'd with the p token from the prepare request. Both travel in the same HTTP exchange, so decrypting it is straightforward:
outer = json.loads(bytes(
base64decode(dx)[i] ^ p_token[i % len(p_token)]
for i in range(len(base64decode(dx)))
))
# → 89 VM instructions
Inside those 89 instructions, there is a 19KB encrypted blob containing the actual fingerprinting program. This inner blob uses a different XOR key that is not the p token.
Initially I assumed this key was derived from performance.now() and was truly ephemeral. Then I looked at the bytecode more carefully and found the key sitting in the instructions:
[41.02, 0.3, 22.58, 12.96, 97.35]
The last argument, 97.35, is the XOR key. A float literal, generated by the server, embedded in the bytecode it sent to the browser. I verified this across 50 requests. Every time, the float from the instruction decrypts the inner blob to valid JSON. 50 out of 50.
The full decryption chain requires nothing beyond the HTTP request and response:
1. Read p from prepare request
2. Read turnstile.dx from prepare response
3. XOR(base64decode(dx), p) → outer bytecode
4. Find the 5-arg instruction after the 19KB blob → last arg is the key
5. XOR(base64decode(blob), str(key)) → inner program (417-580 VM instructions)
The key is in the payload.
Each inner program uses a custom VM with 28 opcodes (ADD, XOR, CALL, BTOA, RESOLVE, BIND_METHOD, JSON_STRINGIFY, etc.) and randomized float register addresses that change per request. I mapped the opcodes from the SDK source (sdk.js, 1,411 lines, deobfuscated).
The program collects 55 properties. No variation across 377 samples. All 55, every time, organized into three layers:
WebGL (8 properties): UNMASKED_VENDOR_WEBGL, UNMASKED_RENDERER_WEBGL, WEBGL_debug_renderer_info, getExtension, getParameter, getContext, canvas, webgl
Screen (8): colorDepth, pixelDepth, width, height, availWidth, availHeight, availLeft, availTop
Hardware (5): hardwareConcurrency, deviceMemory, maxTouchPoints, platform, vendor
Font measurement (4): fontFamily, fontSize, getBoundingClientRect, innerText. Creates a hidden div, sets a font, measures rendered text dimensions, removes the element.
DOM probing (8): createElement, appendChild, removeChild, div, style, position, visibility, ariaHidden
Storage (5): storage, quota, estimate, setItem, usage. Also writes the fingerprint to localStorage under key 6f376b6560133c2c for persistence across page loads.
Edge headers (5): cfIpCity, cfIpLatitude, cfIpLongitude, cfConnectingIp, userRegion
These are injected server-side by Cloudflare's edge. They exist only if the request passed through Cloudflare's network. A bot making direct requests to the origin server or running behind a non-Cloudflare proxy will produce missing or inconsistent values.
React internals (3): __reactRouterContext, loaderData, clientBootstrap
This is the part that matters. __reactRouterContext is an internal data structure that React Router v6+ attaches to the DOM. loaderData contains the route loader results. clientBootstrap is specific to ChatGPT's SSR hydration.
These properties only exist if the ChatGPT React application has fully rendered and hydrated. A headless browser that loads the HTML but doesn't execute the JavaScript bundle won't have them. A bot framework that stubs out browser APIs but doesn't actually run React won't have them.
This is bot detection at the application layer, not the browser layer.
After collecting all 55 properties, the program hits a 116-byte encrypted blob that decrypts to 4 final instructions:
[
[96.05, 3.99, 3.99], // JSON.stringify(fingerprint)
[22.58, 46.15, 57.34], // store
[33.34, 3.99, 74.43], // XOR(json, key)
[1.51, 56.88, 3.99] // RESOLVE → becomes the token
]
The fingerprint is JSON.stringify'd, XOR'd, and resolved back to the parent. The result is the OpenAI-Sentinel-Turnstile-Token header sent with every conversation request.
Turnstile is one of three challenges. The other two:
Signal Orchestrator (271 instructions): Installs event listeners for keydown, pointermove, click, scroll, paste, and wheel. Monitors 36 window.__oai_so_* properties tracking keystroke timing, mouse velocity, scroll patterns, idle time, and paste events. A behavioral biometric layer running underneath the fingerprint.
Proof of Work (25-field fingerprint + SHA-256 hashcash): Difficulty is uniform random (400K-500K), 72% solve under 5ms. Includes 7 binary detection flags (ai, createPRNG, cache, solana, dump, InstallTrigger, data), all zero across 100% of 100 samples. The PoW adds compute cost but is not the real defense.
The XOR key for the inner program is a server-generated float embedded in the bytecode. Whoever generated the turnstile.dx knows the key. The privacy boundary between the user and the system operator is a policy decision, not a cryptographic one.
The obfuscation serves real operational purposes: it hides the fingerprint checklist from static analysis, prevents the website operator (OpenAI) from reading raw fingerprint values without reverse-engineering the bytecode, makes each token unique to prevent replay, and allows Cloudflare to change what the program checks without anyone noticing.
But the "encryption" is XOR with a key that's in the same data stream. It prevents casual inspection. It does not prevent analysis.
| Metric | Value |
|---|---|
| Programs decrypted | 377/377 (100%) |
| Unique users observed | 32 |
| Properties per program | 55 (identical across all samples) |
| Instructions per program | 417-580 (mean 480) |
| Unique XOR keys (50 samples) | 41 |
| SO behavioral properties | 36 |
| PoW fingerprint fields | 25 |
| PoW solve time | 72% under 5ms |
No systems were accessed without authorization. No individual user data is disclosed. All traffic was observed from consented participants. The Sentinel SDK was beautified and manually deobfuscated. All decryption was performed offline using Python.
That's... exactly expected? It's a cat and mouse game. People running botnets or AI scrapers aren't diligently setting the evil bit on their packets.
In my brief experience with abuse mitigation, connections coming from VPNs or unusual IP ranges were very significantly more likely to be associated with abuse.
It depends on your users. VPNs aren’t common at all, even though you hear about them a lot on Hacker News. For types of social sites where people got banned for abuse (forums) the first step to getting back on the forum was always to sign up for a VPN and try to reconnect. It got so bad that almost every new account connecting via VPN would reveal itself as a spammer, a banned member trying to return, or someone trying to sock puppet alternate accounts for some reason.
The worst offenders are Tor IP addresses. Anyone connecting from Tor was basically guaranteed to have bad intentions.
I heard from someone who dealt with a lot of e-mail abuse that the death threats, extortion, and other serious abuse almost always came from Protonmail or one of the other privacy-first providers that I can’t remember right now. He half-jokingly said they could likely block Protonmail entirely without impacting any real users.
It’s tough for people who want these things for privacy, but the sad reality is that these same privacy protections are favored by people who are trying to abuse services.
I have yet to see a use case for VPNs for the casual internet audience, and for a tech savvy user, their better off renting through some datacenter or something, which at that point is hardly a VPN and more home IP obfuscation. All the same downsides, and at least you get real privacy.
At least outside the US, there's 3DS as an (admittedly often high friction) high quality cardholder verification method, but in the US, that's of course considered much too consumer-hostile, so "select 87 overpasses" it is.
ironically, in high end hotels, there's often a lot less cctv. not none. just less. rich people enjoy privacy
Doesn't make sense, my home is much more preferable to a hotel
According to the OP:
> The program checks 55 properties spanning three layers: your browser (GPU, screen, fonts), the Cloudflare network (your city, your IP, your region from edge headers), and the ChatGPT React application itself (__reactRouterContext, loaderData, clientBootstrap).
I guess Firefox VPN will hide the IP at least. But what about the other data, is it faked by RFP? Because if not, the so-called privacy offered by this configuration is outdated.
You might be fingerprinted by OpenAI right now, as “that guy with all the Firefox anti-fingerprinting stuff enabled, even though it breaks other sites”.
This is generally called virtual scrolling, and it is not only an option in many common table libraries, but there are plenty of standalone implementations and other libraries (lists and things) that offer it. The technique certainly didn't originate with Netflix.
It's perfectly possible to write fast or slow web applications in React, same as any other framework.
Linear is one of the snappiest web applications I've ever used, and it is written in React.
But hey, at least some bots are also not making it past Cloudflare!
I work a customer facing email job and loads of people use Proton across demographics and industries
Mullvad.
It has been proven in a court of law that when Mullvad says "no logging", they mean it.
They also regularly have security audits and publish the results[2][3]
[1]https://mullvad.net/en/blog/mullvad-vpn-was-subject-to-a-sea... [2]https://mullvad.net/en/blog/new-security-audit-of-account-an... [3]https://mullvad.net/en/blog/successful-security-assessment-o...
Source? I haven't seen any evidence that the major paid VPN providers engage in any of those things. At best it's vague implications something shady is happening because one of the key people was previously at [shady organization].
MullvadVPN is also another great one.
I have heard some good things about AirVPN, but I can absolutely attest for mullvad and to a degree ProtonVPN (Just with Proton, depending upon your threat model, do make the necessary precautions like buying with monero for example)
There are others, but mostly its the 2-3 that I trust.
https://news.ycombinator.com/item?id=3913919
No one seem to use or care about their own product anymore. Only uses dashboard and metrics, which does not explain the full situation.
It's a pity Firefox doesn't get the praise it deserves half as much as it cops criticism.
It's also possible to make Firefox route each container through a different proxy which could be running locally even which then can connect to multiple different VPN's. I haven't tried doing that but its certainly possible.
It's sort of possible to run different browsers with completely new identities and sometimes IP within the convenience of one. It's really underrated. I don't use the IP part of this that I have mentioned but I use multi containers quite a lot on zen and they are kind of core part of how I browse the web and there are many cool things which can be done/have been done with them.
I also like privacy. I use GrapheneOS. I compartmentalize my credit cards, emails, and phone numbers. I don't use Google products, and the list continues, but I don't complain about Cloudflare because it is painless and I understand the price I pay for privacy.
I also have home services accessible via my home website, running on my home server(s). I chose to have cloudflare to host my domain specifically for the easy bot blocking, and it blocks more than 2000 bots/day that otherwise would be trying to find vulnerabilities on my servers, which contain a lot of sensitive things. I've never had an issue personally accessing my services through cloudflare. Sometimes I have to do captchas to access my own things, and that's barely an inconvenience (I am aware the domain isn't necessary to access services, but it makes more sense for my setup and intents)
And the salient difference is that CCTV is simply defense-in-depth, not a primary means for authentication.
Original here: https://archive.org/details/sim_creative-computing_1984-06_1...
Yeah I guess the cryptographic stuff sounds vaguely impressive although it’s been a long time since I had to think about cryptography in detail. But what is this _for_? I’m going to buy an expensive keyboard so that I can send messages to someone and they’ll know it’s really me – but it has to be someone who a) doesn’t trust me or any of our existing communication channels and b) cares enough to verify using this weird software? Oh and it’s important they know I sent it from a particular device out of the many I could be using?
Who is that person? What would I be sending them? What is the scenario where we would both need this?
Also the server can’t read the message but the decryption key is in the URL? So anyone with the URL can still read it? Then why even bother encrypting it?
Maybe this is one of those cases where I’m so far outside your target market that it was never supposed to make sense to me but I feel like I’m missing something here. Or maybe you need to work on your elevator pitch.
Just sharing my honest reaction.
Yes, RFP spoofs or at least somewhat obfuscates/normalizes GPU/screen/font info. The rest are integrity validations of the server/app, and not really identifying in any way.
>You might be fingerprinted by OpenAI right now, as “that guy with all the Firefox anti-fingerprinting stuff enabled, even though it breaks other sites”.
I'm not sure what the broader point you're trying to make here is. Is fingerprinting bad? Yes. All things being equal, I'd rather not have it than have it, but at the same time it's not realistic to expect openai to serve anonymous requests from anyone. Back when chatgpt was first launched you had to sign up and verify your phone number. Compared to mandatory logins, fingerprinting is definitely the lesser evil here.
None of which chatgpt can handle presumably.
Yep. The most easy to implement stable state for any system where you're aiming to prevent misuse is to just prevent use
Claude's free tier requires a phone number just to try it.
Best case, the VPN learns your residential IP and the names of every HTTPS host you connect to (if not your entire DNS traffic as well); worst case, they collude with any of the services you use (or some ad tracker they embed) and persistently deanonymize your account.
VPNs are structurally not great for privacy.
They allowed anonymous requests for months now, maybe even a year.
More info here:
https://web.archive.org/web/20231107182321/https://mu0.cc/20...
Is it TSA's "fault" that non-terrorists are subject to screening?
</sarcasm>
OpenAI documents how to opt out of scraping here: https://developers.openai.com/api/docs/bots
Anthropic documents how to opt out of scraping here: https://privacy.claude.com/en/articles/8896518-does-anthropi...
I'm not sure if Gemini lets you opt out without also delisting you from Google search rankings.
Well, I can use the world‘s best safety deposit box without being on CCTV while I pass secrets in and out of it, right? Just not for free.
Bummer, this sounds like it is about to turn into a Monero ad (“let us pay privately”)
Yes, even their "humanifesto" is LLM output, and is written almost exclusively in the "it's not X <emdash> it's Y" style.
[0]: https://github.com/magicseth/keywitness/graphs/contributors
This idea of capturing the timing of people's keystrokes to identify them, ensure it is them typing their passwords, or even using the timing itself as a password has been recurring every few years for at least three decades.
It is always just as bad. Because there are so many cases where it completely fails.
The first case is a minor injury to either hand — just put a fat bandage on one finger from a minor kitchen accident, and you'll be typing completely differently for a few days.
Or, because I just walked into my office eating a juicy apple with one hand and I'm in a hurry typing my PW with my other hand because someone just called with an urgent issue I've got to fix, aaaaannnd, your software balks because I'm typing with a completely different cadence.
The list of valid reasons for failure is endless wherein a person's usual solid patterns are good 90%+ of the time, but will hard fail the other 10% of the time. And the acceptable error rate would be 2-4 orders of magnitude less.
It's a mystery how people go all the way to building software based on an idea that seems good but is actually bad, without thinking it through, or even checking how often it has been done before and failed?
If you're engaging with the idea seriously, I suppose we'd need to build a reputation or trust network or something.
Although if you're talking about replay attacks specifically, there are other crypto based solutions for that.
GP was mentioning that a solution to the problem exists, not that Netflix specifically invented it. Your quip that the technique is not specific to Netflix bolsters the argument that OpenAI should code that in.
- "ctrl + f" search stops working as expected - the scrollbar has wrong dimensions - sometimes the content might jump (common web issue overall)
The reason why we lost it is because web supports wildly different types of layouts, so it is really hard to optimize the same way it is possible in native apps (they are much less flexible overall).
I think you're lucky to hang around people whose heads don't hurt when they think.
(It's a different question whether zero friction is actually desired, or whether some security theater is actually part of the service being provided, but that's a different question.)
I can imagine their models have been trained on a lot of websites before opt outs became a thing, and the models will probably incorporate that for forever.
But at least for websites there's an opt-out, even if only for the big AI companies. Open source code never even got that option ;).
PRESS RELEASE: UNITED BURGLARS SOCIETY
The United Burglars Society understands that being burgled may be inconvenient for some. In response, UBS has introduced the Opt-Out system for those who wish not to be burgled.
Please understand that each burglar is an independent contractor, so those wishing not to burgled should go to the website for each burglar in their area and opt-out there. UBS is not responsible for unwanted burglaries due to failing to opt-out.
Bit concerning that some professional engineers don't understand this given the sensitive systems they interact with.
Also are hidden cameras even legal? I know here in EU they aren't.
It is an attempt at putting something into the conversation more than just "OSS is broken because there are too many slop PRs." What if OSS required a human to attest that they actually looked at the code they're submitting? This tool could help with that.
Yes LLMs were used greatly in the production of this prototype!
It doesn't change the goal of the experiment! or it's potential utility! Do you see any potential area in your world where some piece of this is valuable?
....no. There's not a single occurrence of that.
https://keywitness.io/manifesto
There are six emdashes on that page. NONE of them are "it's not X it's why".
> Emails, messages, essays, code reviews, love letters — all suspect.
> We believe this can be solved — not by detecting AI, but by proving humanity.
> KeyWitness captures cryptographic proof at the point of input — the keyboard.
> When you seal a message, the keyboard builds a W3C Verifiable Credential — a self-contained proof that can be verified by anyone, anywhere, without trusting us or any central authority.
> That's an alphabet of 774 symbols — each carrying log2(774) ≈ 9.6 bits. 27 emoji for 256 bits.
> They're a declaration: this message was written by a person — one of the diverse, imperfect, irreplaceable humans who still choose to type their own words.
Clarifications: 4
Continuation from a list: 1
Could just be a comma: 1
"It's not X -- it's Y": 0.
If you're going to make lazy commentary about good writing being AI, please at least be sure that you're reading the content and saying accurate things.
It proves 1) that an apple device with a secure enclave signed it. 2) that my app signed it.
If you trust the binary I've distributed is the same as the one on the app store, then it also proves: 3) that it was typed on my keyboard not using automation (though as others have mentioned, you could build a capacitive robot to type on it) 4) that the typer has the same private key as previous messages they've signed (if you have an out of band way to corroborate that's great too) 5) optionally, that the person whose biometrics are associated with the device approved it.
There is also an optional voice to text mode that uses 3d face mesh to attempt to verify the words were spoken live.
Not every level of verification is required by the ptrotocol, so you could attest that it was written on a keyboard, but not who wrote it (not yet implemented in the client app).
The protocol doesn't require you to run my app, if you compile it yourself, you can create your own web of trust around you!
thaaaaaaaaanks
For what it’s worth, modern browsers can render absurdly large plain HTML+CSS documents fairly well except perhaps for a slow initial load as long as the contents are boring enough. Chat messages are pretty boring.
I have a diagnostic webpage that is a few million lines long. I could get fancy and optimize it, but it more or less just works, even on mobile.
They described Netflix's implementation, but if someone actually wanted to follow up on this (even for their own personal interest), Dynamic HTML would not get you there, while virtualization would across all the places it's used: mobile, desktop, web, etc.
We lost it because the web was never designed for applications and the support it gives you for building GUIs is extremely basic beyond styling, verging on more primitive than Windows 3.1 - there are virtually no widgets, and the widgets that do exist have almost no features. So everyone rolls their own and it's really hard to do that well. In fact that's one of the big reasons everyone wrote apps for Windows back in the day despite the lockin, the value of the built-in widget toolkit was just that high. It's why web apps so often feel flaky and half baked compared to how desktop apps tend(ed) to feel - the widgets just don't get the investment that a shared GUI platform allows.
"You're posting too fast! Please slow down."
1. Every person is born with the knowledge of how ChatGPT uses Cloudflare Turnstile?
2. This article contains factual mistakes? If so, what are they?
If neither of these is true, then this article strictly provides information and educational value for some readers. The writing style, AI-like or not, doesn't change that.
“Ignorant” is also infinite - you’re ignorant of MANY things as well, and I’m sure you would struggle with things I can do with ease. For example, understanding the meaning behind what’s being said so I know not to brow-beat someone over it.
It was a dataset of the entirety of the public internet from the very beginning that bypassed paywalls etc, there’s virtually nothing they haven’t scraped.
The emoji idea was mine. I like it :-) unfortunately it doesn't work in places like HN that strip out emoji. So I had to make a base64 encoding option.
The goal was to create an effective encryption key for the url hash (so it doesn't get sent to the server). And encoding skin tone with human emojis allows a super dense bit/visual character encoding that ALSO is a cute reference to the humans I'm trying to center with this project!
“It's not X -- it's Y": 1
> The server stores an encrypted blob it can't decrypt. We couldn't read your messages even if we wanted to. That's not a policy — it's math.
If you can’t tell that this is AI slop then maybe KeyWitness does solve a real problem after all.
What Apple devices are supported? All I have is a iPhone 4 running a old iOS version(pre iOS 7) (which I will not update and I don't think has a secure enclave) and a M1 mac mini and some lightning earpods and a apple thunderbolt display and some USB-A chargers and some old MacBooks.
I saw something about android (https://typed.by/manifesto#:~:text=Android,Integrity) on the website, but it mentioned Play Integrity which I do not have becuase I use LineageOS for MicroG.
I think that the concept is stupid becuase it would require to somehow prove that the app is not modified(which is impractical) and there is no stylus on a motor or fake screen(which is also impractical).
I think that a better aproach would be to form a Web Of Trust where only people's (not just humans, this would include all animals and potentially aliens but no clankers) certificates are signed, but with a interface that is friendly to people who are not very into technology but with some sort of way to not have who your friends are revealed, but this would still allow someone to get a attestation for their robot.
On a web of trust, if you have a negative interaction with a bot, you revoke trust in one of the humans in the chain of trust that caused you to come in contact with that bot. You've now effectively blocked all bots they've ever made or ever will make... At least until they recycle their identity and come to another key signing party.
Once you have the web in place though, a series of "this key belongs to a human" attestations, then you can layer metadata on top of it like "this human is a skilled biologist" or "this human is a security expert". So if you use those attestations to determine what content your exposed to then a malicious human doesn't merely need to show up at a key signing party to bootstrap a new identity, they also have to rebuild their reputation to a point where you or somebody you trust becomes interested in their content again.
Nothing can be done to prevent bad people from burning their identities for profit, but we can collectively make it not economical to do so by practicing some trust hygiene.
Key signing establishes a graph upon which more effective trust management becomes possible. It on its own is likely insufficient.
>>While you type, the keyboard quietly records how you type — the rhythm, the pauses between keys, where your finger lands, how hard you press.
>>Nobody types the same way. Your pattern is as unique as your handwriting. That's the signal.
This very precisely makes my point:
Yes, the typing pattern of any human is highly and possibly even completely unique to that human — UNTIL any of a myriad of everyday issues makes it falsely deny access because the human's typing pattern has changed in a way the human can't do anything to fix at the moment.
If you are only attempting to distinguish a human from an automated system, it'll be better, until someone just starts recording the same patterns and re-playing them to this upstream process; then its a mere race to who can get their hooks in at a lower level. And someone is always going to say: "Oh, this system can identify the specific human", and we're off to the races again.
So, no. Unless you can account for ALL of the reasonable everyday failure modes, typing with either hand, any finger or combination of fingers out of commission for a minute or a lifetime, this idea will fail.
More generally, it's one of the interesting things working in a non-big-tech company with non-public-facing software. So much of the received wisdom and culture in our field comes from places with incredible engineering talent but working at totally different scales with different constraints and requirements. Some of time the practices, tools, approaches advocated by big tech apply generally, and sometimes they do things a particular way because it's the least bad option given their constraints (which are not the same as our constraints).
There are good reasons why Amazon doesn't return a 10,000 row table when you search for a mobile phone case, but for [data ]scientists|analysts etc many of those reasons no longer apply, and the best UX might just be the massive table/grid of data.
Not sure what the answer is, other than keep talking to your users and watching them using your tools :)
There's nothing stopping folks from typing a message an LLM wrote one at a time, but the idea of increasing the human cost of sending messages is an interesting one, or at least I thought :-(
> While you type, the keyboard quietly records how you type — the rhythm, the pauses between keys, where your finger lands, how hard you press.
> Nobody types the same way. Your pattern is as unique as your handwriting. That's the signal.
A human is personally responsible for a bot acting on their behalf. If your bot behaves, nothing is going to happen. If you keep handing out your personal keys to shitty misbehaving bots, then you will personally get banned - which gives you a pretty good incentive to be a bit more discerning about the bots you use.
Or if you ever need to travel a lot and tether off your phone. Most mobile devices are IPV6 only (via 464XLAT) behind a CGNAT these days.
I'm certain the User-Agent is part of it. I know that for certain because a very reliable way I can trigger the CF stuff is this plugin with the wrong browser selected [1].
[1] https://addons.mozilla.org/en-US/firefox/addon/uaswitcher/
I’m almost endlessly surprised by the probably-autistic-spectrum responses to tech things from people with no idea how things seem to other people.
Maybe they deliberately write it like that, to filter out people who aren’t the target market?
Sorry it doesn't meet your needs.
There is irony in having an ai generated humanifesto. Could it be intentional? hmm?
Is there no irony in deriding a project for being potentially LLM generated, when it's goal is to aide people in differentiating? :shrug:
Maybe I just have to disable all ad blockers and Safari tracking prevention? Or I guess I could send a link to a scan of my photo ID in a custom request header like X-Please-Cloudflare-May-I-Use-Your-Open-Web?
I can't tell whether you're serious but in case you are, this theory immediately falls apart when you realize waymo operates at night but there aren't any night photos.
The "quality" of TSA's screening seems be pretty bad too given how many people have to go through secondary screening vs how many terrorist they catch (0?)
I think I was sufficiently clear that I was specifically talking about CGNAT-caused IP address tainting being an unreasonably emphasized worry, not the worry about their detections overall misfiring. Though I certainly don't hear much about people having issues with it (but then anecdotes are anecdotal).
> Or I guess I could send a link to a scan of my photo ID in a custom request header like X-Please-Cloudflare-May-I-Use-Your-Open-Web?
Sounds good, have you tried?
Not sure what's the point of these comically asinine rhetoricals.
Consider also that many people aren't the best at writing blog-like posts but still have things to share and AI empowers them to do that. I can't find anything constructive in your post and I don't understand why you are posting at all.
For instance, the employee at Apple that decided to pull ICE Block from the store could decide that the "admissible in court" bit should be false if it looks like a police officer is in frame.
Similarly, the keyboard could decide your social credit score is too low, and just stop attesting. A court could order this behavior.
Or, you could fail mandatory age / id verification because your credit card expired, and then all the above + more could happen! Good luck getting through to credit card tech support at that point...
Nice try but I used "caught", not "stopped", which requires they actually apprehended someone, not just prevented some hypothetical attack.
>since they got the gig to serve and protect and before we lost thousands of lives…)
You could easily reuse this argument for cloudflare: "if it wasn't for such invasive browser fingerprinting openai would be drowning in bajillion req/s from bots."
of course they would be drowning! I have no issues with what CF is doing. too funny that people use tools like chatgpt and expect privacy?!