Cloudflare Turnstile requiring fingerprintable WebGL

Cloudflare is known to use fingerprinting to detect scrapers For example, they use JA3 fingerprints and match them against the UA to block stuff like cURL while allowing OkHttp (Android clients) - but this can be easily be spoofed with packages such as CycleTLS [1].

I don't want to defend them, because they gate away a good chunk of the internet with their "bot protection", but unless you do PoW (which is also ecologically a nightmare), probably fingerprinting is the way to go - completely destroying the privacy of everyone involved.

Cromite, a privacy conscious fork of Chromium for Android, has constantly issues with CloudFlare Turnstile [2] because they (Cloudflare) try to fingerprint it in multiple ways in order to pass the challenge. The only way to get it to work would be to join the CloudFlare Browser Developer program - which requires signing an NDA. Rightfully so, the project maintainer didn't want to do it.

If you want to see the extent of what CloudFlare does to fingerprint the browsers, just have a look in the issue [2] and see which flags need to be disabled in order to allow CloudFlare to pass the challenge.

I understand both sides, but at least CloudFlare could be flexible enough to fall back to PoW instead of just blocking people from sending forms or accessing websites...

[1]: https://github.com/Danny-Dasilva/CycleTLS

[2]: https://github.com/uazo/cromite/issues/2365

I'm maintaining a minority browser[0] and as of a couple of weeks this is affecting several of our users[1]. While I'm currently not considering this a browser bug (one could be involved, of course), more eyes are better and any help or ideas on improving or mitigating the situation would be appreciated.

[0]: https://konform-browser.codeberg.page/

[1]: Most? All? Without any telemetry, relying on user reports and our own testing here.

> Plus privacy.resistfingerprinting isn't enabled even when selecting "Strict" "Enhanced Privacy Protection" in the settings, great job there Mozilla.

For good reason. I've run that setting for ages but I kept having to disable it and add workarounds because websites would break in weird ways. Timezones in scheduling websites being messed up nearly made me miss a couple of appointments. There's no way to tell the user Firefox isn't broken without displaying a permanent banner like "if websites are broken in any way or you see weird glitches or your computer's time is wrong or fonts look weird or videos don't always work right, click here to disable fingerprinting protection".

Interestingly, Turnstile breaks with resistfingerprinting but works with fingerprintingProtection, I guess the latter takes this crap into account.

Is there a deal between Google and Cloudflare to make non-Chrome browsers harder to use? The pressure to use Chrome keeps increasing, and the amount of ad filtering you can do in Chrome keeps decreasing.

Thanks, i did not know about `privacy.resistfingerprinting`

I'll make sure to fail all cloudflare turnshit in the future.

So if you need to prevent bot abuse, but also don't want an ugly captcha every time someone goes to sign up, is there a better option?

In other words, Cloudflare requires you to substantially increase your browser’s attack surface in order to visit websites.

I tested this extension that I've been using for a long time on the turnstile page and it got through, fwiw. I think it's a bit more subtle than how resistfingerprinting works but not sure what the privacy tradeoff is.

https://github.com/kkapsner/CanvasBlocker

> Plus privacy.resistfingerprinting isn't enabled even when selecting "Strict" "Enhanced Privacy Protection" in the settings, great job there Mozilla.

That pref is there for the Tor Browser.

Doesn't this mean we just need to make the webgl fingerprint resistance implementation smarter? Instead of explicitly rejecting webgl access or responding with dummy data, respond with data that is random within space of N common and reproducible patterns. E.g. emulate webgl implementation of some low spec but actually popular devices.

I did warmups in Grub Crawler to fight this: https://deepbluedynamics.com/grub

"This makes your browser appear suspicious because it looks like you're trying to hide your identity."

Yeah, this needs to be burned to the ground.

Adding noise to a canvas element is a mistake anyway. It means you can't develop a proper paint program using web technologies because your browser will mess with the image.

This company makes the internet unusable if you value privacy and use VPNs or whatever. Evil.

As turnstile users on several of our sites, I think we need to revisit that decision.

...in the age of AI, does anyone have an actual solution for keeping out bots while preserving the privacy of humans?

Obviously this is terrible, but I think there's a possibility it's the least terrible option? Another option is IP reputation, which I think is worse. Or scanning a code with a non-rooted phone, which I think is even worse than that!

They use all kinds of obscure APIs, which you'll learn if you're privacy/security conscious and disable random web APIs that are of no use to YOU as a web user, but only can ever serve the people who serve you stuff or want to hack you or track you.

Normally websites feature test and just skip using obscure disabled APIs, or more likely, websites don't use those APIs at all or only tracking scripts use it, which are already optional usually.

Problem with CF is that if you want increased security they'll prevent you from gaining it everywhere, even on sites they don't protect, or prevent you from accessing services even the ones you paid for. Browsers don't allow disabling APIs per domain, so you're either at risk everywhere or you're blocked from accessing a lot of things for no particular reason.

CF can't be bothered to feature test.

Dont like it but is a reality due to bots

This blog post is filled with false assumptions.

>Turns out it's because Cloudflare wants to have a fingerprint of your device via WebGL, the only reason for doing this would be tracking.

> So Cloudflare just banned all WebKitGTK browsers as I guess they put an exception for Safari.

This is false. I ran firefox with:

* hardware acceleration disabled (so software renderer, nothing to fingerprint)

* resistfingerprinting enabled, including letterboxing with default window size

* webgl disabled

* VPN enabled

* In a Windows VM

By all accounts this should be the most suspicious fingerprint ever, but turnstile happily lets me through. If they want to track people, they're doing a pretty bad job. My guess is that OP's browser is getting banned because his WebKitGTK has a weird fingerprint, not because of webgl or whatever.

> Such things are blocked in WebKit, and have been for years. Meaning it's tracking so awful that even Apple would block it, and as far as I can tell it's not the kind of privacy protection you can easily disable in it.

This is also false. Webgl fingerprinting works just fine on Safari. They might try to mitigate it by adding some noise, but that's not so different than what firefox does, and is certainly not "blocked".

Say no to malware - say no to Cloudflare

Firefox has so much built-in tracking it seems they want to push me to build my own browser. For example every time you open the settings there are several ways they are sending out pings to certain extensions.

Also by default addons.mozilla.org is a privileged site so of course they include google tracking in it and they get the proper fingerprint no matter what you have configured.

I wondered about that too. So they allege that bots require that everyone now has to ID to the big service providers. Very dystopian situation. Skynet is currently winning the war.

Please, anyone from EU (US is doomed rofl) create a petition to ban browser-fingerprinting in EU, across all existing browsers.

I'm not good at creating petitions but can happily sign it. Also with stop killing games and anti-chat control.

I can imagine this can get a traction, if it's explained in youtube video to "normal" people.

What? Big tech company is evil? No way! I thought cloudflare were good guys...

I understand both sides, but at least CloudFlare could be flexible enough to fall back to PoW instead of just blocking people from sending forms or accessing websites...

[1]: https://github.com/Danny-Dasilva/CycleTLS

[2]: https://github.com/uazo/cromite/issues/2365

> I don't want to defend them, because they gate away a good chunk of the internet with their "bot protection"

They also gate away a good many people with their "bot protection". I am extremely worried about how so many seem to have outsourced the control over who can access their websites to a company, with no second thoughts whatsoever.

They're also anti free speech.

> I don't want to defend them, because they gate away a good chunk of the internet with their "bot protection", but unless you do PoW (which is also ecologically a nightmare), probably fingerprinting is the way to go - completely destroying the privacy of everyone involved.

Bot protection with fingerprinting is just an illusion. Any signals like this which is on client side can be spoofed by an above average person. Fingerprinting is just way to consolidate the market for advertising business. Assigning Reputation to residential IP addresses and commercial blocks is is another approach to achieve the desired result. Providers would be a lot more careful to allow their IP addresses for misuses, however turns out that it would bring down the DDOS business on both sides, attackers and protectors.

Ironically, more than often its the same companies that invest in building their own bots and finding ways to stop bots from other companies.

it's all for nothing, because Cloudflare's scraping protection works about as well as a $5 padlock - good enough to dissuade bored teens, not good enough to dissuade even an amateur burglar. if someone wants to scrap your publicly visible data, they will. there's nothing you can do.

> but unless you do PoW (which is also ecologically a nightmare)

Can you expand? I don't see a problem with some napkin math. 5W load for 2 seconds is 0.002Wh (we have to let smartphones pass and not by doing PoW for 10s of seconds). 8 billion checks a day for a year = 8GWh.

This is why I have two separate browsers. If you want to do official stuff like paying for things you need to get through cloudflare.

> I don't want to defend them, because they gate away a good chunk of the internet with their "bot protection"

I can no longer access any website that's "protected" by Cloudflare. As soon a website enables that stuff… "Shoot, another one bites the dust." I wonder if the website owners realise at all how many actual users they lose by this sort of "protection."

They sometimes have to comply with legal requests (which I understand), but at the same time they have a huge market share - which means that the internet is becoming less and less decentralized and more in their control. We've seen the effects of that in previous outages...

It's just one more facet of the enshittoscene, the era where actual product quality is completely irrelevant. Put it in the same bucket as websites that lag when you scroll, apps that refuse to show you video without a huge play/pause button overlaid in the middle of it that never goes away, and the movie Melania. My hypothesis is that billion-dollar businesses no longer exist to sell things to customers, but only to impress other billionaires to get their investment money.

>I am extremely worried about how so many seem to have outsourced the control over who can access their websites to a company, with no second thoughts whatsoever.

I think the Web is on its last legs, anyway. Generative AI and LLM-instead-of-search has destroyed what little value remained.

This company makes the internet unusable if you value privacy and use VPNs or whatever. Evil.

[0]: https://konform-browser.codeberg.page/

[1]: Most? All? Without any telemetry, relying on user reports and our own testing here.

In other words, Cloudflare requires you to substantially increase your browser’s attack surface in order to visit websites.

Thanks, i did not know about `privacy.resistfingerprinting`

I'll make sure to fail all cloudflare turnshit in the future.

I have it enabled and turnstile works fine.

https://github.com/kkapsner/CanvasBlocker

This blog post is filled with false assumptions.

>Turns out it's because Cloudflare wants to have a fingerprint of your device via WebGL, the only reason for doing this would be tracking.

> So Cloudflare just banned all WebKitGTK browsers as I guess they put an exception for Safari.

This is false. I ran firefox with:

* hardware acceleration disabled (so software renderer, nothing to fingerprint)

* resistfingerprinting enabled, including letterboxing with default window size

* webgl disabled

* VPN enabled

* In a Windows VM

Normally websites feature test and just skip using obscure disabled APIs, or more likely, websites don't use those APIs at all or only tracking scripts use it, which are already optional usually.

CF can't be bothered to feature test.

I did warmups in Grub Crawler to fight this: https://deepbluedynamics.com/grub

Say no to malware - say no to Cloudflare

Dont like it but is a reality due to bots

I wondered about that too. So they allege that bots require that everyone now has to ID to the big service providers. Very dystopian situation. Skynet is currently winning the war.

They're also anti free speech.

Looks cool. And I wonder why I'd run this over JSshelter. It appears to do the same thing, no?

Thanks for the report, I've been running this for a long time.

I think your comment is also making plenty assumptions..

Official Firefox can be leaky unless you build it yourself with some build-time changes or use a fork with such[0]. Am I guessing right that you still have Webcompat, RemoteSettings, and Nimbus enabled still? How do you know a compatibility intervention isn't causing your browser to open the kimono just enough to "unbreak the page"?

> My guess is that OP's browser is getting banned because his WebKitGTK has a weird fingerprint, not because of webgl or whatever.

My guess is a different flavor of the same: Not matching an expected fingerprint (simplified: whitelist vs blacklist approach) combined with other factors.

[0]: I'm currently aware of Tor Browser, Konform Browser (am dev), Mullvad Browser, and to a certain extent Waterfox, LibreWolf, and r3df0x doing that.

Enabling resistfingerprinting on my Android phone shows me the same error screen. It's not just webkit.

fingerprintingProtection works fine on the other hand, but then again that's intentionally less intrusive.

> My guess is that OP's browser is getting banned because his WebKitGTK has a weird fingerprint, not because of webgl or whatever.

So why is Cloudflare saying the author got blocked because of WebGL?

> > Such things are blocked in WebKit, and have been for years. Meaning it's tracking so awful that even Apple would block it, and as far as I can tell it's not the kind of privacy protection you can easily disable in it.

> This is also false. Webgl fingerprinting works just fine on Safari. They might try to mitigate it by adding some noise, but that's not so different than what firefox does, and is certainly not "blocked".

While I don't have an iDevice to try, the assumption that they are special cased is fair... because they are: https://blog.cloudflare.com/eliminating-captchas-on-iphones-...

(Yes, this is basically WEI in a shinier package.)

Yep. Cloudflare and cloudflare's customers don't care about blocking people that use non-standard browsers (or accessible browsers, or feed readers, or whatever). Using cloudflare defaults is basically saying, "Only major corporate browsers released in the last year or two can access this site."

Also by default addons.mozilla.org is a privileged site so of course they include google tracking in it and they get the proper fingerprint no matter what you have configured.

If you are this motivated (I am!), how about joining forces on Konform Browser? Radio silence and remote third-party integrations disabled by default and generally sane and conservative defaults respecting old-fashioned notions like individual consent and data-protection regulations.

Aside from general dev, could use a hand in bringing it to more platforms (mobile and flatpak are frequently asked) and taking a closer look at fingerprinting protections and what's currently tripping up the turnstile.

https://codeberg.org/konform-browser/source

> Plus privacy.resistfingerprinting isn't enabled even when selecting "Strict" "Enhanced Privacy Protection" in the settings, great job there Mozilla.

That pref is there for the Tor Browser.

It's enabled by default in Tor Browser and I'm not sure it can even be disabled?

Also enabled by default for Konform Browser and Mullvad Browser, which borrow many of the privacy- and security-related patches from Tor Browser.

> Plus privacy.resistfingerprinting isn't enabled even when selecting "Strict" "Enhanced Privacy Protection" in the settings, great job there Mozilla.

Interestingly, Turnstile breaks with resistfingerprinting but works with fingerprintingProtection, I guess the latter takes this crap into account.

Maybe a good reason for not enabling it by default but a bad reason to not enabling it for strict settings.

I somewhat expect breaking sites with strict settings, I don’t expect an still wide open tracking path.

That’s deceiving.

I would wager to guess its one of the nature consequences of Chrome being the most popular browser on the web. Most legit traffic will be from Chrome.

The last screenshot in the OP article mentions that "a browser extension... adding random noise to canvas data" can be detected. Which isn't to say this perfectly detects all such randomization, but it's certainly an active part of the arms race.

All of those advanced features should be enabled on a per-website basis but unfortunately even browsers whose marketing focuses on privacy don't allow you to do that. Same with TLS root CA certificates, there is no way to configure that a certain CA can only create certificates for certain domains.

"This makes your browser appear suspicious because it looks like you're trying to hide your identity."

Yeah, this needs to be burned to the ground.

Bad optics aside, it doesn't actually reflect reality. See my other comment. You can enable basically all the privacy settings and still pass turnstile. Tor browser in a VM passes it, of all things.

https://litter.catbox.moe/gaizpk692bhhs6b7.png

So if you need to prevent bot abuse, but also don't want an ugly captcha every time someone goes to sign up, is there a better option?

Behavioral signals are the usual answer: risk-scored, invisible challenges; proof-of-work (cost without identity, though it taxes mobile); and signup-velocity/rate limits that stop cheap abuse before any challenge fires. The reason fingerprinting wins anyway is that it requires less operator effort, not that it is the only thing that works.

Use proof-of-work captchas, many are private by default. Look into Private Captcha or Cap captcha.

The tool "Anubis" uses proof of work instead

Adding noise to a canvas element is a mistake anyway. It means you can't develop a proper paint program using web technologies because your browser will mess with the image.

You can still do that, but it may not be rendered correctly in a screenshot.

...in the age of AI, does anyone have an actual solution for keeping out bots while preserving the privacy of humans?

As turnstile users on several of our sites, I think we need to revisit that decision.

Out of curiosity, why did you have it on in the first place?

What? Big tech company is evil? No way! I thought cloudflare were good guys...

Please, anyone from EU (US is doomed rofl) create a petition to ban browser-fingerprinting in EU, across all existing browsers.

I'm not good at creating petitions but can happily sign it. Also with stop killing games and anti-chat control.

I can imagine this can get a traction, if it's explained in youtube video to "normal" people.

> but unless you do PoW (which is also ecologically a nightmare)

Ironically, more than often its the same companies that invest in building their own bots and finding ways to stop bots from other companies.

This is why I have two separate browsers. If you want to do official stuff like paying for things you need to get through cloudflare.

> ...in the age of AI, does anyone have an actual solution for keeping out bots while preserving the privacy of humans?

There isn't one, and pretending otherwise is nonsense because humans will always provide their credentials to something to act on their behalf.

In the limit you end up with Chinese phone farms.

The only solution is regulation. If all content created by anyone has a copyright, how does an implicit opt-in (which is what happens if you don't create a robots.txt file for your website) for scraping make any sense? Moreover, even if you have a robots.txt, AI (or whatever) bots often don't respect it (or use workarounds - they outsource scraping of such "restricted" sites to unethical third-parties to get the data; Meta has even resorted to piracy, openly!). So clearly, the logic and the "honour system" has failed.

Cloudflare, Google Captcha, HCaptcha etc. are all shitty technical solutions because, as we are all discovering, it comes at the cost of our privacy (i.e. our personal data may monetise these services) and / or our computing resource and time. If current copyright laws aren't sufficient to prevent this, we have to acknowledge the system is broken. The answer could be enhancing it with some kind of Digital Millennium Copyright Act (DMCA) -like laws, but in favour of the creators against BigTech or rogue actors.

- Web-scraping and copyright law - https://www.neudata.co/blog/web-scraping-and-copyright-law

- Why DMCA Claims Against Web Scrapers Face Long Odds - https://capstonedc.com/insights/why-dmca-claims-against-web-...

Remote attestation should still be possible with a rooted phone if phone manufacturers weren't so shit. If the attestation happens at hardware level, it doesn't matter what programs or kernels you're running.

And identifying a bot that is acting on my behalf. Claude go search this topic is basically the same as Googling something and clicking on the results. Human driven AI searching needs to be in a different box than AI scraping for training data.

Which sounds extremely difficult to differentiate

Or maybe we can actually start paying for all the things we use on the Web, making it prohibitively expensive to deploy fleets of bots.

You don't need a non-rooted phone to pass captcha checks, I have a rooted phone and can pass the captchas that ask you to scan a qr code. But I doubt phones without google services would manage.

Private invite only internets

They are not a problem unless you "believe" it is a problem. I estimate around 20-25K hits to my website from bots per day and I have all cloudflare protections disabled. Any decently optimized server should be able to easily handle that. (it's roughly 1 request every 3 seconds).

web environment integrity

> keeping out bot

You can forget about it. It is not possible. Simple as that.

>I am extremely worried about how so many seem to have outsourced the control over who can access their websites to a company, with no second thoughts whatsoever.

I think the Web is on its last legs, anyway. Generative AI and LLM-instead-of-search has destroyed what little value remained.

Looks cool. And I wonder why I'd run this over JSshelter. It appears to do the same thing, no?

Thanks for the report, I've been running this for a long time.

What gave you the impression that Cloudflare were the good guys?

Big tech companies are always visited first by the G-men who need something done.

Cloudflare will just tell them that 70% traffic drop is because 70% of their traffic was bots, and everything is working fine, and hey, don't you want to upgrade to a paid plan to block 50% of the remainder? Think about how many bots will be blocked with that upgrade!

A better solution would be to make webgl, webgpu and (especially) webrtc have some sort of prompt before they can be in any way used in that fashion, but this will absolutely destroy web ux Windows Vista style.

Fingerprinting is just an implementation, banning it will just drive these companies to invent new tricks. That's why the GDPR doesn't specify any technical tracking methods, whether you're using cookies or fingerprinting or a camera drone looking at the user's screen, tracking without consent or good reason is banned.

I doubt politicians care much about fingerprinting, though. They're more afraid of actual businesses getting attacked by bots than they are about Linux users with weird setups not being able to access some websites.

a. Accept All

b. Accept Only Necessary Fingerprinting

At the same time: it sure works well enough to annoy anyone with a "bad ASN" IP with 80 captchas a day.

Exactly. I’m constantly amazed at how little you actually need to bypass CF, Amazon, Azure WAFs and so on (Incapsula springs to mind too). When you look at the code you’ve come up with, it’s actually quite small and compact.

More to the point, these systems actually help scraping because proof of work unlocks essentially unlimited scraping, in my experience.

That said - from my experience on the other side, sure you can’t stop people like me or you, but you can stop 99% of the others. That’s more than worth it operationally.

I stand corrected. It's not a nightmare scenario (as for Bitcoins) - but I'm still of the idea that "useless" computations should be avoided (as we should avoid having 10MB websites).

In any case, according to some napkin math done by Kimi 2.6 (which by itself is probably already consuming more than all of my PoW challenges for the upcoming 5 years) - the situation looks incredibly in favor of PoW: https://www.kimi.com/share/19e7ef40-a432-8912-8000-0000b4a71...

Which makes me wonder why CloudFlare isn't switching to this already

> Bot protection with fingerprinting is just an illusion. Any signals like this which is on client side can be spoofed by an above average person.

At the upper bound, fraud can always be committed by paying real people with real accounts to perform the desired action in a way that is 100% truly indistinguishable from organic. There's fundamentally actual prevention technique at the limit.

So the entire game is only "increasing the costs until it's not viable ROI", not "holistically prevent", which is why fingerprinting is a relevant technique here.

You can use Firefox with different profiles and configure it to launch particular profile directly, without launching default profile and using about:profiles.

Firefox with a non-default profile can be created like that:

  ./firefox -CreateProfile "profile-name /home/user/.mozilla/firefox/profile-dir/"
  # For, say, cloudflare that would be:
  ./firefox -CreateProfile "cloudflare /home/user/.mozilla/firefox/cloudflare/"

And you can launch it like that:

  ./firefox -profile "/home/user/.mozilla/firefox/profile-dir/"
  # For cloudflare that would be:
  ./firefox -profile "/home/user/.mozilla/firefox/cloudflare/"

So, given that /usr/bin/firefox is just a shell script, you can

    - create a copy of it, say, /usr/bin/firefox-cloudflare
    - adjust the relevant line, adding the -profile argument

If you use an icon to run firefox (say, /usr/share/applications/firefox.desktop), you'll need to do copy/adjust line for the icon.

Of course, "./firefox" from examples above should be replaced with the actual path to executable. For default installation of Firefox the path would be in /usr/bin/firefox script.

So, you can have a separate profiles for something sensitive/invasive (linkedin, cloudflare, shops, banks, etc.) and then you can have a separate profile for everything else.

And each profile can have its own set of extensions.

Firefox added profile switching recently. Works good.

(That said, I still keep separate machines. One for doing "official" things, the other for everything else)

I have it enabled and turnstile works fine.

It breaks Turnstile for me on Android. Had to restart the browser for it to take effect of course.

I think your comment is also making plenty assumptions..

> My guess is that OP's browser is getting banned because his WebKitGTK has a weird fingerprint, not because of webgl or whatever.

My guess is a different flavor of the same: Not matching an expected fingerprint (simplified: whitelist vs blacklist approach) combined with other factors.

[0]: I'm currently aware of Tor Browser, Konform Browser (am dev), Mullvad Browser, and to a certain extent Waterfox, LibreWolf, and r3df0x doing that.

>Official Firefox can be leaky unless you build it yourself with some build-time changes or use a fork with such[0]. Am I guessing right that you still have Webcompat, RemoteSettings, and Nimbus enabled still? How do you know a compatibility intervention isn't causing your browser to open the kimono just enough to "unbreak the page"?

See my other comment, tor browser works fine too: https://news.ycombinator.com/item?id=48346659

Big tech companies are always visited first by the G-men who need something done.

a. Accept All

b. Accept Only Necessary Fingerprinting

More to the point, these systems actually help scraping because proof of work unlocks essentially unlimited scraping, in my experience.

That said - from my experience on the other side, sure you can’t stop people like me or you, but you can stop 99% of the others. That’s more than worth it operationally.

> Bot protection with fingerprinting is just an illusion. Any signals like this which is on client side can be spoofed by an above average person.

So the entire game is only "increasing the costs until it's not viable ROI", not "holistically prevent", which is why fingerprinting is a relevant technique here.

Enabling resistfingerprinting on my Android phone shows me the same error screen. It's not just webkit.

fingerprintingProtection works fine on the other hand, but then again that's intentionally less intrusive.

I would wager to guess its one of the nature consequences of Chrome being the most popular browser on the web. Most legit traffic will be from Chrome.

https://codeberg.org/konform-browser/source

Maybe a good reason for not enabling it by default but a bad reason to not enabling it for strict settings.

I somewhat expect breaking sites with strict settings, I don’t expect an still wide open tracking path.

That’s deceiving.

It's enabled by default in Tor Browser and I'm not sure it can even be disabled?

Also enabled by default for Konform Browser and Mullvad Browser, which borrow many of the privacy- and security-related patches from Tor Browser.

You can still do that, but it may not be rendered correctly in a screenshot.

Or maybe we can actually start paying for all the things we use on the Web, making it prohibitively expensive to deploy fleets of bots.

> keeping out bot

You can forget about it. It is not possible. Simple as that.

Let's say I'm selling concert tickets. How do I prevent bots from buying up all the tickets and scalping them?

What gave you the impression that Cloudflare were the good guys?

Probably everyone on HN singing their praises for the past 10 years.

And then the gatekeepers like Cloudflare will say "please hit accept in order to verify your browser and access this site".

You mean the "Accept Cookies" banner that has become a complete joke? Pass

At the same time: it sure works well enough to annoy anyone with a "bad ASN" IP with 80 captchas a day.

exactly that's what I was thinking... like the day they provided a solution to the issue they posed

I stand corrected. It's not a nightmare scenario (as for Bitcoins) - but I'm still of the idea that "useless" computations should be avoided (as we should avoid having 10MB websites).

Which makes me wonder why CloudFlare isn't switching to this already

Because it doesn’t solve the problem of residential botnets.

You can use Firefox with different profiles and configure it to launch particular profile directly, without launching default profile and using about:profiles.

Firefox with a non-default profile can be created like that:

  ./firefox -CreateProfile "profile-name /home/user/.mozilla/firefox/profile-dir/"
  # For, say, cloudflare that would be:
  ./firefox -CreateProfile "cloudflare /home/user/.mozilla/firefox/cloudflare/"

And you can launch it like that:

  ./firefox -profile "/home/user/.mozilla/firefox/profile-dir/"
  # For cloudflare that would be:
  ./firefox -profile "/home/user/.mozilla/firefox/cloudflare/"

So, given that /usr/bin/firefox is just a shell script, you can

    - create a copy of it, say, /usr/bin/firefox-cloudflare
    - adjust the relevant line, adding the -profile argument

If you use an icon to run firefox (say, /usr/share/applications/firefox.desktop), you'll need to do copy/adjust line for the icon.

Of course, "./firefox" from examples above should be replaced with the actual path to executable. For default installation of Firefox the path would be in /usr/bin/firefox script.

So, you can have a separate profiles for something sensitive/invasive (linkedin, cloudflare, shops, banks, etc.) and then you can have a separate profile for everything else.

And each profile can have its own set of extensions.

They're blocking Firefox quite often. Stripe does something that makes Firefox hang. I use Chrome for those sites and then go back to Firefox...

You do now do this from `Profiles` menu too, without going down to CLI path. It's extremely simple now.

Except that fingerprinting means that both profiles are actually tied together by cloudflare (and other tech companies)

Firefox added profile switching recently. Works good.

(That said, I still keep separate machines. One for doing "official" things, the other for everything else)

> Firefox added profile switching recently.

I think this was as recent as 25 years ago?

Recently they added some new UI. There was and still is (I think) classic Profile Manager UI, which you can launch with

  ./firefox -ProfileManager

or access UI in about:profiles.

But you don't have to use any of those anyway - see my comment above (a response to parent).

Odd - they've had that for years, but only on the command line. Wonder if it's different under the hood? They also have firefox containers which also never quite became a first-class feature (you have to install a plugin).

>Works good.

does it? same binary, same machine, same display, same 781 other heuristics.

> My guess is that OP's browser is getting banned because his WebKitGTK has a weird fingerprint, not because of webgl or whatever.

So why is Cloudflare saying the author got blocked because of WebGL?

While I don't have an iDevice to try, the assumption that they are special cased is fair... because they are: https://blog.cloudflare.com/eliminating-captchas-on-iphones-...

(Yes, this is basically WEI in a shinier package.)

>So why is Cloudflare saying the author got blocked because of WebGL?

No idea. I can't even reproduce the error OP got with webgl disabled.

https://litter.catbox.moe/y42l22k97tgv96nx.png

Use proof-of-work captchas, many are private by default. Look into Private Captcha or Cap captcha.

Bad optics aside, it doesn't actually reflect reality. See my other comment. You can enable basically all the privacy settings and still pass turnstile. Tor browser in a VM passes it, of all things.

https://litter.catbox.moe/gaizpk692bhhs6b7.png

The tool "Anubis" uses proof of work instead

> ...in the age of AI, does anyone have an actual solution for keeping out bots while preserving the privacy of humans?

There isn't one, and pretending otherwise is nonsense because humans will always provide their credentials to something to act on their behalf.

In the limit you end up with Chinese phone farms.

- Web-scraping and copyright law - https://www.neudata.co/blog/web-scraping-and-copyright-law

- Why DMCA Claims Against Web Scrapers Face Long Odds - https://capstonedc.com/insights/why-dmca-claims-against-web-...

Which sounds extremely difficult to differentiate

You don't need a non-rooted phone to pass captcha checks, I have a rooted phone and can pass the captchas that ask you to scan a qr code. But I doubt phones without google services would manage.

Speaking from the scraper’s perspective, I like proof of work; a ten year old 96-core server will cost a couple of quid to run for a few hours and will grab an absurd number of pages thanks to the access granted by repeatedly solving proofs of work. Small slick codebases too!

How does proof of work stop bots?

Any idea what the difference is between your setup and the one in the article that failed with fingerprint-resistance enabled?

With a tuned cool down period this isn't a problem, especially if you frequent the sites. OpenWRT uses Anubis and usually when I need to peruse their site I'm on a very low-end device. I prefer waiting much more over finding Waldos

But in principle I agree that there's no good answer to this, scraping _is_ useful and I bet most of us here had scraped something, it is AI company and their use of human's material for training without consent and return that led us to this (I know botting exists in forum since forum is a thing but it is easily solved by human moderators and keyword filter)

Anubis often takes more than 60 seconds to complete on low-end devices (especially old smartphones). It seems like there's no good solution.

How does Anubis stop bots?

Right. Botnet operators love cloudflare because they make so much money renting out compromised machines to pass their tests.

Or you could let information be free, at least the stuff that’s on the public net.

As for issues like bots overloading websites or using too many resources scaling laws will take care of it quickly, it’s not like you can’t serve thousands of RPS from a Raspberry Pi these days.

I don't think regulation will stop web scraping, not least of which because it can be done from locations outside the jurisdiction of the regulations.

> we have to acknowledge the system is broken

The system is broken. It probably takes, what, 10 seconds or less to use a residential or foreign proxy, 6+ months to internationally track and prosecute a single offender? So like a million times more effort going the regulatory route.

Hopefully it stays that way; "a bot acting on my behalf" is still a bot. At least it's often a well-behaved bot and uses a user-agent that can be detected and blocked.

How does scanning a QR code prove any kind of captcha?

Let's say I'm selling concert tickets. How do I prevent bots from buying up all the tickets and scalping them?

Do it like plane tickets do, tie a ticket to an identity + buyback up to a week or so before the concert in case someone wants to cancel (or authorize the transfer and capture only a week before). Ask for ID and ticket at the entrance.

Sell them via a Dutch auction. Eliminate the arbitrage opportunity for scalpers and make more money in the process.

I'd simply check filling speed, even with browser's autocomplete humans are slow due needing click submit.

Then when it's "processing", do them in bulk and prioritize slower users. There's huge opportunity do bot checks after checkout without affecting user experience.

Also on product launches you could add unique field which requires user to input, for example that way bots can't prepare for launches.

Tie them to the buyer's identity, offer at-value buy-backs until X weeks before event, disallow resale.

Probably everyone on HN singing their praises for the past 10 years.

And then the gatekeepers like Cloudflare will say "please hit accept in order to verify your browser and access this site".

exactly that's what I was thinking... like the day they provided a solution to the issue they posed

You do now do this from `Profiles` menu too, without going down to CLI path. It's extremely simple now.

They're blocking Firefox quite often. Stripe does something that makes Firefox hang. I use Chrome for those sites and then go back to Firefox...

Because it doesn’t solve the problem of residential botnets.

>Works good.

does it? same binary, same machine, same display, same 781 other heuristics.

>So why is Cloudflare saying the author got blocked because of WebGL?

No idea. I can't even reproduce the error OP got with webgl disabled.

https://litter.catbox.moe/y42l22k97tgv96nx.png

And my og comment getting downvoted on this very intellectual forum that definitely isn't an echo chamber

You mean the "Accept Cookies" banner that has become a complete joke? Pass

I think he means browser permissions, for example when browsers want notify or record your mic theres a permission check something similar for webgl.

It's about explicitly deciding to allow certain capabilities on a per-website basis. No major browser allows defense-in-depth via fine-grained website permissions.

Even simply changing the user agent was sabotaged at Firefox, and choosing one user agent per domain is wishful thinking.

This is actually illegal under GDPR.

Except that fingerprinting means that both profiles are actually tied together by cloudflare (and other tech companies)

> Firefox added profile switching recently.

I think this was as recent as 25 years ago?

Recently they added some new UI. There was and still is (I think) classic Profile Manager UI, which you can launch with

  ./firefox -ProfileManager

or access UI in about:profiles.

But you don't have to use any of those anyway - see my comment above (a response to parent).

How does proof of work stop bots?

Any idea what the difference is between your setup and the one in the article that failed with fingerprint-resistance enabled?

Right. Botnet operators love cloudflare because they make so much money renting out compromised machines to pass their tests.

Or you could let information be free, at least the stuff that’s on the public net.

As for issues like bots overloading websites or using too many resources scaling laws will take care of it quickly, it’s not like you can’t serve thousands of RPS from a Raspberry Pi these days.

Hopefully it stays that way; "a bot acting on my behalf" is still a bot. At least it's often a well-behaved bot and uses a user-agent that can be detected and blocked.

See my other comment, tor browser works fine too: https://news.ycombinator.com/item?id=48346659

Hacker Times

Hacker Times

Cloudflare Turnstile requiring fingerprintable WebGL

Discussion

Discussion