How does this "identifier" work with Javascript disabled
Why is this global keyed only by the database name string in the first place?
The post mentions a generated UUID, why not use that instead, and have a per-origin mapping of database names to UUID somewhere? Or even just have separate hash-tables for each origin? Seems like a cleaner fix to me compared to sorting (imo, though admittedly, more of a complex fix with architectural changes)
Seems to me that having a global hashtable that shares information from all origins is asking for trouble, though I'm sure there is a good explanation for this (performance, historical reasons, some benefits of this architecture I'm not aware of, etc.).
namespace mozilla {
namespace dom::indexedDB {
using namespace mozilla::dom::quota;
using namespace mozilla::ipc;
using mozilla::dom::quota::Client;
> For security and product stakeholders, the key point is simple: even an API that appears harmless can become a cross-site tracking vector if it leaks stable process-level state.
This reads almost LLM-ish. The article on the whole does not appear so, but parts of it do.
I was expecting an ad for their product somewhere towards the end, but it wasn't there!
I do wonder though: why would this company report this vulnerability to Mozilla if their product is fingeprinting?
Isn't it better for the business (albeit unethical) to keep the vulnerability private, to differentiate from the competitors? For example, I don't see many threat actors burning their zero days through responsible disclosure!
Don't get your opsec advice from HN. Check whonix, qubes, grapheneos, kicksecure forums/wikis. Nihilist opsec, Privacyguides.
Make sure to exit Tor Browser at the end of a session. Make sure not to mix two uses in one session.
Whether they care is entirely separate.
Why don't browsers make it like phones where the server (app) has to be granted permission to access stuff?
Also, does anyone know of any researchers in the academic world focusing on this issue? We are aware that EFF has a project that used to be named after a pedophile on this subject, but we are more looking for professors at universities or pure research labs ala MSR or PARC than activists working for NGOs, however pure their praxis :-)
As privacy geeks, we have become fascinated with the topic -- it seems that while we can achieve security through extensions like noscript or ublock origin or firefox containers (our personal "holy trinity"), anonymity slips through our fingers due to fingerprinting issues. (Especially if we lump stylometry in the big bucket of "fingerprinting".)
[1] https://web.archive.org/web/20260422190706/https://fingerpri...
Hmm, I'm a little confused, since in 2021 Mozilla released experimental one-process-per-site:
> This fundamental redesign of Firefox’s Security architecture extends current security mechanisms by creating operating system process-level boundaries for all sites loaded in Firefox for Desktop
https://blog.mozilla.org/security/2021/05/18/introducing-sit...
Perhaps that is not fully released?
Or perhaps it is, but IndexedDB happens to live outside of that isolation?
That's why expansion of web standards is wrong. Browser should provide minimal APIs for interacting with device and features like IndexedDB can be implemented as WebAssembly library, leaking no valuable data.
For example, if canvas provided only access to picture buffer, and no drawing routines calling into platform-specific libraries, it would become useless for fingerprinting.
The IndexedDB UUID is "shared across all origins", so why not use the contents of the database to identify browers, rather than the ordering?
Seriously, I am saddened that Chromium dominates the browser market as much as it does, but at this point the herd-immunity of Chromium is necessary to keep users safe.
And all browser devs should be required to actively fight against fingerprinting.
There is no legitimate need for fingerprinting in browsers.
Maybe because is not as serious as them and their title, made it to be? Did you read it fully?
The identifier described is not process lifetime stable, not machine stable, or profile stable, or installation stable. The article itself says it resets on a full browser restart...
So this is not a magic forever ID and not some hardware tied supercookie. Now what should we do with that title, and the authors of it?
The key vulnerability here is that, for the lifetime of that Firefox process, any website that makes that set of databases is going to see the exact same output ordering, no matter what the contents of those databases are. That makes this a fingerprint: it's a stable, high-entropy identifier that persists across time, even if the contents of those databases are not preserved. It is shared even across origins (where the contents would not be), and preserved after website data is deleted -- all a website has to do to re-acquire the fingerprint is recreate the databases with the same names and observe their ordering.
Because it's an isolated remote browser, you also get a lot of flexibility. You can run BrowserBox itself as an onion hidden service connected to the clearnet, or connect BrowserBox to browse over Tor, or even do both at the same time. Since this Firefox IndexedDB vulnerability relies on persisting state, you can completely avoid it by running BrowserBox (based on Chromium), and doing it ephemerally. There's actually a new GitHub action [0] that makes spinning up a purely ephemeral, disposable session incredibly easy and would be immune to this kind of process-level state tracking.
The action runs BrowserBox on a GitHub Action Runner, you can specify whether you want a CloudFlare tunnel, or a tor tunnel (which comes with torweb access). And there's a conveneince script you can use to run from the command-line - which does the setup then spits out your login link.
All you need is a BrowserBox license (not free), but then you can use it.
I would consider this a lightweight Tor-proxied Browser, not a replacement for Tor Browser, at this time as there are likely edges and leaks that the official Tor Browser has long patched. However, as cases liek this IDB bug demonstrate - no security is perfect. If you simply want a way to access tor, and add an extra "ephemeral" hop on a runner, itself over Tor, and not trying to do anything especially sensitive or life-threatening - it's probably good.
Dump the rendered window pixels out to a simple viewer. Mouse movement is still a pain to deal with, but I would default to spoofing it as moving between clicks, with some image parsing logic to identify menu traversal.
Then it should reboot the browser process regularly.
I've been waiting for someone to make a packaged 'VPC in a box' incorporating networking and linked VMs.
Just use a network namespace individual pieces of software are way too easy to misconfigure.
This is dangerously incomplete and bad advice.
Qubes OS does not work the way you seem to think it does.
Creating a new identity in the Tor Browser inside a disposable VM does not automatically stop that VM and start a new disposable VM. That initial disposable VM launches the new identity from the existing process and therefore remains vulnerable, the same as any bare metal computer running Tor Browser would.
Virtualization is not magic.
A Qubes OS user needs to spin up a new disposable Whonix VM to sidestep this attack. Creating a new identity alone is ineffective in this threat model.
If you care about these projects as much as you say you do, please stop giving harmful advice. You do it in various places on the Internet and in every thread which gives you half a chance to do so, and these projects would be better off if you either took any of the extensive well-reasoned correction many people offer you, or opted to stop making such claims. The former would be ideal, the latter still vastly preferable to the existing state of affairs.
Did you even read the article at all? Ah my children did bad in school, time to replace them with new children and a different spouse. This is what you're suggesting essentially. A browser is not just something you simply make out of thin air. There's decades of nuance to browser engines, and I'm only thinking of the HTML nuances, not the CSS or JS nuances.
With all due respect, and acknowledging that your work is technically excellent…
Isn't everything that you do an exploitation of vulnerabilities? https://news.ycombinator.com/from?site=fingerprint.com
Fingerprinting is all about extracting information about a site's visitors which those users didn't explicitly intend to reveal.
Apps have access to inconceivable amounts of identifiers and device characteristics, even on the well protected systems without Google Play services.
And since browsers rival OSes for complexity (they are basically OSes in their own right already), any part of the system can be inadvertently exposed and exploited.
1. Website fingerprints the browser, stores a cookie with an ID and a fingerprint.
2. During the next session, it fingerprints again and compares with the cookie. If fingerprint changed, notify server about old and new fingerprint.
If so, cool!
Or just open dev tools
So it persists between anonymous sessions. So you could connect User A that logged out and reset the identity to User B who believed was using a fresh anonymous session and logged in afterwards.
It's more than a browser restart, it's a complete system wipe every time.
Tails is made on the premise that exactly this kind of trick will occur. Sometimes even persisting between browser restart. For that reason even the persistent storage is very limited. But that's optional and cautioned against for maximum anonymity.
What would be worrying with tails would be if there was some way for some hardware identifier to be exposed. Like a serial number or MAC address. But this kind of thing is exactly what it's made to protect against.
https://www.ndss-symposium.org/wp-content/uploads/ndss2021_1...
Says that Firefox has a bug that prevents favicons from being loaded from cache, which inadvertently protects against this technique. They filed a bug report on it in 2020 but nothing has happened with it yet: https://bugzilla.mozilla.org/show_bug.cgi?id=1618257
connects Chrome to a Tor SOCKS proxy and wraps all other browsing-related network calls over torsocks. It prevents local fingerprinting leaks (like this IndexedDB ordering bug) because the browser isn't running locally at all. You can host the BrowserBox instance as an onion hidden service, use it to browse over Tor, or both.
If you want to try an ephemeral "VPC in a box" style setup where the environment is destroyed after you're done, you can easily spin it up using this new GitHub action: https://github.com/marketplace/actions/browserbox (but you need a license key, obtainable at https://browserbox.io)
This is my attempt to make it easy to spin up bbx on ephemeral infrastructure that's mostly free (GitHub Actions runners are perfect).
Joanna Rutkowska's understandable preference for older kernels had its advantages, but the current team is much more likely to ship somewhat newer kernels and I've been surprised by what hardware 4.3 has worked well on.
Beyond that, I'm currently running a kernel from late Feb/early Mar (6.19.5).
Driver support can still be an issue, and a Wi-Fi card that doesn't play nice with Linux in general is doing to be no different on Qubes OS.
A Qubes OS user needs to start a new disposable Whonix workstation VM to sidestep this attack, NOT create a new identity in the same disposable VM's browser, which is exactly what this attack targets.
No software wants to be fingerprinted. If it did, it would offer an API with a stable identifier. All fingerprinting is exploiting unintended behavior of the target software or hardware.
A user agent that says the browser's version? Reasonable enough.
Being able to ask for fonts, if the system has them? Difficult to have font support without that.
Getting the user's timezone, language and keyboard layout? Reasonable.
The size of the screen, and the size of the browser window? Difficult to lay things out without that.
Of course a video or audio player needs to know which video formats your browser supports - how else to provide the right video?
Obviously javascript can get the time, and it's trivial to figure out the system's clock error by comparing that to the time on a server.
Before you know it, almost every browser is uniquely identifiable.
Like Android phones perhaps? Unfortunate Apple gives very little granular control.
> In Firefox Private Browsing mode, the identifier can also persist after all private windows are closed, as long as the Firefox process remains running. In Tor Browser, the stable identifier persists even through the "New Identity" feature, which is designed to be a full reset that clears cookies and browser history and uses new Tor circuits.
You bring this up like it's a well known incident, but my googling can find no evidence of it? The only reason not say the name of the project would be if it's common knowledge, but it's not?
ChatGPT research reckons you're making it up, and I'd be curious if you have evidence to the contrary?
JS also dramatically improves security. TBB is stuck in a 90s mindset about privacy, as if Firefox exploits were not dime a dozen. Especially with AI making FF exploits more available, we can expect many tor sites to be actively attacking their visitors.
Tor Browser also doesn't spoof navigator.platform at all for some reason, so sites can still see when you use Linux, even if the User-Agent is spoofing Windows.
User agents as a concept are rather poorly thought out across the board and not all that useful but persist because that's just how technical cruft is.
Fonts should be provided by the website; if not provided the choice should take the form of a spec sent by the website including line height, sarifs or not, monospace or not, etc. There's little to no excuse for the current font situation IMO beyond poor design decisions that became heavily entrenched.
Timezone and other obviously private metadata should never be shared without the user explicitly granting permission on a case by case basis. The status quo here is completely inexcusable as is the continued failure to fix the problem.
Size of the physical screen should never be exposed under any circumstances. The current size of the browser window is reasonable on its face but now that fingerprinting is understood to be an issue should always be heavily letterboxed unless the user consents to sharing the exact value.
Video formats should be provided by the website as a list of offerings and the browser should respond with a choice; the user could optionally intervene. There's no reason to expose the full capabilities to a remote service.
Querying the current time should be gated behind an explicit permission. There's almost never a need for it. However from a fingerprinting perspective you also have to worry about correlating the rate of clock skew across clients. That can be solved by gating access to high resolution time counters behind an explicit permission as (once again) the vast majority of services have no legitimate use for such functionality.
Now we have actual criminal organizations and other real bad actors.
I'm sure we can come up with something better than advertise our whole local computing platform on every HTTP request.
No applications. No mail. No need for cookies.
I can use a "regular" browser for more enhanced stuff. But for simple content consumption, we can just have a "dumb" browser that can't do much.
> A user agent that says the browser's version? Reasonable enough.
No user agent. I'm guessing it will need it for JavaScript or HTML features, and dynamically update if using an old browser, but let's just not supply a user agent and let it be the reader's burden to have a reasonably decent browser.
> Being able to ask for fonts, if the system has them? Difficult to have font support without that.
What's the fallback if the system doesn't have them?
> Getting the user's timezone, language and keyboard layout? Reasonable.
Keyboard layout is irrelevant for viewing content. For timezone and language: Yeah, I can see the use cases, but these are in a small minority. Let there be a popup when requested, and the user can specify the timezone/language as requested.
> The size of the screen, and the size of the browser window? Difficult to lay things out without that.
Let's let this new browser return only from a (small) discrete set of sizes. It will pick the size closest to the actual browser window size and send that.
> Of course a video or audio player needs to know which video formats your browser supports - how else to provide the right video?
Same answer as user agent. Either let the user pick from a selection of video formats, or just hard code a reasonable one and put the onus on the user to have a browser that supports it.
> Obviously javascript can get the time, and it's trivial to figure out the system's clock error by comparing that to the time on a server.
This hypothetical browser could just not send the time :-) For 99% of content consumption, this function is not needed.
What I'm describing should be part of "Private mode". Or browsers should have an "Ultra-private" mode that is the above. If it's too complex/risky maintaining it all in one codebase ... fine. Just have a separate browser.
Right now, if I built such a browser, I'm sure a lot of sites meant for content would break. But in my fantasy world, using "Ultra-private" would be the default, and people who make sites will target them first.
I think much of the complexity in making a web browser is all the "other" stuff. Being able to run apps, cookie/privacy management, etc.
But most ROMs don't allow controls for WiFi, Cell data, Phone ID, Phone number, User ID, local storage, etc...
Assume the same.
>The idea is to amass as much information as possible
Reminded, from 2012: https://www.wired.com/2012/03/ff-nsadatacenter/
So what happened here is basically... AI told you that something that made you suspicious because you have zero subject matter expertise is suspect?
I'm not really sure how to react to someone who has a robot affirm their anxieties other than to stand by my previous statements and give a polite pointer at some terms to look up on Wikipedia rather than feed into a clanker.
i also like anonbib as a central repo for interesting work.
Tor endpoints are pretty easy to identify, there are plenty of handy databases for that, using it to begin with increases your uniqueness. If noscript was set to strictly disallow javascript by default, that decreases the degree to which it increases your signature relative to the baseline of using tor.
Then we have to account for the simple fact that many, many fingerprinting techniques rely on javascript, so taking them out of the picture reduces the unique identity that can be gleaned.
Are we absolutely, positively sure that the tradeoff is worth it? Without a strict repeatable measurement, I think I'm highly skeptical about whether or not a default of "allow" is a net boon to hiding your identity. I remember the rationale about the switch mostly being directed towards "most of the web is broken otherwise and that's bad."
I've heard a handful of people say this but are there examples of what I would imagine would have to be server-side fingerprinting and the granularity? Since most fingerprinting I'm aware of is client-side, running via JS. While I expect server-side checks to be limited to things like which resources haven't be loaded by a particular user and anything else normally available via server logs either way, which could limit the pool but I wonder how effective in terms of tracking uniqueness across sites.
We're talking about users of the Tor browser, and I'd be very surprised if this was the case (that a majority keep JS turned on)
Basically every Tor guide (heh) tells you to turn it off because it's a huge vector for all types of attacks. Most onion sites have captcha systems that work without JS too which would indicate that they expect a majority to have it disabled.
For those who want an ephemeral setup but prefer the Chromium engine over Firefox, you can achieve a similar "destroy after use" workflow using BrowserBox. It has a tor-run function that connects Chrome to a Tor SOCKS proxy and wraps all auxiliary network calls over torsocks.
You can easily spin up a purely ephemeral session using a GitHub action [0] so that absolutely no state persists once you close it. As a bonus, you can also run the BrowserBox instance itself as an onion hidden service while browsing over Tor.
This is technically incorrect information and could get people in trouble if followed literally.
On Qubes OS, if a user creates a new identity inside a Whonix workstation disposable VM via the browser's new identity functionality, the new identity spawns within the same disposable VM. I just tested this on Qubes OS 4.3.
That, I assume would expose one to OP's vulnerability, as its still running in the same VM. I would be glad to learn that I'm incorrect in my unverified assumption.
Even Qubes OS users still need to be mindful to launch new disposable VM when keeping identities separate to sidestep this attack.
All these things should be opt-in and like blocked by GDPR.
No way!
I don’t ever use any font provided by the website. I don’t even let websites choose which fonts get used. Instead I choose a set of fonts (monospaced and proportional) that are readable and everything uses those.
If you want to see what that looks like, go into the Firefox settings, find the Fonts section, click Advanced, and then uncheck “Allow pages to choose their own fonts, instead of your selections above”. Be sure to adjust the “Minimum font size” while you’re here so that nobody uses text sizes that you cannot read.
No support for forms. The browser is meant for content consumption. Not for interaction/creation.
One could argue that any JS capabilities to do network requests (including dynamically rendering content) would be disallowed.
Yes, I know, this is going pre-Web 2.0.
Yes, of course, most current sites won't work in that model. But I'll also say: Most current content sites don't need these capabilities. They have them because they know the browser supports them.
Again - a fantasy. I know only a few people will use it. I know that won't be enough to change web behavior. It would be nice, though, if sites carried a badge to indicate they conform to all of the above.
When i use Resist Fingerprinting my main issue is the timezone being set to UTC. most of the other stuff it does never causes issues. I guess sometimes sites need to read the canvas, but theres a permission box that allows that when needed. I wish there was a similar permission box for timezone.
The only other drawback to the "resist fingerprinting" option is you will encounter cloudflares' captcha checkbox everywhere and all of the time :(
An example that comes to mind that I've seen is an anonymous app that allows for blocking users; you can programmatically block users, query all posts, and diff the sets to identify stable identities. However, the ability to block users is desired by the app developers; they just may not have intended this behavior, but there's no immediate solution to this. This is different than 'user_id' simply being returned in the API for no reason, which is a vulnerability. Then there's maybe a case of the user_id being returned in the API for some reason that MIGHT be important too, but that could be implemented another way more sensibly; this leans more towards vulnerability.
Ultimately most fingerprinting technologies use features that are intended behavior; Canvas/font rendering is useful for some web features (and the web target means you have to support a LOT of use cases), IP address/cookies/useragent obviously are useful, etc (though there's some case to be made about Google's pushing for these features as an advertising company!).
Unintended identification is less than ideal but frankly is just the nature of doing business and any number of niceties are lost by aggressively avoiding fingerprinting.
In software intentionally optimized to avoid any fingerprinting however it is a vulnerability.
The distinction being that fingerprinting in general is a less than ideal side effect that gives you a minor loss in privacy but in something like Tor Browser that fingerprinting can be life or death for a whistleblower, etc. It's the distinction between an annoyance and an execution.
Yeah, because I love it when every website I go to downloads 10 megs of fonts to my computer before it starts rendering the page. Fonts should be suggested by the website, and a bog-standard "every computer has this" font should be listed as the fallback.
> Timezone and other obviously private metadata should never be shared without the user explicitly granting permission on a case by case basis
100% agree.
> Size of the physical screen should never be exposed under any circumstances
I mostly agree, but with the understanding that this would cause issues with "modern" web pages having very difficult to format layouts. Responsive design requires a response, after all.
> Video formats should be provided by the website as a list of offerings and the browser should respond with a choice
You're still getting the same feedback with this, that the browser chose to use X format, so you're not increasing privacy with this, only difficulty.
> Querying the current time should be gated behind an explicit permission
100% agree. If there is no active local processing of information that the server relies on, in the format of a game or some other interactivity, then there is no reason why the server needs to know your local time.
thankfully i think traditional web surfing is probably going to die out in the next 10 years, and progressively decline a lot much sooner than that as people start to interact with AI rather than browsers (or any software for that matter).
my feed of hackernews is going to be my AI agent giving it to me in plain text very soon, and soon after that i will probably never visit the internet again because it will be impossible to know what's real and fake
as a millennial it will be interesting to experience the full cycle of being born when nothing was online, to everything being online, to then again being entirely offline by the time i'm older
What you want exists, have at it
You want fingerprinting to identify low risk users to skip the inconvenient security checks.
In what way is collecting a record of a person's browsing history a "minor loss" of privacy. For many people, tracking everywhere they go online would easily expose the most sensitive personal information they have.
That's why I said that a spec mechanism should also be provided. The issue is that sites can perform measurements regarding the layout that change based on the font used. So the browser should only ever provide a few fallbacks, nothing more, and anything else needs to come from the site itself.
> screen size
I think maybe you're confusing the physical screen with the current size of the browser window?
> video formats
The issue at present is that a site can programatically test a long list of formats against your setup to see what happens. What I'm describing increases privacy because the site can no longer directly query for the entire list of supported formats and the user can optionally control the process. Obviously it's still possible to botch the implementation on the browser's end but the point is to make it possible to do the right thing.
"Web applications use it for offline support, caching, session state, and other local storage needs"
This use case is completely orthogonal to what my browser is meant to do. My browser would not have a concept of local storage.
The premise of starting with a modern browser and stripping away features to get privacy is flawed - it's always vulnerable to these types of things. I'm going the opposite route: Only add features if they cannot be exploited for monitoring.
Wait for the advent of local agents running on local models (for privacy) followed by techniques to fingerprint agents, followed by techniques to infer query parameters based on agent behavior. I wish I was joking but it seems all too plausible.
You said it was “named after a pedophile”, that is wrong
>>The word panopticon derives from the Greek word for "all seeing" – panoptes.
The concept was invented by Jeremy Bentham, who died before Foucault was born.
Interesting that you named your HN account after a famous homophobe.
If TBB changed to js off by default that signal would be less evident, and also, fingerprinting would be harder.
https://fingerprint.com/blog/disabling-javascript-wont-stop-...
There is also a method of fingerprinting using the favicon: https://github.com/jonasstrehle/supercookie
>Physical isolation is a given safeguard that the digital world lacks
…
>In our digital lives, the situation is quite different: All of our activities typically happen on a single device. This causes us to worry about whether it’s safe to click on a link or install an app, since being hacked imperils our entire digital existence.
>Qubes eliminates this concern by allowing us to divide a device into many compartments, much as we divide a physical building into many rooms. …
Sold
Strong disagree.
> IP address/cookies/useragent obviously are useful
Cookies are an intended tracking behavior. IP Address, as a routing address, is debatable.
> Canvas/font rendering is useful for some web features
These two are actually wonderful examples of taking web features and using them as a _side channel_ in an unintended way to derive information that can be used to track people. A better argument would be things like Language and Timezone which you could argue "The browser clearly makes these available and intends to provide this information without restriction." Using side channels to determine what fonts a user has installed... well there's an API for doing just that[0] and we (Firefox) haven't implemented it for a reason.
n.b. I am Firefox's tech lead on anti-fingerprinting so I'm kind of biased =)
[0] https://developer.mozilla.org/en-US/docs/Web/API/Local_Font_...
If 100 people are using that browser, how will they know which one is me?
> How browsers render SVGs can be used for fingerprinting (even the underlying OS affects this, and I assume you'll want to see those)
Can you provide details on this? And how will they know which OS I'm using (through SVG rendering...)? The UserAgent definitely should not send the OS.
> combine with ISP from IP address
That's already provided whether I use Private mode or not, correct? I can always use a VPN.
Browsers do not "generate" fingerprints. They expose data that can be used to fingerprint users. You cannot "randomize" this; even if you were to return random values for, say, user screen size, with various visual side effects, it would just be another signal to fingerprint: "Oh, your browser is returning random values? Must be a Tor browser user".
A user would have to manually start a new disposable VM for each identity.
For remote browser tools I use neko https://github.com/m1k1o/neko
But with Tor I like to have more safeguards. So I prefer to run tails in an isolated environment.
By you reasoning, Qubes doesn't provide more protection than the underlying operating systems. I've seen this myth on HN multiple times.
The thing is, technology is either enabling something or not. The exploration space might be huge, but once an exploit is found, the exploitation code / strategy / plan can trivially proceed and be shared worldwide. So you have to deal with this when you design and patch systems.
Example: preserving paths in URLs. Safari ITP aggressively removes “utm_” and other well-known querystring parameters even in links clicked from email. Well, it is trivial to embed it in a path instead, so that first-party websites can track attribution, eg for campaign perfomance or email verification links etc. In theory, Apple and Mozilla could actually play a cat-and-mouse game with links across all their users and actually remove high-entropy path segments or confuse websites so much that they give up on all attribution. Browser makers or email client makers or messenger makers could argue that users don’t want to have attribution of their link clicks tracked silently without their permission. They could then say if users really wanted, they could manually enter a code (assisted by the OS or browser) into a website, or simply provide interactive permission of being tracked after clicking a link, otherwise the website will receive some dummy results and break. Where is the line after all?
That's perfectly fine! As long as they can't tell which tor user you are they can't track your browsing activity or associate it to any one tor user. That's the goal. Currently tor browser sticks out like a sore thumb by trying to appear identical no matter who uses it, which is fragile because any one data point unaccounted for unmasks everyone.
You'd have to fingerprint the browser first to determine that the "random values" were indeed coming from it.
I see Neko brought up a lot, but honestly when I tried it a couple years ago it felt pretty clunky. It seems designed more for anime watch parties than serious security or remote isolation, IMO.
I totally get the Tails/Firefox preference, tho. If you want absolute baremetal isolation on your own hardware and have the discipline for it, a fresh Tails USB is definitely the right move. BrowserBox is just a different architecture -- it's mainly for when you specifically want an ephemeral Chromium setup on ... well ... anything, need some policy controls or programmability. And don't want to fiddle with config yourself.
Having said that, fsflover exhibits a poor grasp of how this stuff works and all should be aware that even in Qubes OS, one would need to spawn new disposable VMs for each identity; relying on the Tor Browser's new identity creation within the same disposable VM would be little different from running Tor Browser on a traditional OS.
Also, please stop grossly misreading the comments of others. You consistently do it to numerous people here.
I joke that I'm a no-app person, because I install very few apps and I use anti tracking tech on my phone that's even hard to explain or recommend to non technical friends. I use Firefox with uMatrix and uBlock Origin and Blockada. uMatrix is effective but breaks so many sites unless one invests time in playing with the matrix. Blockada breaks many important apps (banking) less one understands whitelisting.
This is by design how everyone should always be using Qubes OS for any task, according to its documentation and approach to security.
> relying on the Tor Browser's new identity creation within the same disposable VM would be little different from running Tor Browser on a traditional OS
Yes, if you use a single VM on Qubes OS for everything, then all security you get is from the OS running in this VM. This is not how you use Qubes, https://doc.qubes-os.org/en/r4.3/introduction/faq.html#how-d...
I run Qubes as a daily driver according to the docs, and my workflow was not vulnerable to the discussed attack.
> Most users seem to not care about ad tech/tracking
I don't think this is true.Most people don't understand that they're being tracked. The ones that do generally don't understand to what extent.
You tend to get one of two responses: surprise or apathy. When people say "what are you going to do?" They don't mean "I don't care" they mean "I feel powerless to do anything about it, so I'll convince myself to not care or think about it". Honestly, the interpretation is fairly similar for when people say "but my data isn't useful" or "so what, they sell me ads (I use an ad blocker)". Those responses are mental defenses to reduce cognitive overload.
If you don't buy my belief then reframe the question to make things more apparent. Instead asking people how they feel about Google or Meta tracking them, ask how they feel about the government or some random person. "Would you be okay if I hired a PI to follow you around all day? They'll record who you talk to, when, how long, where you go, what you do, what you say, when you sleep, and everything down to what you ate for breakfast." The number of people that are going to be okay with that will plummet. As soon as you change it from "Meta" to "some guy named Mark". You'll still get nervous jokes of "you're wasting money, I'm boring" but you think they wouldn't get upset if you actually hired a PI to do that?
The problem is people don't actually understand what's being recorded and what can be done with that information. If they did they'd be outraged because we're well beyond what 1984 proposed. In 1984 the government wasn't always watching. The premise was more about a country wide Panopticon. The government could be watching at any time. We're well past that. Not only can the government and corporations do that but they can look up historical records and some data is always being recorded.
So the reason I don't buy the argument is because 1984 is so well known. If people didn't care, no one would know about that book. The problem is people still think we're headed towards 1984 and don't realize we're 20 years into that world
On what basis do you claim that software developers, who did not establish a means of for third parties to get a stable identifier, nevertheless intended that fingerprinting techniques should work?
Part of the problem is the misconception that the data being collected is only being used to determine which ads to show them. Companies love to frame it that way because ultimately people don't actually care that much about which ads they get shown. The more people get educated on the real world/offline uses of the data they're handing over the more they'll start to care about the tracking being done.
The saying about assumptions is as true as ever, unfortunately for both of us.
This is exactly what I was saying - if you look at the polls, people actually tend to support things like the UK's Online Safety Act. Explaining it more does not usually result in a change of that. The difference with a PI is you're asking about them individually instead of everyone - of course they trust themselves, they just want everyone surveilled for that same feeling of confidence.
Also, the degree to which some are more comfortable with the personal privacy/'feeling of personal safety' tradeoff notwithstanding, the examples that do get media traction are predictably extremes that the average person doesn't feel applies to them.
Ah but I'd want to run it myself anyway. I wouldn't want it hosted. Especially for browsing, I don't want someone else's systems looking over my shoulder.
I avoid cloud stuff as much as possible in my personal life. When you mentioned github actions I thought it was something you could self-host too, I didn't realise it was a service only. I was looking for a docker or something but as it's not free and (less importantly) foss it won't work for me.
And yes neko is not a polished corporate solution, but it works for me as a home user. It's very flexible to build other stuff with. I have several instances here in different environments (and I don't expose them to the clear internet)
But for work yeah I know there's different options, at work we have zscaler remote browser.
Yes and no, because people still will think that when it's done at scale it's different from some stalker following YOU explicitly, and not just following everybody. Also, the mental model is "they just want to sell me something, but I can just ignore and don't buy if I'm not really interested". And especially going down this second rabbit-hole opens a whole world about consumerism that not many people are comfortable with. At the same time there are people that are totally against consumerism that should be more informed and care more about tracking and privacy; with those people it's probably easier to have that conversation.
Yet again, please stop grossly misreading the comments of others. You consistently do it to numerous people here.
People don't care. This is demonstrably true.
1) wanting functionality that isn't provided and working around that
and
2) restoring such functionality in the face of countermeasures
The absence of functionality isn't a clear signal of intent, while countermeasures against said functionality is.
And then there is the distinction between the intent of the software publisher and the intent of the user. There is a big ethical difference between "Mozilla doesn't want advertisers tracking their users" and "those users don't want to be tracked". If these guys want to draw the line at "if there is a signal from the user that they want privacy, we won't track them", I think that's reasonable.
TBF the idea that any and all fingerprinting falls under the umbrella of exploiting a vulnerability was also presented as an assertion. At least personally I think it's a rather absurd notion.
Certainly you can exploit what I would consider a vulnerability to obtain information useful for fingerprinting. But you can also assemble readily available information and I don't think that doing so is an exploit though in most cases it probably qualifies as an unfortunate oversight on the part of the software developer.
I'm not so sure that counterpoint in particular holds. I think to say the "number of people that are going to be okay with that will [still] plummet" is an understatement. I'd go so far as to say no one, at least no rational person, would be okay with a "record [of] who you talk to, when, how long, where you go, what you do, what you say, when you sleep", etc., just because of the scale.
As to cloud - indeed, why would you want to trust a cloud provider with sensitive internal browsing? Also, providing a SaaS is a hassle, but I feel I must do it serve that side and enable those uses, some of which are cool.
I see some success by telling people "what if was our government doing the same thing to us, even by extorting private companies? what if that same government, or the next one, just hates you for whatever reason?"
> IMO you need to actually work around a technical measure intended to stop you for it to qualify as an exploit.
Even well-known vulnerabilities like SQL injection don't qualify under this definition?
The point isn't my precise wording but the underlying concept that making use of freely provided information isn't exploiting anything even if both the user and the developer are unhappy about the end result. Security boundaries are not defined post hoc by regret.
We recently discovered a privacy vulnerability affecting all Firefox-based browsers. The issue allows websites to derive a unique, deterministic, and stable process-lifetime identifier from the order of entries returned by IndexedDB, even in contexts where users expect stronger isolation.
This means a website can create a set of IndexedDB databases, inspect the returned ordering, and use that ordering as a fingerprint for the running browser process. Because the behavior is process-scoped rather than origin-scoped, unrelated websites can independently observe the same identifier and link activity across origins during the same browser runtime. In Firefox Private Browsing mode, the identifier can also persist after all private windows are closed, as long as the Firefox process remains running. In Tor Browser, the stable identifier persists even through the "New Identity" feature, which is designed to be a full reset that clears cookies and browser history and uses new Tor circuits. The feature is described as being for users who "want to prevent [their] subsequent browser activity from being linkable to what [they] were doing before." This vulnerability effectively defeats the isolation guarantees users rely on for unlinkability.
We responsibly disclosed the issue to Mozilla and to the Tor Project. Mozilla has quickly released the fix in Firefox 150 and ESR 140.10.0, and the patch is tracked in Mozilla Bug 2024220. The underlying root cause is inherited by Tor Browser through Gecko’s IndexedDB implementation, so the issue is relevant to both products and to all Firefox-based browsers.
The fix is straightforward in principle: the browser should not expose internal storage ordering that reflects process-scoped state. Canonicalizing or sorting results before returning them removes the entropy and prevents this API from acting as a stable identifier.
Private browsing modes and privacy-focused browsers are designed to reduce websites' ability to identify users across contexts. Users generally expect two things:
First, unrelated websites should not be able to tell they are interacting with the same browser instance unless a shared storage or explicit identity mechanism is involved.
Second, when a private session ends, the state associated with that session should disappear.
This issue breaks both expectations. A website does not need cookies, localStorage, or any explicit cross-site channel. Instead, it can rely on the browser’s own internal storage behavior to derive a high-capacity identifier from the ordering of database names returned by an API.
For developers, this is a useful reminder that privacy bugs do not always come from direct access to identifying data. Sometimes they come from deterministic exposure of internal implementation details.
For security and product stakeholders, the key point is simple: even an API that appears harmless can become a cross-site tracking vector if it leaks stable process-level state.
indexedDB.databases() do?IndexedDB is a browser API for storing structured data on the client side. Web applications use it for offline support, caching, session state, and other local storage needs. Each origin can create one or more named databases, which can hold object stores and large amounts of data.
The indexedDB.databases() API returns metadata about the databases visible to the current origin. In practice, developers might use it to inspect existing databases, debug storage usage, or manage application state.
Under normal privacy expectations, the order of results returned by this API should not, in itself, carry identifying information. It should simply reflect a neutral, canonical, or otherwise non-sensitive presentation of database metadata.
The issue we found comes from the fact that, in all Firefox-based browsers, the returned order was not neutral at all.
indexedDB.databases() became a stable identifierIn all Firefox Private Browsing mode, indexedDB.databases() returns database metadata in an order derived from internal storage structures rather than from database creation order.
The relevant implementation is in dom/indexedDB/ActorsParent.cpp.
In Private Browsing mode, database names are not used directly as on-disk identifiers. Instead, they are mapped to UUID-based filename bases via a global hash table:
using StorageDatabaseNameHashtable = nsTHashMap<nsString, nsString>;
StaticAutoPtr<StorageDatabaseNameHashtable> gStorageDatabaseNameHashtable;
The mapping is performed inside GetDatabaseFilenameBase() called within OpenDatabaseOp::DoDatabaseWork().
When aIsPrivate is true, the website-provided database name is replaced with a generated UUID and stored in the global StorageDatabaseNameHashtable. This mapping:
Later, when indexedDB.databases() is invoked, Firefox gathers database filenames via QuotaClient::GetDatabaseFilenames(...) called in GetDatabasesOp::DoDatabaseWork().
Database base names are inserted into an nsTHashSet.
No sorting is performed before iteration. The final result order is determined by iteration over the hash set’s internal bucket layout.
Because UUID mappings are stable for the lifetime of the Firefox process, and hash table structure and iteration order are deterministic for a given internal layout, the returned ordering becomes a deterministic function of the generated UUID values, hash function behavior, and hash table capacity and insertion history. This ordering persists across tabs and private windows, resetting only upon a full Firefox restart. Crucially, the UUID mapping and hash set iteration are not origin-scoped. They are process-scoped.
A simple proof of concept is enough to demonstrate the behavior. Two different origins host the same script. Each script:
indexedDB.databases().In affected Firefox Private Browsing and Tor Browser builds, both origins observe the same permutation during the lifetime of the same browser process. Restarting the browser changes the permutation.
Conceptually, the output looks like this:
created:
a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p
listed:
g,c,p,a,l,f,n,d,j,b,o,h,e,m,i,k
The important point is not the exact order itself, but rather that the order is not the original creation order, that the same order appears across unrelated origins, and it persists across reloads and new private windows, even after all private windows are closed. Only a full browser restart yields a new one. That is exactly what you do not want from a privacy perspective.
This issue enables both cross-origin and same-origin tracking within a single browser runtime.
Unrelated websites can independently derive the same identifier and infer that they are interacting with the same running Firefox or Tor Browser process. That lets them link activity across domains without cookies or other shared storage.
In Firefox Private Browsing mode, the identifier can persist even after all private windows are closed, provided the Firefox process itself is still running. That means a site can recognize a later visit in what appears to be a fresh private session. In Tor Browser, the stable identifier effectively defeats Tor Browser’s “New Identity” isolation within a running browser process, allowing websites to link sessions that are expected to be fully isolated from one another.
Tor Browser is specifically designed to reduce cross-site linkability and minimize browser-instance-level identity. A stable process-lifetime identifier cuts directly against that design goal. Even if it only survives until a full process restart, that is still enough to weaken unlinkability during active use.
The signal is not just stable. It also has high capacity.
If a site controls N database names, then the number of possible observable permutations is N!, with theoretical entropy of log2(N!). With 16 controlled names, the theoretical space is about 44 bits. That is far more than enough to distinguish realistic numbers of concurrent browser instances in practice.
The exact number of reachable permutations may be somewhat lower because of internal hash table behavior, but that does not materially change the security story. The exposed ordering still provides more than enough entropy to act as a strong identifier.
The right fix is to stop exposing entropy derived from the internal storage layout.
The cleanest mitigation is to return results in a canonical order, such as lexicographic sorting. That preserves the API's usefulness for developers while removing the fingerprinting signal. Randomizing output per call could also hide the stable ordering, but sorting is simpler, more predictable, and easier for developers to reason about.
From a security engineering standpoint, an ideal fix:
We responsibly disclosed the issue to Mozilla and to the Tor Project. Mozilla has released the fix in Firefox 150 and ESR 140.10.0, and the patch is tracked in Mozilla Bug 2024220. Because the behavior originates from Gecko’s IndexedDB implementation, downstream Gecko-based browsers, including Tor Browser, are also affected unless they apply their own mitigation.
This vulnerability shows how a small implementation detail can create a meaningful privacy problem. The impact is significant. Unrelated websites can link activity across origins during the same browser runtime, and private-session boundaries are weakened because the identifier survives longer than users would expect.
The good news is that the fix is simple and effective. By canonicalizing the output before returning it, browsers can eliminate this source of entropy and restore the expected privacy boundary. This is exactly the kind of issue worth paying attention to: subtle, easy to miss, and highly instructive for anyone building privacy-sensitive browser features.