Like this post didn't even mention presentational attributes, like how cursor attribute can contain a url that gets loaded. Or any of the other tricky parts of svg sanitization, like using dtd to bypass things.
The infamous you can't parse (X)HTML with regex¹ meme is from 2009, yet this fix was done in 2019. I guess the SO answer never mentioned SVG.
Tag names, attributes, attribute values, event callback default-cancelers... so many ways to declare that this node and its children shouldn't parse/evaluate scripts.
As Jay-Z said: "I've got 99 solutions, fixing a problem ain't one"
This version 3 could have the version number changed to 2 in order to do cool SVG things, so full-fat SVG as version 2 is now. But you could just flip to 2 to a 3 on upload, so any embedded URLs are harmless.
This could be useful for the creator too, as it is helpful to have layers of source images in bitmap format to work with, and you can easily export such things accidentally.
A useful thing I learned recently is that, while CSP headers are usually set using HTTP headers, you can also reliably set them directly in HTML - for example for HTML generated directly on a page where HTTP headers don't come into play:
<iframe sandbox="allow-scripts" srcdoc="
<meta http-equiv='Content-Security-Policy'
content='default-src none; script-src unsafe-inline; style-src unsafe-inline;'>
<!-- untrusted content here -->
"></iframe>
It feels like this shouldn't work, because JavaScript in the untrusted content could use the DOM to delete or alter that meta tag... but it turns out all modern browsers specifically lock that down, treating those CSP rules as permanent as soon as that meta tag has loaded before any malicious code has the chance to subvert them.I had Claude Code run some experiments to help demonstrate this a few weeks ago: https://github.com/simonw/research/tree/main/test-csp-iframe...
I do feel that's there's two distinct types of svg - "bunch of paths with fills" and "clever dangerous stuff" where most real SVGs are of the former type.
Fully expect this to be shot down by someone that's thought about this problem for longer than the 120 seconds I just spent. :)
https://developer.mozilla.org/en-US/docs/Web/API/HTML_Saniti...
https://developer.mozilla.org/en-US/docs/Web/API/HTML_Saniti...
I'm sure it'd just open up a whole other can of worms though... not to mention having to wait for browsers to actually support it.
The real solution here is definitely CSP + basic sanitisation though.
(This isn't a comment on the challenges in proper sanitization fwiw, as I've needed to do various of the same things myself)
Is it because the SVG parser/renderer being used is an entire library, and it would be prohibitive to write your own SVG parser/renderer or insert your own code into the existing one?
It is not, and never was, an image format. It's a markup language.
> Example from Scratch's test suite:
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg version="1.1" xmlns="http://www.w3.org/2000/svg">
<circle cx="250" cy="250" r="50" fill="red" />
<script type="text/javascript"><![CDATA[
alert('from the svg!')
]]></script>
</svg>
Is this really an issue? This is the method that the chrome teams polyfill to replace XSLT suggests you do. https://github.com/mfreed7/xslt_polyfill/tree/main#usageIt sounds like the linked post was about someone using a blacklist instead of a whitelist. It doesnt matter how tiny your subset is if you allow through stuff you don't recognize.
For the most part svg is safe. The dangerous parts are pretty obvious - script tag, image tag, feImage tag, attributes starting with on, embedding html in <foreignObject>, DTD tricks, namespace tricks, CSS that loads external stuff (keep in mind also presentational attributes. Its not just style attribute/tag).
The rest of it is pretty safe.
This should be in addition to heavily restricting CSP on user content. (Hmm, surely all images should be served with the CSP header set.)
I can't imagine the cumulative number of man hours wasted on this problem when the vast majority of users were just looking for a way to make their logos look sharp.
I'd love to see an agreed standard like OpenGL vs OpenGL ES for SVG. SVG-ES. Everyone agrees on the static, non-scripted elements that should work.
A better approach here would be to just serve svg with Content-security-policy: script-src 'none'; sandbox
If someone formalizes this as a new format, please give it a new name! tvg tiny vector graphics? savg safe vector graphics?
And keep the scope as simple as possible so it actually ships! Don’t try implementing a binary format or something.
Sanitization-wise it's already possible to strip scripting from SVGs and anything else you want, it's just that a library like DOMPurify to avoid ballooning in size doesn't include say a preset to handle the extra parsing necessary to make them behave like browsers treat IMG embeds, so it's up to devs to add their own.
But yeah, a world where a simple attribute to achieve the same effect as an IMG embed but for inlined SVGs would be nice.
There is iframe srcdoc if you want to do this.
I suppose an actual exception is Content-Disposition. If you want the user to save a file, you need to serve it with dest == document as far as I know.
By turning it into a document boundary when you use the sandbox attribute, kinda similar to loading an svg file inside of an <img> tag.
and yeah you could get 90% of the way there with an iframe srcdoc, but I was imagining some kind of cross between an <iframe> sandboxed into its own origin, and an <img> where it still has its own intrinsic size.
but it was mainly just a throw away thought, I've not really thought it through much deeper than that.
Even worse, OP's latest post "Every version of Scratch is vulnerable to arbitrary code execution" just tells you how exactly to exploit something similar today in the current version with no mention of responsible disclosure except a plug to say, "hey, check out my project, this one doesn't have RCE!" This is so irresponsible it borders on malicious.
Think of prior technologies like display postscript and .doc, where a data format ended up a with big problems from its embedded "exec" type features.
You could change the default behavior to the “safer” behavior. And then add some sort of “danger mode” attribute. But… devs are usually hesitant to do something that would break legitimate code, such as changing the default behavior would do.
[1]: https://developer.mozilla.org/en-US/docs/Web/API/SVGGraphics...
But for the most part, I 100% agree, and I've been considering making a format for my own use-cases. I think the biggest issue is in agreeing as to what subset is necessary; plus, of course, getting any level of adoption (though the latter isn't a factor for my own use ... except in the sense that there are no tools to help).
For example, do we need animations? Gradients? If so on the latter, what kind?
Though I think it's still a draft, it does appear to be a requirement for BIMI - https://en.wikipedia.org/wiki/Brand_Indicators_for_Message_I...
I would even go further: HTML should never have supported scripting.
It's like my ticket to add data url support to sheets. It just gets punted year after year.
I'm don't remember precisely but I don't think you could script it from the DOM, I don't see how that could work if it's a plugin.
But also so that setting up a CSS transform: scale(10000) can't take over the entire viewport, it'd be constrained to an iframe-like boundary (exactly like an <img>) but still remain as an inline SVG, sort of like an <iframe srcdoc>. So scripts on the parent/host HTML document can still manipulate it like the rest of the DOM, but the inner <svg> elements are all "inert" for want of a better word.
Actually I don't know off the top of my head what happens with an SVG file inside of a <img> when it references external images (either cross-domain or not.) I know scripts and animations get disabled, so I'd take a guess and say some CSS gets blocked too.
Again I've not really thought terribly hard about it, or if it's actually useful at all, and I'm betting it'd be filled with even more foot-guns than there are right now. I'm just thinking out loud.
Like opening a PNG in a new tab is harmless but opening an SVG in a new tab is opening a pretty substantial can of worms.
Scratch has a long history of SVG-related vulnerabilities. The source of these is that Scratch parses user-generated (ie. attacker-controlled) content into an <svg> element and appends it into the main document for various operations (eg. measuring SVG bounding box in a more reliable way than viewbox or width/height).
No matter how briefly the SVG remains in the main document, this is an inherently unsafe operation. Scratch's approach to making this safe has been to build increasingly complex infrastructure around parsing the SVG and the markup within to remove dangerous parts.
I think Scratch's approach to SVG sanitization is doomed. To explain, we have to take a trip through the history of SVG sanitization in Scratch to see how well it has worked so far.