Twenty One Zero-Days in FFmpeg

Ffmpeg has an exceptionally terrible track record when it comes to security. People have been throwing fuzzers at it for as long as I remember and coming back with a nearly inexhaustible supply of memory corruption bugs. Here's an effort by one Googler a decade ago:

https://security.googleblog.com/2014/01/ffmpeg-and-thousand-...

So, while it's a demo of the capabilities of LLMs, this should not be at all surprising. Ffmpeg is absolutely not something you should be running outside of a sandbox if you're touching any untrusted or user-supplied content. I know that people do, and these people are taking unreasonable risks.

>The reach of this bug is what makes it serious. Any deployment that points FFmpeg at an attacker-influenced RTSP URL is exposed: media ingest pipelines fetching user-supplied stream URLs, surveillance and CCTV systems pulling RTSP feeds, and transcoding services processing remote AV1-over-RTP sources

Wow this is actually pretty serious - I'm even surprised its being published. There are several services where I can imagine this is exploitable today.

Even if this isn't as big a deal as this [advertisement for a security company] seems, it is a reminder that every application you release does have a security hole somewhere, and a script kiddie can now find it 5 minutes after release for $2 in credit. If you're not red-teaming your code before release, hackers are doing it after.

> At this point the corrupted free pointer is called, and control of the instruction pointer is ours.

Very serious, though in practice it doesn't sound like this bug achieves arbitrary RCE on its own (especially in the presence of ASLR). You would need there to be some writable and executable page of memory lying around.

That's not what "zero-day" means.

What about VLC's own built-in versions of decoding libraries (I think, from the FFmpeg project)? Is there a scenario here where we may have to deal with malicious MP4 files?

Is the future of defense-against-foreign-agents-on-my-codebase to subtly hide prompt injections into one’s codebase that would defeat agents to find security bugs ?

If the attackers of ffmpeg need to be using such those authors’ services to find RCE in popular tools to attack, what the ffmpeg team needs to defeat attackers is to reduce efficiency of such tools depthfirst

Inflated use of the term zero-day, while none of the described vulnerabilities is actually a zero-day. But it sounds and clicks good.. thank you for the PoC.

Infinity - 21 is still infinity

I've been using ffmpeg for a very long time, both personally and for services I've built. Fabrice Bellard is a genius, and the developers who have taken it so far have made the world measurably richer.

But I can't think of a program more worthy of sandboxing when run with untrusted input than ffmpeg. It's a huge amount of C dealing with the most complicated video and audio codecs, which is notoriously impossible to get completely right.

But it's not actually that big of a problem. I run ffmpeg inside a VM or gVisor, and the end result is usually a video file that I'm perfectly willing to play in my browser, where it gets decoded in yet another sandbox because this shit is hard.

Is there a timeline for each of these bugs? I wonder if these bugs had been reported to ffmpeg yet.

I find difficult to know how serious the issue is, if it is even an issue.

LLM constantly confidently giving me this same sounding script with a "the root cause" and how it "is simple" while being completely incorrect.

> A victim only has to run ffmpeg -i rtsp://attacker/stream, the most ordinary command imaginable

What about "ls"?

How does the browser use it ?unless they mean there’s a zero day in libavcodec

"No way to prevent this" say users of only language where this regularly happens, etc, etc. Several of these bugs do not appear to be in hot code and would have been detected by a language with saner behaviour.

https://security.googleblog.com/2014/01/ffmpeg-and-thousand-...

> Ffmpeg is absolutely not something you should be running outside of a sandbox if you're touching any untrusted or user-supplied content.

I agree. I work for InstaVM which essentially gives you sandboxes - so I can share some perspective from the other side.

The trend is that people are building AI agents and these agents almost always have a chat box, so prompt injections are always a threat apart from the usual hallucinations or wrong code generated by the llm. Not everyone wants to give latest and great AI models to their users due to cost so end up with something like Gpt-4o and rm rf-ing the whole thing at times. At this point you have to use an isolated environment to guard against these.

Is GStreamer a more secure alternative or does it just get a bit less attention than ffmpeg?

They're also extremely hostile to security researchers who report these issues.

ffmpeg is also rather popular and delivers a lot of functionality. Its unlikely that you don't have it installed.

Yes, there are security issues but quite a few are not ffmpeg itself related - the input is pretty shabby or at least not exactly easy to deal with!

Obviously, they could do with some assistance and I'm sure you and I will both dive in with equal zeal.

Is GStreamer a more secure alternative or does it just get a bit less attention than ffmpeg?

From what I understand gstreamer is more about building complex pipelines and plugins, ffmpeg is better at playing some obscure 20 year old video format extremely efficiently so you can watch it compiled for a potato.

Different cases really I think both are good.

In my experience it's mainly run by very grumpy and opinionated Europeans who take pride in having bugs old enough to drink.

Inflated use of the term zero-day, while none of the described vulnerabilities is actually a zero-day. But it sounds and clicks good.. thank you for the PoC.

> A victim only has to run ffmpeg -i rtsp://attacker/stream, the most ordinary command imaginable

What about "ls"?

Infinity - 21 is still infinity

Is there a timeline for each of these bugs? I wonder if these bugs had been reported to ffmpeg yet.

> At this point the corrupted free pointer is called, and control of the instruction pointer is ours.

The article glosses over this, but it looks like the next variable in the struct is conveniently the first parameter to the function, so you can run arbitrary code with system() or whatever. But, yeah, you would need some other exploit to defeat ASLR.

I find difficult to know how serious the issue is, if it is even an issue.

LLM constantly confidently giving me this same sounding script with a "the root cause" and how it "is simple" while being completely incorrect.

Its 21 issues. And they've been human validated, as far as I can tell.

That's not what "zero-day" means.

Wow this is actually pretty serious - I'm even surprised its being published. There are several services where I can imagine this is exploitable today.

Is the future of defense-against-foreign-agents-on-my-codebase to subtly hide prompt injections into one’s codebase that would defeat agents to find security bugs ?

What about VLC's own built-in versions of decoding libraries (I think, from the FFmpeg project)? Is there a scenario here where we may have to deal with malicious MP4 files?

In my experience it's mainly run by very grumpy and opinionated Europeans who take pride in having bugs old enough to drink.

Different cases really I think both are good.

ffmpeg is also rather popular and delivers a lot of functionality. Its unlikely that you don't have it installed.

Yes, there are security issues but quite a few are not ffmpeg itself related - the input is pretty shabby or at least not exactly easy to deal with!

Obviously, they could do with some assistance and I'm sure you and I will both dive in with equal zeal.

> Ffmpeg is absolutely not something you should be running outside of a sandbox if you're touching any untrusted or user-supplied content.

I agree. I work for InstaVM which essentially gives you sandboxes - so I can share some perspective from the other side.

Its 21 issues. And they've been human validated, as far as I can tell.

It seems to have lost its meaning after getting popularized following Stuxnet coverage.

Some people might suggest it’s crucial to publish if you’re aware of a serious vulnerability, so that people using the software in a vulnerable way can take steps to mitigate the risk.

You would also need some sort of ASLR leak to make this exploitable

What do you mean "video file that I'm perfectly willing to play in my browser". Isn't it safe to assume that no video file can escape the browser decoding sandbox?

But then you also often need hardware accelerators for encoding, so you need to use C again.

All media containers are potentially hostile. Any offset, extent, or reference has to be considered hostile user-provided input.

How does the browser use it ?unless they mean there’s a zero day in libavcodec

Browsers run it in a sandbox process together with allocator hardening. Most of the bugs then are just crashed of the sandbox

Another option is WASM or WASM-style sandboxes if using another process is undesirable.

They're also extremely hostile to security researchers who report these issues.

> … hostile to security researchers who report these issues.

Do you have an example?

Some people might suggest it’s crucial to publish if you’re aware of a serious vulnerability, so that people using the software in a vulnerable way can take steps to mitigate the risk.

It seems to have lost its meaning after getting popularized following Stuxnet coverage.

No, I think it was since Code Red.

I understand why it's poorly understood. It's a snappy term, and people assume it means "bad" and nothing else because that's all you can get from the context. However, since most people also don't know the difference between a vulnerability and an exploit, they won't understand the definition of a zero-day when they read it.

But I'm still going to complain if a security vulnerability research company is using the term incorrectly in their own press copy. It makes them look amateurish.

But then you also often need hardware accelerators for encoding, so you need to use C again.

Browsers run it in a sandbox process together with allocator hardening. Most of the bugs then are just crashed of the sandbox

Another option is WASM or WASM-style sandboxes if using another process is undesirable.

One chained sandbox escape away from compromise.

You would also need some sort of ASLR leak to make this exploitable

What do you mean "video file that I'm perfectly willing to play in my browser". Isn't it safe to assume that no video file can escape the browser decoding sandbox?

Speaking from firsthand experience: codec and other media processing libraries are some of the easiest software to find address leaks in.

(There are a number of reasons for this, not least being that C makes it very easy to ship partially initialized memory over the wire.)

> Isn't it safe to assume that no video file can escape the browser decoding sandbox?

Why would that be safe to assume? If that were a reasonable assumption, you could just as well assume that it's safe to run ffmpeg.

All media containers are potentially hostile. Any offset, extent, or reference has to be considered hostile user-provided input.

No, I think it was since Code Red.

But I'm still going to complain if a security vulnerability research company is using the term incorrectly in their own press copy. It makes them look amateurish.

> … hostile to security researchers who report these issues.

Do you have an example?

I have numerous examples of security researchers being hostile and impossible to work with (but cannot share them unfortunately).

I'm glad to see their sense of humour :-)

https://nitter.net/ffmpeg/status/2039115531744334180

Oh my god! They are so funny and memeable! gets RCE'd

Speaking from firsthand experience: codec and other media processing libraries are some of the easiest software to find address leaks in.

(There are a number of reasons for this, not least being that C makes it very easy to ship partially initialized memory over the wire.)

> Isn't it safe to assume that no video file can escape the browser decoding sandbox?

Why would that be safe to assume? If that were a reasonable assumption, you could just as well assume that it's safe to run ffmpeg.

I'm not up-to-speed with the current state of sandboxing in browsers, but in principle it's (on modern operating systems) not especially hard for them to sandbox the decoding into a separate process with basically no privileges beyond rendering a video stream. It's a bit trickier if we're only considering demuxing and delegating decoding to the hardware, but that's a much smaller attack surface.

A manually run ffmpeg on the command line does nothing to restrict its privileges, and its security model has very little interest in doing so, while browsers very much have.

The parent does argues it is safer to sandbox ffmpeg yes

One chained sandbox escape away from compromise.

Which is of course better than zero sandbox escapes.

Ahah

But are the compiler+OS that runs the ffmpeg executable really a sandbox ?

I have numerous examples of security researchers being hostile and impossible to work with (but cannot share them unfortunately).

Oh my god! They are so funny and memeable! gets RCE'd

I'm glad to see their sense of humour :-)

https://nitter.net/ffmpeg/status/2039115531744334180

A manually run ffmpeg on the command line does nothing to restrict its privileges, and its security model has very little interest in doing so, while browsers very much have.

Ahah

But are the compiler+OS that runs the ffmpeg executable really a sandbox ?

The parent does argues it is safer to sandbox ffmpeg yes

Which is of course better than zero sandbox escapes.

TLDR: depthfirst’s production autonomous security agent discovered 21 zero-day vulnerabilities in FFmpeg, after intensive security analysis by Google and Anthropic. Moving beyond theoretical analysis, our agent produces concrete, reproducible PoC inputs to confirm its findings at a fraction of the costs ($1k vs. $10k). Several of the findings had been sitting latent for 15 to 20 years. We explored the exploitability of the issues and developed a PoC demonstrating a RCE exploit primitive.

FFmpeg is one of the most widely deployed pieces of software in the world. From the browsers we use daily to the infrastructure powering the large streaming platforms, it quietly processes media everywhere. As a library that routinely parses complex, untrusted media, it is inherently security critical and a prime target for zero-click attacks.

Looking deeper into FFmpeg’s repository reveals the true scale of the challenge: it is massive, comprising roughly 1.5 million lines of heavily optimized C code dedicated to parsing hundreds of complex media formats. Furthermore, it has absorbed over two decades of relentless fuzzing and manual audits. Recently, Google’s Big Sleep team disclosed 13 vulnerabilities in FFmpeg. Soon after, Anthropic used their Mythos model to scan FFmpeg and successfully discovered some security issues. These milestones demonstrated that advanced models are increasingly capable of reasoning through dense, hardened C code.

With these recent efforts, finding vulnerabilities in FFmpeg is getting much harder. At depthfirst, we built an agentic system that can do deep scans over large codebases. Finding bugs here is a measure of our security system’s capability. While we don’t have access to Mythos, we wanted to know how far we can go just using the models that are available to us. Can we re-discover what Big Sleep and Mythos have found? And more importantly, can we find any new critical bugs that they completely missed?

Depthfirst’s Security Agent

A coding agent and a security agent may use the same underlying models, but they operate with very different objectives. A coding agent is usually interactive: a human gives it a task, and the goal is to write code, rather than focusing on edge cases and adversarial inputs. A security agent has a narrower and more targeted goal. It is not trying to write useful application code, but trying to find real, exploitable security issues in an existing system without specific instructions.

That changes the shape of the agent. A security agent has to begin by threat modeling the codebase: understanding its architecture, identifying exposed parsers and protocol handlers, and mapping where attacker-controlled input can enter the system. From there, it audits the attack surface code directly, following data flow through the relevant components instead of treating the repository as a flat collection of files. In addition, a practical security agent needs guardrails that prevent it from fabricating missing conditions, over-claiming theoretical bugs, or flooding with false positives. It must check whether the attacker actually controls the right input, whether the vulnerable path is reachable, and whether the suspected flaw can be reproduced. When needed, it should identify or generate appropriate harnesses to interact with the target components and test those hypotheses concretely.

At depthfirst, our specialized security agents deeply analyze the code, branching out in parallel to test various hypotheses. They trace execution paths, validate whether an attacker controls the right inputs, and determine if the data flow actually reaches a vulnerable sink. Crucially, the outcome of this process isn’t just a theoretical report or a vague warning. The system automatically pinpoints the exact security issue with a reproducible concrete input, confirming the vulnerability by execution. This ensures that every finding delivered is real, reachable, and actionable.

The Findings

In total, our agents discovered 21 zero-days, spanning components from the TS demuxer to the VP9 decoder, with a total cost of roughly $1k (10% of what Anthropic spent using Mythos). Eight of the issues have already been assigned CVEs:

CVE-2026-39210 (Heap Buffer Overflow): Introduced in 2010 in the TS demuxer, lacking length bounds checks before reading two bytes.
CVE-2026-39211 (Integer Overflow): Introduced in 2010 during a swscale refactor, via a size factor formula with no upper bounds that allowed user-controlled parameters to trigger arbitrarily large scaling.
CVE-2026-39212 (Stack Overflow): A recent regression from July 2025 inside ffmpeg_opt.c, where a preset file could trigger option parsing recursively without a depth limit.
CVE-2026-39213 (Heap Buffer Overflow): Introduced in 2023 in the yuv4mpegenc rawvideo input path without validating dimensions against packet size.
CVE-2026-39214 (Stack Buffer Overflow): Introduced in 2003 during the original SDT implementation, this bug writes service entries without tracking remaining space. It sat latent for 23 years.
CVE-2026-39215 (Heap Buffer Overflow): Introduced in 2012 inside update_mb_info(), where a logic error allows a subsequent call to write 12 bytes past the allocated buffer.
CVE-2026-39216 (Heap Buffer Overflow): Introduced in 2012 in img2enc.c due to replacing a safe chroma size with an unbounded dimension-derived size.
CVE-2026-39217 (Heap Buffer Overflow): A recent regression from March 2025 in the VP9 decoder, where a refactored size update function caused tile thread buffers to miss necessary reallocations.
CVE-2026-39218 (Heap Buffer Overflow): Introduced in 2017 in the DASH demuxer by failing to reject negative duration values, turning fragment array indices negative.

The remaining issues are fixed, but we do not have CVE identifiers assigned yet. We reference them here by our internal tracking IDs:

DFVULN-127 (Heap Buffer Overflow): In the RTP AV1 depacketizer (rtpdec_av1.c), av1_handle_packet() advances the output write position by obu_size when skipping a Temporal Delimiter OBU without allocating matching space, so the next OBU is written well past the buffer boundary. The flaw has been present since the AV1 RTP depacketizer was first added in 2024.
DFVULN-126 (Heap Buffer Overflow): In the swscale graph code (graph.c), run_legacy_unscaled() mishandles interlaced YUV420P→NV12 conversion: get_field() doubles the plane linestrides, causing ff_copyPlane’s contiguous memcpy to overflow the destination Y-plane by 576 bytes. Introduced in 2024 with swscale’s new dynamic scaling API.
DFVULN-125 (Stack Buffer Overflow): In the RTP JPEG depacketizer (rtpdec_jpeg.c), jpeg_create_header() builds a quantization-table section in a 1024-byte stack buffer; a crafted packet with qtable_len >= 1024 fills it completely, then a trailing AV_WB16 writes two bytes past the end. A 2012 regression is to blame: days after the JPEG depacketizer landed, a refactor replaced the clamped table count with an unbounded qtable_len / 64, allowing enough quantization tables to overrun the fixed buffer.
DFVULN-124 (Heap Buffer Overflow): In the AVIF overlay path (ffmpeg_demux.c), istg_parse_tile_grid() fails to reject a dimg reference with zero tile entries; an unsigned wraparound then drives an out-of-bounds read on a one-byte heap allocation. Introduced in 2025 when automatic HEIF tile merging was added.
DFVULN-123 (Integer Overflow): In the RTP LATM depacketizer (rtpdec_latm.c), latm_parse_packet() performs a signed 32-bit addition that overflows and bypasses its bounds check, letting memcpy read roughly 1 GB past the end of a heap buffer. Present since the MP4A-LATM depacketizer was added in 2010.
DFVULN-122 (Heap Buffer Overflow): In the RTP MPEG-4 depacketizer (rtpdec_mpeg4.c), aac_parse_packet() accepts an AU-headers-length of 0, which yields a one-byte allocation that is then read as a four-byte field without checking that any AU headers are present. Present since MPEG4-AAC RTP support was added in 2005 — the oldest of the set, latent for over two decades.
DFVULN-121 (Heap Buffer Underflow): In the CAF demuxer (cafdec.c), read_seek() uses the return value of av_index_search_timestamp() directly as an array index without checking for -1; a crafted file makes all index timestamps negative, so a seek indexes index_entries[-1]. Present since the CAF demuxer was added in 2009.
DFVULN-120 (Integer Underflow): In the AVI demuxer (avidec.c), ff_read_riff_info() is called with size - 4 without verifying size >= 4; a LIST chunk of size 0 underflows to ~4 GB, bypassing bounds checks and triggering a ~2 GB allocation (DoS). Introduced in 2011 when RIFF INFO-tag parsing was generalized, replacing a bounded call with the underflow-prone size - 4.
DFVULN-119 (Heap Buffer Overflow): In the option parser (ffmpeg_opt.c), opt_map() contains a stray increment that misparses a link-label as a file index and stores a stream index of -1; the subsequent negative-map loop then reads before the AVStream** array. A 2025 regression, introduced when stream-group matching helpers extended -map parsing.
DFVULN-118 (Heap Buffer Overflow): In the RTSP server path (rtspdec.c), rtsp_read_announce() treats a negative Content-Length as valid; a remote ANNOUNCE with Content-Length: -1 causes an out-of-bounds write at sdp[-1]. A 2021 regression that removed the hardcoded SDP size cap and dropped the upper-bound check along with it.
DFVULN-117 (Heap Buffer Overflow): In the RTMP client (rtmpproto.c), rtmp_calc_swfhash() checks in_size < 3 instead of in_size < 8, allowing memcpy to read eight bytes from a buffer allocated with as few as three. Present since automatic SWFVerification hashing was added in 2012.
DFVULN-116 (Heap Buffer Overflow): In RTSP SDP parsing (rtsp.c), sdp_parse_line() computes strlen(control_url) - 1 on an empty string, wrapping a size_t to SIZE_MAX and producing a one-byte pre-buffer read. Present since SDP control-URI handling was added in 2010.

Finding a bug is one thing; proving it is exploitable is another. To truly understand the power of our system, we need to look at one specific bug and how it was found.

From a Skipped Frame Marker to PC Control

Among the 21 findings, one stood out: a heap buffer overflow in FFmpeg’s AV1 RTP depacketizer (libavformat/rtpdec_av1.c). It is reachable from the network with no special flags. A victim only has to run ffmpeg -i rtsp://attacker/stream, the most ordinary command imaginable, and a single 183-byte packet is enough to redirect execution.

To understand it, we first need a little background on how AV1 video travels over RTP. When FFmpeg pulls an RTSP stream, the server delivers the encoded video as a sequence of RTP packets. AV1 organizes its bitstream into OBUs (Open Bitstream Units). The RTP payload format splits these OBUs across packets, and FFmpeg’s depacketizer is responsible for stitching them back into a clean elementary stream. One special OBU type is the Temporal Delimiter (TD), a tiny marker that separates one temporal unit (frame) from the next. The spec explicitly tells the depacketizer to “ignore and remove” any TD it sees in the payload.

That innocent-looking “ignore and remove” is exactly where things go wrong, and exactly where our agent zeroed in.

The Root Cause

The depacketizer builds its output packet incrementally. A cursor named pktpos tracks where the next byte will be written into pkt->data, and it starts at the current end of the packet:

// libavformat/rtpdec_av1.c:199
pktpos = pkt->size;

As the code loops over the OBU elements in a packet, every byte it actually emits is preceded by a matching call to av_grow_packet, which enlarges the heap allocation backing pkt->data. The invariant the whole routine depends on is simple: **pktpos must never run ahead of the allocated size of pkt->data.** The Temporal Delimiter handling breaks that invariant:

// libavformat/rtpdec_av1.c:250
if ((obu_type == AV1_OBU_TEMPORAL_DELIMITER) ||
    (obu_type == AV1_OBU_TILE_LIST)) {
    pktpos += obu_size;        // advance the output cursor...
    rem_pkt_size -= obu_size;  // ...and the input counter
    obu_cnt++;
    continue;                  // but never allocate, and never advance buf_ptr
}

When a TD is skipped, pktpos is pushed forward by the attacker-declared obu_size, yet no memory is allocated to back that advance. Worse, the input pointer buf_ptr is not moved past the TD’s bytes. Two distinct problems fall out of this single continue:

The write cursor is now poisoned. After skipping a TD with obu_size = 148, pktpos equals 148, but pkt->data is still unallocated (or far smaller than 148 bytes).
The attacker controls what gets written there. Because buf_ptr never advanced, the next loop iteration re-parses the TD’s own bytes — its header byte is re-read as a fresh OBU length, and its payload becomes that fabricated OBU’s contents. The data that will eventually land at the poisoned offset is fully attacker-supplied.

On the next iteration the loop reaches a normal OBU and grows the packet by just that OBU’s size:

// libavformat/rtpdec_av1.c:296
if ((result = av_grow_packet(pkt, output_size)) < 0)
    return result;
...
// libavformat/rtpdec_av1.c:304 / 336 — writes begin at pkt->data[pktpos]
pkt->data[pktpos++] = *buf_ptr++ | AV1F_OBU_HAS_SIZE_FIELD;
...
memcpy(pkt->data + pktpos, buf_ptr, obu_payload_size);

With a fabricated OBU of 17 bytes, av_grow_packet allocates an 81-byte buffer (17 bytes plus FFmpeg’s 64-byte input padding). But the writes begin at pkt->data[148], which is 67 bytes past the end of the allocation. This is a heap buffer overflow with a fully controlled offset and fully controlled contents, which is about as strong a primitive as a memory-corruption bug can offer.

Exploitation

A controlled overflow is only useful if there is something worth corrupting just past the buffer. Here, FFmpeg’s own allocator hands us a perfect target.

When av_grow_packet allocates the packet’s data buffer, it routes through av_buffer_alloc, which performs three sequential heap allocations: the data buffer itself, an AVBuffer bookkeeping struct, and an AVBufferRef. Because FFmpeg allocates everything through posix_memalign with 64-byte alignment, our 81-byte data buffer occupies a 128-byte chunk, and the AVBuffer struct lands immediately after it. That struct contains a function pointer:

// libavutil/buffer_internal.h
struct AVBuffer {
    uint8_t *data;        // +0
    size_t   size;        // +8
    atomic_uint refcount; // +16  (4 bytes + 4 padding)
    void (*free)(void *opaque, uint8_t *data); // +24  ← target
    void    *opaque;      // +32
    ...
};

Counting from the start of the data buffer, the AVBuffer.free pointer sits at offset 152. This is the callback FFmpeg invokes to release the buffer’s memory — and it is exactly what we aim the overflow at.

The arithmetic is deliberately tuned. With the TD’s obu_size = 148, writes start at pkt->data[148]. The TD header byte 0x10 is re-interpreted as a length of 16, producing a fabricated 16-byte OBU whose header and payload are written starting at offset 148:

// libavutil/buffer_internal.h
struct AVBuffer {
    uint8_t *data;        // +0
    size_t   size;        // +8
    atomic_uint refcount; // +16  (4 bytes + 4 padding)
    void (*free)(void *opaque, uint8_t *data); // +24  ← target
    void    *opaque;      // +32
    ...
};

There is one subtlety that makes the whole thing reliable: AVBuffer.refcount lives at offset 144–147, below where our writes begin at 148. The overflow corrupts free while leaving refcount untouched at its original value of 1. That matters for the trigger.

To actually fire the hijacked pointer, the packet needs to be freed. The exploit embeds a third fabricated OBU in the TD payload, which drives one more av_grow_packet. Because the buffer was created with av_buffer_alloc rather than av_buffer_realloc, it is not flagged as reallocatable, so FFmpeg takes the “allocate a fresh buffer and release the old one” path:

// libavutil/buffer.c:209
if (!(buf->buffer->flags_internal & BUFFER_FLAG_REALLOCATABLE) || ...) {
    ret = av_buffer_realloc(&new, size);  // fresh buffer
    memcpy(new->data, buf->data, ...);    // copy data across
    buffer_replace(pbuf, &new);           // release the old, corrupted buffer
    return 0;
}

buffer_replace decrements the old buffer’s refcount, which we carefully left at 1, to 0, and invokes the freeing callback:

// libavutil/buffer.c:129
if (atomic_fetch_sub_explicit(&b->refcount, 1, memory_order_acq_rel) == 1) {
    b->free(b->opaque, b->data);  // b->free is now 0xdeadbeef
}

At this point the corrupted free pointer is called, and control of the instruction pointer is ours. On a release build, the single 183-byte RTP packet produces:

#0  0x00000000deadbeef in ?? ()
rip            0xdeadbeef          0xdeadbeef
#1  buffer_replace (buffer.c:133)      ← b->free(b->opaque, b->data)
#2  av_buffer_realloc (buffer.c:220)
#3  av_grow_packet (packet.c:151)
#4  av1_handle_packet (rtpdec_av1.c:296)
#5  rtp_parse_packet_internal (rtpdec.c:743)

The reach of this bug is what makes it serious. Any deployment that points FFmpeg at an attacker-influenced RTSP URL is exposed: media ingest pipelines fetching user-supplied stream URLs, surveillance and CCTV systems pulling RTSP feeds, and transcoding services processing remote AV1-over-RTP sources. No authentication, no user interaction beyond opening the stream, and no unusual command-line flags are required — the vulnerability triggers during the normal RTSP PLAY phase that every one of these clients performs by design.You may find the PoC code here.

Hacker Times