Ancient way of coding helps boost popular video encoder by 100x — but is it too good to be true?

FFmpeg
(Image credit: FFmpeg)

  • FFmpeg’s biggest speedup yet affects only one function few people will have heard of
  • Handwritten Assembly makes a comeback in a niche filter that most users will never even touch
  • AVX512 gives FFmpeg an absurd 100x gain - but only if your CPU supports it

The FFmpeg project, known for powering some of the most widely used video editing software and media tools, is making headlines again.

Developers claim to have achieved what they call “the biggest speedup so far,” delivering a 100x performance gain in a recent update.

The catch? It only applies to a single, obscure function, and the means of achieving it is raising eyebrows - handwritten Assembly code, a technique largely seen as outdated by most of today’s developers.

Assembly coding sparks both nostalgia and skepticism

Assembly language, once essential for getting the most out of limited hardware in the 1980s and 1990s, has become a niche practice.

Yet FFmpeg developers continue to rely on it for extreme optimization, calling themselves “assembly evangelists.”

In their latest patch, they rewrote a filter called rangedetect8_avx512 using AVX512 instructions, part of a modern SIMD (Single Instruction, Multiple Data) toolkit that helps CPUs perform multiple tasks in parallel.

On systems without AVX512 support, the AVX2 variant still delivers a 65.63% improvement.

As the team points out, “It’s a single function that’s now 100x faster, not the whole of FFmpeg.”

This news follows a similar boost reported in November 2024, where another patch brought certain operations up to 94x faster.

In that case, part of the earlier performance gap stemmed from mismatched filter complexity: the generic C version used an 8-tap convolution, while the SIMD version used a simpler 6-tap approach.

Even compiling the C version in release mode with a better compiler like Clang could close over 50% of the gap, suggesting that some of the claimed speed gains may have been exaggerated by comparing worst-case with best-case conditions.

“Register allocator sucks on compilers,” the devs quipped on social media, highlighting compiler inefficiencies.

Despite the caveats, this renewed focus on low-level coding has sparked fresh conversations around performance optimization.

FFmpeg powers everything from VLC Media Player to countless YouTube downloader tools, so even small improvements in isolated filters can ripple through widely used software.

However, it’s worth noting that such results are often difficult to replicate and apply across broader parts of the codebase.

While these kinds of deep optimizations are impressive, they may not reflect real-world improvements for everyday users editing footage with video editing software.

Unless other core functions receive similar treatment, the promise of a faster FFmpeg might remain limited to technical benchmarks.

Via TomsHardware

You might also like

Efosa Udinmwen
Freelance Journalist

Efosa has been writing about technology for over 7 years, initially driven by curiosity but now fueled by a strong passion for the field. He holds both a Master's and a PhD in sciences, which provided him with a solid foundation in analytical thinking. Efosa developed a keen interest in technology policy, specifically exploring the intersection of privacy, security, and politics. His research delves into how technological advancements influence regulatory frameworks and societal norms, particularly concerning data protection and cybersecurity. Upon joining TechRadar Pro, in addition to privacy and technology policy, he is also focused on B2B security products. Efosa can be contacted at this email: udinmwenefosa@gmail.com

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.