Differences between revisions 33 and 34
Revision 33 as of 2014-06-13 14:10:33
Size: 8410
Editor: lu_zero
Revision 34 as of 2014-06-13 14:13:27
Size: 8299
Editor: lu_zero
Deletions are marked like this. Additions are marked like this.
Line 38: Line 38:
==== AddressSanitizer (gcc, clang) ==== === AddressSanitizer (gcc, clang) ===
Line 43: Line 43:
===== Configuring and building Libav with AddressSanitizer =====
Line 48: Line 47:
===== Using AddressSanitizer with GDB ===== ==== Using AddressSanitizer with GDB ====
Line 72: Line 71:
==== MemorySantizer (gcc, clang) ==== === MemorySantizer (gcc, clang) ===
Line 75: Line 74:
==== UndefinedBehaviourSanitizer (gcc, clang) ==== === UndefinedBehaviourSanitizer (gcc, clang) ===
Line 79: Line 78:

===== Configuring and building Libav with ubsan =====
Line 92: Line 89:
===== GDB breakpoints for ubsan ===== ==== GDB breakpoints for ubsan ====
Line 118: Line 115:
===== Using the UndefinedBehaviourSanitizer =====
After configuring with ubsan and building, run commands like normal; avconv is particularly useful.
==== Using the UndefinedBehaviourSanitizer ====
After configuring with ubsan and building, run commands like normal; '''avconv''' is particularly useful.
Line 122: Line 118:
Line 123: Line 120:
Line 125: Line 123:
{{{#!code gcc
Line 127: Line 125:

Catching elusive bugs

Libav contains bugs, many have already been fixed, some remain and few might appear again. Complex code has plenty of corner cases and many of them can lead to memory corruption and crashes, infinite loops and memory leaks. Fortunately, there are a variety of useful tools available to catch them. Consider using Libav in a sandbox.

The Libav build system provides built-in support for most of the instrumentation tools described below. Patches to support additional tools or suggestions regarding new useful tools are always welcome.

Where to start

FATE has a number of instances running instrumented to catch regressions on the normal codepaths, giving us some level of confidence normal decoding would work as expected.

Corner cases, non-standard samples, corrupted samples on the other hand are bound to exericise codepaths that aren't not checked routinely: this is where most of the remaining bugs hide.

Dynamic analysis


Valgrind is a suite of tools for checking for errors using memory. Usually memcheck and massif provide good and precise results. Unfortunately, helgrind has problems tracking our non-standard threading system. Valgrind works well on Linux and Mac OS X. See the Valgrind documentation for more information.


Memcheck is a memory error detector, and catches illegal reads/writes, use of uninitialized values, illegal frees, and some memory leaks, among other issues. See the Memcheck manual for more details.

Using Memcheck

valgrind --tool=memcheck avplay yourfile.wav


Massif is a heap profiler, and can catch some leaks that mecheck cannot. See http://valgrind.org/docs/manual/ms-manual.html for details.

Using Massif

valgrind --tool=massif avplay yourfile.wav

Dr. Memory

Dr. Memory is similar to memcheck feature-wise but faster and known to work on Linux, Windows, and Mac OS X. It is less mature than memcheck.

Compiler specific instrumentation

The following tools work with a specific compiler and instrument the binary produced.

AddressSanitizer (gcc, clang)

AddressSanitizer is an instrumentation to check faulty memory access. It is somewhat like a faster Memcheck, without the memory bookkeeping. You still need Memcheck and Massif to track memory leaks.

The overhead is bearable in execution and negligible on compilation.

The simplest way to use asan is to pass --toolchain=clang-asan or --toolchain=gcc-asan to configure.

Using AddressSanitizer with GDB

It integrates nicely with gdb:

  • It is possible to call some of its internal functions to introspect pointers.

define p_a
    print __asan_describe_address($arg0)
  • It is possible to set breaks on entry points and inspect the state using the normal commands (e.g. frame and bt)

define b_a
    br __asan_report_load1
    br __asan_report_load2
    br __asan_report_load4
    br __asan_report_load8
    br __asan_report_load16
    br __asan_report_store1
    br __asan_report_store2
    br __asan_report_store4
    br __asan_report_store8
    br __asan_report_store16

MemorySantizer (gcc, clang)

MemorySanitizer is an experimental tool, which tries to closely mimic Valgrind's Memcheck. It is currently much harder to use, because it needs to have every library instrumented, including libc.

UndefinedBehaviourSanitizer (gcc, clang)

Undefined behaviour are a particularly tricky fault to track, since you might or might not get the expected behaviour, depending on the architecture, compiler version, optimization level, etc. Luckily, once found, this kind of issues are the easiest to fix.

The simplest way to use ubsan is to pass --toolchain=clang-usan or --toolchain=gcc-usan to configure. For historical reasons, the toolchain arguments refer to usan, not ubsan. It will not halt the execution by default as asan does. Pass -fno-sanitize-recover as --extra-cflags= to have runtime asserts that stop execution.

This compiler instrumentation makes quite easy spot them as well.

The undefined behavior sanitizer adds significant overhead, and makes compilation much slower.

Like the AddressSanitizer, it is possible to integrate it with gdb and break on (some) undefined code. Combined with the fact that ubsan does not abort execution by default, it comes handy when evaluating a group of issues in a single session.

GDB breakpoints for ubsan

  • Breakpoints

define b_u
    br __ubsan_handle_add_overflow
    br __ubsan_handle_mul_overflow
    br __ubsan_handle_negate_overflow

    br __ubsan_handle_builtin_unreachable
    br __ubsan_handle_divrem_overflow
    br __ubsan_handle_out_of_bounds

    br __ubsan_handle_float_cast_overflow
    br __ubsan_handle_shift_out_of_bounds

    br __ubsan_handle_function_type_mismatch
    br __ubsan_handle_sub_overflow

    br __ubsan_handle_load_invalid_value
    br __ubsan_handle_type_mismatch

    br __ubsan_handle_missing_return
    br __ubsan_handle_vla_bound_not_positive

Using the UndefinedBehaviourSanitizer

After configuring with ubsan and building, run commands like normal; avconv is particularly useful.

./avconv -i yourfile.mp4 -f null -

In addition to the normal output, there is information printed about undefined behavior:

/path/to/libav/libavcodec/h264_mb.c:304:27: runtime error: left shift of negative value -1


The simplest type of fuzzing consists of generating a large number of samples with some bits swapped randomly. We check if they trigger uncaught issues using the rest of the tools on this page, as potentially invalid input data can trigger branches that would otherwise not be run in Libav, as well as incorrect memory access.


zzuf is among the easier and faster to use fuzzer.

How to use zzuf

while true; SEED=$RANDOM; do
    for file in SAMPLES; do
        zzuf -M -1 -q -U 60 -s $SEED ./avconv -i "$file" -f null - || echo $SEED $file >> fuzz
  • -M sets the max memory to use (1MB)
  • -q hides the ouput
  • -U kills the process after a given time (60s) (useful for exiting out of infinite loops)

Leave this running for a while and magic will happen. When your application crashes zzuf will print the seed and ratio parameters you'll need to reproduce the crash. For example

zzuf[s=5115,r=0.004]: signal 11 (SIGSEGV)

means that the application crashed because of a segfault and by calling zzuf -s 5115 -r 0.004 you will make it crash again.

If you want to debug the application you can't use zzuff directly, but rather you can fuzz the file, dump it and feed it to avconv with your favourite debugger. Using data from the example above

zzuf -s=5115 -r=0.004 cat working_input.file > fuzzed_output.file

Note that sometimes invalid reads/writes do not cause a crash during debugging, so Valgrind might be a good alternative too.

Static analysis

Some errors can be detected by analyzing source code without running it. Static analisys tools are a way to find some bugs, though they suffer from false positives and cannot catch every problem.

clang static analizer

Clang offers scan-build to easily analyze projects by adding an extra phase in the normal build process.

It generates a descriptive html report. However, sometimes the amount of false positives are high enough that it is not always useful.

CategoryWIP CategoryDebug CategorySecurity