Skip to content

Latest commit

 

History

History
168 lines (136 loc) · 7.22 KB

fuzzing.md

File metadata and controls

168 lines (136 loc) · 7.22 KB

Fuzzing bpftrace

This document is for bpftrace developers.

Introduction

Fuzzing is a method to find bugs in a program automatically. In fuzzing, a fuzzer generates the program input and give it and observes whether the program crashes or not. The most commonly used fuzzing method is called gray box fuzzing, which uses coverage (which parts the program executes) information to generate input efficiently.

Fuzzing can be divided into two types according to the target of fuzzing: one that targets the entire program for fuzzing, such as AFL, and the other that targets a specific function, such as libFuzzer. In the former case, a fuzzer generates and supplies the program's input, so you don't need to modify the program. On the other hand, it is not always efficient for large programs, though, in reality, AFL founds a lot of bugs in many programs. The latter is efficient for a function to be fuzzed because a fuzzer directly targets the function, but we need to write some glue code to connect a fuzzer and the function.

bpftrace options for fuzzing

bpftrace has several options useful for fuzzing.

BPFTRACE_MAX_AST_NODES environment variable

When doing fuzzing, it is important to limit the number of AST nodes because otherwise, a fuzzer might keep generating a very long program that causes a stack overflow. BPFTRACE_MAX_AST_NODES environment variable controls the maximum number of AST nodes.

Fuzzing with AFL

Here, I briefly describe the way to fuzz bpftrace with AFL. I highly recommend reading the documentation in the AFL's repository for further information.

Install AFL (or AFLPlusPlus)

Please install AFL or AFLPlusPlus according to the instructions. I use AFLPlusPlus because it works well in my environment. AFLPlusPlus is a forked version of AFL (there was a time when the AFL wasn't updated for a while. Now, AFL is hosted on Google's github, and the development is continuing). AFL and AFLPlusPlus have almost the same interface.

Compile for fuzzing

To use AFL, we need to compile the program with the AFL's compiler (it's the wrapper of gcc/clang and do some instrumentation for measuring coverage.) Below is an example of a compile.

CC=/path/to/AFLplusplus/afl-clang-fast \
CXX=/path/to/AFLplusplus/afl-clang-fast++ \
AFL_USE_ASAN=1 \
cmake .. \
-DBUILD_FUZZ \
-DFUZZ_TARGET=codegen \
-DCMAKE_BUILD_TYPE=Debug \
-DBUILD_TESTING=0 \
-DBUILD_ASAN=1

then,

AFL_USE_ASAN=1 make -j$(nproc)

Important points:

  • -DBUILD_FUZZ option is required to build bpftrace for fuzzing. It adds -DFUZZ to compile options.
  • -DFUZZ_TARGET is used to let the program stop right after the specified process. The supported value is either "semantic" or "codegen". For example, if -DFUZZ_TARGET=semantic, then the program stops after a semantic analysis.
  • AddressSanitizer might take a lot of memory. If you want to fuzz without it, please remove AFL_USE_ASAN and -DBUILD_ASAN.

Let's Fuzzing

First, AFL requires some settings for efficient fuzzing.

echo core | sudo tee -a /proc/sys/kernel/core_pattern
cd /sys/devices/system/cpu
echo performance | sudo tee cpu*/cpufreq/scaling_governor

Then, start fuzzing! AFL and AddressSanitizer have a lot of settings, so please read each documentation for the details. The sample way to run fuzzer is like below:

CPU=0
FUZZER=/path/to/AFLplusplus/afl-fuzz

sudo AFL_NO_AFFINITY=1 \
     ASAN_OPTIONS=detect_leaks=0:abort_on_error=1:symbolize=0 \
     BPFTRACE_MAX_AST_NODES=200 \
     taskset -c ${CPU} \
     $FUZZER -M 0 -m none -i ./input -o ./output -t 3000 -- \
     ./src/bpftrace_fuzz @@

I describe several important things:

  • bpftrace_fuzz is a fuzzer that we built. It's a slightly modified version of bpftrace for fuzzing, and the first argument is the script file name. If the argument is not given, it reads a script from stdin.
  • -i is the input directory, and -o is the output directory. In the input directory, you need to put something to start fuzzing. The most simple example is echo a > input/a. More sophisticated inputs can be created by extracting the bpftrace program from the source code directory (especially from tests directory. See the "Input corpus creation" section below for the details.) If some inputs that cause a program crash is found, output/crashes contains them.
  • bpftrace has known that it has several memory leaks. Therefore ASAN_OPTIONS=detect_leaks=0 is needed. Otherwise, fuzzer thinks each memory leak as a crash and report it. Also, abort_on_error=1: symbolize=0 is required for fuzzing.
  • -t 3000 is the timeout value of each execution. Because (especially codegen) sometimes take a long time to process, it is important to have a longer timeout. Otherwise, AFL would stop fuzzing.
  • @@ will be replaced by the input file generated by the fuzzer.

Fuzzing with libFuzzer

LibFuzzer is a coverage-guided fuzzer developed with llvm/clang, and bpftrace can be fuzzed with it.

Compile with libFuzzer

To use the libFuzzer, use clang and compile it as follows.

CC=clang-10 CXX=clang++-10 cmake .. -DBUILD_ASAN=1 -DBUILD_FUZZ=1 -DUSE_LIBFUZZER=1 -DBUILD_TESTING=0

-DBUILD_FUZZ=1 and -DUSE_LIBFUZZER=1 is necessary.

Fuzzing with libFuzzer

The compiled binary itself is a fuzzer, and fuzzing can be performed with commands like the following.

sudo ASAN_OPTIONS=detect_leaks=0 ./src/bpftrace_fuzz -max_len=1024 input

"input" is a corpus of inputs, just as it is used in AFL. Unlike the AFL, the libFuzzer stops when it crashes.

Input corpus creation

Here are some examples of creating input corpus.

  • Extract scripts from runtime tests (only the ones enclosed in "")
cat ./tests/runtime/scripts/* | grep RUN | grep -- "-e"| \
sed "s/[^']*'\([^']*\)'.*/\1/" | parallel -N1 "echo {} > input/{#}"`
  • Use bpftrace tools but remove comments and blank lines
find ./tools -name "*.bt" | \
parallel -N1 "sed -e '/^#\!/d' -e '/\/\*.*/d' -e '/^\s\*.*/d' -e '/\/\/.*/d' -e 's/^\s\+//g' -e '/^$/d' {} > input/{#}"

Found bugs

AFL

libFuzzer