CrashMonkey is a file-system agnostic testing framework for file-system consistency. It is meant to explore many crash states that are possible when a computer crashes in the middle of a write operation. CrashMonkey is made up of 3 main parts:
- file-system agnostic kernel module for bio logging and disk snapshotting
- user space test harness which coordinates everything
- user space, user defined test cases which specify the workload to be tested and, optionally, data consistency tests to run on each generated crash state.
The HotStorage'17 paper CrashMonkey: A Framework to Automatically Test File-System Crash Consistency has a more detailed explanation of the internals of CrashMonkey. Link
CrashMonkey also makes use of common Linux file system checker and repair programs like fsck
.
The easiest (and recommended) way to start working on (or using) CrashMonkey is to setup a virtual machine and run everything in the VM. This is partly so that any bugs in the kernel module don't bring down your whole system and partly because I just find it easier. In the future I may try to get a Docker running with all the needed packages and files so that things are easy to setup and get running. In the meantime, you should spin up an Ubuntu 14.04 LTS or Ubuntu 16.04.2 LTS VM and work on there. CrashMonkey is known to work on kernel versions 3.13.0-121-generic and 4.4.0-62-generic. The Ubuntu VM that you create will also need the following packages to properly build and run CrashMonkey:
- make
- git
- gcc
- g++
- linux kernel headers
- install with
sudo apt-get install linux-headers-$(uname -r)
- install with
Furthermore, the VM should have enough disk space to build and compile CrashMonkey as well as enough RAM to run any tests you want. I mention RAM because CrashMonkey uses a RAM block device during its tests, so you will need to give it at least as much RAM as the largest test you plan on running. For small tests, a 20 GB hard drive for the Ubuntu install and also all other files (I'm lazy and don't feel like trimming it down more than that) and 2-4 GB of RAM should be more than enough.
If you are new to building and running VMs and would like to try something other than VirtualBox, I would recommend using vmbuilder
, libvert
, qemu
, kvm
, and the vmbuilder
script in the repo to get everything setup (script generously provided by Ian Neal).
To get everything working:
- Follow steps 1-3 on this random website about setting up kvm on Ubuntu 16.04 LTS
- To fix some odd permission issue with libvirt, run:
sudo apt-get install apparmor-profiles apparmor-utils
sudo aa-complain /usr/lib/libvirt/virt-aa-helper
git clone
CrashMonkey repo into a directory of your choosing- edit
setup/ceate_vm.sh
to point to the directory you want the VM disk in, add any additional packages you may want, change user names setup/create_vm.sh <VM name> <VM IP>
to create a new VM and register it withlibvirt
- Note that you may have to comment out line 153 in
/usr/lib/python2.7/dist-packages/VMBuilder/plugins/ubuntu/dapper.py
ofvmbuilder
python code in order to get it to run properly. Otherwise, it may have an issue with copying over sudo templates. - Sit back and drink some coffee as this process may take a little while
- Note that you may have to comment out line 153 in
virsh edit <vm name>
and fix the disk that is passed into the VM as the boot drive. It defaults to the random alphanumeric name thatvmbuilder
generates, but the last few lines of the script moves it to the name of the VM itself.- Note that you may also have to edit the name of the bridge by running
virsh edit <VM name>
depending on your system.
- Note that you may also have to edit the name of the bridge by running
- Fire up the newly created VM and
ssh
into it
git clone --recursive
CrashMonkey repo into a directory of your choosing- sorry, I'll edit the
vmbuilder
script at some point to just copy these files over
- sorry, I'll edit the
CrashMonkey can be built simply by running make
in the root directory of the
repository. This will build all the needed kernel modules, tests, and test
harness components needed to run CrashMonkey.
User defined tests reside in the code/tests
directory. They can be compiled
into static objects with make tests
.
Some tests for CrashMonkey reside in the test
directory of the repo. Tests leverage googletests
and are used to ensure the correctness and functionality of some of the user space portions of CrashMonkey (ex. the descendants of the Permuter
class). Right now you'll have to examine the outputted binary names to determine what each binary tests. In the future, the build system will be updated to run the tests after compiling them.
CrashMonkey can be run either as a standalone program or as a background program. When in standalone mode, Crashmonkey will automatically load and run the user defined C++ setup and workload methods. In both modes, CrashMonkey will look for user defined data consistency tests in the .so
test file provided to CrashMonkey on the command line. When run as a background process, the user is allowed to run setup and workload methods outside of CrashMonkey use a series of simple stub programs to communicate with CrashMonkey. In both modes of operation, command line flags have the same meaning.
Before running any tests with CrashMonkey, you will have to create a directory
at /mnt/snapshot
for the test harness to mount test devices at. If you would
like to run CrashMonkey by hand, you must run the c_harness
binary and at
least provide the following:
-f
- block device to copy device queue flags from. This controls what flags (FUA, flush, etc) will be allowed to propagate to the device wrapper. Something like/dev/vda
should work for this-t
- file system type, right now CrashMonkey is only tested on ext4-d
- device to run tests on. Currently the only valid option is/dev/cow_ram0
. This flag should hopefully go away soon.
To run your own CrashMonkey, use: ../build/c_harness <flags> <user defined workload>
A full listing of flags for CrashMonkey can be found in code/harness/c_harness.c
There are currently no scripts or pre-defined make
rules for running CrashMonkey as a background process. However, an example of how to run a simple CrashMonkey smoke test in background mode is shown below. Before running either of these tests, you will have to create a directory at /mnt/snapshot
for the test harness to mount test devices at.
- open 2 shells in you virtual machine and
cd
into the root directory of the repository - shell 1:
make
- shell 1:
cd build
- shell 2:
cd build
- shell 1:
sudo ./c_harness -f /dev/sda -t ext4 -m barrier -d /dev/cow_ram0 -e 10240 -b tests/echo_sub_dir.so
-e
specifies the RAM block device size to use in KB-f
specifies another block device to copy IO scheduler flags from
- shell 2:
sudo mkdir /mnt/snapshot/test_dir
- shell 2:
sudo user_tools/begin_log
- shell 2:
sudo touch /mnt/snapshot/test_dir/test_file
- shell 2:
sudo chmod 0777 /mnt/snapshot/test_dir/test_file
- shell 2:
echo "hello great big world out there" | sudo tee /mnt/snapshot/test_dir/test_file
- shell 2:
sudo user_tools/end_log
- shell 2:
sudo user_tools/begin_tests
Again, a full list of flags for CrashMonkey can be found in code/harness/c_harness.c
- Contributed code should follow Google's C++ Style Guide (the current code loosely follows that already).
- Contributed user defined tests can currently use any method to write to the file system under test. This can include using C/C++ to open/write to files or using the
system
call in C++ to call a shell function. - All user defined tests must adhere to the interface defined in
code/tests/BaseTestCase.h
and must inherit from this class - All user defined tests must include
test_case_get_instance()
andtest_case_delete_instance()
method implementations (seecode/tests/echo_sub_dir.cpp
for an example- In the future this will become a macro that is added at the end of the file
- This is used by the test harness to create and destroy tests on the fly without recompiling the entire harness
- All user defined permuters must adhere to the interface defined in
code/permuters/Permuter.h
and must inherit from this class - All user defined tests must include
permuter_get_instance()
andpermuter_delete_instance()
method implementations (seecode/permuter/RandomPermuter.cpp
for an example- In the future this will become a macro that is added at the end of the file
- This is used by the test harness to create and destroy permuters on the fly without recompiling the entire harness
If you run into system crashes etc. from a buggy CrashMonkey kernel module you may want to try using stap
to help place print statements in arbitrary places in the kernel. Alternatively, you could put printk
s in the kernel module itself.
- Rework scripts to setup the VM, install packages, etc
- Switch to
CMake
orBazel
instead of plain, poorly writtenMakefiles
- Use
gflags
to parse command line flags- I need to test if
gflags
can properly pickup and parse flags from dynamically loaded static objects
- I need to test if
- Rework the following portions of the test harness
- Make a class to manage disks, partitions, and formatting disks
- Make a class to manage kernel modules
- Make running test cases multi-threaded
- Make the
disk_wrapper
work on volumes that span multiple block devices - Clean up the interface for generating crash states
I can be reached at [email protected]. Please don't spam this email and please begin your subject line with CrashMonkey:
because I do filter my messages.