Skip to content
Marcel edited this page Mar 15, 2019 · 3 revisions

Infrastructure Tutorial

In this tutorial we will give a short introduction to software-engineering tools and a small overview over the infrastructure used in SeqAn.

Source Code Management (Git)

A source code management system, short SCM, has the task to keep multiple states of a file system and it's files and make it easy to access a certain states at every time. This essentially means that every SCM manages complete backups as restore-points of a file system. A SCM will of course stores and manages this in a time and space efficient way.

The rational behind is easy, every developer on a project should be able to have the same state of the source code as a co-worker.

We could also compress the complete file system and share it with others, but a SCM makes it easier to combine different states of the file system.

We use git a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.

For further reading we would recommend https://git-scm.com/.

Install git

Configure your username and email

> git config --global user.name "My Name"
> git config --global user.email "[email protected]"

Create a repository

First let's create a project folder.

mkdir seqan3-infrastructure-tutorial
cd seqan3-infrastructure-tutorial

Initialise a repository

And then make this folder a git repository.

> git init
Initialized empty Git repository in /seqan3-infrastructure-tutorial/.git/

Add the first file to the repository

Write a program that outputs "Hello World", call it hello.cpp, compile it and execute it.

solution:
> echo '
#include <iostream>

int main()
{
    std::cout << "Hello World!\\n";
    return 0;
}
' > hello.cpp

> g++ hello.cpp -o hello

> ./hello
Hello World!

Git uses four states for a file:

  • untracked the repository does not know this file yet,
  • unstaged the file was modified, but the changes are not tracked yet,
  • staged the changes of the file are tracked and
  • committed the changes of the file were committed to the repository.

See with git status which state your hello.cpp has.

solution:
> git status
On branch master
Untracked files:
  (use "git add <file>..." to include in what will be committed)

    hello.cpp
    hello

As you can see the file hello.cpp is not tracked yet.

Tip: You should check out git status after each operation. You could also use a GUI like gitk to see the status.

We now stage the file.

> git add hello.cpp

See again which state your hello.cpp has.

solution:
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	new file:   hello.cpp

Untracked files:
  (use "git add <file>..." to include in what will be committed)

    hello

We will now commit the changes, which basically means that we create a restore-point.

> git commit -m 'Add hello world example'
[master (root-commit) 2687bd1] Add hello world example
 1 file changed, 7 insertions(+)
 create mode 100644 hello.cpp

A commit is a collection of changes from the previous restore-point to the newly created restore-point. This means that we could have also created multiple files (or changes to existing ones) and committed them in one batch.

Add a second file to the repository and modify the first

Create a second hello world program, named hello2.cpp and change the phrase of hello.cpp to "Hi Folks!".

solution:
> cp hello{,2}.cpp
> echo '
#include <iostream>

int main()
{
    std::cout << "Hi Folks!\\n";
    return 0;
}
' > hello.cpp

Stage the file and then commit the changes. (See with git status how the status changes.)

solution:
> git add hello.cpp hello2.cpp
> git commit -m 'Add second hello world example and change first hello world example'
[master 566333c] Add second hello world example and change first hello world example
 2 files changed, 9 insertions(+), 1 deletion(-)
 create mode 100644 hello2.cpp

Remove the second file

Remove the hello2.cpp file with git rm and commit the changes.

solution:
> git rm hello2.cpp
rm 'hello2.cpp'
> git commit -m 'Remove second hello world example'
[master 12ee27c] Remove second hello world example
 1 file changed, 7 deletions(-)
 delete mode 100644 hello2.cpp

Work on a different branch

Branches are a method to switch between different states of the file system.

The default branch is called master.

> git branch new-feature

Creates a new branch called new-feature.

> git checkout new-feature
Switched to branch 'new-feature'

Will switch from the default branch to the newly created new-feature branch.

Revert an older commit

Sometimes you want to undo a commit. In this case we want to undo the last commit.

> git revert HEAD
[new-feature 5fe4492] Revert "Remove second hello world example"
 1 file changed, 7 insertions(+)
 create mode 100644 hello2.cpp

Instead of HEAD we could also write master or new-feature, because all those names point to the same commit. If we would want to undo the second last commit we could write HEAD~1.

Work on a different branch

First switch to the default branch (master). Notice that hello2.cpp is now missing.

Then create a new branch new-feature2 and create a new file hello3.cpp. After that commit everything.

solution:
> git checkout master
Switched to branch 'master'

> git checkout -b new-feature2 # is a shortcut for create branch + checkout
Switched to a new branch 'new-feature2'

> cp hello{,3}.cpp

> git add hello3.cpp
> git commit -m "Add third hello world example"
[new-feature2 213ebc2] Add third hello world example
 1 file changed, 7 insertions(+)
 create mode 100644 hello3.cpp

Get changes from new-feature and new-feature2 into master

We will now apply the changes from new-feature and new-feature2 into master. For this we will use git merge.

First checkout to master and then merge both feature branches.

solution:
> git checkout master
Switched to branch 'master'

> git merge new-feature new-feature2 # You can also merge them separately
Fast-forwarding to: new-feature
Trying simple merge with new-feature2
Merge made by the 'octopus' strategy.
 hello2.cpp | 7 +++++++
 hello3.cpp | 8 ++++++++
 2 files changed, 15 insertions(+)
 create mode 100644 hello2.cpp
 create mode 100644 hello3.cpp

Delete old branches

Use git branch -d to delete old branches.

solution:
> git branch -d new-feature new-feature2
Deleted branch new-feature (was 5fe4492).
Deleted branch new-feature2 (was ec0eeed).

Fork, Pull and Push to remote repository

Go to https://github.com/marehr/seqan3-infrastructure-tutorial and fork it (You need a github account).

After that clone it (note that "" is your github name).

> git clone [email protected]:<name>/seqan3-infrastructure-tutorial.git
Cloning into 'seqan3-infrastructure-tutorial'...
remote: Enumerating objects: 16, done.
remote: Counting objects: 100% (16/16), done.
remote: Compressing objects: 100% (10/10), done.
Unpacking objects: 100% (16/16), done.
remote: Total 16 (delta 4), reused 16 (delta 4), pack-reused 0

Be creative and commit some changes on a new branch remote.

solution:
> git checkout -b remote
Switched to a new branch 'remote'

> git rm hello2.cpp hello3.cpp
rm 'hello2.cpp'
rm 'hello3.cpp'

> git commit -m 'revert changes'
[master 865e0c0] revert changes
 2 files changed, 15 deletions(-)
 delete mode 100644 hello2.cpp
 delete mode 100644 hello3.cpp

After that publish your changes to your remote repository.

> git push origin remote
Enumerating objects: 3, done.
Counting objects: 100% (3/3), done.
Delta compression using up to 8 threads
Compressing objects: 100% (2/2), done.
Writing objects: 100% (2/2), 229 bytes | 229.00 KiB/s, done.
Total 2 (delta 1), reused 0 (delta 0)
remote: Resolving deltas: 100% (1/1), completed with 1 local object.
To github.com:<name>/seqan3-infrastructure-tutorial.git
   04ca7ed..9048277  remote -> remote

You can update your local repository by pulling changes from your remote repository. We will update the local master branch to track the remote one.

> git checkout master
Switched to branch 'master'

> git pull origin master
From github.com:<name>/seqan3-infrastructure-tutorial
 * branch            master     -> FETCH_HEAD
Already up to date.

Managing the build process with CMake

CMake is used to control the software compilation process using simple platform and compiler independent configuration files, and generate native makefiles and workspaces that can be used in the compiler environment of your choice.

Install CMake

Create a CMake configuration file

Create a file "CMakeLists.txt" and put the following content into it.

cmake_minimum_required (VERSION 3.2) # require at least version 3.2 of cmake
project (seqan3_infrastructure_tutorial CXX) # this is a c++-only project

add_executable(hello hello.cpp) # build an executable from the hello.cpp source

This is a minimal example of a working CMake file. You can query your cmake version by calling cmake --version.

Generate makefiles

Create a folder named build and generate the makefiles by using cmake <source-dir>.

solution: If you are stuck you can checkout solution01.
> mkdir build
> cd build
> cmake ..
-- The CXX compiler identification is GNU 8.2.1
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /seqan3-infrastructure-tutorial/build

Then build the executable hello and execute it.

solution:
> make
Scanning dependencies of target hello
[ 50%] Building CXX object CMakeFiles/hello.dir/hello.cpp.o
[100%] Linking CXX executable hello
[100%] Built target hello
> ./hello
Hi Folks!

We suggest to read https://cmake.org/cmake/help/latest/ to find the available commands and how to use them.

factorial function

Write a factorial function (i.e. n!) in a header file and call it factorial.hpp. And a main file factorial.cpp which reads user input from cin and prints The factorial of <input> is <result>.. Use CMake to build the executable.

The function signature should be

long factorial(long n); // compute n!

Please only look at the solution if you have problems to proceed.

solution: If you are stuck you can checkout solution02.
// factorial.hpp
#pragma once

long factorial(long n)
{
    return n == 1 ? 1 : n * factorial(n-1);
}
// factorial.cpp
#include <iostream>
#include "factorial.hpp"

int main()
{
    long n;
    std::cout << "n is ";
    std::cin >> n;

    long fac = factorial(n);
    std::cout << "The factorial of " << n << " is " << fac << ".\n";
}
# CMakeLists.txt
cmake_minimum_required (VERSION 3.2)
project (seqan3_infrastructure_tutorial CXX)

add_executable(hello hello.cpp)
add_executable(factorial factorial.cpp)

Unit tests

You probably tested your factorial function by trying some values out. Unit testing has the same concept but you will write your inputs and expected outputs in a cpp file. This enables you to have reproducible test cases and you will remember actually what cases you addressed.

Simple unit test

We will first write a conceptual unit test.

Create a test case named factorial_test.cpp and add it to CMake.

#include <cassert>
#include "factorial.hpp"

int main()
{
    // normal cases
    assert(factorial(1) == 1);
    assert(factorial(2) == 2);
    assert(factorial(3) == 6);
    assert(factorial(4) == 24);
    assert(factorial(10) == 3628800);

    // edge case zero
    assert(factorial(0) == 1);

    // edge case negative numbers
    assert(factorial(-1) == -1);
    return 0;
}

And test if your code is correct :)

Use CTest from CMake

You can define which executables are test cases with CMake and CMake will generate a nice overview of which tests failed or succeeded.

To enable this feature put the following lines to your CMake file.

enable_testing()
add_executable(factorial_test factorial_test.cpp)
add_test(factorial factorial_test)

Important: You have to first build the binaries before you can test them.

> cd build
> make
-- Configuring done
-- Generating done
-- Build files have been written to: /seqan3-infrastructure-tutorial/build
[ 33%] Built target hello
Scanning dependencies of target factorial_test
[ 50%] Building CXX object CMakeFiles/factorial_test.dir/factorial_test.cpp.o
[ 66%] Linking CXX executable factorial_test
[ 66%] Built target factorial_test
Scanning dependencies of target factorial
[ 83%] Building CXX object CMakeFiles/factorial.dir/factorial.cpp.o
[100%] Linking CXX executable factorial
[100%] Built target factorial

> ctest
Test project /seqan3-infrastructure-tutorial/build
    Start 1: factorial
1/1 Test #1: factorial ........................   Passed    0.00 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) =   0.00 sec

Use a proper unit testing framework

First clone https://github.com/google/googletest in your seqan3-infrastructure-tutorial folder.

solution:
git clone https://github.com/google/googletest

Change you CMake file like so:

cmake_minimum_required (VERSION 3.2)
project (seqan3_infrastructure_tutorial CXX)

add_executable(hello hello.cpp)
add_executable(factorial factorial.cpp)

add_subdirectory (googletest)

enable_testing()

add_executable(factorial_test factorial_test.cpp)
target_link_libraries(factorial_test gtest_main)
add_test(factorial factorial_test)

Use google test in factorial_test.cpp. Use this scaffold.

#include "gtest/gtest.h"
#include "factorial.hpp"

TEST(factorial, normal)
{
    EXPECT_EQ(factorial(1), 1);
    // ...
}

// more test cases

// note that main is missing
solution:
#include "gtest/gtest.h"
#include "factorial.hpp"

TEST(factorial, normal)
{
    EXPECT_EQ(factorial(1), 1);
    EXPECT_EQ(factorial(2), 2);
    EXPECT_EQ(factorial(3), 6);
    EXPECT_EQ(factorial(4), 24);
    EXPECT_EQ(factorial(10), 3628800);
}

TEST(factorial, zero)
{
    EXPECT_EQ(factorial(0), 1);
}

TEST(factorial, negative_numbers)
{
    EXPECT_EQ(factorial(-1), -1);
}

Build and execute your test case.

SeqAn test cases

See https://github.com/seqan/seqan3/tree/master/test/unit for examples.

A small example to build seqan's unit tests.

git clone [email protected]:seqan/seqan3.git --recursive
mkdir -p seqan3/build/unit
cd seqan3/build/unit
cmake -DCMAKE_CXX_COMPILER=g++7 ../../test/unit
make -j 8

Code coverage

A different kind of test are code coverage tests.

The main idea is to make visible which functions were executed and which were not. This mainly helps to see if your test cases cover all important use cases. For an example see https://codecov.io/gh/seqan/seqan3.

Benchmarks

Benchmarks are a way to have reproducible performance measures.

In seqan we use google benchmark.

See https://github.com/seqan/seqan3/tree/master/test/performance for examples.