Skip to content

Installation

Plant Pants edited this page Dec 22, 2024 · 18 revisions

System Requirements

Before installing kakapo, ensure that your system meets the following requirements:

  • Operating System: Linux, macOS, or Windows with Windows Subsystem for Linux installed
  • Python: 3.9 or later
  • RAM: Minimum 16 GB (32 GB or more recommended)
  • Disk Space: kakapo downloads and processes files, as specified by user; while they will be stored in a compressed form, depending on the number of samples in your analysis, you may need hundreds of gigabytes of free disk space

kakapo was designed for machines running macOS or Linux, including the Windows Subsystem for Linux. If you choose to run kakapo on Windows Subsystem for Linux, it is recommended that you use the latest Ubuntu distribution available on the Microsoft store.

kakapo supports Python 3 and does not work with Python 2. Please follow the sequence of Preferred Installer Program pip commands listed below to install kakapo. In case you have both Python 2 and Python 3 on your system, or if you are not certain whether you do, you may check whether you have a Python 3 version of pip by running the commands below:

pip -V

This should print the version of pip you have and if pip is using Python 3. If the output lists Python version 3.8 or higher, you are set.

pip 22.0.2 from /usr/lib/python3/dist-packages/pip (python 3.10)

Otherwise you may try:

pip3 -V

If pip3 command works but pip does not, then please replace pip with pip3 in the installation steps.

In case none of the above commands work, you may not have pip installed. Follow the steps described here to install it. Alternatively, one easy way to install pip is by installing Conda or Miniconda.

Dependencies

kakapo can download most of the required dependencies on its own, however you will have to install a few build tools, Java (for Trimmomatic), and Perl (for Rcorrector) before running kakapo. Additionally, gzip is required, but will probably be installed on your system already. I highly recommend installing pigz as well as it will speed up compression steps by utilizing multiple CPU threads.

macOS

On macOS you will first have to install "Command Line Developer Tools":

xcode-select --install

A window will pop up, follow the instructions to complete the installation.

Additionally, on Macs with Apple M1/M2 CPUs, you will have to install Rosetta 2:

/usr/sbin/softwareupdate --install-rosetta --agree-to-license

You can install Java by downloading the installer directly from java.com or by using Homebrew:

brew install openjdk

Linux

On an Ubuntu system these dependencies can be installed by running:

sudo apt update
sudo apt install git python3-pip python3-venv default-jre pigz cmake libbz2-dev build-essential

Installation Steps

  1. Open a terminal on your system.

  2. Install kakapo directly from the GitHub repository by executing the following command:

    pip install --user --upgrade git+https://github.com/karolisr/kakapo

    The same command will also work to upgrade kakapo to the latest version in the future.

  3. To check that kakapo was installed and that it is visible to the system by running:

    kakapo

    You should see kakapo version and other information printed to the screen:

    Kakapo version: 0.9.6
    Python version: 3.10.12
    Operating system: Ubuntu 22.04
    System info: 32 physical and 64 logical cores, 503.71 GB RAM (x86_64)
    
    Configuration file was not provided. Nothing to do.
    
    usage: kakapo --cfg project_configuration_file --ss search_strategies_file
    
    options:
      --cfg path           Path to a kakapo project configuration file.
      --ss path            Path to a kakapo search strategies file.
      --ncpu count         Number of CPUs to use.
      --stop-after-filter  Stop kakapo after Kraken2/Bowtie2 filtering step.
      --force-deps         Force the use of kakapo-installed dependencies,
                           even if they are already available on the system.
      --install-deps       Install kakapo dependencies and quit.
      --dnld-kraken-dbs    Download Kraken2 databases and quit.
      --clean-data-dir     Remove cached NCBI taxonomy data and all software
                           dependencies downloaded by kakapo.
      -v, --version        Print kakapo version.
      -h, --help           Print kakapo help information.
    

    If you get an error ("command not found"), try running this:

    ${HOME}/.local/bin/kakapo

    If this version of the command works, you will need to append the following line to your ${HOME}/.bashrc (for BASH) or ${HOME}/.zshrc (for ZSH) file:

    export PATH="$PATH:${HOME}/.local/bin"

    On macOS there is an additional place where pip3 may place installed programs. The Python 3 interpreter (and pip3) shipped with macOS use this directory:

    ${HOME}/Library/Python/3.X/bin

    You have to replace the X in 3.X with your version (shipped with macOS). If kakapo was installed there, you also have to add the line below to your ${HOME}/.bashrc (for BASH) or ${HOME}/.zshrc (for ZSH) file:

    export PATH="$PATH:${HOME}/Library/Python/3.X/bin"

    Alternative installation: Using Python virtual environment

    KKPVENV="${HOME}/venv-kakapo"
    
    python3 -m venv "${KKPVENV}"
    
    "${KKPVENV}"/bin/pip install wheel
    "${KKPVENV}"/bin/pip install --upgrade git+https://github.com/karolisr/kakapo
    
    "${KKPVENV}"/bin/kakapo
  4. kakapo is now installed, but not quite ready to use without its dependencies. It can install these additional dependencies on its own. They are installed in the directory ${HOME}/.local/share/kakapo/dependencies and should not interfere with other software on your computer.

    To only install those dependencies that cannot be currently found on your system, run this command:

    kakapo --install-deps

    For reproducibility, you may wish kakapo to install all dependencies, including those that you may already have. (This is useful and does not take much space, but it will possibly replicate some packages, which will also be installed in a local kakapo folder.)

    kakapo --force-deps --install-deps

    Note, that the --force-deps option can also be used when running (as opposed to installing) kakapo later to ensure that only the dependencies installed by kakapo are used by it. You can rerun kakapo --install-deps command at any point, with and/or without the --force-deps option to see which programs (their versions and paths) kakapo "sees" and will use. For example, on my system I see this output:

    gzip is available: gzip 1.10 /usr/bin/gzip
    pigz is available: pigz 2.6 /usr/bin/pigz
    Seqtk is available: 1.3-r117-dirty /home/karolis/.local/share/kakapo/dependencies/seqtk-master/seqtk
    Trimmomatic is available: 0.39 /home/karolis/.local/share/kakapo/dependencies/Trimmomatic-0.39/trimmomatic-0.39.jar
    fasterq-dump is available: 2.11.3 /home/karolis/.local/share/kakapo/dependencies/sratoolkit.2.11.3-ubuntu64/bin/fasterq-dump
    makeblastdb is available: 2.12.0 /usr/bin/makeblastdb
    blastn is available: 2.12.0 /usr/bin/blastn
    tblastn is available: 2.12.0 /usr/bin/tblastn
    Vsearch is available: 2.21.1 /usr/bin/vsearch
    SPAdes is available: 3.15.4 /home/karolis/.local/share/kakapo/dependencies/SPAdes-3.15.4-Linux/bin/spades.py
    bowtie2 is available: 2.4.4 /usr/bin/bowtie2
    bowtie2-build is available: 2.4.4 /usr/bin/bowtie2-build
    Rcorrector is available: 1.0.5 /home/karolis/.local/share/kakapo/dependencies/Rcorrector-master/run_rcorrector.pl
    kraken2 is available: 2.1.2 /home/karolis/.local/share/kakapo/dependencies/kraken2-master/bin/kraken2
    kraken2-build is available: 2.1.2 /home/karolis/.local/share/kakapo/dependencies/kraken2-master/bin/kraken2-build
    kakapolib is available: /home/karolis/SyncThing/python/kakapo/kakapo/utils/c/lib/kakapolib_linux_x86_64.so
    

    Important: Make sure to check that the version is printed for fasterq-dump. It is part of SRA-Toolkit. If this is the first time any of the SRA-Toolkit programs was executed on your system, it will expect a configuration file to exist. kakapo tries to create one, but sometimes it can fail. To resolve the issue, if the version is not being printed, copy the parent path listed next to fasterq-dump, append vdb-config --interactive and run the resulting command. On my system, based on the output above I would copy home/karolis/.local/share/kakapo/dependencies/sratoolkit.2.11.3-ubuntu64/bin/ and append vdb-config --interactive:

    /home/karolis/.local/share/kakapo/dependencies/sratoolkit.2.11.3-ubuntu64/bin/vdb-config --interactive

    You should see a blue configuration screen appear. Press Tab then Return to exit.

    Rerun the kakapo --install-deps command and see if the version is being printed now.

  5. kakapo is now installed and ready to use! If you experience any problems during the installation process, please open an issue on GitHub, so I can investigate and fix the problem for everyone! (Thanks!)