This workshop demonstrates a typical workflow of a SimpleVM user. In this workshop your goal will be to identify pathogenic bacteria that were classified as "greatest threat to human health" by the World Health Organisation (WHO) in 2017: https://www.who.int/news/item/27-02-2017-who-publishes-list-of-bacteria-for-which-new-antibiotics-are-urgently-needed
You will search for those microbes in publicly available metagenomic datasets that are stored in the Sequence Read Archive (SRA). In metagenomics, microbial genetic material is extracted from environmental samples like human gut, soil, freshwater or biogas plants in order to investigate the functions and interactions of the microbial community.
In order to find those microbes, you will have to interact with the de.NBI Cloud via SimpleVM. This workshop is divided into three parts.
In the first part you will learn the basic concept of virtual machines and how to configure them.
In the second section you will test whether SimpleVM correctly provisioned your VM with all your tools installed on it.
In the third section you will learn about object storage and the Sequence Read Archive. Further, you will run your analysis pipeline.
In the fourth section you will start a virtual machine with a RStudio research environment installed.
In this part you will use a SimpleVM Cluster to distribute your analysis on multiple machines instead of just one.