This training is the second fundamental training module. The module is designed for participants with either no prior experience or minimal knowledge of sequence data analysis. We recommend that you complete Fundamental 1 (F1), or have experience using tools covered in F1, before proceeding with this module. In F2, you will use online tools to analyse genome sequence data. For the most part, these are drag & drop style web tools that will take an input file (fastq or fasta) and analyse the genomic data. The output files can then be saved and used as input files for visualisation tools such as Microreact and Phandango. The tools and methods described here may not be suitable for carrying out large scale data analysis, and will be addressed in upcoming advanced level bioinformatic courses A1 and A2.
We begin with introducing you to fastq format data, which is the default output format from the majority of the sequencers. Next, we show you how to assess the quality of sequence data, understand base quality scores and how to spot contamination of sequence reads. The sequence reads will then be used for generating an assembly, which will be used as input for the downstream data analysis.
We will then introduce you to genotyping which involves characterising bacterial strains based on their DNA sequence, including the presence of genes that encode phenotypes of interest such as antimicrobial resistance. You will then use a free online tool to genotype some example isolates, including identifying the bacterial species, multilocus sequence typing (MLST), antimicrobial resistance (AMR) and assigning lineage using a clustering method. In the Fundamental 1 module we covered the basics of phylogenetics and interpretation of phylogenetic trees. In this module we will build on your knowledge, and you will visualise the genotype data that you will generate alongside a phylogeny in Microreact.
Approximately 3 hours.
Please find the required files for this module here.
There is a Slack channel available for you to ask questions and discuss your thoughts. From 4th April to 27th April there will be members of the core GPS and Juno project teams available to answer your questions. Access to the Slack channel is only available for GPS and Juno project partners. If you are not involved in either project, you are welcome to use the training materials, however no support will be provided. We will have a ‘wrap-up’ session via a webinar on 27th April to address any outstanding questions; please fill in the short questionnaire at the end of the module to let us know of any questions or topics that you would like to hear more about.
Educators
Narender Kumar, Kate Mellor, Stephanie W. Lo, Victoria Carr, Uzma Khan, Jolynne Mokaya, Ana Ferreira, Gemma Murray.
Contributors
Narender Kumar, Kate Mellor, Christine Boinett, Victoria Carr, Nil Shchelov, Jolynne Mokaya, Ana Ferreira, Gareth Peat, John Lees, Stephanie W. Lo, Dorota Jamrozy, Neil McAlisdair, and Stephen Bentley.
Funding
The training is provided as part of the Juno and GPS2 projects funded by The Bill and Melinda Gates Foundation.