This lecture will continue using Python to perform biological sequence analyses. In particular, we will use Biopython to read in sequencing data. We will also briefly use pandas for handling data frames, and plotnine for plotting.
After this lecture, you should be able to write Python functions:
-
Parse barcodes from a real deep-sequencing experiment.
-
Convert data into a data frame.
-
Plot basic data.
-
The content for this lecture is containing in the Jupyter notebook lecture14.ipynb located in this directory. The content from lecture 13 is duplicated here for your convenience.
-
If you have difficulty performing a
git pull
to obtain the materials for this class, it is likely because you have a conflict betweenlecture13.ipynb
and the version in the public GitHub repo. You can resolve this by making a copy of that notebook (naming it something different, likemy_lecture13.ipynb
) and then discarding changes to the original notebook (you can do this in the GitHub Desktop app by going to the "Changes" tab, right clicking on this notebook, and selecting "Discard changes"). You should then be able to pull without issue. -
Please also install plotnine prior to class by opening Terminal (Mac) or Anaconda Prompt (Windows) and executing the following code:
conda install -c conda-forge plotnine
. Alternatively (on either platform), open Anaconda Navigator, go to "Environments" and click "not installed". Search for plotnine and click the box to install.
- Homework 6 (Practical Python with biopython) is due on Tuesday, November 19 at noon, and includes material from lectures 13 and 14. Please see this issue for discussion of general questions about the class materials, and this issue for specific questions about homework 6. Kate will hold office hours for homework 6 on Monday, November 20 from 9-11 am in Arnold Atrium.