Skip to content

Latest commit

 

History

History
97 lines (85 loc) · 10.2 KB

README.md

File metadata and controls

97 lines (85 loc) · 10.2 KB

Applied Bioinformatics

Welcome to the Applied Bioinformatics course offered at The Scripps Research Institute.
Course materials from previous years are available here.

Instructors: Dr. Andrew I Su (@andrewsu) and Dr. Sabah Ul-Hasan (@sabahzero)
Teaching Assistants (TAs): Huitian Yolanda Diao (@Huitian), Karthik Gangavarapu (@gkarthik), Shang-Fu Chen (@ShaunFChen)

This course is available in [2] parts and operates under the Computational Biology & Bioinformatics (CBB) core track:

  • Unit A: Fundamentals of Scientific Computing (FSC), 4 weeks (1 credit)
  1. Learn and utilize the Bash (Unix shell) for file manipulation and navigation of the file system
  2. Learn and utilize R code to perform exploratory data analysis of data in files
  3. Learn and utilize Jupyter Notebook for R code
  4. Learn and utilize Git and GitHub for code versioning (tracking changes of source code)
  • Units B-C: Applied Bioinformatics and Computational Biology (ABCB), 8 weeks (2 credits)
  1. Learn the fundamentals of RNA-Seq, and its application in the larger biological research schema.
  2. Application of R in analyses of RNA-Seq data, from raw data to publishable statistics and figures.
  3. Practice and present on learned R skillset through published data via Capstone project.
  4. Understand and practice of peer review through self-evaluation and evaluation of peers.

The reasoning for this internal breakdown of the course is to give individuals an opportunity to "learn from scratch" and then go more in-depth, or choose either or path (solely partake in Unit A or Units B-C) depending on the individuals' needs.

Prerequisites

  • An enthusiasm for learning, at whatever level of experience you may or may not be
  • A Windows 10 or MacOS laptop (inform instructors if you need access to one of these)
  • Software installation prior to arrival (by Sep 4th @ 5 PM PST):
    Command line (for Unix shell), R, and Jupyter Notebook (using Anaconda, includes Python 3) here
  • Expectations: Individuals following this course either on their own or for credit should conduct professional and considerate behavior, likewise for TAs and Instructors. Individuals can typically anticipate feedback within a 48-hour time period during typical business hours.

Schedule at a Glance

Each week consists of two 90-minute classes starting at 8:15 AM PST and ending at 9:45 AM PST from Sep 8th through Dec 10th, paired with one homework assignment (per week, weeks 1-9, 8 assignments in total). Within each class are two 45-minute sessions comprising of ~15-min lectures and ~30-min hands-on exercises with a brief recap at the end of the 90-minute period.

  • Unit A (4 wks):
    Fundamentals of Scientific Computing (FSC), or STBIO 400
    • Week 1: Course Introduction + Jupyter Notebook + Bash Basics
    • Week 2: Bash in-depth
    • Week 3: Intro to R
    • Week 4: Data Analysis and Plotting in R
  • Unit B (5 wks):
    Understanding and Exploration RNA-Seq, or STBIO 440i
    • Week 5: Advanced R and Pertinence to RNA-Seq, Introduction to Capstone Project
    • Week 6: RNA-Seq Raw Data Ouput + RNA-Seq Data Pre-processing
    • Week 7: RNA-Seq Data Pre-processing cnt'd
    • Week 8: RNA-Seq Data Post-processing + DESeq2
    • Week 9: Special Topics: R Packages of interest, Git (version Control), and the HPC
  • Unit C (3 wks):
    Capstone Projects and Overview of the Bioinformatics Data Workflow Spectrum, or STBIO 440ii
    • Week 10: Capstone Project Workshop
    • Week 11: An Overview of Additional Bioinformatics Workflows (Metagenomics, Proteomics, and others)
    • Week 12: Capstone Project Presentations

Course Materials

Unit A: Jupyter, Bash, and R

  • A.1a (Sep 8): Intro and Bash Basics slides, HW1 (jupyterhub)
  • A.1b (Sep 10): Bash cnt'd slides
  • A.2a (Sep 15): Bash in-depth slides, HW1 Review and HW2
    • HW1 Due @ 8 AM PST
  • A.2b (Sep 17): Loops slides
  • A.3a (Sep 22): Introduction to R slides, HW2 Review and HW3
    • HW2 Due @ 8 AM PST
  • A.3b (Sep 24): R Objects and Operations slides
  • A.4a (Sep 29): Data Analysis and Function in R slides, HW3 Review and HW4
    • HW3 Due @ 8 AM PST
  • A.4b (Oct 1): Plotting in R, slides

Unit B: Exploration of RNA-Seq via Utilizing R

  • B.5a (Oct 6): Advanced plotting in R slides, HW4 Review and HW5
    • HW 4 Due @ 8 AM PST
  • B.5b (Oct 8): R and DESeq2 in relation to RNA-Seq slides, Introduction to Capstone
  • B.6a (Oct 13): Introduction to RNA-Seq and FASTQC slides, HW5 Review and HW6
    • HW 5 Due @ 8 AM PST
  • B.6b (Oct 15): Raw RNA-Seq Data Ouput and Alignment (HISAT2 and SAM) slides
  • B.7a (Oct 20): CLASS CANCELLED, HW7 RNA-Seq Data Pre-Processing, HW6 Review
    • HW 6 Due @ 8 AM PST
  • B.7b (Oct 22): RNA-Seq Mapping and Read Counting slides
  • B.8a (Oct 27): RNA-Seq Expression Analysis slides, HW7 Review and HW8
  • B.8b (Oct 29): RNA-Seq DESeq2 and Enrichment Analysis slides
    • HW 7 Due @ 8 AM PST
  • B.9a (Nov 3): R Analyses of Interest (PCA and PCoA) slides
  • B.9b (Nov 5): Git and utilizing the TSRI HPC slides and associated repository

Unit C: Capstone Project, and Overview of Pipelines

  • C.10 a(Nov 10 slides) - b(Nov 12 slides): Workshop Time
    • HW 8 Due Nov 10th @ 8 AM PST
  • C.11a (Nov 17): Invited talks on applied bioinformatics research and career journeys slides
    • Dr. Joel Babdor (@joelBabdor) - Translational, Immunology, Immunotherapy; Dr. Chiranjit Mukherjee (@cm0109) - Amplicon Sequencing, Human Microbiome, Microbial Ecology; Dr. Sally Chang (@esallychang) - Non-Model Organisms, Comparative Genomics, Genomics of Disease; Dr. Mario Banuelos (@MBanuelos) - Genomic Variation, Deep Learning, Optimization
  • C.11b (Nov 19): Panel slides
  • C.12 a(Dec 1) - b (Dec 3): Presentations
    • cnt'd Dec 8 and Dec 10 (if necessary)

How to Get Help?

  1. Is there a possibility the answer to my question is available online? If the answer is available, post the original question and answer you found as an Issue with the 'question' label. Be sure to close the issue with the answer you found (including relevant resource links) after opening it. You will need a Github account to do this.
  2. If the answer is not available online, post it as a question-labeled Issue and pend for a response from the Instructors or TAs. Who knows, perhaps another student had also encountered the same issue and can answer it too!
  3. If the obstacle is something more personal, such as a specific installation issue unique to your computer alone, reach out to the instructors. Any question is better than no question at all, and often we all learn from them (not just students)!