-
Notifications
You must be signed in to change notification settings - Fork 1
Welcome pack for new starters
Welcome to Imperial's Neurogenomics lab!
To-do:
Send me your username so I can give you access to the computing cluster
Create a Github account if you don't already have one: send me your username so I can add you to the group's project space. Try to do all your work within a Github repository.
Send me a profile photo and a short bio paragraph to go on the lab website
I recommend using Evernote to keep track of your work / notes. Try it if you haven't before. Keep notes on all computer errors etc in here.
Join the journal club's mailing list: https://groups.google.com/forum/#!forum/london-genomics-journal-club
Join the lab's mailing list: https://groups.google.com/forum/#!forum/neurogenomics-lab
Register for the Imperial's High Performance Computing course
Learning to use the computing cluster:
Imperial regularly runs a beginner's guide to high performance computing course. If you have not previously used HPC you'll want to register for this as soon as possible. This can be done through their website: https://www.imperial.ac.uk/admin-services/ict/self-service/research-support/rcs/training/
Imperial also runs a course on software carpentry. If you are unfamiliar with usage of Git and Linux then you should take this course. It can be booked using the above link.
Combiz wrote useful notes on using the HPC (how to login etc): https://gist.github.com/combiz/0939ea23366805de0f2dcf6f43763660
A version of RStudio is installed on the computing cluster and can be accessed here: https://rstudio.rcs.imperial.ac.uk/ This gives you access to a 24 core machine and is probably better than programming on your laptop
If you'll be using R on the cluster you might want to get used to using the ClusterMQ R package. This lets you submit jobs to the Imperial cluster from within R, much the same as with a normal loop. Create a .PBStemplate file in your home directory on the server, containing the code below, to enable it to run.
#PBS -N {{ job_name }} #PBS -l select=1:ncpus={{ cores | 1 }}:mem=1gb #PBS -l walltime={{ walltime | 0:05:00 }}
source activate monocle ulimit -v $(( 1024 * {{ memory | 4096 }} )) CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker("{{ master }}")'
#NOTPBS -o {{ log_file | /rds/general/user/nskene/home/logs/ }} #NOTPBS -j oe
Learning to programme in R:
Here’s a tutorial written by a (quite famous) R developer called Hadley Wickham:
https://r4ds.had.co.nz/explore-intro.html It’s a good intro to data visualisation using R. The ggplot2 package is probably the best data vis tool in any programming language. The tutorial does teach a style of code (‘tidyverse’) that I don’t use much, but it is popular
I’ve seen a number of people recommend these tutorials:
https://swirlstats.com/students.html
This cheatsheet explains many basic functions using the two main styles of R:
https://atrebas.github.io/post/2019-03-03-datatable-dplyr/#introduction
Consider installing Sublime Text. A good text editor is always useful.
Learning core lab approaches: Run through the tutorial's for EWCE and MAGMA Celltyping: https://github.com/nathanskene/EWCE https://github.com/NathanSkene/MAGMA_Celltyping
Recommended background reading:
Consider getting these books out of the library, or mention them to me and I'll get you a copy. They are all easy reading and intended to give you a general background in current understanding of molecular biology, evolution, and human genetics that forms a good background for the work the lab is doing.
"Arrival of the Fittest: Solving Evolution's Greatest Puzzle" by Andreas Wagner "The Beak Of The Finch: Story of Evolution in Our Time" by Jonathan Weiner "Who We Are and How We Got Here: Ancient DNA and the new science of the human past" by David Reich "A Life Decoded: My Genome: My Life", by Craig Venter
These papers are worth reading:
Boyle, Evan A., Yang I. Li, and Jonathan K. Pritchard. "An expanded view of complex traits: from polygenic to omnigenic."
"Common disease is more complex than implied by the core gene omnigenic model." Wray, Naomi R., et al.
Visscher, Peter M., and Michael E. Goddard. "From RA Fisher’s 1918 Paper to GWAS a Century Later." Genetics 211.4 (2019): 1125-1130.
van Rheenen, Wouter, et al. "Genetic correlations of polygenic disease traits: from theory to practise"
Watanabe, Kyoko, et al. "A global overview of pleiotropy and genetic architecture in complex traits."
Zeisel, Amit, et al. "Molecular architecture of the mouse nervous system."
Sullivan, Patrick F., and Daniel H. Geschwind. "Defining the genetic, genomic, cellular, and diagnostic architectures of psychiatric disorders."
Reshef, Y. A. et al. Detecting genome-wide directional effects of transcription factor binding on polygenic disease risk. Nat. Genet. 50, 1483–1493 (2018). Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228 (2015). Jansen, I. E. et al. Genome-wide meta-analysis identifies new loci and functional pathways influencing Alzheimer’s disease risk. Nat. Genet. (2019). Soskic, B. et al. Chromatin activity at GWAS loci identifies T cell states driving complex immune diseases. bioRxiv 566810 (2019). doi:10.1101/566810 Colantuoni, C. et al. Temporal dynamics and genetic control of transcription in the human prefrontal cortex. Nature 478, 519–523 (2011).