-
Notifications
You must be signed in to change notification settings - Fork 1
/
000-prep-for-coalescent.Rmd
105 lines (80 loc) · 3.33 KB
/
000-prep-for-coalescent.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
---
title: "Preparing for the Coalescent"
output:
html_notebook:
toc: no
html_document:
df_print: paged
toc: no
bibliography: references.bib
---
We will be studying the coalescent process, a mathematical framework
that is central to understanding the genetic variation that occurs in a genetic
sample that you might collect from a population.
We will spend some time simulating data from the coalescent process using
Dick Hudson's program `ms` [@hudsonGeneratingSamplesWright2002]. I have written
a little R package called 'GSImulator' that has pre-compiled versions of `ms` for
windows and Mac. It is easy to install 'GSImulator' using the 'devtools' package.
To read in some trees from text files we will use the 'ape' package [@ape-package], and to plot
them we will use the 'ggtree' package [@ggtree-package].
So, there are some steps to prepare your computer for these excercises.
You should have installed:
1. A recent version of R (> 4.0, say)
2. RStudio (you don't have to use this, but it is a great IDE)
Then, within R, you need to get some packages.
**Important: when you are installing packages, and you are asked if you would
like to update existing packages to later versions, then you should answer "No!"
at least for the first time. Chances are everything will work fine without
spending a long time updating everything. **
```{r, eval=FALSE}
# if you don't already have the tidyverse, install it:
if(!("tidyverse" %in% rownames(installed.packages()))) {
install.packages("tidyverse")
}
# if you don't already have the remotes package, install it:
if(!("remotes" %in% rownames(installed.packages()))) {
install.packages("remotes")
}
# now, get GSImulator from my GitHub page
remotes::install_github("eriqande/GSImulator", dependencies = FALSE)
# then get ape
if(!("ape" %in% rownames(installed.packages()))) {
install.packages("ape")
}
# and finally install ggtree from bioconductor
if(!("ggtree" %in% rownames(installed.packages()))) {
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("ggtree")
}
# when it says: Update all/some/none? [a/s/n], your answer should be n
```
Once that is done, you can test that you have everything is working by running a few lines of code.
First, load the libraries and make sure there are no glaring errors or failures:
```{r, message=FALSE, warning=FALSE}
library(tidyverse)
library(ape)
library(ggtree)
library(GSImulator)
```
Then test that we have `ms` working.
If we invoke `ms` with no arguments it gives us a message about what kinds
of arguments one can use. When you give the following commands, you should
get the output below:
```{r}
# ms_binary() gives the path to ms on your system
system2(command = ms_binary(), args = "")
```
And try executing the following few lines of code that should produce a
plot of a tree somewhat like the one below
(though it may be different since you could be starting from different
random seeds).
```{r}
# note that we add stderr = FALSE to the following command because of a bug in R 3.5.1
system2(command = ms_binary(), args = "10 1 -T", stdout = "out.tree", stderr = FALSE)
ctree <- read.tree("out.tree")
ggtree(ctree, layout = "rectangular") +
geom_tiplab(size = 2) + coord_flip() + scale_x_reverse()
```
If you are having problems getting this working, please talk to Eric.
## References