Skip to content

Commit

Permalink
Simplify README
Browse files Browse the repository at this point in the history
  • Loading branch information
jamesmbaazam committed Oct 9, 2023
1 parent dd1f734 commit ddd2361
Showing 1 changed file with 31 additions and 237 deletions.
268 changes: 31 additions & 237 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -41,13 +41,13 @@ models are often used in infectious disease epidemiology, where the chains repre
transmission, and the offspring distribution represents the distribution of
secondary infections caused by an infected individual.

_{{ packagename }}_ re-implements [bpmodels]("https://github.com/epiverse-trace/bpmodels/")
_{{ packagename }}_ re-implements [epichains]("https://github.com/epiverse-trace/epichains/")
by providing bespoke functions and data structures that allow easy
manipulation and interoperability with other Epiverse packages, for example, [superspreading]("https://github.com/epiverse-trace/superspreading/") and [epiparameter]("https://github.com/epiverse-trace/epiparameter/"), and potentially some existing packages for handling transmission chains, for example, [epicontacts](https://github.com/reconhub/epicontacts).

_{{ packagename }}_ is developed at the [Centre for the Mathematical Modelling of Infectious Diseases](https://www.lshtm.ac.uk/research/centres/centre-mathematical-modelling-infectious-diseases) at the London School of Hygiene and Tropical Medicine as part of the [Epiverse Initiative](https://data.org/initiatives/epiverse/).

# Installation
## Installation

The latest development version of the _{{ packagename }}_ package can be installed via

Expand All @@ -63,251 +63,45 @@ To load the package, use
library("epichains")
```

# Quick start
## Quick start

_{{ packagename }}_ provides functionalities for estimating the likelihood of observing a given transmission chain, `likelihood()`, and functions for simulating transmission chains: `simulate_tree()`, `simulate_tree_from_pop()`, and `simulate_summary()`.
_{{ packagename }}_ provides four main functions:

The objects returned by these functions play nicely with `summary()` and `aggregate()`. Aggregated results also play nicely with `plot()`.
Each functionality is briefly demonstrated below.
* `simulate_tree()`: simulates transmission chains using an initial number of
cases and information on the offspring distribution. This function returns
an object with columns that track information on who infected whom, the
generation of infection, and optionally, the time of infection.

## Chain likelihoods
* `simulate_summary()`: simulates a vector of transmission chain sizes or
lengths using an initial number of cases and information on the offspring
distribution. This function only returns a vector of realized chain size or
length.

### [`likelihood()`](https://epiverse-trace.github.io/epichains/reference/likelihood.html)
* `simulate_tree_from_pop()`: simulates transmission chains given an initial
population size and information on the offspring distribution. You can also
specify a given level of pre-existing immunity. This function returns
an object with columns that track information on who infected whom, the
generation of infection, and the time of infection.

This function calculates the likelihood/loglikelihood of observing a vector of outbreak summaries obtained from transmission chains. Summaries here refer to transmission chain sizes or lengths/durations.
* `likelihood()`: calculates the loglikelihood (or likelihood, depending
on the value of `log`) of observing a vector of transmission chain sizes or
lengths.

`likelihood()` requires a vector of chain summaries (sizes or lengths),
`chains`, the corresponding statistic to calculate, `statistic`, and the offspring distribution,
`offspring_dist` its associated parameters. It also requires `nsim_obs`, which is the number of simulations to run if the likelihoods do not have a closed-form solution and must be simulated. This argument will be explained further in the ["Getting Started"](https://epiverse-trace.github.io/epichains/articles/epichains.html) vignette.
The objects returned by the `simulate_*()` functions can be summarised with
`summary()` and aggregated into a `<data.frame>` of cases per time or generation
with `aggregate()`. Aggregated results can also be passed on to `plot()` with
its own arguments to customize the resulting plots.

Let's look at the following example where we estimate the loglikelihood of observing `chain_sizes`.
```{r}
set.seed(121)
# example of observed chain sizes
# randomly generate 20 chains of size between 1 to 10
chain_sizes <- sample(1:10, 20, replace = TRUE)
```

```{r}
# estimate loglikelihood of the observed chain sizes
likelihood_eg <- likelihood(
chains = chain_sizes,
statistic = "size",
offspring_dist = "pois",
nsim_obs = 100,
lambda = 0.5
)
# Print the estimate
likelihood_eg
```

## Chain simulation

There are three simulation functions, herein referred to collectively as the `simulate_*()` functions.

### [`simulate_tree()`](https://epiverse-trace.github.io/epichains/reference/simulate_tree.html)

`simulate_tree()` simulates an outbreak from a given number of infections.
It retains and returns information on infectors (ancestors), infectees, the generation of infection, and the time, if a serial distribution is specified.

Let's look at an example where we simulate the transmission trees of $10$ initial infections/chains. We
assume a poisson offspring distribution with mean, $\text{lambda} = 0.9$, and a serial interval of $3$ days:
```{r}
set.seed(123)
sim_tree_eg <- simulate_tree(
nchains = 10,
statistic = "size",
offspring_dist = "pois",
stat_max = 10,
serials_dist = function(x) 3,
lambda = 0.9
)
head(sim_tree_eg)
```

### [`simulate_summary()`](https://epiverse-trace.github.io/epichains/reference/simulate_summary.html)

`simulate_summary()` is basically `simulate_tree()` except that it does not retain
information on each infector and infectee. It returns the eventual size or length/duration of each transmission chain.

Here is an example to simulate the previous examples without intervention,
returning the size of each of the $10$ chains. It assumes a poisson offspring distribution with
mean of $0.9$.
```{r}
set.seed(123)
simulate_summary_eg <- simulate_summary(
nchains = 10,
statistic = "size",
offspring_dist = "pois",
stat_max = 10,
lambda = 0.9
)
# Print the results
simulate_summary_eg
```

### [`simulate_tree_from_pop()`](https://epiverse-trace.github.io/epichains/reference/simulate_tree_from_pop.html)

`simulate_tree_from_pop()` simulates outbreaks based on a specified population size and pre-existing immunity until the susceptible pool runs out.

Here is a quick example where we simulate an outbreak in a population of size $1000$. We assume individuals have a poisson offspring distribution with mean, $\text{lambda} = 1$, and serial interval of $3$:
```{r}
set.seed(7)
sim_tree_from_pop_eg <- simulate_tree_from_pop(
pop = 1000,
offspring_dist = "pois",
lambda = 1,
serials_dist = function(x) {3}
)
head(sim_tree_from_pop_eg)
```

#### Simulating interventions

All the `simulate_*()` functions can model interventions that reduce the $R_0$,
using the `intvn_mean_reduction` argument. In general, these can be
interpreted as population-level interventions.

To illustrate this, we will use the previous examples for each function and specify
a population-level intervention that reduces $R_0$ by $50\%$.

Using `simulate_tree()`, we can specify an initial number of cases
and a population level intervention, `intvn_mean_reduction`, that reduces $R_0$ by $50\%$.

```{r}
set.seed(123)
sim_tree_intvn_eg <- simulate_tree(
nchains = 10,
statistic = "size",
offspring_dist = "pois",
intvn_mean_reduction = 0.5,
stat_max = 10,
serials_dist = function(x) 3,
lambda = 0.9
)
head(sim_tree_intvn_eg)
```
Each of the listed functionalities is demonstrated in detail
in the ["Getting Started" vignette](https://epiverse-trace.github.io/epichains/articles/epichains.html).

Here is an example with `simulate_summary()`, modelling an intervention that reduces $R_0$ by $50\%$.
```{r}
simulate_summary_intvn_eg <- simulate_summary(
nchains = 10,
statistic = "size",
offspring_dist = "pois",
intvn_mean_reduction = 0.5,
stat_max = 10,
lambda = 0.9
)
# Print the results
simulate_summary_intvn_eg
```

Finally, let's use `simulate_tree_from_pop()`.
```{r}
set.seed(7)
sim_tree_from_pop_intvn_eg <- simulate_tree_from_pop(
pop = 1000,
offspring_dist = "pois",
intvn_mean_reduction = 0.5,
lambda = 1,
serials_dist = function(x) {3}
)
head(sim_tree_from_pop_intvn_eg)
```

## Other functionalities

### Summarising

You can run `summary()` on `<epichains>` objects to get useful summaries.
```{r include=TRUE,echo=TRUE}
# Example with simulate_tree()
set.seed(123)
sim_tree_eg <- simulate_tree(
nchains = 10,
statistic = "size",
offspring_dist = "pois",
stat_max = 10,
serials_dist = function(x) 3,
lambda = 0.9
)
summary(sim_tree_eg)
# Example with simulate_summary()
set.seed(123)
simulate_summary_eg <- simulate_summary(
nchains = 10,
statistic = "size",
offspring_dist = "pois",
stat_max = 10,
lambda = 0.9
)
# Get summaries
summary(simulate_summary_eg)
```

### Aggregating

You can aggregate `<epichains>` objects returned by the `simulate_*()` functions into a time series, which is a `<data.frame>` with columns "cases" and either "generation" or "time", depending on the value of `grouping_var`.

To aggregate over "time", you must have specified a serial interval distribution in the simulation step.
```{r include=TRUE,echo=TRUE}
# Example with simulate_tree()
set.seed(123)
sim_tree_eg <- simulate_tree(
nchains = 10,
statistic = "size",
offspring_dist = "pois",
stat_max = 10,
serials_dist = function(x) 3,
lambda = 0.9
)
aggregate(sim_tree_eg, grouping_var = "time")
```

### Plotting

Aggregated `<epichains>` objects can easily be plotted using base R or `ggplot2` with little to no data manipulation.

Here is an end-to-end example from simulation through aggregation to plotting.
```{r}
# Run simulation with simulate_tree()
set.seed(123)
sim_tree_eg <- simulate_tree(
nchains = 10,
statistic = "size",
offspring_dist = "pois",
stat_max = 10,
serials_dist = function(x) 3,
lambda = 0.9
)
# Aggregate cases over time
sim_aggreg <- aggregate(sim_tree_eg, grouping_var = "time")
## Package vignettes

# Plot cases over time
plot(sim_aggreg, type = "b")
```
The theory behind the models provided here can be
found in the [theory vignette](https://epiverse-trace.github.io/epichains/articles/theoretical_background.html).

## Package vignettes
We have also collated a bibliography of branching process applications in
epidemiology. These can be found in the [literature vignette](https://epiverse-trace.github.io/epichains/articles/branching_process_literature.html).

Specific use cases of _{{ packagename }}_ can be found in
the [online documentation as package vignettes](https://epiverse-trace.github.io/epichains/), under "Articles".
Expand Down

0 comments on commit ddd2361

Please sign in to comment.