Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add network model simulation #60

Merged
merged 50 commits into from
Feb 7, 2024
Merged
Show file tree
Hide file tree
Changes from 35 commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
4674431
add WIP .sim_network_bp function
joshwlambert Jan 12, 2024
c01e77e
fix sampling contacts and bookkeeping ancestors
joshwlambert Jan 15, 2024
69e99c9
fixed sampling contact distribution with replacement
joshwlambert Jan 16, 2024
8a1d3b5
increase size of allocated vectors
joshwlambert Jan 16, 2024
4907363
fixed issue counting chain length
joshwlambert Jan 16, 2024
6122f9b
simplified vector indexing in .sim_network_bp
joshwlambert Jan 16, 2024
36da388
simplified next_gen_size calculation in .sim_network_bp
joshwlambert Jan 16, 2024
aabab0e
remove preallocation of id vec and add vector growing to .sim_network_bp
joshwlambert Jan 16, 2024
5dc439c
remove input checking from .sim_network_bp as it is an internal function
joshwlambert Jan 16, 2024
a401eef
updated .check_sim_input with new model arguments
joshwlambert Jan 16, 2024
4f6fd61
updated .sim_contacts_tbl for new simulation model
joshwlambert Jan 16, 2024
ae79da1
updated exported simulation function to use new model arguments
joshwlambert Jan 16, 2024
049a6ec
updated .sim_bp_linelist to use .sim_network_bp
joshwlambert Jan 16, 2024
fa35b39
updated add_cols function to sample infected individuals
joshwlambert Jan 16, 2024
0b629ef
removed contact_distribution from functions that no longer use it
joshwlambert Jan 17, 2024
30e7bfb
removed documentation for old arguments and added documentation for n…
joshwlambert Jan 17, 2024
42b5496
added details to .sim_network_bp documentation
joshwlambert Jan 17, 2024
1f0d2c8
updated man files for updated functions
joshwlambert Jan 17, 2024
6de6d13
fix typo converting contact_interval to function in sim_contacts
joshwlambert Jan 17, 2024
02abe5f
fixed examples and inherit doc in .sim_network_bp
joshwlambert Jan 17, 2024
bdcbff7
added tests for .sim_network_bp
joshwlambert Jan 17, 2024
4b1ca9d
added linelist row filtering to sim_outbreak
joshwlambert Jan 17, 2024
8d09db7
reduced maximum iterations for simulation conditioning
joshwlambert Jan 17, 2024
8f6dd78
updated tests to pass with new arguments
joshwlambert Jan 17, 2024
735ec13
pass add_names argument to .sim_network_bp
joshwlambert Jan 18, 2024
c9f8bc5
update simulation functions to use new arguments in vignettes
joshwlambert Jan 18, 2024
dfae23d
remove mention of bpmodels from the package
joshwlambert Jan 18, 2024
2106172
Update CITATION.cff
actions-user Jan 18, 2024
e19b496
update README to use new simulation function arguments
joshwlambert Jan 18, 2024
9970476
Automatic readme update
actions-user Jan 18, 2024
0d2cf1d
update mean_contacts to contact_distribution
joshwlambert Jan 19, 2024
2156fcb
update tests to use contact_distribution instead of mean_contacts
joshwlambert Jan 19, 2024
3888aeb
update vignettes and README to use contact_distribution instead of me…
joshwlambert Jan 19, 2024
282057a
Automatic readme update
actions-user Jan 19, 2024
b428d1c
linting
joshwlambert Jan 19, 2024
bb141f8
increase excess degree distribution density upper bound
joshwlambert Jan 31, 2024
cd5fe64
update seed in .sim_network_bp test to pass
joshwlambert Jan 31, 2024
4315035
added .sample_names function
joshwlambert Feb 5, 2024
d767163
use .sample_names in .add_names function
joshwlambert Feb 5, 2024
69ff03f
removed warning and break for large line lists with names in .sim_net…
joshwlambert Feb 5, 2024
fb5e435
linting
joshwlambert Feb 5, 2024
e60bdf7
added network option to create_config
joshwlambert Feb 6, 2024
2985b39
updated create_config tests
joshwlambert Feb 6, 2024
173a855
add option to turn off network effect in .sim_network_bp and remove a…
joshwlambert Feb 6, 2024
be49848
added test for .sim_network_bp without network effect and update othe…
joshwlambert Feb 6, 2024
baeb283
removed add_names argument from .sim_network_bp call in sim_* functions
joshwlambert Feb 6, 2024
91f77ff
removed add_names argument from .sim_bp_linelist
joshwlambert Feb 6, 2024
5d83edc
updated sim_contacts and sim_linelist tests
joshwlambert Feb 6, 2024
297d713
added info box to simulist vignette on network effect in model
joshwlambert Feb 6, 2024
bdac6de
use snapshot tests for regression testing .sim_network_bp
joshwlambert Feb 6, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 0 additions & 20 deletions CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -104,26 +104,6 @@ references:
email: [email protected]
orcid: https://orcid.org/0000-0001-8814-9421
year: '2024'
- type: software
title: bpmodels
abstract: 'bpmodels: Simulating and Analysing Transmission Chain Statistics using
Branching Process Models'
notes: Imports
url: https://epiverse-trace.github.io/bpmodels/
authors:
- family-names: Azam
given-names: James M.
email: [email protected]
orcid: https://orcid.org/0000-0001-5782-7330
- family-names: Finger
given-names: Flavio
email: [email protected]
orcid: https://orcid.org/0000-0002-8613-5170
- family-names: Funk
given-names: Sebastian
email: [email protected]
orcid: https://orcid.org/0000-0002-2842-3406
year: '2024'
- type: software
title: randomNames
abstract: 'randomNames: Generate Random Given and Surnames'
Expand Down
4 changes: 1 addition & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@ Imports:
stats,
checkmate,
epiparameter,
bpmodels,
randomNames,
rlang
Suggests:
Expand All @@ -40,8 +39,7 @@ Suggests:
spelling,
testthat (>= 3.0.0)
Remotes:
epiverse-trace/epiparameter,
epiverse-trace/bpmodels
epiverse-trace/epiparameter
Config/testthat/edition: 3
Config/Needs/website:
epiverse-trace/epiversetheme
Expand Down
20 changes: 14 additions & 6 deletions R/add_cols.R
Original file line number Diff line number Diff line change
Expand Up @@ -81,13 +81,17 @@ NULL
.add_hospitalisation <- function(.data,
onset_to_hosp,
hosp_risk) {
.data$hospitalisation <- .data$time + onset_to_hosp(nrow(.data))
infected_idx <- .data$infected == "infected"
num_infected <- sum(infected_idx)
.data$hospitalisation <- NA_real_
.data$hospitalisation[infected_idx] <- .data$time[infected_idx] +
onset_to_hosp(num_infected)

if (is.numeric(hosp_risk)) {
pop_sample <- sample(
seq_len(nrow(.data)),
which(infected_idx),
replace = FALSE,
size = (1 - hosp_risk) * nrow(.data)
size = (1 - hosp_risk) * num_infected
)
.data$hospitalisation[pop_sample] <- NA
} else {
Expand All @@ -113,14 +117,18 @@ NULL
onset_to_death,
hosp_death_risk,
non_hosp_death_risk) {
.data$deaths <- .data$time + onset_to_death(nrow(.data))
infected_idx <- .data$infected == "infected"
num_infected <- sum(infected_idx)
.data$deaths <- NA_real_
.data$deaths[infected_idx] <- .data$time[infected_idx] +
onset_to_death(num_infected)

apply_death_risk <- function(.data, risk, hosp = TRUE) {
if (is.numeric(risk)) {
pop_sample <- sample(
seq_len(nrow(.data)),
which(infected_idx),
replace = FALSE,
size = (1 - risk) * nrow(.data)
size = (1 - risk) * num_infected
)
.data$deaths[pop_sample] <- NA
} else {
Expand Down
12 changes: 6 additions & 6 deletions R/checkers.R
Original file line number Diff line number Diff line change
Expand Up @@ -114,13 +114,13 @@
#' called for its side-effects, which will error if the input is invalid.
#' @keywords internal
.check_sim_input <- function(sim_type = c("linelist", "contacts", "outbreak"), # nolint cyclocomp_linter
R,
serial_interval,
contact_distribution,
contact_interval,
prob_infect,
outbreak_start_date,
min_outbreak_size,
onset_to_hosp = NULL,
onset_to_death = NULL,
contact_distribution = NULL,
add_names = NULL,
add_ct = NULL,
case_type_probs = NULL,
Expand All @@ -131,8 +131,9 @@
population_age = NULL) {
sim_type <- match.arg(sim_type)

checkmate::assert_number(R, lower = 0)
.check_func_req_args(serial_interval)
checkmate::assert_number(prob_infect, lower = 0, upper = 1)
.check_func_req_args(contact_distribution)
.check_func_req_args(contact_interval)
checkmate::assert_date(outbreak_start_date)
checkmate::assert_integerish(min_outbreak_size, lower = 1)

Expand Down Expand Up @@ -169,7 +170,6 @@
}

if (sim_type == "contacts" || sim_type == "outbreak") {
.check_func_req_args(contact_distribution)
checkmate::assert_numeric(contact_tracing_status_probs, len = 3)
checkmate::assert_names(
names(contact_tracing_status_probs),
Expand Down
57 changes: 25 additions & 32 deletions R/sim_contacts.R
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@
#' Simulate contacts for an infectious disease outbreak
#'
#' @inheritParams sim_linelist
#' @param contact_distribution An `<epidist>` object or anonymous function for
#' the contact distribution. This is similar to the offspring distribution,
#' but describes the heterogeneity of contacts that are not infected.
#' @param contact_tracing_status_probs A named `numeric` vector with the
#' probability of each contact tracing status. The names of the vector must
#' be `"under_followup"`, `"lost_to_followup"`, `"unknown"`. Values of each
Expand All @@ -16,28 +13,28 @@
#'
#' @examples
#' # load data required to simulate contacts
#' serial_interval <- epiparameter::epidist(
#' contact_distribution <- epiparameter::epidist(
#' disease = "COVID-19",
#' epi_dist = "serial interval",
#' prob_distribution = "gamma",
#' prob_distribution_params = c(shape = 1, scale = 1)
#' epi_dist = "contact distribution",
#' prob_distribution = "pois",
#' prob_distribution_params = c(mean = 2)
#' )
#'
#' contact_distribution <- epiparameter::epidist(
#' contact_interval <- epiparameter::epidist(
#' disease = "COVID-19",
#' epi_dist = "contact_distribution",
#' prob_distribution = "pois",
#' prob_distribution_params = c(l = 5)
#' epi_dist = "contact interval",
#' prob_distribution = "gamma",
#' prob_distribution_params = c(shape = 1, scale = 1)
#' )
#'
#' contacts <- sim_contacts(
#' R = 1.1,
#' serial_interval = serial_interval,
#' contact_distribution = contact_distribution
#' contact_distribution = contact_distribution,
#' contact_interval = contact_interval,
#' prob_infect = 0.5
#' )
sim_contacts <- function(R,
serial_interval,
contact_distribution,
sim_contacts <- function(contact_distribution,
contact_interval,
prob_infect,
outbreak_start_date = as.Date("2023-01-01"),
min_outbreak_size = 10,
population_age = c(1, 90),
Expand All @@ -50,44 +47,40 @@ sim_contacts <- function(R,
# check and convert distribution to func if needed before .check_sim_input()
stopifnot(
"Input delay distributions need to be either functions or <epidist>" =
inherits(serial_interval, c("function", "epidist")) &&
inherits(contact_distribution, c("function", "epidist"))
inherits(contact_interval, c("function", "epidist"))
)
serial_interval <- as.function(serial_interval, func_type = "generate")
contact_distribution <- as.function(
contact_distribution,
func_type = "generate"
contact_distribution, func_type = "density"
)
contact_interval <- as.function(contact_interval, func_type = "generate")

.check_sim_input(
sim_type = "contacts",
R = R,
serial_interval = serial_interval,
contact_distribution = contact_distribution,
contact_interval = contact_interval,
prob_infect = prob_infect,
outbreak_start_date = outbreak_start_date,
min_outbreak_size = min_outbreak_size,
contact_distribution = contact_distribution,
contact_tracing_status_probs = contact_tracing_status_probs,
population_age = population_age
)

chain <- .sim_bp_linelist(
R = R,
serial_interval = serial_interval,
contact_distribution = contact_distribution,
contact_interval = contact_interval,
prob_infect = prob_infect,
outbreak_start_date = outbreak_start_date,
min_outbreak_size = min_outbreak_size,
population_age = population_age,
add_names = TRUE,
config = config
)

chain <- .add_names(.data = chain)

contacts <- .sim_contacts_tbl(
.data = chain,
outbreak_start_date = outbreak_start_date,
contact_distribution = contact_distribution,
population_age = population_age,
contact_tracing_status_probs = contact_tracing_status_probs,
config = config
contact_tracing_status_probs = contact_tracing_status_probs
)

# return line list
Expand Down
117 changes: 15 additions & 102 deletions R/sim_contacts_tbl.R
Original file line number Diff line number Diff line change
Expand Up @@ -6,128 +6,41 @@
#' @return A contacts `<data.frame>`
#' @keywords internal
.sim_contacts_tbl <- function(.data,
outbreak_start_date,
contact_distribution,
population_age,
contact_tracing_status_probs,
config) {
contact_tracing_status_probs) {
if (!"infector_name" %in% colnames(.data)) {
.data <- .add_names(.data = .data)
}

contact_investigation <- subset(
contacts_tbl <- subset(
.data,
select = c(
"infector", "infector_name", "case_name", "age", "gender",
"infector_name", "case_name", "age", "gender",
"date_first_contact", "date_last_contact"
)
)
colnames(contact_investigation) <- c(
"infector", "from", "to", "cnt_age", "cnt_gender",
"date_first_contact", "date_last_contact"
colnames(contacts_tbl) <- c(
"from", "to", "age", "gender", "date_first_contact",
"date_last_contact"
)
contact_investigation <- contact_investigation[-1, ]
contact_investigation$was_case <- "Y"

other_contacts <- subset(.data, select = c("infector", "infector_time"))
other_contacts <- other_contacts[-1, ]
other_contacts$from <- contact_investigation$from
other_contacts <- subset(
other_contacts,
subset = !duplicated(other_contacts$infector)
)

# sample contact distribution for non-infected contacts
contact_freq <- contact_distribution(nrow(other_contacts))

# Multiply row by times specified in frequency column V
other_contacts <-
other_contacts[rep(row.names(other_contacts), contact_freq), ]

# add random age and gender
if (is.data.frame(population_age)) {
age_groups <- apply(population_age, MARGIN = 1, function(x) x[1]:x[2])
sample_weight <- rep(population_age$proportion, times = lengths(age_groups))
# normalise for vector length
sample_weight <- sample_weight /
rep(lengths(age_groups), times = lengths(age_groups))
other_contacts$cnt_age <- sample(
unlist(age_groups),
size = nrow(other_contacts),
replace = TRUE,
prob = sample_weight
)
} else {
other_contacts$cnt_age <- sample(
population_age[1]:population_age[2],
size = nrow(other_contacts),
replace = TRUE
)
}
other_contacts$cnt_gender <- sample(
c("m", "f"),
size = nrow(other_contacts),
replace = TRUE
)

# Add corresponding names acc to the gender V
other_contacts$to <- randomNames::randomNames(
which.names = "both",
name.sep = " ",
name.order = "first.last",
gender = other_contacts$cnt_gender,
sample.with.replacement = FALSE
contacts_tbl$was_case <- ifelse(
test = .data$infected == "infected",
yes = "Y",
no = "N"
)

# add contact dates
other_contacts <- .add_date_contact(
.data = other_contacts,
contact_type = "last",
distribution = config$last_contact_distribution,
config$last_contact_distribution_params,
outbreak_start_date = outbreak_start_date
)
other_contacts <- .add_date_contact(
.data = other_contacts,
contact_type = "first",
distribution = config$first_contact_distribution,
config$first_contact_distribution_params
)

other_contacts <- subset(
other_contacts,
select = c(
"infector", "from", "to", "cnt_age", "cnt_gender",
"date_first_contact", "date_last_contact"
)
)
other_contacts$was_case <- "N"

# merge both datasets and keep all the values from both
contact_investigation <- rbind(contact_investigation, other_contacts)
contact_investigation <-
contact_investigation[order(contact_investigation$infector), ]
row.names(contact_investigation) <- NULL

# add contact tracing status
pick_N <- which(contact_investigation$was_case == "N")
pick_N <- which(contacts_tbl$was_case == "N")
status_type <- sample(
names(contact_tracing_status_probs),
size = length(pick_N),
replace = TRUE,
prob = contact_tracing_status_probs
)
contact_investigation$status <- NA
contact_investigation$status[pick_N] <- status_type
contact_investigation$status <- ifelse(
test = is.na(contact_investigation$status),
yes = "case",
no = contact_investigation$status
)
contacts_tbl$status <- "case"
contacts_tbl$status[pick_N] <- status_type

# remove infector col
contact_investigation$infector <- NULL
contacts_tbl <- contacts_tbl[-1, ]

# return contacts
contact_investigation
contacts_tbl
}
Loading