In using {EpiNow2}
, users will often be faced with two
decision points that will guide their choice of an appropriate model:
(i) use case: retrospective vs real-time analysis, and (ii) limited
computing resources. {EpiNow2}
provides a range of
customisations of the default model to suit these decision points.
The aim of this vignette is to show how these model customisations
affect the speed and accuracy of the model. We will benchmark four (4)
{EpiNow2}
model options chosen to cover a range of use
cases. We will benchmark the default model, the model with the 7-day
random walk prior on \(R_t\), the
non-mechanistic model, which has no explicit prior on \(R_t\), and the non-residual \(R_t\) model, which assumes a stationary
prior on \(R_t\).
We will analyse the estimation and forecasting performance of these
models when solved with MCMC sampling. {EpiNow2}
also
provides approximate sampling methods to solve these models (variational
inference, pathfinder
method, and laplace
sampling). These algorithms are, however, not recommended for use in
pipelines, so we will not emphasise their use. We will, however,
highlight how the models perform when solved with these approximate
sampling methods and provide an overview of when they may be
appropriate.
For the benchmarking exercise, we will compare the models based on their runtimes and the accuracy of estimates on complete and partial data, taken at 3 time points of a simulated epidemic: (i) the growth phase, (ii) the peak, and (iii) the decline phase.
We will start by setting up the “true” dataset with known
trajectories of \(R_t\) and infections
using {EpiNow2}
’s forecast_infections()
function.
forecast_infections()
requires a fitted estimates object
from epinow()
with output
set to “fit”, the
trajectory of the reproduction number, R
, and the number of
samples to simulate. So, we will set these up first.
To obtain the estimates
object, we will run the
epinow()
function using real-world observed data and delay
distributions to recover realistic parameter values. For the
data
, we will use the first \(60\) observations of the
example_confirmed
data set. We will use the
example_generation_time
for the generation time, and the
example_incubation_period
, and
example_reporting_delay
to specify and delays. These come
with the package. For the rt
model, we will use a 14-day
random walk prior, with a mean of \(2\)
and standard deviation of \(0.1\). As
we only want to generate estimates, we will turn off forecasting by
setting horizon = 0
.
Throughout this vignette, several argument values, including the observation model options, the rt model prior will be maintained, so we will define them here.
Now, we can generate the estimates
object.
estimates <- epinow(
data = example_confirmed[1:60],
generation_time = generation_time_opts(example_generation_time),
delays = delay_opts(example_incubation_period + example_reporting_delay),
rt = rt_opts(prior = rt_prior_default, rw = 14),
gp = NULL,
obs = obs,
forecast = forecast_opts(horizon = 0), # no forecasting
output = "fit"
)
#> WARN [2025-01-29 15:07:06] epinow: The `filter_leading_zeros` argument of `estimate_infections()` is deprecated as of EpiNow2 1.7.0. -
#> WARN [2025-01-29 15:07:06] epinow: The `zero_threshold` argument of `estimate_infections()` is deprecated as of EpiNow2 1.7.0. -
#> WARN [2025-01-29 15:07:06] epinow: partial match of 'length' to 'lengths' - $, consecutive, length
For the R
data, we will set up an arbitrary trajectory
and add some Gaussian noise.
# Arbitrary reproduction number trajectory
R <- c(
rep(1.50, 20),
rep(1.25, 10),
rep(1, 10),
rep(0.5, 10),
rep(1, 10),
1 + 0.04 * 1:20,
rep(1.4, 5),
1.4 - 0.02 * 1:20,
rep(1.4, 10),
rep(0.8, 50),
0.8 + 0.02 * 1:20
)
# Add Gaussian noise
R_noisy <- R * rnorm(length(R), 1.1, 0.05)
# Plot
true_rt <- ggplot(data = data.frame(R = R_noisy)) +
geom_line(aes(x = seq_along(R_noisy), y = R)) +
labs(x = "Time")
plot(true_rt)
Let’s proceed to simulate the true infections and \(R_t\) data by sampling from \(10\) posterior samples.
Let’s now extract the true data for benchmarking: -
R_true
: the median of the simulated \(R_t\) values, -
infections_true
: the infections by date of infection, and -
reported_cases_true
: the reported cases by date of
report.
R_true <- forecast$summarised[variable == "R"]$median
# Get the posterior samples from which to extract the simulated infections and reported cases
posterior_sample <- forecast$samples[sample == 1]
# Extract the simulated infections
infections_true <- posterior_sample[variable == "infections"]$value
# Extract the simulated reported cases and rename the "value" column to "confirm" (to match EpiNow2 requirements)
reported_cases_true <- posterior_sample[
variable == "reported_cases", .(date, confirm = value)
]
Let’s see what the cases plot looks like
cases_traj <- ggplot(data = reported_cases_true) +
geom_line(aes(x = date, y = confirm)) +
scale_y_continuous(label = scales::label_comma()) +
scale_x_date(date_labels = "%b %d", date_breaks = "1 month") +
labs(y = "Reported cases", x = "Date")
#> Warning in scale_y_continuous(label = scales::label_comma()): partial argument
#> match of 'label' to 'labels'
cases_traj
We will now proceed to define and run the different model options.
Below we describe each model.
model | model_basename | description |
---|---|---|
default_mcmc | default | Default model (non-stationary prior on \(R_t\)); fitting with mcmc |
default_vb | default | Default model (non-stationary prior on \(R_t\)); fitting with variational bayes |
default_pathfinder | default | Default model (non-stationary prior on \(R_t\)); fitting with pathfinder algorithm |
default_laplace | default | Default model (non-stationary prior on \(R_t\)); fitting with laplace approximation |
non_mechanistic_mcmc | non_mechanistic | no mechanistic prior on \(R_t\); fitting with mcmc |
non_mechanistic_vb | non_mechanistic | no mechanistic prior on \(R_t\); fitting with variational bayes |
non_mechanistic_pathfinder | non_mechanistic | no mechanistic prior on \(R_t\); fitting with pathfinder algorithm |
non_mechanistic_laplace | non_mechanistic | no mechanistic prior on \(R_t\); fitting with laplace approximation |
rw7_mcmc | rw7 | 7-day random walk prior on \(R_t\); fitting with mcmc |
rw7_vb | rw7 | 7-day random walk prior on \(R_t\); fitting with variational bayes |
rw7_pathfinder | rw7 | 7-day random walk prior on \(R_t\); fitting with pathfinder algorithm |
rw7_laplace | rw7 | 7-day random walk prior on \(R_t\); fitting with laplace approximation |
non_residual_mcmc | non_residual | Stationary prior on \(R_t\); fitting with mcmc |
non_residual_vb | non_residual | Stationary prior on \(R_t\); fitting with variational bayes |
non_residual_pathfinder | non_residual | Stationary prior on \(R_t\); fitting with pathfinder algorithm |
non_residual_laplace | non_residual | Stationary prior on \(R_t\); fitting with laplace algorithm |
These are the components of each model.
model | rt_gp_prior | fitting | package |
---|---|---|---|
default_mcmc | non_stationary | mcmc | rstan |
default_vb | non_stationary | variational_bayes | rstan |
default_pathfinder | non_stationary | pathfinder | cmdstanr |
default_laplace | non_stationary | laplace | cmdstanr |
non_mechanistic_mcmc | none | mcmc | rstan |
non_mechanistic_vb | none | variational_bayes | rstan |
non_mechanistic_pathfinder | none | pathfinder | cmdstanr |
non_mechanistic_laplace | none | laplace | cmdstanr |
rw7_mcmc | non_stationary | mcmc | rstan |
rw7_vb | non_stationary | variational_bayes | rstan |
rw7_pathfinder | non_stationary | pathfinder | cmdstanr |
rw7_laplace | non_stationary | laplace | cmdstanr |
non_residual_mcmc | stationary | mcmc | rstan |
non_residual_vb | stationary | variational_bayes | rstan |
non_residual_pathfinder | stationary | pathfinder | rstan |
non_residual_laplace | stationary | laplace | cmdstanr |
We will now define the {EpiNow2}
configurations for each
model, which are modifications of the default model.
model_configs <- list(
# The default model with MCMC fitting
default_mcmc = list(
rt = rt_opts()
),
# The default model with variational bayes fitting
default_vb = list(
rt = rt_opts(),
stan = stan_opts(method = "vb", backend = "rstan")
),
# The default model with pathfinder fitting
default_pathfinder = list(
rt = rt_opts(),
stan = stan_opts(method = "pathfinder", backend = "cmdstanr")
),
# The default model with laplace fitting
default_laplace = list(
rt = rt_opts(),
stan = stan_opts(method = "laplace", backend = "cmdstanr")
),
# The non-mechanistic model with MCMC fitting
non_mechanistic_mcmc = list(
rt = NULL
),
# The non-mechanistic model with variational bayes fitting
non_mechanistic_vb = list(
rt = NULL,
stan = stan_opts(method = "vb", backend = "rstan")
),
# The non-mechanistic model with pathfinder fitting
non_mechanistic_pathfinder = list(
rt = NULL,
stan = stan_opts(method = "pathfinder", backend = "cmdstanr")
),
# The non-mechanistic model with laplace fitting
non_mechanistic_laplace = list(
rt = NULL,
stan = stan_opts(method = "laplace", backend = "cmdstanr")
),
# The 7-day RW Rt model with MCMC fitting
rw7_mcmc = list(
rt = rt_opts(
prior = rt_prior_default,
rw = 7
)
),
# The 7-day RW Rt model with variational bayes fitting
rw7_vb = list(
rt = rt_opts(
prior = rt_prior_default,
rw = 7
),
stan = stan_opts(method = "vb", backend = "rstan")
),
# The 7-day RW Rt model with pathfinder fitting
rw7_pathfinder = list(
rt = rt_opts(
prior = rt_prior_default,
rw = 7
),
stan = stan_opts(method = "pathfinder", backend = "cmdstanr")
),
# The 7-day RW Rt model with laplace fitting
rw7_laplace = list(
rt = rt_opts(
prior = rt_prior_default,
rw = 7
),
stan = stan_opts(method = "laplace", backend = "cmdstanr")
),
# The non_residual model with MCMC fitting
non_residual_mcmc = list(
rt = rt_opts(
prior = rt_prior_default,
gp_on = "R0"
)
),
# The non_residual model with variational bayes fitting
non_residual_vb = list(
rt = rt_opts(
prior = rt_prior_default,
gp_on = "R0"
),
stan = stan_opts(method = "vb", backend = "rstan")
),
# The non_residual model with pathfinder fitting
non_residual_pathfinder = list(
rt = rt_opts(
prior = rt_prior_default,
gp_on = "R0"
),
stan = stan_opts(method = "pathfinder", backend = "cmdstanr")
),
# The non_residual model with laplace fitting
non_residual_laplace = list(
rt = rt_opts(
prior = rt_prior_default,
gp_on = "R0"
),
stan = stan_opts(method = "pathfinder", backend = "cmdstanr")
)
)
All the models will share the configuration for the generation time, incubation period, reporting delay, and the forecast horizon, so we will define them once and pass them to the models.
# Generation time
generation_time <- Gamma(
shape = Normal(1.3, 0.3),
rate = Normal(0.37, 0.09),
max = 14
)
# Incubation period
incubation_period <- LogNormal(
meanlog = Normal(1.6, 0.05),
sdlog = Normal(0.5, 0.05),
max = 14
)
# Reporting delay
reporting_delay <- LogNormal(
meanlog = 0.5,
sdlog = 0.5,
max = 10
)
# Combine the incubation period and reporting delay into one delay
delay <- incubation_period + reporting_delay
# 7-day forecast window
horizon <- 7
# Combine the shared model inputs into a list for use across all the models
model_inputs <- list(
generation_time = generation_time_opts(generation_time),
delays = delay_opts(delay),
obs = obs,
forecast = forecast_opts(horizon = horizon),
verbose = FALSE
)
Additionally, from the benchmarking data, we choose the following dates to represent the periods of growth, peak, and decline.
Using the chosen dates, let’s create the data snapshots for fitting the models.
Now, we’re ready to run the models. We will use the
epinow()
function and return useful outputs like the timing
of model runs. We obtain forecasts for the data excluding the forecast
horizon and then compare the forecasts to the data including the horizon
in the evaluations. This is often called an out-of-sample
evaluation.
# Create a version of epinow() that works like base::try() and works even if some models fail.
safe_epinow <- purrr::safely(epinow)
# Run the models over the different dates
results <- lapply(
data_snaps, function(data) {
lapply(
model_configs,
function(model) {
# Use subset of the data
data_to_fit <- data[1:(.N - model_inputs$forecast$horizon)] # exclude the forecast horizon
model_inputs <- c(model_inputs, data = list(data_to_fit))
do.call(
safe_epinow,
c(
model_inputs,
model
)
)
}
)
}
)
We’ll begin by setting up the following functions:
extract_results()
: extract a subset of variables of
interest (“timing”, “Rt”, and “infections”) from an
epinow()
run.get_model_results()
: apply
extract_results()
to a nested list of model runs per
snapshot date.add_fit_type()
: re-categorise the four(4) fitting
methods into MCMC and approximate.make_cols_factors
convert all columns into factor,
except the columns in the except
column.add_epidemic_phase_levels()
: add factor levels to the
epidemic_phase
column to allow for easy ordering.calc_crps()
: calculate the CRPS using the scoringutils R
package.process_crps()
: calculate CRPS estimates for the nested
list of model runs per snapshot date and flatten into a simple
list.add_model_details()
: Add the model components and
descriptions to the results of process_crps()
.calculate_total_crps()
: calculate total CRPS for each
model.plot_total_crps()
: plot the total CRPS for each
model.plot_crps_over_time()
: # plot CRPS over time per
model.# Function to extract the "timing", "Rt", "infections", and "reports" variables from an
# epinow() run. It expects a model run, x, which contains a "results" or "error" component.
# If all went well, "error" should be NULL.
extract_results <- function(x, variable) {
stopifnot(
"variable must be one of c(\"timing\", \"R\", \"infections\", \"reports\")" =
variable %in% c("timing", "R", "infections", "reports")
)
# Return NA if there's an error
if (!is.null(x$error)) {
return(NA)
}
if (variable == "timing") {
return(round(as.duration(x$result$timing), 1))
} else {
obj <- x$result$estimates$fit
}
# Extracting "Rt", "infections", and "reports" is different based on the object's class and
# other settings
if (inherits(obj, "stanfit")) {
# Depending on rt_opts(use_rt = TRUE/FALSE), R shows up as R or gen_R
if (variable == "R") {
# The non-mechanistic model returns "gen_R" where as the others sample "R".
if ("R[1]" %in% names(obj)) {
return(extract(obj, "R")$R)
} else {
return(extract(obj, "gen_R")$gen_R)
}
} else {
return(extract(obj, variable)[[variable]])
}
} else {
obj_mat <- as_draws_matrix(obj)
# Extracting R depends on the value of rt_opts(use_rt = )
if (variable == "R") {
if ("R[1]" %in% variables(obj_mat)) {
return(subset_draws(obj_mat, "R"))
} else {
return(subset_draws(obj_mat, "gen_R"))
}
} else {
return(subset_draws(obj_mat, variable))
}
}
}
# Wrapper for extracting the results and making them into a data.table
get_model_results <- function(results_by_snapshot, variable) {
# Get model results list
purrr::map_depth(results_by_snapshot, 2, extract_results, variable)
}
# Function to categorise fitting methods into mcmc and approximate
add_fit_type <- function(data){
data[, fit_type := ifelse(fitting == "mcmc", "mcmc", "approximate")]
data[]
}
# Function to convert all columns to factor except the specified cols in `except` (which should be DateTime)
make_cols_factors <- function(data, except){
# Make all columns except timing a factor
data[
,
(setdiff(names(data), except)) :=
lapply(.SD, as.factor),
.SDcols = setdiff(names(data), except)
]
data[]
}
# Add factor levels
add_epidemic_phase_levels <- function(data){
# Add epidemic phase as a factor
data[, epidemic_phase := factor(epidemic_phase, levels = c("growth", "peak", "decline"))]
data[]
}
# Function to calculate the CRPS of log-transformed values
calc_crps <- function(estimates, truth) {
# if the object is not a matrix, then it's an NA (failed run)
if (!inherits(estimates, c("matrix"))) return(rep(NA_real_, length(truth)))
# Assumes that the estimates object is structured with the samples as rows
shortest_obs_length <- min(ncol(estimates), length(truth))
reduced_truth <- tail(truth, shortest_obs_length)
estimates_transposed <- t(estimates) # transpose to have samples as columns
reduced_estimates <- tail(estimates_transposed, shortest_obs_length)
return(
crps_sample(
log10(reduced_truth),
log10(reduced_estimates)
)
)
}
# Function to calculate CRPS estimates and flatten the nested list of model values
process_crps <- function(results, variable, truth) {
# Extract values
results_by_snapshot <- get_model_results(results, variable = variable)
# Apply function to calculate CRPS
crps_by_snapshot <- purrr::map_depth(
results_by_snapshot,
2,
~ calc_crps(estimates = ., truth = truth)
)
# Add dates column based on snapshot length
crps_with_dates <- purrr::map_depth(
crps_by_snapshot,
2,
~ data.table(crps = .x)[,
date := min(reported_cases_true$date) + 0: (.N - 1)
]
)
# Flatten the results into one dt
crps_flat <- lapply(
crps_with_dates,
function(snapshot_results) {
rbindlist(snapshot_results, idcol = "model")
}) |>
rbindlist(idcol = "snapshot_date")
# Replace the snapshot dates with their description
crps_flat[, epidemic_phase := names(snapshot_date_labels)[
match(snapshot_date, snapshot_date_labels)
]]
return(crps_flat)
}
# Add model details (components and descriptions useful for summarising)
add_model_details <- function(crps_by_model){
# Add model configurations for facetting
with_model_components <- merge.data.table(
crps_by_model,
model_components,
by = "model"
)
# Add model descriptions
crps_dt <- merge.data.table(
with_model_components,
model_descriptions,
by = "model"
)
crps_dt[]
}
# Define a function to calculate total CRPS stratified by the "by" vector
calculate_total_crps <- function(data, by) {
data[, .(total_crps = sum(crps, na.rm = TRUE)), by = by]
}
# Define a function to plot total CRPS
plot_total_crps <- function(data, title) {
plot <- ggplot(data = data) +
geom_point(
aes(
x = epidemic_phase,
y = total_crps,
shape = model
),
size = 2.5
) +
labs(
x = "Epidemic phase",
y = "Total CRPS",
title = title,
caption = "Where a model is not shown, it means it failed to run"
) +
theme(plot.title = element_text(size = 18),
strip.text = element_text(size = 13),
axis.title = element_text(size = 13),
axis.text = element_text(size = 11)
)
return(plot)
}
# Define a function to plot CRPS over time
plot_crps_over_time <- function(data, title) {
plot <- ggplot(data = data[!is.na(crps)]) + # remove failed models
geom_line(
aes(x = date,
y = crps,
color = model
),
linewidth = 1.1
) +
labs(
x = "Time",
y = "CRPS",
title = title,
caption = "Where a model is not shown, it means it failed to run"
) +
ggplot2::theme_minimal() +
guides(color = guide_legend(title = "Model type")) +
theme(legend.position = "bottom",
plot.title = element_text(size = 18),
strip.text = element_text(size = 13),
axis.title = element_text(size = 13),
axis.text = element_text(size = 11)
)
return(plot)
}
Let’s now see how long each model took to run. Here, we show the run times by fitting method because they approximate methods are significantly faster. Note the difference in scale in the y-axis.
# Extract the run times and reshape to dt
runtimes_by_snapshot <- get_model_results(results, "timing")
# Flatten the results
runtimes_dt <- lapply(runtimes_by_snapshot, function(x) as.data.table(x)) |>
rbindlist(idcol = "snapshot_date", ignore.attr = TRUE)
# Reshape
runtimes_dt_long <- melt(
runtimes_dt,
id.vars = "snapshot_date", # Column to keep as an identifier
measure.vars = model_descriptions$model, # Dynamically select model columns by pattern
variable.name = "model", # Name for the 'model' column
value.name = "timing" # Name for the 'timing' column
)
#> Warning in melt.data.table(runtimes_dt, id.vars = "snapshot_date", measure.vars
#> = model_descriptions$model, : 'measure.vars' [default_mcmc, default_vb,
#> default_pathfinder, default_laplace, ...] are not all of the same type. By
#> order of hierarchy, the molten data value column will be of type 'double'. All
#> measure variables not of type 'double' will be coerced too. Check DETAILS in
#> ?melt.data.table for more on coercion.
# Add model configurations
runtimes_dt_detailed <- merge(
runtimes_dt_long,
model_components,
by = "model"
)
# snapshot dates dictionary
snapshot_date_labels <- c(growth = "2020-05-15", peak = "2020-06-21", decline = "2020-06-28")
# Replace snapshot_date based on the dictionary
runtimes_dt_detailed[, epidemic_phase := names(snapshot_date_labels)[match(snapshot_date, snapshot_date_labels)]]
# Add model descriptions
runtimes_dt_detailed <- merge(
runtimes_dt_detailed,
model_descriptions,
by = "model"
)
# Re-categorise the factor levels
runtimes_dt_detailed <- add_fit_type(runtimes_dt_detailed)
# Make all columns except timing a factor
runtimes_dt_detailed <- make_cols_factors(runtimes_dt_detailed, except = "timing")
# Add epidemic_phase factor levels to c("growth", "peak", "decline"))
runtimes_dt_detailed <- add_epidemic_phase_levels(runtimes_dt_detailed)
# Plot the timing
timing_plot <- ggplot(data = runtimes_dt_detailed) +
geom_point(aes(x = epidemic_phase,
y = timing,
shape = model_basename,
color = fitting
),
position = position_dodge(width = 0.4),
size = 3,
stroke = 1.2
) +
labs(x = "Epidemic phase",
y = "Runtime (secs)",
shape = "Model",
fill = "Fitting method",
title = "Runtimes per model and fitting method",
caption = "Where a model is not shown, it means it failed to run"
) +
theme_minimal() +
scale_color_brewer(palette = "Dark2") +
scale_shape_manual(values = c(1, 0, 6, 8)) +
facet_wrap(~fit_type, scales = "free_y", nrow = 2, strip.position = "top") +
theme(plot.title = element_text(size = 18),
strip.text = element_text(size = 13),
axis.title = element_text(size = 13),
axis.text = element_text(size = 11)
)
timing_plot
#> Warning: Removed 7 rows containing missing values or values outside the scale range
#> (`geom_point()`).
We can see that across the board, the non-mechanistic model and non-residuals models were the fastest whereas the default model was among the slowest models for all data scenarios.
Now, we will evaluate the performance of the models using the continuous ranked probability score (CRPS). The CRPS is a proper scoring rule that measures the accuracy of probabilistic forecasts. When comparing models, the smaller the CRPS, the better.
We will focus on evaluating the results of fitting with MCMC. We will
also show the results for the approximate sampling methods but will not
delve deeply into them as we do not recommended for full inference due
to their experimental nature currently, which makes them unstable. In a
latter section, we will provide ideas on how the approximate methods can
be used in {EpiNow2}
workflows.
As part of post-processing, we will add a “type” column to the output of the estimates to indicate which subsets are based on partial and complete data. We will get the “type” column from the default model (the same values across model outputs) and append them to the rest of the model results.
# Get date and fit type from the default model (the same across model outputs)
results_by_model <- list_transpose(results)
estimate_types <- lapply(
results_by_model$non_mechanistic_mcmc,
function(results_by_snapshot) {
results_by_snapshot$result$estimates$summarised[
variable == "reported_cases"][
,
c("date", "type")
]
}
) |>
rbindlist(idcol = "snapshot_date")
# add the "estimate_type" column.
rt_crps_dt <- merge(
rt_crps_full,
estimate_types,
by = c("date", "snapshot_date")
)
# Add the "estimate_type" column
infections_crps_dt <- merge(
infections_crps_full,
estimate_types,
by = c("date", "snapshot_date")
)
# Re-categorise fit_type column and convert to factor
# Rt
rt_crps_dt <- add_fit_type(rt_crps_dt)
rt_crps_dt <- make_cols_factors(rt_crps_dt, except = c("date", "crps"))
rt_crps_dt <- add_epidemic_phase_levels(rt_crps_dt)
# Simplify the model name for downstream use
rt_crps_dt_final <- rt_crps_dt[, model := gsub("_[^_]*$", "", model)]
# Infections
infections_crps_dt <- add_fit_type(infections_crps_dt)
infections_crps_dt <- make_cols_factors(infections_crps_dt, except = c("date", "crps"))
infections_crps_dt <- add_epidemic_phase_levels(infections_crps_dt)
# Simplify the model name for downstream use
infections_crps_dt_final <- infections_crps_dt[, model := gsub("_[^_]*$", "", model)]
Let’s see how the \(R_t\) and
infections CRPS changed over time using the function
plot_crps_over_time()
. We’ll start with the models fitted
with MCMC.
# Plot CRPS over time for infections
infections_crps_mcmc_dt <- infections_crps_dt_final[fitting == "mcmc"]
infections_crps_mcmc_plot <- plot_crps_over_time(infections_crps_mcmc_dt, "Time-varying model performance in estimating infections (mcmc)")
infections_crps_mcmc_plot +
facet_wrap(~epidemic_phase, ncol = 1)
Let’s calculate the overall/aggregated performance of the models in terms of the total CRPS for \(R_t\) and infections for downstream visualisations.
# Rt
rt_total_crps <- calculate_total_crps(rt_crps_dt_final, by = c("model", "epidemic_phase", "type", "fitting"))
# infections
infections_total_crps <- calculate_total_crps(infections_crps_dt_final, by = c("model", "epidemic_phase", "type", "fitting"))
# We are not interested in the "estimates", just "estimaes based on partial data" and "forecasts"
rt_total_crps <- rt_total_crps[type != "estimate"]
infections_total_crps <- infections_total_crps[type != "estimate"]
Now, let’s visualise the results for MCMC fitting.
# Calculate
infections_total_crps_mcmc <- infections_total_crps[fitting == "mcmc"]
# Process and plot
infections_total_crps_plot <- plot_total_crps(infections_total_crps_mcmc, "Aggregated model performance in estimating infections (mcmc)")
infections_total_crps_plot +
scale_shape_manual(values = c(0, 1, 17, 8)) +
facet_wrap(~type)
We’ll now show the performance of the approximate methods. Note that we do not recommend using them in real-world inference and analytics pipelines. We provide alternative use cases in the following sections.
Let’s first look at the time varying \(R_t\) and infections estimates.
# Plot CRPS over time for Rt
rt_crps_approx <- rt_crps_dt_final[fitting != "mcmc"]
# Process plot data and plot
rt_tv_crps_plot_approx <- plot_crps_over_time(rt_crps_approx, "Time-varying model performance in estimating Rt (approximate methods)")
rt_tv_crps_plot_approx +
facet_wrap(fitting~epidemic_phase)
# Plot CRPS over time for Rt
infections_crps_approx <- infections_crps_dt_final[fitting != "mcmc"]
# Process plot data and plot
infections_tv_crps_plot_approx <- plot_crps_over_time(infections_crps_approx, "Time-varying model performance in estimating infections (approximate methods)")
infections_tv_crps_plot_approx +
facet_wrap(fitting~epidemic_phase, scales = "free_y") +
labs(caption = "Where a model is not shown, it means it failed to run")
Now, let’s look at the total CRPS for \(R_t\) and infections.
# Calculate total CRPS
rt_total_crps_approx <- rt_total_crps[fitting != "mcmc"]
# Process plot data and plot
rt_total_crps_approx_plot <- plot_total_crps(rt_total_crps_approx, "Aggregated model performance in estimating Rt (approximate methods)")
rt_total_crps_approx_plot +
scale_shape_manual(values = c(0, 1, 17, 8)) +
facet_wrap(~ type + fitting) +
labs(caption = "Where a model is not shown, it means it failed to run")
# Calculate total CRPS
infections_total_crps_approx <- infections_total_crps[fitting != "mcmc"]
# Process plot data and plot
infections_total_crps_approx_plot <- plot_total_crps(infections_total_crps_approx, "Aggregated model performance in estimating infections (approximate methods)")
infections_total_crps_approx_plot +
scale_shape_manual(values = c(0, 1, 17, 8)) +
facet_wrap( ~ type + fitting) +
labs(caption = "Where a model is not shown, it means it failed to run")
Overall, the non-mechanistic model was always the fastest and had the best performance in estimating \(R_t\). The default model was among the slowest models in most cases but often had the best time-varying and aggregated performance in estimating infections. The non-residual and 7-day random walk models showed mixed results. This suggests a trade-off between run times/speed and estimation/forecasting performance, here measured with the CRPS. These results show that choosing an appropriate model for a task requires carefully considering the use case and appropriate trade-offs. Below are a few considerations.
Estimation in {EpiNow2}
using the semi-mechanistic
approaches (putting a prior on \(R_t\))
is often much slower than the non-mechanistic approach. The mechanistic
model is slower because it models aspects of the processes and
mechanisms that drive \(R_t\) estimates
using the renewal equation. The non-mechanistic model, on the other
hand, runs much faster but does not use the renewal equation to generate
infections. Because of this none of the options defining the behaviour
of the reproduction number are available in this case, limiting its
flexibility.
It’s worth noting that the non-mechanistic model in
{EpiNow2}
is equivalent to that used in the {EpiEstim}
R package as they both estimate \(R_t\)
from case time series and the generation interval distribution.
The default sampling method, set through stan_opts()
,
performs MCMC
sampling using {rstan}
.
The MCMC sampling method is accurate but is often slow. The Laplace,
pathfinder, and variational inference methods are faster because they
are approximate (See, for example, a detailed explanation for automatic variational inference
in Stan).
In {EpiNow2}
, you can use variational inference with an
{rstan}
or {cmdstanr}
backend
but you must install the latter to access its functionalities.
Additionally, {EpiNow2}
supports using the Laplace
and Pathfinder
approximate samplers through the {cmdstanr}
R package but
these two methods are currently experimental in {cmdstanr}
and have not been well tested. The approximate methods may not be as
reliable as the default MCMC sampling method and we do not recommend
using them in real-world inference.
The approximate methods can be used in various ways. First, you can initialise the MCMC sampling algorithm with the fit object returned by methods such as pathfinder. More details can be found in the original pathfinder paper. This approach speeds up the initialisation phase of the MCMC algorithm. Second, the approximate methods are also great for prototyping. For example, if you are testing out a pipeline setup, it might be more practical to switch to a method like variational bayes and only use MCMC when the pipeline is up and running.
The random walk method reduces smoothness/granularity of the estimates, compared to the other methods.
We generated the data using an arbitrary R
trajectory.
This represents only one of many data scenarios that the models can be
benchmarked against. The data used here represents abrupt rises and
falls and could favour one model type or solver over another.
The run times measured here use a crude method that compares the
start and end times of each simulation. It only measures the time taken
for one model run and may not be accurate. For more accurate run time
measurements, we recommend using a more sophisticated approach like
those provided by packages like {bench}
and {microbenchmark}
.
Lastly, we used 6 cores for the simulations and so using more or
fewer cores might change the run time results. We, however, expect the
relative rankings to be the same or similar. To speed up the model runs,
we recommend checking the number of cores available on your machine
using parallel::detectCores()
and passing a high enough
number of cores to mc.cores
through the
options()
function (See the benchmarking data setup chunk
above for an example of how to do this).