library(predictNMB)
library(parallel)
library(ggplot2)
library(flextable)
set.seed(42)
This vignette is purely about how to use the autoplot()
method and summary()
to visualise and summarise the
simulations made using {predictNMB}
. For an introduction to
{predictNMB}
, please see the introductory
vignette.
Firstly, as an example case, we will run
screen_simulation_inputs()
.
<- function() {
get_nmb_sampler_training c(
"TP" = rnorm(n = 1, mean = -80, sd = 5),
"TN" = 0,
"FP" = -20,
"FN" = rnorm(n = 1, mean = -100, sd = 10)
)
}
<- function() {
get_nmb_sampler_evaluation c(
"TP" = -80,
"TN" = 0,
"FP" = -20,
"FN" = -100
) }
<- makeCluster(2) cl
<- screen_simulation_inputs(
sim_screen_obj n_sims = 500,
n_valid = 10000,
sim_auc = seq(0.7, 0.95, 0.05),
event_rate = c(0.1, 0.2),
fx_nmb_training = get_nmb_sampler_training,
fx_nmb_evaluation = get_nmb_sampler_evaluation,
cutpoint_methods = c("all", "none", "youden", "value_optimising"),
cl = cl
)stopCluster(cl)
screen_simulation_inputs()
In this simulation screen, we vary both the event rate and the model
discrimination (sim_AUC). There are many ways that we could visualise
the data. The autoplot()
function allows us to make some
basic plots to compare the impact of different cutpoint methods on Net
Monetary Benefit (NMB) and another variable of our choice.
In this case, we can visualise the impact on NMB for different
methods across varying levels of sim_auc
or
event_rate
. We control this with the
x_axis_var
argument.
autoplot(sim_screen_obj, x_axis_var = "sim_auc")
#>
#>
#> Varying simulation inputs, other than sim_auc, are being held constant:
#> event_rate: 0.1
(To avoid the overlap of points in this second plot, we can specify
the dodge_width
to be non-zero.)
autoplot(sim_screen_obj, x_axis_var = "event_rate", dodge_width = 0.002)
#>
#>
#> Varying simulation inputs, other than event_rate, are being held constant:
#> sim_auc: 0.7
For these plots, one of the screened inputs will be the x-axis
variable, but the other will only be displayed at a single level. The
default setting will assume the first level, so when we visualise the
change in NMB specifying sim_auc
as the x-axis variable, we
only observe this for the case where event_rate = 0.1
. We
can choose to select another level with the constants
argument. This argument expects a named list containing the values to
keep for the screened inputs which are not shown on the x-axis.
autoplot(sim_screen_obj, x_axis_var = "sim_auc", constants = list(event_rate = 0.1))
#>
#>
#> Varying simulation inputs, other than sim_auc, are being held constant:
#> event_rate: 0.1
autoplot(sim_screen_obj, x_axis_var = "sim_auc", constants = list(event_rate = 0.2))
#>
#>
#> Varying simulation inputs, other than sim_auc, are being held constant:
#> event_rate: 0.2
We see both a change to the plot as well as the message produced when the plot is made.
There are three options for the y-axis. The default is the NMB, but
you can also visualise the Incremental Net Monetary Benefit (INB) and
the selected cutpoints. These are controlled by the what
argument, which can be any of c("nmb", "inb", "cutpoints")
.
If a vector is used, only the first value will be selected. If you
choose to visualise the INB, you must list your chosen reference
strategy for the calculation in the inb_ref_col
. In this
case, we use treat all ("all"
).
autoplot(sim_screen_obj, what = "nmb")
autoplot(sim_screen_obj, what = "inb", inb_ref_col = "all")
autoplot(sim_screen_obj, what = "cutpoints")
The plots show the median (the dot), the 95% confidence interval
(thick vertical lines), the range (thin vertical lines), and the lines
between the points by default. These can each be shown or hidden
independently, and the width of the confidence interval can be
controlled using the plot_conf_level
argument.
autoplot(sim_screen_obj)
autoplot(sim_screen_obj, plot_range = FALSE)
autoplot(sim_screen_obj, plot_conf_level = FALSE)
autoplot(sim_screen_obj, plot_conf_level = FALSE, plot_range = FALSE)
autoplot(sim_screen_obj, plot_conf_level = FALSE, plot_range = FALSE, plot_line = FALSE)
Currently, the lines and dots overlap. We can use
dodge_width
to apply a horizontal dodge for all layers.
autoplot(sim_screen_obj)
autoplot(sim_screen_obj, dodge_width = 0.01)
The cutpoint methods can be renamed or removed. To rename them, pass
a named vector to the rename_vector
argument. The names of
the vector are the new names and the values are the names you wish to
replace.
autoplot(sim_screen_obj)
autoplot(
sim_screen_obj,rename_vector = c("Treat All" = "all",
"Treat None" = "none",
"Youden Index" = "youden",
"Value Optimisation" = "value_optimising")
)
You can reorder the methods by passing the order as the
methods_order
argument. Also note that this will remove all
methods which aren’t included, and it will factor the names AFTER it has
renamed them. So, if you are both renaming and re-ordering, you must
provide the updated names when you order them:
autoplot(sim_screen_obj)
autoplot(sim_screen_obj, methods_order = c("all", "none"))
autoplot(
sim_screen_obj,# Assign new names to the two methods of interest
rename_vector = c("Treat All" = "all", "Treat None" = "none"),
# Call the methods by their new names
methods_order = c("Treat All", "Treat None")
)
The transparency of all layers can be controlled with
plot_alpha
.
autoplot(sim_screen_obj)
autoplot(sim_screen_obj, plot_alpha = 0.2)
autoplot(sim_screen_obj, plot_alpha = 1)
do_nmb_sim()
Many of the same arguments that we used above can be used with the
object returned from do_nmb_sim()
<- do_nmb_sim(
do_nmb_sim_obj n_sims = 500,
n_valid = 10000,
sim_auc = 0.7,
event_rate = 0.1,
fx_nmb_training = get_nmb_sampler_training,
fx_nmb_evaluation = get_nmb_sampler_evaluation,
cutpoint_methods = c("all", "none", "youden", "value_optimising")
)
The plots here show the results of a single simulation and compare the available cutpoints.
autoplot(do_nmb_sim_obj) + theme_sim()
The y-axis variable and names and orders of methods can be controlled in the same way as previously:
autoplot(do_nmb_sim_obj, what = "nmb") + theme_sim()
autoplot(
do_nmb_sim_obj,what = "inb",
inb_ref_col = "all",
rename_vector = c(
"Value-Optimising" = "value_optimising",
"Treat-None" = "none",
"Youden Index" = "youden"
)+ theme_sim() )
autoplot(
do_nmb_sim_obj,what = "cutpoints",
methods_order = c("all", "none", "youden", "value optimising")
+ theme_sim() )
These plots display the median as the solid bar, the grey part of the
distributions are the outer 5% of the simulated values and the light
blue region is the 95% CI. For the methods that select thresholds based
on the values in the 2x2 table, including the value-optimising
thresholds, this may look a little strange as the cutpoints are highly
variable. This can be stabilised with more simulations. The fill colours
of the histogram are controlled with fill_cols
and the line
for the median is controlled with median_line_col
. The
thickness of the median line is controlled with
median_line_size
and its transparency with
median_line_alpha
.
autoplot(
do_nmb_sim_obj,fill_cols = c("red", "blue"),
median_line_col = "yellow",
median_line_alpha = 1,
median_line_size = 0.9
+ theme_sim() )
The n_bins
argument controls the number of bins used for
the histograms and the label_wrap_width
is the number of
characters above which to start a new line for the facet labels. This
can be handy when using detailed names for the methods when the font of
the label is relatively large compared to the plot, though a space is
needed to determine where to split the label. The width of the
confidence intervals can also be controlled by the
conf.level
argument in this autoplot()
call.
autoplot(
do_nmb_sim_obj,n_bins = 15,
rename_vector = c(
"Value- Optimising" = "value_optimising",
"Treat- None" = "none",
"Treat- All" = "all",
"Youden Index" = "youden"
),label_wrap_width = 5,
conf.level = 0.8
+ theme_sim() )
To make tables from the same objects as we used for the plots, we
instead use summary()
. This can be applied to either type
of results (screen_simulation_inputs()
or
do_nmb_sim()
). Using the %>%
operator, we
can pass it straight to flextable()
from the
{flextable}
package that we have already loaded.
summary(sim_screen_obj)
sim_auc | event_rate | all_median | all_95% CI | none_median | none_95% CI | youden_median | youden_95% CI | value optimising_median | value optimising_95% CI |
---|---|---|---|---|---|---|---|---|---|
0.70 | 0.1 | -25.97 | -27.2 to -25.1 | -10.00 | -12.1 to -7.9 | -15.08 | -20.2 to -11 | -10.04 | -12.1 to -8 |
0.70 | 0.2 | -32.03 | -34 to -30.1 | -19.91 | -24 to -15.9 | -23.16 | -27 to -19.9 | -19.86 | -23.8 to -16.1 |
0.75 | 0.1 | -25.99 | -27 to -24.9 | -10.00 | -12 to -8 | -14.02 | -19.6 to -10.5 | -10.06 | -12 to -8.1 |
0.75 | 0.2 | -31.91 | -33.8 to -30 | -20.02 | -24.1 to -16.1 | -22.34 | -25.8 to -19.4 | -19.86 | -23.4 to -16.5 |
0.80 | 0.1 | -25.96 | -27 to -24.9 | -9.99 | -12 to -8.1 | -13.02 | -17.5 to -10.3 | -10.03 | -11.7 to -8.3 |
0.80 | 0.2 | -32.04 | -34 to -30.1 | -20.08 | -23.6 to -15.9 | -21.51 | -24.4 to -18.5 | -19.62 | -22.4 to -16.4 |
0.85 | 0.1 | -26.00 | -27 to -24.9 | -10.01 | -12 to -8.1 | -12.33 | -16.3 to -9.9 | -9.86 | -11.5 to -8.2 |
0.85 | 0.2 | -32.01 | -34.2 to -30.2 | -20.12 | -24.1 to -16.2 | -20.61 | -23.4 to -18.3 | -19.23 | -21.8 to -16.6 |
0.90 | 0.1 | -25.96 | -27.1 to -25 | -10.17 | -12.1 to -7.9 | -11.30 | -14.7 to -9.3 | -9.72 | -11.3 to -8.3 |
0.90 | 0.2 | -31.88 | -34.1 to -29.9 | -20.13 | -24 to -16.3 | -19.57 | -22.1 to -17.4 | -18.59 | -21 to -16.3 |
0.95 | 0.1 | -26.00 | -27 to -25 | -10.02 | -12 to -8 | -10.16 | -12.9 to -8.7 | -9.20 | -10.4 to -8.1 |
0.95 | 0.2 | -31.95 | -34.1 to -29.8 | -20.14 | -23.6 to -16.5 | -18.47 | -20.7 to -16.2 | -17.85 | -19.8 to -16 |
summary(do_nmb_sim_obj)
method | median | 95% CI |
---|---|---|
all | -25.97 | -27.2 to -25.1 |
none | -10.00 | -12.1 to -7.9 |
value optimising | -10.04 | -12.1 to -8 |
youden | -15.08 | -20.2 to -11 |
By default, the methods are aggregated by the median and the 95%
confidence intervals (and rounded to 2 and 1 decimal places,
respectively). These are the default list of functions passed to the
summary()
as the agg_functions
argument. These
can be changed to any functions which aggregate a numeric vector.
summary(
do_nmb_sim_obj,agg_functions = list(
"mean" = function(x) round(mean(x), digits=2),
"min" = min,
"max" = max
) )
method | mean | min | max |
---|---|---|---|
all | -26.01 | -27.60814 | -24.551688 |
none | -10.06 | -13.91717 | -7.256865 |
value optimising | -10.17 | -25.80009 | -7.473393 |
youden | -15.18 | -23.07328 | -9.794308 |
The what
and rename_vector
arguments work
in the same way as they did when using autoplot()
.
summary(
do_nmb_sim_obj,what = "inb",
inb_ref_col = "all",
rename_vector = c(
"Value-Optimising" = "value_optimising",
"Treat-None" = "none",
"Youden Index" = "youden"
) )
method | median | 95% CI |
---|---|---|
Treat-None | 15.97 | 13.7 to 18.2 |
Value-Optimising | 15.92 | 13.7 to 18 |
Youden Index | 11.04 | 5.6 to 15 |
summary(sim_screen_obj)
sim_auc | event_rate | all_median | all_95% CI | none_median | none_95% CI | youden_median | youden_95% CI | value optimising_median | value optimising_95% CI |
---|---|---|---|---|---|---|---|---|---|
0.70 | 0.1 | -25.97 | -27.2 to -25.1 | -10.00 | -12.1 to -7.9 | -15.08 | -20.2 to -11 | -10.04 | -12.1 to -8 |
0.70 | 0.2 | -32.03 | -34 to -30.1 | -19.91 | -24 to -15.9 | -23.16 | -27 to -19.9 | -19.86 | -23.8 to -16.1 |
0.75 | 0.1 | -25.99 | -27 to -24.9 | -10.00 | -12 to -8 | -14.02 | -19.6 to -10.5 | -10.06 | -12 to -8.1 |
0.75 | 0.2 | -31.91 | -33.8 to -30 | -20.02 | -24.1 to -16.1 | -22.34 | -25.8 to -19.4 | -19.86 | -23.4 to -16.5 |
0.80 | 0.1 | -25.96 | -27 to -24.9 | -9.99 | -12 to -8.1 | -13.02 | -17.5 to -10.3 | -10.03 | -11.7 to -8.3 |
0.80 | 0.2 | -32.04 | -34 to -30.1 | -20.08 | -23.6 to -15.9 | -21.51 | -24.4 to -18.5 | -19.62 | -22.4 to -16.4 |
0.85 | 0.1 | -26.00 | -27 to -24.9 | -10.01 | -12 to -8.1 | -12.33 | -16.3 to -9.9 | -9.86 | -11.5 to -8.2 |
0.85 | 0.2 | -32.01 | -34.2 to -30.2 | -20.12 | -24.1 to -16.2 | -20.61 | -23.4 to -18.3 | -19.23 | -21.8 to -16.6 |
0.90 | 0.1 | -25.96 | -27.1 to -25 | -10.17 | -12.1 to -7.9 | -11.30 | -14.7 to -9.3 | -9.72 | -11.3 to -8.3 |
0.90 | 0.2 | -31.88 | -34.1 to -29.9 | -20.13 | -24 to -16.3 | -19.57 | -22.1 to -17.4 | -18.59 | -21 to -16.3 |
0.95 | 0.1 | -26.00 | -27 to -25 | -10.02 | -12 to -8 | -10.16 | -12.9 to -8.7 | -9.20 | -10.4 to -8.1 |
0.95 | 0.2 | -31.95 | -34.1 to -29.8 | -20.14 | -23.6 to -16.5 | -18.47 | -20.7 to -16.2 | -17.85 | -19.8 to -16 |
The summary table contains the same outputs for both
do_nmb_sim()
and screen_simulation_inputs()
,
but they are arranged slightly differently. Each row in the screen over
inputs object is a unique set of inputs. By default, this is trimmed to
include only those inputs that vary in our function call — here,
sim_auc
and event_rate
— by using the
show_full_inputs
argument. By default, only the inputs that
vary are shown. However, we can set show_full_inputs = TRUE
to see more.
summary(sim_screen_obj, show_full_inputs = TRUE)
In this table below, we merge repeated values using
merge_v()
and add the theme_box()
to make it a
bit easier to read. (You can see more about making tables with
{flextable}
here.)
n_sims | n_valid | fx_nmb_training | fx_nmb_evaluation | sample_size | sim_auc | event_rate | min_events | meet_min_events | .sim_id | all_median | all_95% CI | none_median | none_95% CI | youden_median | youden_95% CI | value optimising_median | value optimising_95% CI |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
500 | 10,000 | unnamed-nmb-function-1 | unnamed-nmb-function-1 | 189 | 0.70 | 0.1 | 19 | TRUE | 1 | -25.97 | -27.2 to -25.1 | -10.00 | -12.1 to -7.9 | -15.08 | -20.2 to -11 | -10.04 | -12.1 to -8 |
246 | 0.2 | 49 | 2 | -32.03 | -34 to -30.1 | -19.91 | -24 to -15.9 | -23.16 | -27 to -19.9 | -19.86 | -23.8 to -16.1 | ||||||
139 | 0.75 | 0.1 | 14 | 3 | -25.99 | -27 to -24.9 | -10.00 | -12 to -8 | -14.02 | -19.6 to -10.5 | -10.06 | -12 to -8.1 | |||||
246 | 0.2 | 49 | 4 | -31.91 | -33.8 to -30 | -20.02 | -24.1 to -16.1 | -22.34 | -25.8 to -19.4 | -19.86 | -23.4 to -16.5 | ||||||
139 | 0.80 | 0.1 | 14 | 5 | -25.96 | -27 to -24.9 | -9.99 | -12 to -8.1 | -13.02 | -17.5 to -10.3 | -10.03 | -11.7 to -8.3 | |||||
246 | 0.2 | 49 | 6 | -32.04 | -34 to -30.1 | -20.08 | -23.6 to -15.9 | -21.51 | -24.4 to -18.5 | -19.62 | -22.4 to -16.4 | ||||||
139 | 0.85 | 0.1 | 14 | 7 | -26.00 | -27 to -24.9 | -10.01 | -12 to -8.1 | -12.33 | -16.3 to -9.9 | -9.86 | -11.5 to -8.2 | |||||
246 | 0.2 | 49 | 8 | -32.01 | -34.2 to -30.2 | -20.12 | -24.1 to -16.2 | -20.61 | -23.4 to -18.3 | -19.23 | -21.8 to -16.6 | ||||||
139 | 0.90 | 0.1 | 14 | 9 | -25.96 | -27.1 to -25 | -10.17 | -12.1 to -7.9 | -11.30 | -14.7 to -9.3 | -9.72 | -11.3 to -8.3 | |||||
246 | 0.2 | 49 | 10 | -31.88 | -34.1 to -29.9 | -20.13 | -24 to -16.3 | -19.57 | -22.1 to -17.4 | -18.59 | -21 to -16.3 | ||||||
139 | 0.95 | 0.1 | 14 | 11 | -26.00 | -27 to -25 | -10.02 | -12 to -8 | -10.16 | -12.9 to -8.7 | -9.20 | -10.4 to -8.1 | |||||
246 | 0.2 | 49 | 12 | -31.95 | -34.1 to -29.8 | -20.14 | -23.6 to -16.5 | -18.47 | -20.7 to -16.2 | -17.85 | -19.8 to -16 |