Title: | Power Analysis for Moderation and Mediation |
Version: | 0.1.0 |
Description: | Power analysis and sample size determination for moderation, mediation, and moderated mediation in models fitted by structural equation modelling using the 'lavaan' package by Rosseel (2012) <doi:10.18637/jss.v048.i02> or by multiple regression. The package 'manymome' by Cheung and Cheung (2024) <doi:10.3758/s13428-023-02224-z> is used to specify the indirect paths or conditional indirect paths to be tested. |
License: | GPL (≥ 3) |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
Suggests: | knitr, rmarkdown, testthat (≥ 3.0.0) |
Config/testthat/edition: | 3 |
Config/testthat/parallel: | true |
Depends: | R (≥ 4.4.0) |
URL: | https://sfcheung.github.io/power4mome/ |
BugReports: | https://github.com/sfcheung/power4mome/issues |
Imports: | lavaan, stats, manymome (≥ 0.2.8), pbapply, parallel, pgnorm, lmhelprs (≥ 0.4.2), psych, yaml, graphics, methods |
Config/Needs/website: | rmarkdown |
VignetteBuilder: | knitr |
NeedsCompilation: | no |
Packaged: | 2025-08-31 07:40:34 UTC; shufa |
Author: | Shu Fai Cheung |
Maintainer: | Shu Fai Cheung <shufai.cheung@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2025-09-04 14:20:16 UTC |
power4mome: Power Analysis for Moderation and Mediation
Description
Power analysis and sample size determination for moderation, mediation, and moderated mediation in models fitted by structural equation modelling using the 'lavaan' package by Rosseel (2012) doi:10.18637/jss.v048.i02 or by multiple regression. The package 'manymome' by Cheung and Cheung (2024) doi:10.3758/s13428-023-02224-z is used to specify the indirect paths or conditional indirect paths to be tested.
Author(s)
Maintainer: Shu Fai Cheung shufai.cheung@gmail.com (ORCID)
Authors:
See Also
Useful links:
Do a Test on Each Replication
Description
Do a test on each
replication in the output of
sim_out()
.
Usage
do_test(
sim_all,
test_fun,
test_args = list(),
map_names = c(fit = "fit"),
results_fun = NULL,
results_args = list(),
parallel = FALSE,
progress = FALSE,
ncores = max(1, parallel::detectCores(logical = FALSE) - 1)
)
Arguments
sim_all |
The output of
|
test_fun |
A function to do the
test. See 'Details' for the requirement
of this function. There are some
built-in test functions in |
test_args |
A list of arguments
to be passed to the |
map_names |
A named character
vector specifying how the content of
the element |
results_fun |
The function to be
used to extract the test results.
See |
results_args |
A list of
arguments to be passed to the
|
parallel |
If |
progress |
If |
ncores |
The number of CPU cores to use if parallel processing is used. |
Details
The function do_test()
does
an arbitrary test in each
replication using the function set to
test_fun
.
Value
An object of the class test_out
,
which is a list of length equal to
sim_all
, one element for each
replication. Each element of the list
has two elements:
-
test
: The output of the function set totest_fun
. -
test_results
: The output of the function set toresults_fun
.
The role of do_test()
The function do_test()
is used by
the all-in-one function
power4test()
. Users usually do not
call this function directly, though
developers can use this function to
develop other functions for power
analysis, or to build their own
workflows to do the power analysis.
Major Test-Related Arguments
The test function (test_fun)
The function set to test_fun
,
the test function, usually
should work
on the output of lavaan::sem()
,
lmhelprs::many_lm()
, or
stats::lm()
, but can also be a
function that works on the output
of the function set to fit_function
when calling fit_model()
or
power4test()
(see fit_model_args
).
The function has two default requirements.
First, it has an argument fit
, to
be set to the output of
lavaan::sem()
or another output
stored in the element extra$fit
of
a replication in the sim_all
object. (This requirement can be
relaxed; see the section on map_names
.)
That is, the function definition
should be of this from: FUN(fit, ...)
. This is the form of all
test_*
functions in power4mome
.
If other arguments are to be passed
to the test function, they can be set
to test_args
as a named list.
Second, the test function must returns an output that (a) can be processed by the results function (see below), or (b) is of the required format for the output of a results function (see the next section). If it already returns an output of the required format, then there is no need to set the results function.
The results function (results_fun)
The test results will be extracted
from the output of test_fun
by the
function set to results_fun
,
the results function. If
the test_fun
already returns an
output of the expected format
(see below), then set results_fun
to NULL
, the default. The output
of test_fun
will be used for
estimating power.
The function set to results_fun
must accept the output of test_fun
,
as the first argument, and return a
named list (which can be a data frame)
or a named vector with some
of the following
elements:
-
est
: Optional. The estimate of a parameter, if applicable. -
se
: Optional. The standard error of the estimate, if applicable. -
cilo
: Optional. The lower limit of the confidence interval, if applicable. -
cihi
: Optional. The upper limit of the confidence interval, if applicable. -
sig
: Required. If1
, the test is significant. If0
, the test is not significant. If the test cannot be done for any reason, it should beNA
.
The results can then be used to estimate the power or Type I error of the test.
For example, if
the null hypothesis is false, then
the proportion of significant, that
is, the mean of the values of sig
across replications, is the power.
Built-in test functions
The package power4mome
has some ready-to-use
test functions:
Please refer to their help pages for examples.
The argument map_names
This argument is for developers using
a test function that has a different
name for the argument of the fit
object ("fit"
, the default).
If test_fun
is set to a function
that works on an output of, say,
lavaan::sem()
but the argument name
for the output is not fit
, the
mapping can be changed by
map_names
.
For example,
lavaan::parameterEstimates()
receives an output of lavaan::sem()
and reports the test results of model
parameters. However, the argument
name for the lavaan
output is
object.
To instruct do_test()
to
do the test correctly when setting
test_fun
to
lavaan::parameterEstimates
, add
map_names = c(object = "fit")
. The
element fit
in a replication will
then set to the argument object
of
lavaan::parameterEstimates()
.
The background for having the results_fun
argument
In the early development of
power4mome
, test_fun
is designed
to accept existing functions from
other packages, such as
manymome::indirect_effect()
. Their
outputs are not of the required
format for power analysis, and so
results functions are needed to
process their outputs. In the current
version of power4mome
, some
ready-to-use test functions, usually
wrappers of these existing functions
from other packages, have been
developed, and they no longer need
results functions to process the
output. The argument results_fun
is
kept for backward compatibility and
advanced users to use test functions
from other packages.
See Also
See power4test()
for
the all-in-one function that uses
this function. See test_indirect_effect()
,
test_cond_indirect()
,
test_cond_indirect_effects()
,
test_moderation()
,
test_index_of_mome()
, and
test_parameters()
for examples of
test functions.
Examples
# Specify the population model
mod <-
"
m ~ x
y ~ m + x
"
# Specify the effect sizes (population parameter values)
es <-
"
y ~ m: m
m ~ x: m
y ~ x: n
"
# Generate several simulated datasets
data_all <- sim_data(nrep = 5,
model = mod,
pop_es = es,
n = 100,
iseed = 1234)
# Fit the population model to each datasets
fit_all <- fit_model(data_all)
# Generate Monte Carlo estimates for forming Monte Carlo confidence intervals
mc_all <- gen_mc(fit_all,
R = 50,
iseed = 4567)
# Combine the results to a 'sim_all' object
sim_all <- sim_out(data_all = data_all,
fit = fit_all,
mc_out = mc_all)
# Test the indirect effect in each replication
# Set `parallel` to TRUE for faster, usually much faster, analysis
# Set `progress` to TRUE to display the progress of the analysis
test_all <- do_test(sim_all,
test_fun = test_indirect_effect,
test_args = list(x = "x",
m = "m",
y = "y",
mc_ci = TRUE),
parallel = FALSE,
progress = FALSE)
# The result for each dataset
lapply(test_all, function(x) x$test_results)
Fit a Model to a List of Datasets
Description
Get the output of
sim_data()
and fit a model to each
of the stored datasets.
Usage
fit_model(
data_all = NULL,
model = NULL,
fit_function = "lavaan",
arg_data_name = "data",
arg_model_name = "model",
arg_group_name = "group",
...,
fit_out = NULL,
parallel = FALSE,
progress = FALSE,
ncores = max(1, parallel::detectCores(logical = FALSE) - 1)
)
Arguments
data_all |
The output
of |
model |
The model to be fitted.
If |
fit_function |
The function to
be used to fit the model. Can also
be a string: |
arg_data_name |
The name of the
argument of |
arg_model_name |
The name of
the argument of |
arg_group_name |
The name of
the argument of |
... |
Optional arguments to be
passed to |
fit_out |
If set to a |
parallel |
If |
progress |
If |
ncores |
The number of CPU cores to use if parallel processing is used. |
Details
By default, the function fit_model()
extracts the model stored in the output of
sim_data()
,fits the model to each dataset simulated using
fit_function
, default to"lavaan"
andlavaan::sem()
will be called,and returns the results.
If the datasets
were generated from a multigroup
model when calling sim_data()
,
a multigroup model is fitted.
Value
An object of the class fit_out
,
which is a list of the output of
fit_function
(lavaan::sem()
by default). If an error occurred
when fitting the model to a dataset,
then this element will be the error
message from the fit function.
The role of fit_model()
This function is used by the
all-in-one function power4test()
.
Users usually do not call this
function directly, though
developers can use this function to
customize the model fitting step in
power analysis.
See Also
See power4test()
for
the all-in-one function that uses
this function, and sim_data()
for the function generating datasets
for this function.
Examples
# Specify the population model
mod <-
"m ~ x
y ~ m + x"
# Specify the effect sizes (population parameter values)
es <-
"
y ~ m: m
m ~ x: m
y ~ x: n
"
# Generate several simulated datasets
data_all <- sim_data(nrep = 5,
model = mod,
pop_es = es,
n = 100,
iseed = 1234)
# Fit the population model to each datasets
fit_all <- fit_model(data_all)
fit_all[[1]]
# Fit the population model using the MLR estimator
fit_all_mlr <- fit_model(data_all,
estimator = "MLR")
fit_all_mlr[[1]]
# Fit a model different from the population model,
# with the MLR estimator
mod2 <-
"m ~ x
y ~ m"
fit_all_mlr2 <- fit_model(data_all,
mod2,
estimator = "MLR")
fit_all_mlr2[[1]]
Generate Bootstrap Estimates
Description
Get a list of the output
of lavaan::sem()
or lmhelprs::many_lm()
and generate
bootstrap estimates of model
parameters.
Usage
gen_boot(
fit_all,
R = 100,
...,
iseed = NULL,
parallel = FALSE,
progress = FALSE,
ncores = max(1, parallel::detectCores(logical = FALSE) - 1),
compute_implied_stats = FALSE
)
Arguments
fit_all |
The output of
|
R |
The number of replications to generate the bootstrap estimates for each fit output. |
... |
Optional arguments to be
passed to |
iseed |
The seed for the random
number generator. Default is |
parallel |
If |
progress |
If |
ncores |
The number of CPU cores to use if parallel processing is used. |
compute_implied_stats |
Whether
implied statistics are computed
in each bootstrap samples. Usually
not needed and so default to |
Details
The function gen_boot()
simply calls manymome::do_boot()
on each output of
lavaan::sem()
or lmhelprs::many_lm()
in fit_all
. The
simulated
estimates can then be used to test
effects such as indirect effects,
usually by functions from the
manymome
package, such as
manymome::indirect_effect()
.
Value
A boot_list
object, which is a list
of the output of manymome::do_boot()
.
The role of gen_boot()
This function is used by the
all-in-one function power4test()
.
Users usually do not call this
function directly, though
developers can use this function to
customize the workflow of the
power analysis.
See Also
See power4test()
for
the all-in-one function that uses
this function.
Examples
# Specify the population model
mod <-
"m ~ x
y ~ m + x"
# Specify the effect sizes (population parameter values)
es <-
"
y ~ m: m
m ~ x: m
y ~ x: n
"
# Generate several simulated datasets
data_all <- sim_data(nrep = 2,
model = mod,
pop_es = es,
n = 50,
iseed = 1234)
# Fit the population model to each datasets
fit_all <- fit_model(data_all)
# Generate bootstrap estimates for each replication
boot_all <- gen_boot(fit_all,
R = 10,
iseed = 4567)
boot_all
Generate Monte Carlo Estimates
Description
Get a list of the output
of lavaan::sem()
and generate
Monte Carlo estimates of model
parameters.
Usage
gen_mc(
fit_all,
R = 100,
...,
iseed = NULL,
parallel = FALSE,
progress = FALSE,
ncores = max(1, parallel::detectCores(logical = FALSE) - 1),
compute_implied_stats = FALSE
)
Arguments
fit_all |
The output of
|
R |
The number of replications to generate the Monte Carlo estimates for each fit output. |
... |
Optional arguments to be
passed to |
iseed |
The seed for the random
number generator. Default is |
parallel |
If |
progress |
If |
ncores |
The number of CPU cores to use if parallel processing is used. |
compute_implied_stats |
Whether
implied statistics are computed
in each Monte Carlo replication. Usually
not needed and so default to |
Details
The function gen_mc()
simply calls
manymome::do_mc()
on each output of
lavaan::sem()
in fit_all
. The
simulated
estimates can then be used to test
effects such as indirect effects,
usually by functions from the
manymome
package, such as
manymome::indirect_effect()
.
Value
An mc_list
object, which is a list
of the output of manymome::do_mc()
.
The role of gen_mc()
This function is used by the
all-in-one function power4test()
.
Users usually do not call this
function directly, though
developers can use this function to
customize the workflow of the
power analysis.
See Also
See power4test()
for
the all-in-one function that uses
this function.
Examples
# Specify the population model
mod <-
"m ~ x
y ~ m + x"
# Specify the effect sizes (population parameter values)
es <-
"
y ~ m: m
m ~ x: m
y ~ x: n
"
# Generate several simulated datasets
data_all <- sim_data(nrep = 5,
model = mod,
pop_es = es,
n = 100,
iseed = 1234)
# Fit the population model to each datasets
fit_all <- fit_model(data_all)
# Generate Monte Carlo estimates for each replication
mc_all <- gen_mc(fit_all,
R = 100,
iseed = 4567)
mc_all
Plot a Power Curve
Description
Plotting the results
in a 'power_curve' object, such as the
estimated power against sample size,
or the results of power4test_by_n()
or power4test_by_es()
.
Usage
## S3 method for class 'power_curve'
plot(
x,
what = c("ci", "power_curve"),
main = paste0("Power Curve ", "(Predictor: ", switch(x$predictor, n = "Sample Size", es
= "Effect Size"), ")"),
xlab = switch(x$predictor, n = "Sample Size", es = "Effect Size"),
ylab = "Estimated Power",
pars_ci = list(),
type = "l",
ylim = c(0, 1),
ci_level = 0.95,
...
)
## S3 method for class 'power4test_by_n'
plot(
x,
what = c("ci", "power_curve"),
main = "Estimated Power vs. Sample Size",
xlab = "Sample Size",
ylab = "Estimated Power",
pars_ci = list(),
type = "l",
ylim = c(0, 1),
ci_level = 0.95,
...
)
## S3 method for class 'power4test_by_es'
plot(
x,
what = c("ci", "power_curve"),
main = paste0("Estimated Power vs. Effect Size / Parameter (", attr(x[[1]],
"pop_es_name"), ")"),
xlab = paste0("Effect Size / Parameter (", attr(x[[1]], "pop_es_name"), ")"),
ylab = "Estiamted Power",
pars_ci = list(),
type = "l",
ylim = c(0, 1),
ci_level = 0.95,
...
)
Arguments
x |
The object to be plotted.
It can be a |
what |
A character vector of
what to include in the
plot. Possible values are
|
main |
The title of the plot. |
xlab , ylab |
The labels for the horizontal and vertical axes, respectively. |
pars_ci |
A named list of
arguments to be passed to |
type |
An argument of the
default plot method |
ylim |
A two-element numeric vector of the range of the vertical axis. |
ci_level |
The level of
confidence of the confidence intervals,
if requested. Default is |
... |
Optional arguments.
Passed to |
Details
The plot
method of power_curve
objects currently plots the relation
between estimated power and
the predictor. Other elements
can be requested (see the argument
what
), and they can be customized
individually.
Value
The plot
-methods return x
invisibly. They
are called for their side effects.
See Also
power_curve()
,
power4test_by_n()
, and
power4test_by_es()
.
Examples
# Specify the population model
model_simple_med <-
"
m ~ x
y ~ m + x
"
# Specify the effect sizes (population parameter values)
model_simple_med_es <-
"
y ~ m: l
m ~ x: m
y ~ x: s
"
# Simulate datasets to check the model
sim_only <- power4test(nrep = 10,
model = model_simple_med,
pop_es = model_simple_med_es,
n = 50,
fit_model_args = list(fit_function = "lm"),
do_the_test = FALSE,
iseed = 1234,
parallel = FALSE,
progress = FALSE)
# By n: Do a test for different sample sizes
# Set `parallel` to TRUE for faster, usually much faster, analysis
# Set `progress` to TRUE to display the progress of the analysis
out1 <- power4test_by_n(sim_only,
nrep = 10,
test_fun = test_parameters,
test_args = list(par = "y~x"),
n = c(25, 50, 100),
by_seed = 1234,
parallel = FALSE,
progress = FALSE)
pout1 <- power_curve(out1)
pout1
plot(pout1)
# By pop_es: Do a test for different population values of a model parameter
# Set `parallel` to TRUE for faster, usually much faster, analysis
# Set `progress` to TRUE to display the progress of the analysis
out2 <- power4test_by_es(sim_only,
nrep = 10,
test_fun = test_parameters,
test_args = list(par = "y~x"),
pop_es_name = "y ~ x",
pop_es_values = c(0, .3, .5),
by_seed = 1234,
parallel = FALSE,
progress = FALSE)
pout2 <- power_curve(out2)
plot(pout2)
Plot The Results of 'x_from_power'
Description
It plots the results of 'x_from_power', such as the estimated power against sample size.
Usage
## S3 method for class 'x_from_power'
plot(
x,
what = c("ci", "power_curve", "final_x", "final_power", "target_power", switch(x$x, n =
"sig_area", es = NULL)),
text_what = c("final_x", "final_power", switch(x$x, n = "sig_area", es = NULL)),
digits = 3,
main = paste0("Power Curve ", "(Target Power: ", formatC(x$target_power, digits =
digits, format = "f"), ")"),
xlab = NULL,
ylab = "Estimated Power",
ci_level = 0.95,
pars_ci = list(),
pars_power_curve = list(),
pars_ci_final_x = list(lwd = 2, length = 0.2, col = "blue"),
pars_target_power = list(lty = "dashed", lwd = 2, col = "black"),
pars_final_x = list(lty = "dotted"),
pars_final_power = list(lty = "dotted", col = "blue"),
pars_text_final_x = list(y = 0, pos = 3, cex = 1),
pars_text_final_power = list(pos = 3, cex = 1),
pars_sig_area = list(col = adjustcolor("lightblue", alpha.f = 0.1)),
pars_text_sig_area = list(cex = 1),
...
)
## S3 method for class 'n_region_from_power'
plot(
x,
what = c("ci", "power_curve", "final_x", "final_power", "target_power", "sig_area"),
text_what = c("final_x", "final_power", "sig_area"),
digits = 3,
main = paste0("Power Curve ", "(Target Power: ", formatC(x$below$target_power, digits =
digits, format = "f"), ")"),
xlab = NULL,
ylab = "Estimated Power",
ci_level = 0.95,
pars_ci = list(),
pars_power_curve = list(),
pars_ci_final_x = list(lwd = 2, length = 0.2, col = "blue"),
pars_target_power = list(lty = "dashed", lwd = 2, col = "black"),
pars_final_x = list(lty = "dotted"),
pars_final_power = list(lty = "dotted", col = "blue"),
pars_text_final_x = list(y = 0, pos = 3, cex = 1),
pars_text_final_power = list(cex = 1),
pars_sig_area = list(col = adjustcolor("lightblue", alpha.f = 0.1)),
pars_text_sig_area = list(cex = 1),
...
)
Arguments
x |
An |
what |
A character vector of
what to include in the
plot. Possible values are
|
text_what |
A character vector
of what numbers to be added as
labels. Possible values are
|
digits |
The number of digits after the decimal that will be used when adding numbers. |
main |
The title of the plot. |
xlab , ylab |
The labels for the horizontal and vertical axes, respectively. |
ci_level |
The level of
confidence of the confidence intervals,
if requested. Default is |
pars_ci |
A named list of
arguments to be passed to |
pars_power_curve |
A named list of
arguments to be passed to |
pars_ci_final_x |
A named list of
arguments to be passed to |
pars_target_power |
A named list
of arguments to be passed to |
pars_final_x |
A
named list of arguments to be passed
to |
pars_final_power |
A
named list of arguments to be passed
to |
pars_text_final_x |
A
named list of arguments to be passed
to |
pars_text_final_power |
A
named list of arguments to be passed
to |
pars_sig_area |
A named list
of arguments to be passed to
|
pars_text_sig_area |
A named list
of arguments to be passed to
|
... |
Optional arguments.
Passed to |
Details
The plot
method of x_from_power
objects currently plots the relation
between estimated power and
the values examined by x_from_power()
.
Other elements
can be requested (see the argument
what
), and they can be customized
individually.
The plot
-method for
n_region_from_power
objects is
a modified version of the plot
-method
for x_from_power
. It plots the
results of two runs of n_from_power()
in one plot. It is otherwise similar
to the plot
-method for x_from_power
.
Value
The plot
-method of x_from_power
returns x
invisibly.
It is called for its side effect.
The plot
-method of n_region_from_power
returns x
invisibly.
It is called for its side effect.
See Also
Examples
# Specify the population model
mod <-
"
m ~ x
y ~ m + x
"
# Specify the population values
mod_es <-
"
m ~ x: m
y ~ m: l
y ~ x: n
"
# Generate the datasets
sim_only <- power4test(nrep = 10,
model = mod,
pop_es = mod_es,
n = 100,
do_the_test = FALSE,
iseed = 1234)
# Do a test
test_out <- power4test(object = sim_only,
test_fun = test_parameters,
test_args = list(pars = "m~x"))
# Determine the sample size with a power of .80 (default)
power_vs_n <- x_from_power(test_out,
x = "n",
progress = TRUE,
target_power = .80,
final_nrep = 10,
max_trials = 1,
seed = 2345)
plot(power_vs_n)
Parse YAML-Stye Values For 'pop_es'
Description
Convert a YAML string
to a vector or list of pop_es
specification.
Usage
pop_es_yaml(text)
Arguments
text |
The multiline string to be parsed to a specification of population values. |
Details
The function pop_es_yaml()
allows users to specify population
values of a model using one single
string, as in 'lavaan' model syntax.
Value
Either a named vector (for a single-group model) or a list of named vector (for a multigroup model) of the population values of the parameters (the effect sizes).
Specify 'pop_es' Using a Multiline String
When setting the argument pop_es
,
instead of using a named vector
or named list for
pop_es
, the population values of
model parameters can also be
specified using a multiline string,
as illustrated below, to be parsed
by pop_es_yaml()
.
Single-Group Model
This is an example of the multiline string for a single-group model:
y ~ m: l m ~ x: m y ~ x: nil
The string must follow this format:
Each line starts with
tag:
.-
tag
can be the name of a parameter, inlavaan
model syntax format.For example,
m ~ x
denotes the path fromx
tom
.
A tag in
lavaan
model syntax can specify more than one parameter using+
.For example,
y ~ m + x
denotes the two paths fromm
andx
toy
.
Alternatively, the
tag
can be either.beta.
or.cov.
.Use
.beta.
to set the default values for all regression coefficients.Use
.cov.
to set the default values for all correlations of exogenous variables (e.g., predictors).
-
After each tag is the value of the population value:
-
nil
for nil (zero),-
s
for small, -
m
for medium, and -
l
for large. -
si
,mi
, andli
for small, medium, and large a standardized indirect effect, respectively.
Note:
n
cannot be used in this mode.The value for each label is determined by
es1
andes2
as described inptable_pop()
.The value can also be set to a numeric value, such as
.30
or-.30
.
-
This is another example:
.beta.: s y ~ m: l
In this example, all regression
coefficients are small
, while
the path from m
to y
is large.
Multigroup Model
This is an example of the string for a multigroup model:
y ~ m: l m ~ x: - nil - s y ~ x: nil
The format is similar to that for
a single-group model. If a parameter
has the same value for all groups,
then the line can be specified
as in the case of a single-group
model: tag: value
.
If a parameter has different values across groups, then it must be in this format:
A line starts with the tag, followed by two or more lines. Each line starts with a hyphen
-
and the value for a group.
For example:
m ~ x: - nil - s
This denotes that the model has
two groups. The values of the path
from x
to m
for the two
groups are 0 (nil
) and
small (s
), respectively.
Another equivalent way to specify
the values are using []
, on
the same line of a tag.
For example:
m ~ x: [nil, s]
The number of groups is inferred from the number of values for a parameter. Therefore, if a tag has more than one value, each tag must has the same number of value, or only one value.
The tag .beta.
and .cov.
can
also be used for multigroup models.
Which Approach To Use
Note that using named vectors or named lists is more reliable. However, using a multiline string is more user-friendly. If this method failed, please use named vectors or named list instead.
Technical Details
The multiline string is parsed by yaml::read_yaml()
.
Therefore, the format requirement
is actually that of YAML. Users
knowledgeable of YAML can use other
equivalent way to specify the string.
See Also
ptable_pop()
,
power4test()
, and other functions
that have the pop_es
argument.
Examples
mod_es <- c("y ~ m" = "l",
"m ~ x" = "m",
"y ~ x" = "nil")
mod_es_yaml <-
"
y ~ m: l
m ~ x: m
y ~ x: nil
"
pop_es_yaml(mod_es_yaml)
Estimate the Power of a Test
Description
An all-in-one function that receives a model specification, generates datasets, fits a model, does the target test, and returns the test results.
Usage
power4test(
object = NULL,
nrep = NULL,
ptable = NULL,
model = NULL,
pop_es = NULL,
standardized = TRUE,
n = NULL,
number_of_indicators = NULL,
reliability = NULL,
x_fun = list(),
e_fun = list(),
process_data = NULL,
fit_model_args = list(),
R = NULL,
ci_type = "mc",
gen_mc_args = list(),
gen_boot_args = list(),
test_fun = NULL,
test_args = list(),
map_names = c(fit = "fit"),
results_fun = NULL,
results_args = list(),
test_name = NULL,
test_note = NULL,
do_the_test = TRUE,
sim_all = NULL,
iseed = NULL,
parallel = FALSE,
progress = TRUE,
ncores = max(1, parallel::detectCores(logical = FALSE) - 1),
es1 = c(n = 0, nil = 0, s = 0.1, m = 0.3, l = 0.5, si = 0.141, mi = 0.361, li = 0.51),
es2 = c(n = 0, nil = 0, s = 0.05, m = 0.1, l = 0.15),
es_ind = c("si", "mi", "li"),
n_std = 1e+05,
std_force_monte_carlo = FALSE
)
## S3 method for class 'power4test'
print(
x,
what = c("data", "test"),
digits = 3,
digits_descriptive = 2,
data_long = FALSE,
test_long = FALSE,
fit_to_all_args = list(),
...
)
Arguments
object |
Optional. If set to a
|
nrep |
The number of replications
to generate the simulated datasets.
Default is |
ptable |
The output of
|
model |
The |
pop_es |
The character vector or
multiline string to
specify population effect sizes
(population values of parameters). See
the help page on how to specify this
argument.
Ignored if |
standardized |
Logical. If
|
n |
The sample size for each dataset. Default is 100. |
number_of_indicators |
A named
vector to specify the number of
indicators for each factors. See
the help page on how to set this
argument. Default is |
reliability |
A named vector
(for a single-group model) or a
named list of named vectors
(for a multigroup model)
to set the reliability coefficient
of each set of indicators. Default
is |
x_fun |
The function(s) used to
generate the exogenous variables or
error terms. If
not supplied, or set to |
e_fun |
The function(s) used to
generate the error terms of indicators,
if any. If
not supplied, or set to |
process_data |
If not |
fit_model_args |
A list of the
arguments to be passed to |
R |
The number of replications
to generate the Monte Carlo or
bootstrapping estimates
for each fit output. No Monte Carlo
nor bootstrapping
estimates will be generated if |
ci_type |
The type of
simulation-based confidence
intervals to use. Can be either
|
gen_mc_args |
A list of
arguments to be passed to
|
gen_boot_args |
A list of
arguments to be passed to
|
test_fun |
A function to do the
test. See 'Details' for the requirement
of this function. There are some
built-in test functions in |
test_args |
A list of arguments
to be passed to the |
map_names |
A named character
vector specifying how the content of
the element |
results_fun |
The function to be
used to extract the test results.
See |
results_args |
A list of
arguments to be passed to the
|
test_name |
String. The name
of the test. Default is |
test_note |
String. An optional
note for the test, stored in the
attribute |
do_the_test |
If |
sim_all |
If set to either a
|
iseed |
The seed for the random
number generator. Default is |
parallel |
If |
progress |
If |
ncores |
The number of CPU cores to use if parallel processing is used. |
es1 |
Set the
values for each label of the effect
size (population value) for correlations and regression
paths.
Used only if |
es2 |
Set the
values for each label of the effect
size (population value) for product term.
Used only if |
es_ind |
The names of labels denoting the effect size of an indirect effect. They will be used to determine the population values of the component paths along an indirect path. |
n_std |
The sample size used to
determine the error variances by
simulation when |
std_force_monte_carlo |
Logical.
If |
x |
The object to be printed. |
what |
A string vector of
what to print, |
digits |
The numbers of digits displayed after the decimal. |
digits_descriptive |
The number of digits displayed after the decimal for the descriptive statistics table. |
data_long |
If |
test_long |
If |
fit_to_all_args |
A named list
of arguments to be passed to
|
... |
Optional arguments to be passed to other print methods |
Details
The function power4test()
is an
all-in-one function for
estimating the power of a test for
a model, given the sample size
and effect sizes (population values
of model parameters).
Value
An object of the class power4test
,
which is a list with two elements:
-
sim_all
: The output ofsim_out()
. -
test_all
: A named list of the output ofdo_test()
. The names are the values oftest_name
. This list can have more than one test because a call topower4test()
can add new tests to apower4test
object.
The print
method of power4test
returns x
invisibly. Called for
its side effect.
Workflow
This is the workflow:
If
object
is an output of the output of a previous call topower4test()
withdo_the_test
set toFALSE
and so has only the model and the simulated data, the following steps will be skipped and go directly to doing the test.Call
sim_data()
to determine the population model and generate the datasets, using arguments such asmodel
andpop_es
.Call
fit_model()
to fit a model to each of the datasets, which is the population model by default.If
R
is notNULL
andci_type = "mc"
, callgen_mc()
to generate Monte Carlo estimates usingmanymome::do_mc()
. The estimates can be used by supported functions such astest_indirect_effect()
.If
R
is notNULL
andci_type = "boot"
, callgen_boot()
to generate bootstrap estimates usingmanymome::do_boot()
. The estimates can be used by supported functions such astest_indirect_effect()
.Merge the results into a
sim_out
object by callingsim_out()
.If
do_the_test
isFALSE
, skip the remaining steps and return apower4test
object, which contains only the data generated and optionally the Monte Carlo or bootstrap estimates.
If
do_the_test
isTRUE
, do the test.-
do_test()
will be called to do the test in the fit output of each dataset.
-
Return a
power4test
object which include the output ofsim_out
and, ifdo_the_test
isTRUE
, the output ofdo_test()
.
This function is to be used when users are interested only in the power of one or several tests on a particular aspect of the model, such as a parameter, given a specific effect sizes and sample sizes.
Detailed description on major arguments can be found in sections below.
NOTE: The technical internal workflow of
of power4test()
can be found in
this page: https://sfcheung.github.io/power4mome/articles/power4test_workflow.html.
Updating a Condition
The function power4test()
can also be used to
update a condition when only some
selected aspects is to be changed.
For example,
instead of calling this function with
all the arguments set just to change
the sample size, it can be called
by supplying an existing
power4test
object and set only
n
to a new sample size. The data
and the tests will be updated
automatically. See the examples for
an illustration.
Adding Another Test
The function power4test()
can also be used to
add a test to the output from a
previous call to power4test()
.
For example, after simulating the
datasets and doing one test,
the output can be set to object
of power4test()
, and set only
test_fun
and, optionally,
test_fun_args
to do one more test
on the generated datasets. The output
will be the original
object with the results of the new
test added. See the examples for
an illustration.
Model Fitting Arguments ('fit_model_args')
For power analysis, usually, the
population model (model
) is to be
fitted, and there is no need to
set fit_model_args
.
If power analysis is to be conducted
for fitting a model that is not the
population model, of if non-default
settings are desired when fitting
a model, then the argument fit_model_args
needed to be set to customize the
call to fit_model()
.
For example,
users may want to examine the power
of a test when a misspecified model
is fitted, or the power of a test
when MLR is used as the estimator
when calling lavaan::sem()
.
See the help page of fit_model()
for some examples.
Specify the Population Model by 'model'
Single-Group Model
For a single-group model, model
should be a lavaan
model syntax
string of the form of the model.
The population values of the model
parameters are to be determined by
pop_es
.
If the model has latent factors,
the syntax in model
should specify
only the structural model for the
latent factors. There is no
need to specify the measurement
part. Other functions will generate
the measurement part on top of this
model.
For example, this is a simple mediation model:
"m ~ x y ~ m + x"
Whether m
, x
, and y
denote
observed variables or latent factors
are determined by other functions,
such as power4test()
.
Multigroup Model
Because the model is the population
model, equality constraints are
irrelevant and the model syntax
specifies only the form of the
model. Therefore, model
is
specified as in the case of single
group models.
Specify 'pop_es' Using Named Vectors
The argument pop_es
is for specifying
the population values of model
parameters. This section describes
how to do this using named vectors.
Single-Group Model
If pop_es
is specified by a named
vector, it must follow the convention
below.
The names of the vectors are
lavaan
names for the selected parameters. For example,m ~ x
denotes the path fromx
tom
.Alternatively, the names can be either
".beta."
or".cov."
. Use".beta."
to set the default values for all regression coefficients. Use".cov."
to set the default values for all correlations of exogenous variables (e.g., predictors).The names can also be of this form:
".ind.(<path>)"
, whether<path>
denote path in the model. For example,".ind.(x->m->y)"
denotes the path fromx
throughm
toy
. Alternatively, thelavaan
symbol~
can also be used:".ind.(y~m~x)"
. This form is used to set the indirect effect (standardized, by default) along this path. The value for this name will override other settings.If using
lavaan
names, can specify more than one parameter using+
. For example,y ~ m + x
denotes the two paths fromm
andx
toy
.The value of each element can be the label for the effect size:
n
for nil,s
for small,m
for medium, andl
for large. The value for each label is determined byes1
andes2
. See the section on specifying these two arguments.The value of
pop_es
can also be set to a value, but it must be quoted as a string, such as"y ~ x" = ".31"
.
This is an example:
c(".beta." = "s", "m1 ~ x" = "-m", "m2 ~ m1" = "l", "y ~ x:w" = "s")
In this example,
All regression coefficients are set to small (
s
) by default, unless specified otherwise.The path from
x
tom1
is set to medium and negative (-m
).The path from
m1
tom2
is set to large (l
).The coefficient of the product term
x:w
when predictingy
is set to small (s
).
Indirect Effect
When setting an indirect effect to
a symbol (default: "si"
, "mi"
,
"li"
, with "i"
added to differentiate
them from the labels for a direct path),
the corresponding value is used to
determine the population values of
all component paths along the pathway.
All the values are assumed to be equal.
Therefore, ".ind.(x->m->y)" = ".20"
is equivalent to setting m ~ x
and y ~ m
to the square root of
.20, such that the corresponding
indirect effect is equal to the
designated value.
This behavior, though restricted, is for quick manipulation of the indirect effect. If different values along a pathway, set the value for each path directly.
Only nonnegative value is supported.
Therefore, ".ind.(x->m->y)" = "-si"
and ".ind.(x->m->y)" = "-.20"
will
throw an error.
Multigroup Model
The argument pop_es
also supports multigroup
models.
For pop_es
, instead of
named vectors, named list of
named vectors should be used.
The names are the parameters, or keywords such as
.beta.
and.cov.
, like specifying the population values for a single group model.The elements are character vectors. If it has only one element (e.g., a single string), then it is the the population value for all groups. If it has more than one element (e.g., a vector of three strings), then they are the population values of the groups. For a model of k groups, each vector must have either k elements or one element.
This is an example:
list("m ~ x" = "m", "y ~ m" = c("s", "m", "l"))
In this model, the population value
of the path m ~ x
is medium (m
) for
all groups, while the population
values for the path y ~ m
are
small (s
), medium (m
), and large (l
),
respectively.
Specify 'pop_es' Using a Multiline String
When setting the argument pop_es
,
instead of using a named vector
or named list for
pop_es
, the population values of
model parameters can also be
specified using a multiline string,
as illustrated below, to be parsed
by pop_es_yaml()
.
Single-Group Model
This is an example of the multiline string for a single-group model:
y ~ m: l m ~ x: m y ~ x: nil
The string must follow this format:
Each line starts with
tag:
.-
tag
can be the name of a parameter, inlavaan
model syntax format.For example,
m ~ x
denotes the path fromx
tom
.
A tag in
lavaan
model syntax can specify more than one parameter using+
.For example,
y ~ m + x
denotes the two paths fromm
andx
toy
.
Alternatively, the
tag
can be either.beta.
or.cov.
.Use
.beta.
to set the default values for all regression coefficients.Use
.cov.
to set the default values for all correlations of exogenous variables (e.g., predictors).
-
After each tag is the value of the population value:
-
nil
for nil (zero),-
s
for small, -
m
for medium, and -
l
for large. -
si
,mi
, andli
for small, medium, and large a standardized indirect effect, respectively.
Note:
n
cannot be used in this mode.The value for each label is determined by
es1
andes2
as described inptable_pop()
.The value can also be set to a numeric value, such as
.30
or-.30
.
-
This is another example:
.beta.: s y ~ m: l
In this example, all regression
coefficients are small
, while
the path from m
to y
is large.
Multigroup Model
This is an example of the string for a multigroup model:
y ~ m: l m ~ x: - nil - s y ~ x: nil
The format is similar to that for
a single-group model. If a parameter
has the same value for all groups,
then the line can be specified
as in the case of a single-group
model: tag: value
.
If a parameter has different values across groups, then it must be in this format:
A line starts with the tag, followed by two or more lines. Each line starts with a hyphen
-
and the value for a group.
For example:
m ~ x: - nil - s
This denotes that the model has
two groups. The values of the path
from x
to m
for the two
groups are 0 (nil
) and
small (s
), respectively.
Another equivalent way to specify
the values are using []
, on
the same line of a tag.
For example:
m ~ x: [nil, s]
The number of groups is inferred from the number of values for a parameter. Therefore, if a tag has more than one value, each tag must has the same number of value, or only one value.
The tag .beta.
and .cov.
can
also be used for multigroup models.
Which Approach To Use
Note that using named vectors or named lists is more reliable. However, using a multiline string is more user-friendly. If this method failed, please use named vectors or named list instead.
Technical Details
The multiline string is parsed by yaml::read_yaml()
.
Therefore, the format requirement
is actually that of YAML. Users
knowledgeable of YAML can use other
equivalent way to specify the string.
Set the Values for Effect Size Labels ('es1' and 'es2')
The vector es1
is for correlations,
regression coefficients, and
indirect effect, and the
vector es2
is for for standardized
moderation effect, the coefficients
of a product term. These labels
are to be used in interpreting
the specification in pop_es
.
Set 'number_of_indicators' and 'reliability'
The arguments number_of_indicators
and reliability
are used to
specify the number of indicators
(e.g., items) for each factor,
and the population reliability
coefficient of each factor,
if the variables in the model
syntax are latent variables.
Single-Group Model
If a variable in the model is to be
replaced by indicators in the generated
data, set
number_of_indicators
to a named
numeric vector. The names are the
variables of variables with
indicators, as appeared in the
model
syntax. The value of each
name is the number of indicators.
The
argument reliability
should then be
set a named numeric vector (or list,
see the section on multigroup models)
to specify the population reliability
coefficient ("omega") of each set of
indicators. The population standardized factor
loadings are then computed to ensure
that the population reliability
coefficient is of the target value.
These are examples for a single group model:
number of indicator = c(m = 3, x = 4, y = 5)
The numbers of indicators for m
,
x
, and y
are 3, 4, and 5,
respectively.
reliability = c(m = .90, x = .80, y = .70)
The population reliability
coefficients of m
, x
, and y
are
.90, .80, and .70, respectively.
Multigroup Models
Multigroup models are supported.
The number of groups is inferred
from pop_es
(see the help page
of ptable_pop()
), or directly
from ptable
.
For a multigroup model, the number of indicators for each variable must be the same across groups.
However, the population reliability
coefficients can be different
across groups. For a multigroup model
of k groups,
with one or more population reliability
coefficients differ across groups,
the argument reliability
should be
set to a named list. The names are
the variables to which the population
reliability coefficients are to be
set. The element for each name is
either a single value for the common
reliability coefficient, or a
numeric vector of the reliability
coefficient of each group.
This is an example of reliability
for a model with 2 groups:
reliability = list(x = .80, m = c(.70, .80))
The reliability coefficients of x
are
.80 in all groups, while the
reliability coefficients of m
are
.70 in one group and .80 in another.
Equal Numbers of Indicators and/or Reliability Coefficients
If all variables in the model has
the same number of indicators,
number_of_indicators
can be set
to one single value.
Similarly, if all sets of indicators
have the same population reliability
in all groups, reliability
can also
be set to one single value.
Specify The Distributions of Exogenous Variables Or Error Terms Using 'x_fun'
By default, variables and error
terms are generated
from a multivariate normal distribution.
If desired, users can supply the
function used to generate an exogenous
variable and error term by setting x_fun
to
a named list.
The names of the list are the variables for which a user function will be used to generate the data.
Each element of the list must also be a list. The first element of this list, can be unnamed, is the function to be used. If other arguments need to be supplied, they should be included as named elements of this list.
For example:
x_fun = list(x = list(power4mome::rexp_rs), w = list(power4mome::rbinary_rs, p1 = .70)))
The variables x
and w
will be
generated by user-supplied functions.
For x
, the function is
power4mome::rexp_rs
. No additional
argument when calling this function.
For w
, the function is
power4mome::rbinary_rx
. The argument
p1 = .70
will be passed to this
function when generating the values
of w
.
If a variable is an endogenous
variable (e.g., being predicted by
another variable in a model), then
x_fun
is used to generate its
error term. Its implied population
distribution may still be different
from that generate by x_fun
because
the distribution also depends on the
distribution of other variables
predicting it.
These are requirements for the user-functions:
They must return a numeric vector.
They mush has an argument
n
for the number of values.The population mean and standard deviation of the generated values must be 0 and 1, respectively.
The package power4mome
has
helper functions for generating
values from some common nonnormal
distributions and then scaling them
to have population mean and standard
deviation equal to 0 and 1 (by default), respectively.
These are some of them:
To use x_fun
, the variables must
have zero covariances with other
variables in the population. It is
possible to generate nonnormal
multivariate data but we believe this
is rarely needed when estimating
power before having the data.
Major Test-Related Arguments
The test function (test_fun)
The function set to test_fun
,
the test function, usually
should work
on the output of lavaan::sem()
,
lmhelprs::many_lm()
, or
stats::lm()
, but can also be a
function that works on the output
of the function set to fit_function
when calling fit_model()
or
power4test()
(see fit_model_args
).
The function has two default requirements.
First, it has an argument fit
, to
be set to the output of
lavaan::sem()
or another output
stored in the element extra$fit
of
a replication in the sim_all
object. (This requirement can be
relaxed; see the section on map_names
.)
That is, the function definition
should be of this from: FUN(fit, ...)
. This is the form of all
test_*
functions in power4mome
.
If other arguments are to be passed
to the test function, they can be set
to test_args
as a named list.
Second, the test function must returns an output that (a) can be processed by the results function (see below), or (b) is of the required format for the output of a results function (see the next section). If it already returns an output of the required format, then there is no need to set the results function.
The results function (results_fun)
The test results will be extracted
from the output of test_fun
by the
function set to results_fun
,
the results function. If
the test_fun
already returns an
output of the expected format
(see below), then set results_fun
to NULL
, the default. The output
of test_fun
will be used for
estimating power.
The function set to results_fun
must accept the output of test_fun
,
as the first argument, and return a
named list (which can be a data frame)
or a named vector with some
of the following
elements:
-
est
: Optional. The estimate of a parameter, if applicable. -
se
: Optional. The standard error of the estimate, if applicable. -
cilo
: Optional. The lower limit of the confidence interval, if applicable. -
cihi
: Optional. The upper limit of the confidence interval, if applicable. -
sig
: Required. If1
, the test is significant. If0
, the test is not significant. If the test cannot be done for any reason, it should beNA
.
The results can then be used to estimate the power or Type I error of the test.
For example, if
the null hypothesis is false, then
the proportion of significant, that
is, the mean of the values of sig
across replications, is the power.
Built-in test functions
The package power4mome
has some ready-to-use
test functions:
Please refer to their help pages for examples.
The argument map_names
This argument is for developers using
a test function that has a different
name for the argument of the fit
object ("fit"
, the default).
If test_fun
is set to a function
that works on an output of, say,
lavaan::sem()
but the argument name
for the output is not fit
, the
mapping can be changed by
map_names
.
For example,
lavaan::parameterEstimates()
receives an output of lavaan::sem()
and reports the test results of model
parameters. However, the argument
name for the lavaan
output is
object.
To instruct do_test()
to
do the test correctly when setting
test_fun
to
lavaan::parameterEstimates
, add
map_names = c(object = "fit")
. The
element fit
in a replication will
then set to the argument object
of
lavaan::parameterEstimates()
.
Examples
# Specify the model
model_simple_med <-
"
m ~ x
y ~ m + x
"
# Specify the population values
model_simple_med_es <-
"
m ~ x: m
y ~ m: l
y ~ x: n
"
# Set nrep to a large number in real analysis, such as 400
# Set `parallel` to TRUE for faster, usually much faster, analysis
# Set `progress` to TRUE to display the progress of the analysis
out <- power4test(nrep = 10,
model = model_simple_med,
pop_es = model_simple_med_es,
n = 100,
test_fun = test_parameters,
test_args = list(pars = "m~x"),
iseed = 1234,
parallel = FALSE,
progress = TRUE)
print(out,
test_long = TRUE)
# Change the sample size
out1 <- power4test(out,
n = 200,
iseed = 1234,
parallel = FALSE,
progress = TRUE)
print(out1,
test_long = TRUE)
# Add one more test
out2 <- power4test(out,
test_fun = test_parameters,
test_args = list(pars = "y~x"),
parallel = FALSE,
progress = TRUE)
print(out2,
test_long = TRUE)
Power By Effect Sizes
Description
Estimate power for a set of effect sizes (population values of a model parameter).
Usage
power4test_by_es(
object,
pop_es_name = NULL,
pop_es_values = NULL,
progress = TRUE,
...,
by_seed = NULL,
by_nrep = NULL,
save_sim_all = TRUE
)
## S3 method for class 'power4test_by_es'
c(..., sort = TRUE, skip_checking_models = FALSE)
as.power4test_by_es(original_object, pop_es_name)
## S3 method for class 'power4test_by_es'
print(x, print_all = FALSE, digits = 3, ...)
Arguments
object |
A |
pop_es_name |
The name of the
parameter. See the help page
of |
pop_es_values |
A numeric
vector of the population values
of the parameter specified in
|
progress |
Logical. Whether progress of the simulation will be displayed. |
... |
For |
by_seed |
If set to a number,
it will be used to generate the
seeds for each call to |
by_nrep |
If set to a number,
it will be used to generate the
number of replications ( |
save_sim_all |
If |
sort |
WHen combining the
objects, whether they will be sorted
by population values. Default is |
skip_checking_models |
Whether
the check of the data generation model
will be checked. Default is |
original_object |
The object
to be converted to a |
x |
The object to be printed. |
print_all |
If |
digits |
The numbers of digits displayed after the decimal. |
Details
The function power4test_by_es()
regenerates
datasets for a set of effect sizes
(population values of a model parmeter)
and does the stored tests in each of
them.
Optionally, it can also be run
on a object with no stored tests.
In this case, additional arguments
must be set to instruct power4test()
on the tests to be conducted.
It is usually used to examine the power over a sets of effect sizes (population values).
The c
method of power4test_by_es
objects
is used to combine tests from different
runs of power4test_by_es()
.
The function as.power4test_by_es()
is used to convert a power4test
object to a power4test_by_es
object, if it is not already one.
Useful when concatenating
power4test
objects with
power4test_by_es
objects.
Value
The function
power4test_by_es()
returns a
power4test_by_es
object, which is a
list of power4test
objects, one for
each population value of the parameter.
The method c.power4test_by_es()
returns
a power4test_by_es
object with
all the elements (tests for different
values of pop_es_values
) combined.
The function as.power4test_by_es()
returns
a power4test_by_es
object converted
from the input object.
The print
-method of power4test_by_es
objects returns the object invisibly.
It is called for its side-effect.
See Also
Examples
# Specify the model
model_simple_med <-
"
m ~ x
y ~ m + x
"
# Specify the population values
model_simple_med_es <-
"
m ~ x: m
y ~ m: l
y ~ x: n
"
sim_only <- power4test(nrep = 2,
model = model_simple_med,
pop_es = model_simple_med_es,
n = 100,
R = 40,
ci_type = "boot",
fit_model_args = list(fit_function = "lm"),
do_the_test = FALSE,
iseed = 1234)
test_out <- power4test(object = sim_only,
test_fun = test_indirect_effect,
test_args = list(x = "x",
m = "m",
y = "y",
boot_ci = TRUE,
mc_ci = FALSE))
out <- power4test_by_es(test_out,
pop_es_name = "y ~ m",
pop_es_values = c(.10, .20))
out_reject <- rejection_rates(out)
out_reject
Power By Sample Sizes
Description
Estimate power for a set of sample sizes.
Usage
power4test_by_n(
object,
n = NULL,
progress = TRUE,
...,
by_seed = NULL,
by_nrep = NULL,
save_sim_all = TRUE
)
## S3 method for class 'power4test_by_n'
c(..., sort = TRUE, skip_checking_models = FALSE)
as.power4test_by_n(original_object)
## S3 method for class 'power4test_by_n'
print(x, print_all = FALSE, ...)
Arguments
object |
A |
n |
A numeric vector of sample sizes for which the simulation will be conducted. |
progress |
Logical. Whether progress of the simulation will be displayed. |
... |
For |
by_seed |
If set to a number,
it will be used to generate the
seeds for each call to |
by_nrep |
If set to a number,
it will be used to generate the
number of replications ( |
save_sim_all |
If |
sort |
When combining the
objects, whether they will be sorted
by sample sizes. Default is |
skip_checking_models |
Whether
the check of the data generation model
will be checked. Default is |
original_object |
The object
to be converted to a |
x |
The object to be printed. |
print_all |
If |
Details
The function power4test_by_n()
regenerates
datasets for a set of sample sizes
and does the stored tests in each of
them.
Optionally, it can also be run
on a object with no stored tests.
In this case, additional arguments
must be set to instruct power4test()
on the tests to be conducted.
It is usually used to examine the power over a set of sample sizes.
The c
method of power4test_by_n
objects
is used to combine tests from different
runs of power4test_by_n()
.
The function as.power4test_by_n()
is used to convert a power4test
object to a power4test_by_n
object, if it is not already one.
Useful when concatenating
power4test
objects with
power4test_by_n
objects.
Value
The function
power4test_by_n()
returns a
power4test_by_n
object, which is a
list of power4test
objects, one for
each sample size.
The method c.power4test_by_n()
returns
a power4test_by_n
object with
all the elements (tests for different
sample sizes) combined.
The function as.power4test_by_n()
returns
a power4test_by_n
object converted
from the input object.
The print
-method of power4test_by_n
objects returns the object invisibly.
It is called for its side-effect.
See Also
Examples
# Specify the model
model_simple_med <-
"
m ~ x
y ~ m + x
"
# Specify the population values
model_simple_med_es <-
"
m ~ x: m
y ~ m: l
y ~ x: n
"
sim_only <- power4test(nrep = 2,
model = model_simple_med,
pop_es = model_simple_med_es,
n = 100,
R = 40,
ci_type = "boot",
fit_model_args = list(fit_function = "lm"),
do_the_test = FALSE,
iseed = 1234)
test_out <- power4test(object = sim_only,
test_fun = test_indirect_effect,
test_args = list(x = "x",
m = "m",
y = "y",
boot_ci = TRUE,
mc_ci = FALSE))
out <- power4test_by_n(test_out,
n = c(100, 110, 120))
out_reject <- rejection_rates(out)
out_reject
Power Curve
Description
Estimate the relation between power and a characteristic, such as sample size or population effect size (population value of a model parameter).
Usage
power_curve(
object,
formula = NULL,
start = NULL,
lower_bound = NULL,
upper_bound = NULL,
nls_args = list(),
nls_control = list(),
verbose = FALSE,
models = c("nls", "logistic", "lm")
)
## S3 method for class 'power_curve'
print(x, data_used = FALSE, digits = 3, right = FALSE, row.names = FALSE, ...)
Arguments
object |
An object of the class
|
formula |
A formula of the model
for |
start |
Either a named vector
of the start value(s) of parameter(s)
in |
lower_bound |
Either a named vector
of the lower bound(s) of parameter(s)
in |
upper_bound |
Either a named vector
of the upper bound(s) of parameter(s)
in |
nls_args |
A named list of
arguments to be used when calling
|
nls_control |
A named list of
arguments to be passed the |
verbose |
Logical. Whether messages will be printed when trying different models. |
models |
Models to try. Support
|
x |
A |
data_used |
Logical. Whether the rejection rates data frame used to fit the model is printed. |
digits , right , row.names |
Arguments of the same names used
by the |
... |
For the |
Details
The function power_curve()
retrieves the information
from the output of
power4test_by_n()
or
power4test_by_es()
, and
estimate the power curve: the
relation between the characteristic
varied, sample size for
power4test_by_n()
and the
population effect size for
power4test_by_es()
, and the
rejection rate of the test conducted
by power4test_by_n()
or
power4test_by_es()
. This
rejection rate is the power when the
null hypothesis is false (e.g., the
population value of the effect size
being tested is nonzero).
The model fitted is not intended to
be a precise model for the relation
across a wide range. It is only a
crude estimate based on the limited
number of values of the
characteristic (e.g., sample size)
examined, which can be as small as
four or even smaller. The model is
intended to be
used for only for the range covered,
and for estimating the probable
sample size or effect size with a
desirable level of power. This value
should then be studied by higher
precision through simulation
using functions such as
power4test()
.
These are the models to be tried, in the following order:
One or nonlinear models, to be fitted by
stats::nls()
. If several models are specified, all will be fitted and the one with the smallest deviance will be used.If all the nonlinear models failed, for whatever reason, a logistic regression will be fitted by
stats::glm()
to predict the binary significant test results.If the logistic model also failed, for whatever reason, a simple linear regression model will be fitted. Although the power curve is nonlinear across a wide range of, say, sample size, a linear model can still be a good enough approximation for a narrow range of the predictor.
The output can then be plotted to
visualize the power curve, using
the plot
method (plot.power_curve()
)
for the output
of power_curve()
.
This function can be used directly,
but is also used internally by
functions such as x_from_power()
.
Value
It returns a list which is a
power_curve
object, with the
following elements:
-
fit
: The model fitted, which is the output ofstats::nls()
,stats::glm()
, orstats::lm()
. -
reject_df
: The table of reject rates and other characteristics, which is generated byrejection_rates()
. -
predictor
: The predictor or the power curve, ether"n"
(sample size) or"es"
(population effect size). -
call
: The call used to run this function.
The print
method of power_curve
object returns x
invisibly. Called
for its side-effect.
See Also
power4test_by_n()
and power4test_by_es()
for the output supported by
power_curve()
, plot.power_curve()
for the plot
method and
predict.power_curve()
for the predict
method of the output
of power_curve()
.
Examples
# Specify the population model
model_simple_med <-
"
m ~ x
y ~ m + x
"
# Specify the effect sizes (population parameter values)
model_simple_med_es <-
"
y ~ m: l
m ~ x: m
y ~ x: s
"
# Simulate datasets to check the model
# Set `parallel` to TRUE for faster, usually much faster, analysis
# Set `progress` to TRUE to display the progress of the analysis
sim_only <- power4test(nrep = 10,
model = model_simple_med,
pop_es = model_simple_med_es,
n = 50,
fit_model_args = list(fit_function = "lm"),
do_the_test = FALSE,
iseed = 1234,
parallel = FALSE,
progress = FALSE)
# By n: Do a test for different sample sizes
out1 <- power4test_by_n(sim_only,
nrep = 10,
test_fun = test_parameters,
test_args = list(par = "y~x"),
n = c(25, 50, 100),
by_seed = 1234,
parallel = FALSE,
progress = FALSE)
pout1 <- power_curve(out1)
pout1
plot(pout1)
# By pop_es: Do a test for different population values of a model parameter
out2 <- power4test_by_es(sim_only,
nrep = 10,
test_fun = test_parameters,
test_args = list(par = "y~x"),
pop_es_name = "y ~ x",
pop_es_values = c(0, .3, .5),
by_seed = 1234,
parallel = FALSE,
progress = FALSE)
pout2 <- power_curve(out2)
pout2
plot(pout2)
Predict Method for a 'power_curve' Object
Description
Compute the predicted
values in a model fitted by
power_curve()
.
Usage
## S3 method for class 'power_curve'
predict(object, newdata, ...)
Arguments
object |
A |
newdata |
A data frame with
a column named |
... |
Additional arguments.
Passed to the corresponding
|
Details
The predict
method of power_curve
objects works in two modes.
If new
data is not supplied (through
newdata
), it retrieves the stored
results and calls the corresponding
methods to compute the predicted
values, which are the predicted
rejection rates (power levels if
the null hypothesis is false,
e.g., the population effect size is
equal to zero).
If new data is supplied, such as a named list with a vector of sample sizes, they will be used to compute the predicted rejection rates.
Value
It returns a numeric vector of the predicted rejection rates.
See Also
Examples
# Specify the population model
model_simple_med <-
"
m ~ x
y ~ m + x
"
# Specify the effect sizes (population parameter values)
model_simple_med_es <-
"
y ~ m: l
m ~ x: m
y ~ x: s
"
# Simulate datasets to check the model
# Set `parallel` to TRUE for faster, usually much faster, analysis
# Set `progress` to TRUE to display the progress of the analysis
sim_only <- power4test(nrep = 10,
model = model_simple_med,
pop_es = model_simple_med_es,
n = 50,
fit_model_args = list(fit_function = "lm"),
do_the_test = FALSE,
iseed = 1234,
parallel = FALSE,
progress = FALSE)
# By n: Do a test for different sample sizes
out1 <- power4test_by_n(sim_only,
nrep = 10,
test_fun = test_parameters,
test_args = list(par = "y~x"),
n = c(25, 100, 200, 1000),
by_seed = 1234,
parallel = FALSE,
progress = FALSE)
pout1 <- power_curve(out1)
pout1
predict(pout1,
newdata = list(x = c(150, 250, 500)))
# By pop_es: Do a test for different population values of a model parameter
out2 <- power4test_by_es(sim_only,
nrep = 10,
test_fun = test_parameters,
test_args = list(par = "y~x"),
pop_es_name = "y ~ x",
pop_es_values = seq(0, .7, .15),
by_seed = 1234,
parallel = FALSE,
progress = FALSE)
pout2 <- power_curve(out2)
pout2
predict(pout2,
newdata = list(x = c(.25, .55)))
Generate the Population Model
Description
Generate the complete population model using the model syntax and user-specified effect sizes (population parameter values).
Usage
ptable_pop(
model = NULL,
pop_es = NULL,
es1 = c(n = 0, nil = 0, s = 0.1, m = 0.3, l = 0.5, si = 0.141, mi = 0.361, li = 0.51),
es2 = c(n = 0, nil = 0, s = 0.05, m = 0.1, l = 0.15),
es_ind = c("si", "mi", "li"),
standardized = TRUE,
n_std = 1e+05,
std_force_monte_carlo = FALSE,
add_cov_for_moderation = TRUE
)
model_matrices_pop(x, ..., drop_list_single_group = TRUE)
Arguments
model |
String. The model defined
by |
pop_es |
It can be a data frame
with these columns: |
es1 |
Set the
values for each label of the effect
size (population value) for correlations and regression
paths.
Used only if |
es2 |
Set the
values for each label of the effect
size (population value) for product term.
Used only if |
es_ind |
The names of labels denoting the effect size of an indirect effect. They will be used to determine the population values of the component paths along an indirect path. |
standardized |
Logical. If
|
n_std |
The sample size used to
determine the error variances by
simulation when |
std_force_monte_carlo |
Logical.
If |
add_cov_for_moderation |
Logical. If |
x |
It can be a 'lavaan' model
syntax, to be passed to |
... |
If |
drop_list_single_group |
If
|
Details
The function ptable_pop()
generates a lavaan
parameter table that can be used
to generate data based on the population
values of model parameters.
Value
The function ptable_pop()
returns
a lavaan
parameter table of the
model, with the column start
set to the
population values.
The function model_matrices_pop()
returns a lavaan
LISREL-style model
matrices (like the output of
lavaan::lavInspect()
with what
set to "free"
), with matrix elements
set to the population values. If
x
is the model syntax, it will be
stored in the attributes model
.
If the model is a multigroup model
with k groups (k greater than 1),
then it returns a list of k lists
of lavaan
LISREL-style model
matrices unless drop_list_single_group
is TRUE
.
The role of ptable_pop()
The function ptable_pop()
is used by
the all-in-one function
power4test()
. Users usually do not
call this function directly, though
developers can use this function to
develop other functions for power
analysis, or to build their own
workflows to do the power analysis.
Specify the Population Model by 'model'
Single-Group Model
For a single-group model, model
should be a lavaan
model syntax
string of the form of the model.
The population values of the model
parameters are to be determined by
pop_es
.
If the model has latent factors,
the syntax in model
should specify
only the structural model for the
latent factors. There is no
need to specify the measurement
part. Other functions will generate
the measurement part on top of this
model.
For example, this is a simple mediation model:
"m ~ x y ~ m + x"
Whether m
, x
, and y
denote
observed variables or latent factors
are determined by other functions,
such as power4test()
.
Multigroup Model
Because the model is the population
model, equality constraints are
irrelevant and the model syntax
specifies only the form of the
model. Therefore, model
is
specified as in the case of single
group models.
Specify 'pop_es' Using Named Vectors
The argument pop_es
is for specifying
the population values of model
parameters. This section describes
how to do this using named vectors.
Single-Group Model
If pop_es
is specified by a named
vector, it must follow the convention
below.
The names of the vectors are
lavaan
names for the selected parameters. For example,m ~ x
denotes the path fromx
tom
.Alternatively, the names can be either
".beta."
or".cov."
. Use".beta."
to set the default values for all regression coefficients. Use".cov."
to set the default values for all correlations of exogenous variables (e.g., predictors).The names can also be of this form:
".ind.(<path>)"
, whether<path>
denote path in the model. For example,".ind.(x->m->y)"
denotes the path fromx
throughm
toy
. Alternatively, thelavaan
symbol~
can also be used:".ind.(y~m~x)"
. This form is used to set the indirect effect (standardized, by default) along this path. The value for this name will override other settings.If using
lavaan
names, can specify more than one parameter using+
. For example,y ~ m + x
denotes the two paths fromm
andx
toy
.The value of each element can be the label for the effect size:
n
for nil,s
for small,m
for medium, andl
for large. The value for each label is determined byes1
andes2
. See the section on specifying these two arguments.The value of
pop_es
can also be set to a value, but it must be quoted as a string, such as"y ~ x" = ".31"
.
This is an example:
c(".beta." = "s", "m1 ~ x" = "-m", "m2 ~ m1" = "l", "y ~ x:w" = "s")
In this example,
All regression coefficients are set to small (
s
) by default, unless specified otherwise.The path from
x
tom1
is set to medium and negative (-m
).The path from
m1
tom2
is set to large (l
).The coefficient of the product term
x:w
when predictingy
is set to small (s
).
Indirect Effect
When setting an indirect effect to
a symbol (default: "si"
, "mi"
,
"li"
, with "i"
added to differentiate
them from the labels for a direct path),
the corresponding value is used to
determine the population values of
all component paths along the pathway.
All the values are assumed to be equal.
Therefore, ".ind.(x->m->y)" = ".20"
is equivalent to setting m ~ x
and y ~ m
to the square root of
.20, such that the corresponding
indirect effect is equal to the
designated value.
This behavior, though restricted, is for quick manipulation of the indirect effect. If different values along a pathway, set the value for each path directly.
Only nonnegative value is supported.
Therefore, ".ind.(x->m->y)" = "-si"
and ".ind.(x->m->y)" = "-.20"
will
throw an error.
Multigroup Model
The argument pop_es
also supports multigroup
models.
For pop_es
, instead of
named vectors, named list of
named vectors should be used.
The names are the parameters, or keywords such as
.beta.
and.cov.
, like specifying the population values for a single group model.The elements are character vectors. If it has only one element (e.g., a single string), then it is the the population value for all groups. If it has more than one element (e.g., a vector of three strings), then they are the population values of the groups. For a model of k groups, each vector must have either k elements or one element.
This is an example:
list("m ~ x" = "m", "y ~ m" = c("s", "m", "l"))
In this model, the population value
of the path m ~ x
is medium (m
) for
all groups, while the population
values for the path y ~ m
are
small (s
), medium (m
), and large (l
),
respectively.
Specify 'pop_es' Using a Multiline String
When setting the argument pop_es
,
instead of using a named vector
or named list for
pop_es
, the population values of
model parameters can also be
specified using a multiline string,
as illustrated below, to be parsed
by pop_es_yaml()
.
Single-Group Model
This is an example of the multiline string for a single-group model:
y ~ m: l m ~ x: m y ~ x: nil
The string must follow this format:
Each line starts with
tag:
.-
tag
can be the name of a parameter, inlavaan
model syntax format.For example,
m ~ x
denotes the path fromx
tom
.
A tag in
lavaan
model syntax can specify more than one parameter using+
.For example,
y ~ m + x
denotes the two paths fromm
andx
toy
.
Alternatively, the
tag
can be either.beta.
or.cov.
.Use
.beta.
to set the default values for all regression coefficients.Use
.cov.
to set the default values for all correlations of exogenous variables (e.g., predictors).
-
After each tag is the value of the population value:
-
nil
for nil (zero),-
s
for small, -
m
for medium, and -
l
for large. -
si
,mi
, andli
for small, medium, and large a standardized indirect effect, respectively.
Note:
n
cannot be used in this mode.The value for each label is determined by
es1
andes2
as described inptable_pop()
.The value can also be set to a numeric value, such as
.30
or-.30
.
-
This is another example:
.beta.: s y ~ m: l
In this example, all regression
coefficients are small
, while
the path from m
to y
is large.
Multigroup Model
This is an example of the string for a multigroup model:
y ~ m: l m ~ x: - nil - s y ~ x: nil
The format is similar to that for
a single-group model. If a parameter
has the same value for all groups,
then the line can be specified
as in the case of a single-group
model: tag: value
.
If a parameter has different values across groups, then it must be in this format:
A line starts with the tag, followed by two or more lines. Each line starts with a hyphen
-
and the value for a group.
For example:
m ~ x: - nil - s
This denotes that the model has
two groups. The values of the path
from x
to m
for the two
groups are 0 (nil
) and
small (s
), respectively.
Another equivalent way to specify
the values are using []
, on
the same line of a tag.
For example:
m ~ x: [nil, s]
The number of groups is inferred from the number of values for a parameter. Therefore, if a tag has more than one value, each tag must has the same number of value, or only one value.
The tag .beta.
and .cov.
can
also be used for multigroup models.
Which Approach To Use
Note that using named vectors or named lists is more reliable. However, using a multiline string is more user-friendly. If this method failed, please use named vectors or named list instead.
Technical Details
The multiline string is parsed by yaml::read_yaml()
.
Therefore, the format requirement
is actually that of YAML. Users
knowledgeable of YAML can use other
equivalent way to specify the string.
Set the Values for Effect Size Labels ('es1' and 'es2')
The vector es1
is for correlations,
regression coefficients, and
indirect effect, and the
vector es2
is for for standardized
moderation effect, the coefficients
of a product term. These labels
are to be used in interpreting
the specification in pop_es
.
The role of model_matrices_pop()
The function model_matrices_pop()
generates models matrices with
population values, used by ptable_pop()
.
Users usually do not
call this function directly, though
developers can use this build their own
workflows to generate the data.
See Also
power4test()
, and
pop_es_yaml()
on an alternative
way to specify population values.
Examples
# Specify the model
model1 <-
"
m1 ~ x + c1
m2 ~ m1 + x2 + c1
y ~ m2 + m1 + x + w + x:w + c1
"
# Specify the population values
model1_es <- c("m1 ~ x" = "-m",
"m2 ~ m1" = "s",
"y ~ m2" = "l",
"y ~ x" = "m",
"y ~ w" = "s",
"y ~ x:w" = "s",
"x ~~ w" = "s")
ptable_final1 <- ptable_pop(model1,
pop_es = model1_es)
ptable_final1
# Use a multiline string, illustrated by a simpler model
model2 <-
"
m ~ x
y ~ m + x
"
model2_es_a <- c("m ~ x" = "s",
"y ~ m" = "m",
"y ~ x" = "nil")
model2_es_b <-
"
m ~ x: s
y ~ m: m
y ~ x: nil
"
ptable_model2_a <- ptable_pop(model2,
pop_es = model2_es_a)
ptable_model2_b <- ptable_pop(model2,
pop_es = model2_es_b)
ptable_model2_a
ptable_model2_b
identical(ptable_model2_a,
ptable_model2_b)
# model_matrices_pop
model_matrices_pop(ptable_final1)
model_matrices_pop(model1,
pop_es = model1_es)
Random Variable From a Beta Distribution
Description
Generate random numbers from a beta distribution, rescaled to have user-specified population mean and standard deviation.
Usage
rbeta_rs(n = 10, shape1 = 0.5, shape2 = 0.5, pmean = 0, psd = 1)
Arguments
n |
The number of random numbers to generate. |
shape1 |
|
shape2 |
|
pmean |
Population mean. |
psd |
Population standard deviation. |
Details
First, specify the two parameters,
shape1
and shape2
, and the
desired population mean and standard
deviation. The random numbers, drawn
from a beta distribution by
stats::rbeta()
will then be
rescaled with the desired population
mean and standard deviation.
Value
A vector of the generated random numbers.
Examples
set.seed(90870962)
x <- rbeta_rs(n = 5000,
shape1 = .5,
shape2 = .5,
pmean = 3,
psd = 1)
mean(x)
sd(x)
hist(x)
Random Variable From a Beta Distribution (User Range)
Description
Generate random numbers from a beta distribution, rescaled to have user-specified population mean and standard deviation, and within a specific range.
Usage
rbeta_rs2(n = 10, bmean, bsd, blow = 0, bhigh = 1)
Arguments
n |
The number of random numbers to generate. |
bmean |
The population mean. |
bsd |
The population standard
deviation. If |
blow |
The lower bound of the target range. |
bhigh |
The upper bound of the target range. |
Details
First, specify the two parameters,
shape1
and shape2
, and the
desired population mean and standard
deviation. The random numbers, drawn
from a beta distribution by
stats::rbeta()
will then be
rescaled to the desired population range.
Value
A vector of the generated random numbers.
Examples
set.seed(90870962)
x <- rbeta_rs2(n = 5000,
bmean = .80,
bsd = .10,
blow = .00,
bhigh = .95)
mean(x)
sd(x)
hist(x)
y <- rbeta_rs2(n = 5000,
bmean = 4,
bsd = 3,
blow = -10,
bhigh = 10)
mean(y)
sd(y)
hist(y)
Random Binary Variable
Description
Generate random numbers from a distribution of 0 or 1, rescaled to have user-specified population mean and standard deviation.
Usage
rbinary_rs(n = 10, p1 = 0.5, pmean = 0, psd = 1)
Arguments
n |
The number of random numbers to generate. |
p1 |
The probability of being 1, before rescaling. |
pmean |
Population mean. |
psd |
Population standard deviation. |
Details
First, specify probability of 1
(p1
), and the desired population
mean and standard deviation. The
random numbers, drawn from a
distribution of 0 (1 - p1
probability) and 1 (p1
probability), will then be rescaled
with the desired population mean and
standard.
Value
A vector of the generated random numbers.
Examples
set.seed(90870962)
x <- rbinary_rs(n = 5000,
p1 = .5,
pmean = 3,
psd = 1)
mean(x)
sd(x)
hist(x)
Rejection Rates
Description
Get all rejection rates
of all tests stored in a power4test
object or other supported objects.
Usage
rejection_rates(object, ...)
## Default S3 method:
rejection_rates(object, ...)
## S3 method for class 'power4test'
rejection_rates(
object,
all_columns = FALSE,
ci = TRUE,
level = 0.95,
se = FALSE,
collapse = c("none", "all_sig", "at_least_one_sig", "at_least_k_sig"),
at_least_k = 1,
...
)
## S3 method for class 'power4test_by_es'
rejection_rates(
object,
all_columns = FALSE,
ci = TRUE,
level = 0.95,
se = FALSE,
...
)
## S3 method for class 'power4test_by_n'
rejection_rates(
object,
all_columns = FALSE,
ci = TRUE,
level = 0.95,
se = FALSE,
...
)
## S3 method for class 'rejection_rates_df'
print(x, digits = 3, annotation = TRUE, abbreviate_col_names = TRUE, ...)
Arguments
object |
The object
from which the rejection rates
are to be computed, such as
a |
... |
Optional arguments. For
the |
all_columns |
If |
ci |
If |
level |
The level of confidence
for the confidence intervals, if
|
se |
If |
collapse |
Whether a single
decision (significant vs. not significant)
is made across all tests for a test
that consists of several tests
(e.g., the tests of several parameters).
If |
at_least_k |
Used by |
x |
The |
digits |
The number of digits to be printed after the decimal. |
annotation |
Logical. Whether additional notes will be printed. |
abbreviate_col_names |
Logical. Whether some column names will be abbreviated. |
Details
For a power4test
object,
rejection_rates loops over the tests stored
in a power4test
object and retrieves
the rejection rate of each test.
The rejection_rates
method for
power4test_by_es
objects
is used to compute the rejection
rates from a power4test_by_es
object, with effect sizes added to
the output.
The rejection_rates
method for
power4test_by_n
objects
is used to compute the rejection
rates, with sample sizes added to
the output.
Value
The rejection_rates
method returns
a rejection_rates_df
object,
with a print
method.
If the input (object
) is a
power4test
object, the
rejection_rates_df
object is
a data-frame like object with the
number of
rows equal to the number of tests.
Note that some tests, such as
the test by test_parameters()
,
conduct one test for each parameter.
Each such test is counted as one
test.
The data frame has at least these columns:
-
test
: The name of the test. -
label
: The label for each test, or"Test"
if a test only does one test (e.g.,test_indirect_effect()
). -
pvalid
: The proportion of valid tests across all replications. -
reject
: The rejection rate for each test. If the null hypothesis is false, then this is the power.
The rejection_rates
method
for power4test_by_es
objects
returns an object of the
class rejection_rates_df_by_es
,
which is a subclass of
rejection_rates_df
.
It is a data frame which is
similar to the output of
rejection_rates()
, with two
columns added for the effect size (pop_es_name
and
pop_es_values
)
for each test.
The rejection_rates
method
for power4test_by_n
objects
returns an object of the
class rejection_rates_df_by_n
,
which is a subclass of
rejection_rates_df
.
It is a data frame which is
similar to the output of
a power4test
object, with a
column n
added for the sample size
for each test.
The print
method of a
rejection_rates_df
object returns
the object invisibly. It is called
for its side-effect.
See Also
power4test()
,
power4test_by_n()
, and
power4test_by_es()
, which are
supported by this method.
Examples
# Specify the population model
model_simple_med <-
"
m ~ x
y ~ m + x
"
# Specify the effect sizes (population parameter values)
model_simple_med_es <-
"
y ~ m: l
m ~ x: m
y ~ x: n
"
# Generate some datasets to check the model
sim_only <- power4test(nrep = 4,
model = model_simple_med,
pop_es = model_simple_med_es,
n = 100,
R = 50,
ci_type = "boot",
fit_model_args = list(fit_function = "lm"),
do_the_test = FALSE,
iseed = 1234)
# Do the test 'test_indirect_effect' on each datasets
test_out <- power4test(object = sim_only,
test_fun = test_indirect_effect,
test_args = list(x = "x",
m = "m",
y = "y",
boot_ci = TRUE,
mc_ci = FALSE))
# Do the test 'test_parameters' on each datasets
# and add the results to 'test_out'
test_out <- power4test(object = test_out,
test_fun = test_parameters)
# Compute and print the rejection rates for stored tests
rejection_rates(test_out)
# See the help pages of power4test_by_n() and power4test_by_es()
# for other examples.
Random Variable From an Exponential Distribution
Description
Generate random numbers from an exponential distribution, rescaled to have user-specified population mean and standard deviation.
Usage
rexp_rs(n = 10, rate = 1, pmean = 0, psd = 1, rev = FALSE)
Arguments
n |
The number of random numbers to generate. |
rate |
|
pmean |
Population mean. |
psd |
Population standard deviation. |
rev |
If TRUE, the distribution is revered to generate a negatively skewed distribution. Default is FALSE. |
Details
First, specify the parameter,
rate
, and the
desired population mean and standard
deviation. The random numbers, drawn
from an exponential distribution by
stats::rexp()
, will then be
rescaled with the desired population
mean and standard.
Value
A vector of the generated random numbers.
Examples
set.seed(90870962)
x <- rexp_rs(n = 5000,
rate = 4,
pmean = 3,
psd = 1)
mean(x)
sd(x)
hist(x)
Random Variable From a Lognormal Distribution
Description
Generate random numbers from a lognormal distribution, rescaled to have user-specified population mean and standard deviation.
Usage
rlnorm_rs(n = 10, mui = 0, sigma = 1, rev = FALSE, pmean = 0, psd = 1)
Arguments
n |
The number of random numbers to generate. |
mui |
The parameter |
sigma |
The parameter |
rev |
If TRUE, the distribution is revered to generate a negatively skewed distribution. Default is FALSE. |
pmean |
Population mean. |
psd |
Population standard deviation. |
Details
First, specify the parameter,
mui
and sigma
, and the
desired population mean and standard
deviation. The random numbers, drawn
from a lognormal distribution by
stats::rlnorm()
, will then be
rescaled with the desired population
mean and standard.
Value
A vector of the generated random numbers.
Examples
set.seed(90870962)
x <- rlnorm_rs(n = 5000,
mui = 0,
sigma = 1,
pmean = 0,
psd = 1)
mean(x)
sd(x)
hist(x)
Random Variable From a Generalized Normal Distribution
Description
Generate random numbers from generalized normal distribution, rescaled to have user-specified population mean and standard deviation.
Usage
rpgnorm_rs(n = 10, p = 2, pmean = 0, psd = 1)
Arguments
n |
The number of random numbers to generate. |
p |
The parameter of the distribution. Must be a positive non-zero number. Default is 2, resulting in a normal distribution. Higher than 2 results in negative excess kurtosis. Lower than 2 results in positive excess kurtosis. |
pmean |
Population mean. |
psd |
Population standard deviation. |
Details
First, specify the parameter p
and the desired population
mean and standard deviation. The
random numbers, drawn from a
generalized normal distribution by
pgnorm::rpgnorm()
, will then be
rescaled with the desired population
mean and standard.
Value
A vector of the generated random numbers.
Examples
set.seed(90870962)
x <- rpgnorm_rs(n = 5000,
p = 2,
pmean = 0,
psd = 1)
mean(x)
sd(x)
hist(x)
x_kurt <- function(p) {gamma(5/p)*gamma(1/p)/(gamma(3/p)^2) - 3}
p <- 6
x <- rpgnorm_rs(n = 50000,
p = p,
pmean = 0,
psd = 1)
mean(x)
sd(x)
x_kurt(p)
qqnorm(x); qqline(x)
p <- 1
x <- rpgnorm_rs(n = 50000,
p = p,
pmean = 0,
psd = 1)
mean(x)
sd(x)
x_kurt(p)
qqnorm(x); qqline(x)
Random Variable From a t Distribution
Description
Generate random numbers from a t distribution, rescaled to have user-specified population mean and standard deviation.
Usage
rt_rs(n = 10, df = 5, pmean = 0, psd = 1)
Arguments
n |
The number of random numbers to generate. |
df |
|
pmean |
Population mean. |
psd |
Population standard deviation. |
Details
First, specify the parameter df
and the desired population
mean and standard deviation. The
random numbers, drawn from the
generalized normal distribution by
stats::rt()
, will then be
rescaled with the desired population
mean and standard.
Value
A vector of the generated random numbers.
Examples
set.seed(90870962)
x <- rt_rs(n = 5000,
df = 5,
pmean = 3,
psd = 1)
mean(x)
sd(x)
hist(x)
Random Variable From a Uniform Distribution
Description
Generate random numbers from a uniform distribution, with user-specified population mean and standard deviation.
Usage
runif_rs(n = 10, min = 0, max = 1, pmean = 0, psd = 1)
Arguments
n |
The number of random numbers to generate. |
min |
min for runif. |
max |
max for runif. |
pmean |
Population mean. |
psd |
Population standard deviation. |
Details
First, the user specifies the parameters, min and max, and the desired population mean and standard deviation. Then the random numbers will be generated and rescaled with the desired population mean and standard.
Value
A vector of the generated random numbers.
Examples
set.seed(90870962)
x <- runif_rs(n = 5000,
min = 2,
max = 4,
pmean = 3,
psd = 1)
mean(x)
sd(x)
hist(x)
Simulate Datasets Based on a Model
Description
Get a model matrix and effect size specification and simulate a number of datasets, along with other information.
The function
Usage
sim_data(
nrep = 10,
ptable = NULL,
model = NULL,
pop_es = NULL,
...,
n = 100,
iseed = NULL,
number_of_indicators = NULL,
reliability = NULL,
x_fun = list(),
e_fun = list(),
process_data = NULL,
parallel = FALSE,
progress = FALSE,
ncores = max(1, parallel::detectCores(logical = FALSE) - 1)
)
## S3 method for class 'sim_data'
print(
x,
digits = 3,
digits_descriptive = 2,
data_long = TRUE,
fit_to_all_args = list(),
est_type = "standardized",
variances = NULL,
pure_x = TRUE,
pure_y = TRUE,
...
)
pool_sim_data(object, as_list = FALSE)
Arguments
nrep |
The number of replications to generate the simulated datasets. Default is 10. |
ptable |
The output of
|
model |
The |
pop_es |
The character to
specify population effect sizes.
See ptable_pop on
how to specify this argument.
Ignored if |
... |
For sim_data, parameters
to be passed to |
n |
The sample size for each dataset. Default is 100. |
iseed |
The seed for the random
number generator. Default is |
number_of_indicators |
A named
vector to specify the number of
indicators for each factors. See
the help page on how to set this
argument. Default is |
reliability |
A named vector
(for a single-group model) or a
named list of named vectors
(for a multigroup model)
to set the reliability coefficient
of each set of indicators. Default
is |
x_fun |
The function(s) used to
generate the exogenous variables or
error terms. If
not supplied, or set to |
e_fun |
The function(s) used to
generate the error terms of indicators,
if any. If
not supplied, or set to |
process_data |
If not |
parallel |
If |
progress |
If |
ncores |
The number of CPU cores to use if parallel processing is used. |
x |
The |
digits |
The numbers of digits displayed after the decimal. |
digits_descriptive |
The number of digits displayed after the decimal for the descriptive statistics table. |
data_long |
If |
fit_to_all_args |
A named list
of arguments to be passed to
|
est_type |
The type of estimates
to be printed. Can be a character
vector of one to two elements. If
only |
variances |
Logical. Whether
variances and error variances are printed.
Default depends on |
pure_x , pure_y |
When Logical. When printing indirect effects, whether only "pure" x-variables (variables not predicted by another other variables) and/or "pure" y-variables (variables that do not predict any other variables other than indicators) will be included in enumerating the paths. |
object |
Either a |
as_list |
Logical. If |
Details
The function sim_data()
generates
a list of datasets based on a population
model.
Value
The function sim_out()
returns
a list of the class sim_data
,
with length nrep
. Each element
is a sim_data_i
object, with
the following major elements:
-
ptable
: Alavaan
parameter table of the model, with population values set in the columnstart
. (It is the output of the functionptable_pop()
.) -
mm_out
: The population model represented by model matrices as inlavaan
. (It is the output of the functionmodel_matrices_pop()
.) -
mm_lm_out
: A list of regression model formula, one for each endogenous variable. (It is the output of the internal functionmm_lm()
.) -
mm_lm_dat_out
: A simulated dataset generated from the population model. (It is the output of the internal functionmm_lm_data()
). -
model_original
: The original model syntax (i.e., the argumentmodel
). -
model_final
: A modified model syntax if the model is a latent variable model. Indicators are added to the syntax. -
fit0
: The output oflavaan::sem()
withptable
as the model anddo.fit
set toFALSE
. Used for the easy retrieval of information about the model.
The print
method of sim_data
returns x
invisibly. It is called for
its side effect.
The function pool_sim_data()
returns
either one data frame or a list
of data frames, depending on the
argument as_list
The role of sim_data()
The function sim_data()
is used by
the all-in-one function
power4test()
. Users usually do not
call this function directly, though
developers can use this function to
develop other functions for power
analysis, or to build their own
workflows to do the power analysis.
Workflow
The function sim_data()
does two tasks:
Determine the actual population model with population values based on:
A model syntax for the observed variables (for a path model) or the latent factors (for a latent variable model).
A textual specification of the effect sizes of parameters.
The number of indicators for each latent factor if the model is a latent variable model.
The reliability of each latent factor as measured by the indicators if the model is a latent factor model.
Generate nrep simulated datasets from the population model.
The simulated datasets can then be used to fit a model, test parameters, and estimate power.
The output is usually used by
fit_model()
to fit a target model,
by default the population model, to each
of the dataset.
Set 'number_of_indicators' and 'reliability'
The arguments number_of_indicators
and reliability
are used to
specify the number of indicators
(e.g., items) for each factor,
and the population reliability
coefficient of each factor,
if the variables in the model
syntax are latent variables.
Single-Group Model
If a variable in the model is to be
replaced by indicators in the generated
data, set
number_of_indicators
to a named
numeric vector. The names are the
variables of variables with
indicators, as appeared in the
model
syntax. The value of each
name is the number of indicators.
The
argument reliability
should then be
set a named numeric vector (or list,
see the section on multigroup models)
to specify the population reliability
coefficient ("omega") of each set of
indicators. The population standardized factor
loadings are then computed to ensure
that the population reliability
coefficient is of the target value.
These are examples for a single group model:
number of indicator = c(m = 3, x = 4, y = 5)
The numbers of indicators for m
,
x
, and y
are 3, 4, and 5,
respectively.
reliability = c(m = .90, x = .80, y = .70)
The population reliability
coefficients of m
, x
, and y
are
.90, .80, and .70, respectively.
Multigroup Models
Multigroup models are supported.
The number of groups is inferred
from pop_es
(see the help page
of ptable_pop()
), or directly
from ptable
.
For a multigroup model, the number of indicators for each variable must be the same across groups.
However, the population reliability
coefficients can be different
across groups. For a multigroup model
of k groups,
with one or more population reliability
coefficients differ across groups,
the argument reliability
should be
set to a named list. The names are
the variables to which the population
reliability coefficients are to be
set. The element for each name is
either a single value for the common
reliability coefficient, or a
numeric vector of the reliability
coefficient of each group.
This is an example of reliability
for a model with 2 groups:
reliability = list(x = .80, m = c(.70, .80))
The reliability coefficients of x
are
.80 in all groups, while the
reliability coefficients of m
are
.70 in one group and .80 in another.
Equal Numbers of Indicators and/or Reliability Coefficients
If all variables in the model has
the same number of indicators,
number_of_indicators
can be set
to one single value.
Similarly, if all sets of indicators
have the same population reliability
in all groups, reliability
can also
be set to one single value.
Specify The Distributions of Exogenous Variables Or Error Terms Using 'x_fun'
By default, variables and error
terms are generated
from a multivariate normal distribution.
If desired, users can supply the
function used to generate an exogenous
variable and error term by setting x_fun
to
a named list.
The names of the list are the variables for which a user function will be used to generate the data.
Each element of the list must also be a list. The first element of this list, can be unnamed, is the function to be used. If other arguments need to be supplied, they should be included as named elements of this list.
For example:
x_fun = list(x = list(power4mome::rexp_rs), w = list(power4mome::rbinary_rs, p1 = .70)))
The variables x
and w
will be
generated by user-supplied functions.
For x
, the function is
power4mome::rexp_rs
. No additional
argument when calling this function.
For w
, the function is
power4mome::rbinary_rx
. The argument
p1 = .70
will be passed to this
function when generating the values
of w
.
If a variable is an endogenous
variable (e.g., being predicted by
another variable in a model), then
x_fun
is used to generate its
error term. Its implied population
distribution may still be different
from that generate by x_fun
because
the distribution also depends on the
distribution of other variables
predicting it.
These are requirements for the user-functions:
They must return a numeric vector.
They mush has an argument
n
for the number of values.The population mean and standard deviation of the generated values must be 0 and 1, respectively.
The package power4mome
has
helper functions for generating
values from some common nonnormal
distributions and then scaling them
to have population mean and standard
deviation equal to 0 and 1 (by default), respectively.
These are some of them:
To use x_fun
, the variables must
have zero covariances with other
variables in the population. It is
possible to generate nonnormal
multivariate data but we believe this
is rarely needed when estimating
power before having the data.
Specify the Population Model by 'model'
Single-Group Model
For a single-group model, model
should be a lavaan
model syntax
string of the form of the model.
The population values of the model
parameters are to be determined by
pop_es
.
If the model has latent factors,
the syntax in model
should specify
only the structural model for the
latent factors. There is no
need to specify the measurement
part. Other functions will generate
the measurement part on top of this
model.
For example, this is a simple mediation model:
"m ~ x y ~ m + x"
Whether m
, x
, and y
denote
observed variables or latent factors
are determined by other functions,
such as power4test()
.
Multigroup Model
Because the model is the population
model, equality constraints are
irrelevant and the model syntax
specifies only the form of the
model. Therefore, model
is
specified as in the case of single
group models.
Specify 'pop_es' Using Named Vectors
The argument pop_es
is for specifying
the population values of model
parameters. This section describes
how to do this using named vectors.
Single-Group Model
If pop_es
is specified by a named
vector, it must follow the convention
below.
The names of the vectors are
lavaan
names for the selected parameters. For example,m ~ x
denotes the path fromx
tom
.Alternatively, the names can be either
".beta."
or".cov."
. Use".beta."
to set the default values for all regression coefficients. Use".cov."
to set the default values for all correlations of exogenous variables (e.g., predictors).The names can also be of this form:
".ind.(<path>)"
, whether<path>
denote path in the model. For example,".ind.(x->m->y)"
denotes the path fromx
throughm
toy
. Alternatively, thelavaan
symbol~
can also be used:".ind.(y~m~x)"
. This form is used to set the indirect effect (standardized, by default) along this path. The value for this name will override other settings.If using
lavaan
names, can specify more than one parameter using+
. For example,y ~ m + x
denotes the two paths fromm
andx
toy
.The value of each element can be the label for the effect size:
n
for nil,s
for small,m
for medium, andl
for large. The value for each label is determined byes1
andes2
. See the section on specifying these two arguments.The value of
pop_es
can also be set to a value, but it must be quoted as a string, such as"y ~ x" = ".31"
.
This is an example:
c(".beta." = "s", "m1 ~ x" = "-m", "m2 ~ m1" = "l", "y ~ x:w" = "s")
In this example,
All regression coefficients are set to small (
s
) by default, unless specified otherwise.The path from
x
tom1
is set to medium and negative (-m
).The path from
m1
tom2
is set to large (l
).The coefficient of the product term
x:w
when predictingy
is set to small (s
).
Indirect Effect
When setting an indirect effect to
a symbol (default: "si"
, "mi"
,
"li"
, with "i"
added to differentiate
them from the labels for a direct path),
the corresponding value is used to
determine the population values of
all component paths along the pathway.
All the values are assumed to be equal.
Therefore, ".ind.(x->m->y)" = ".20"
is equivalent to setting m ~ x
and y ~ m
to the square root of
.20, such that the corresponding
indirect effect is equal to the
designated value.
This behavior, though restricted, is for quick manipulation of the indirect effect. If different values along a pathway, set the value for each path directly.
Only nonnegative value is supported.
Therefore, ".ind.(x->m->y)" = "-si"
and ".ind.(x->m->y)" = "-.20"
will
throw an error.
Multigroup Model
The argument pop_es
also supports multigroup
models.
For pop_es
, instead of
named vectors, named list of
named vectors should be used.
The names are the parameters, or keywords such as
.beta.
and.cov.
, like specifying the population values for a single group model.The elements are character vectors. If it has only one element (e.g., a single string), then it is the the population value for all groups. If it has more than one element (e.g., a vector of three strings), then they are the population values of the groups. For a model of k groups, each vector must have either k elements or one element.
This is an example:
list("m ~ x" = "m", "y ~ m" = c("s", "m", "l"))
In this model, the population value
of the path m ~ x
is medium (m
) for
all groups, while the population
values for the path y ~ m
are
small (s
), medium (m
), and large (l
),
respectively.
Specify 'pop_es' Using a Multiline String
When setting the argument pop_es
,
instead of using a named vector
or named list for
pop_es
, the population values of
model parameters can also be
specified using a multiline string,
as illustrated below, to be parsed
by pop_es_yaml()
.
Single-Group Model
This is an example of the multiline string for a single-group model:
y ~ m: l m ~ x: m y ~ x: nil
The string must follow this format:
Each line starts with
tag:
.-
tag
can be the name of a parameter, inlavaan
model syntax format.For example,
m ~ x
denotes the path fromx
tom
.
A tag in
lavaan
model syntax can specify more than one parameter using+
.For example,
y ~ m + x
denotes the two paths fromm
andx
toy
.
Alternatively, the
tag
can be either.beta.
or.cov.
.Use
.beta.
to set the default values for all regression coefficients.Use
.cov.
to set the default values for all correlations of exogenous variables (e.g., predictors).
-
After each tag is the value of the population value:
-
nil
for nil (zero),-
s
for small, -
m
for medium, and -
l
for large. -
si
,mi
, andli
for small, medium, and large a standardized indirect effect, respectively.
Note:
n
cannot be used in this mode.The value for each label is determined by
es1
andes2
as described inptable_pop()
.The value can also be set to a numeric value, such as
.30
or-.30
.
-
This is another example:
.beta.: s y ~ m: l
In this example, all regression
coefficients are small
, while
the path from m
to y
is large.
Multigroup Model
This is an example of the string for a multigroup model:
y ~ m: l m ~ x: - nil - s y ~ x: nil
The format is similar to that for
a single-group model. If a parameter
has the same value for all groups,
then the line can be specified
as in the case of a single-group
model: tag: value
.
If a parameter has different values across groups, then it must be in this format:
A line starts with the tag, followed by two or more lines. Each line starts with a hyphen
-
and the value for a group.
For example:
m ~ x: - nil - s
This denotes that the model has
two groups. The values of the path
from x
to m
for the two
groups are 0 (nil
) and
small (s
), respectively.
Another equivalent way to specify
the values are using []
, on
the same line of a tag.
For example:
m ~ x: [nil, s]
The number of groups is inferred from the number of values for a parameter. Therefore, if a tag has more than one value, each tag must has the same number of value, or only one value.
The tag .beta.
and .cov.
can
also be used for multigroup models.
Which Approach To Use
Note that using named vectors or named lists is more reliable. However, using a multiline string is more user-friendly. If this method failed, please use named vectors or named list instead.
Technical Details
The multiline string is parsed by yaml::read_yaml()
.
Therefore, the format requirement
is actually that of YAML. Users
knowledgeable of YAML can use other
equivalent way to specify the string.
Set the Values for Effect Size Labels ('es1' and 'es2')
The vector es1
is for correlations,
regression coefficients, and
indirect effect, and the
vector es2
is for for standardized
moderation effect, the coefficients
of a product term. These labels
are to be used in interpreting
the specification in pop_es
.
See Also
Examples
# Specify the model
mod <-
"m ~ x
y ~ m + x"
# Specify the population values
es <-
"
y ~ m: m
m ~ x: m
y ~ x: n
"
# Generate the simulated datasets
data_all <- sim_data(nrep = 5,
model = mod,
pop_es = es,
n = 100,
iseed = 1234)
data_all
Create a 'sim_out' Object
Description
Combine the outputs
of sim_data()
, fit_model()
,
and optionally gen_mc()
and/or
gen_boot()
to one
single object.
Usage
sim_out(data_all, ...)
## S3 method for class 'sim_out'
print(x, digits = 3, digits_descriptive = 2, fit_to_all_args = list(), ...)
Arguments
data_all |
The output of
|
... |
Named arguments of
objects to be added to each
replication under the element
|
x |
The |
digits |
The numbers of digits displayed after the decimal. |
digits_descriptive |
The number of digits displayed after the decimal for the descriptive statistics table. |
fit_to_all_args |
A named list
of arguments to be passed to
|
Details
It merges into one object the output
of sim_data()
, which is a list of
nrep
simulated datasets,
fit_model()
, which is a list of the
lavaan::sem()
output for the nrep
datasets, and optionally the output
of gen_mc()
or gen_boot()
, which is a list of the
R
sets of Monte Carlo or bootstrap estimates
based on the results of
fit_model()
. The list has nrep
elements, each element with the data,
the model fit
results, and optionally the Monte
Carlo estimates matched.
This object can then be used for testing effects of interests, which are further processed to estimate the power of this test.
The function sim_out()
is used by
the all-in-one function
power4test()
. Users usually do not
call this function directly, though
developers can use this function to
develop other functions for power
analysis, or to build their own
workflows to do the power analysis.
Value
The function sim_out()
returns a sim_out
object, which
is a list of length equal to the
length of data_all
. Each element
of the list is a sim_data
object
with the element extra
added to
it. Other named elements will be
added under this name. For example.
the output of fit_model()
for this replication can be added
to fit
, under extra
. See
the description of the argument
...
for details.
The print
method of sim_out
returns x
invisibly. Called for
its side effect.
See Also
Examples
# Specify the model
mod <-
"m ~ x
y ~ m + x"
# Specify the population values
es <-
"
y ~ m: m
m ~ x: m
y ~ x: n
"
# Generate the simulated datasets
dats <- sim_data(nrep = 5,
model = mod,
pop_es = es,
n = 100,
iseed = 1234)
# Fit the population model to each dataset
fits <- fit_model(dats)
# Combine the results to one object
sim_out_all <- sim_out(data_all = dats,
fit = fits)
sim_out_all
# Verify that the elements of fits are set to extra$fit
library(lavaan)
parameterEstimates(fits[[1]])
parameterEstimates(sim_out_all[[1]]$extra$fit)
parameterEstimates(fits[[2]])
parameterEstimates(sim_out_all[[2]]$extra$fit)
Summarize Test Results
Description
Extract and summarize test results.
Usage
summarize_tests(
object,
collapse = c("none", "all_sig", "at_least_one_sig", "at_least_k_sig"),
at_least_k = 1
)
## S3 method for class 'test_summary_list'
print(x, digits = 3, ...)
## S3 method for class 'test_summary'
print(x, digits = 2, ...)
## S3 method for class 'test_out_list'
print(x, digits = 3, test_long = FALSE, ...)
Arguments
object |
A |
collapse |
Whether a single
decision (significant vs. not significant)
is made across all tests for a test
that consists of several tests
(e.g., the tests of several parameters).
If |
at_least_k |
Used by |
x |
The object to be printed. |
digits |
The numbers of digits after the decimal when printing numeric results. |
... |
Optional arguments. Not used. |
test_long |
If |
Details
The function summarize_tests()
is used to extract
information from each test stored
in a power4test
object.
The method print.test_out_list()
is
used to print the content of a list
of test stored in a power4test
object, with the option to print
just the names of tests.
Value
The function summarize_tests()
returns
a list of the class test_summary_list
.
Each element contains a summary of a
test stored. The elements are of
the class test_summary
, with
these elements:
-
test_attributes
: The stored information of a test, for printing. -
nrep
: The number of datasets (replications). -
mean
: The means of numeric information. For significance tests, these are the rejection rates. -
nvalid
: The number of non-NA
replications used to compute each mean.
The print
methods returns x
invisibly.
They are called for their side
effects.
The role of summarize_tests()
and related functions
The function summarize_tests()
and
related print methods are used by
the all-in-one function
power4test()
and its summary
method. Users usually do not
call them directly, though
developers can use this function to
develop other functions for power
analysis, or to build their own
workflows to do the power analysis.
See Also
Examples
# Specify the model
mod <-
"
m ~ x
y ~ m + x
"
# Specify the population values
es <-
"
y ~ m: l
m ~ x: m
y ~ x: n
"
# Simulated datasets
sim_only <- power4test(nrep = 2,
model = mod,
pop_es = es,
n = 100,
do_the_test = FALSE,
iseed = 1234)
# Test the parameters in each dataset
test_out <- power4test(object = sim_only,
test_fun = test_parameters)
# Print the summary
summarize_tests(test_out)
Summarize 'x_from_power' Results
Description
The summary method of
the output of x_from_power()
.
Usage
## S3 method for class 'x_from_power'
summary(object, ...)
## S3 method for class 'n_region_from_power'
summary(object, ...)
## S3 method for class 'summary.x_from_power'
print(x, digits = 3, ...)
## S3 method for class 'summary.n_region_from_power'
print(x, digits = 3, ...)
Arguments
object |
An |
... |
Additional arguments. Not used for now. |
x |
The output of
|
digits |
The number of digits after the decimal when printing the results. |
Details
The summary
method simply prepares the
results of x_from_power()
to be printed in details.
Value
The summary
method for
x_from_power
objects returns an
object of the class
summary.x_from_power
, which is
simply the output of x_from_power()
,
with a print
method dedicated for
detailed summary. Please refer
to x_from_power()
for the contents.
The print
-method of summary.x_from_power
objects returns the object x
invisibly.
It is called for its side effect.
The print
-method of summary.n_region_from_power
objects returns the object x
invisibly.
It is called for its side effect.
See Also
x_from_power()
,
n_region_from_power()
Examples
# Specify the population model
mod <-
"
m ~ x
y ~ m + x
"
# Specify the population values
mod_es <-
"
m ~ x: m
y ~ m: l
y ~ x: n
"
# Generate the datasets
sim_only <- power4test(nrep = 5,
model = mod,
pop_es = mod_es,
n = 100,
do_the_test = FALSE,
iseed = 2345)
# Do a test
test_out <- power4test(object = sim_only,
test_fun = test_parameters,
test_args = list(pars = "m~x"))
# Determine the sample size with a power of .80 (default)
power_vs_n <- x_from_power(test_out,
x = "n",
progress = TRUE,
target_power = .80,
final_nrep = 5,
max_trials = 1,
seed = 1234)
summary(power_vs_n)
Test a Conditional Indirect Effect
Description
Test a conditional
indirect effect
for a power4test
object.
Usage
test_cond_indirect(
fit = fit,
x = NULL,
m = NULL,
y = NULL,
wvalues = NULL,
mc_ci = TRUE,
mc_out = NULL,
boot_ci = FALSE,
boot_out = NULL,
check_post_check = TRUE,
...,
fit_name = "fit",
get_map_names = FALSE,
get_test_name = FALSE
)
Arguments
fit |
The fit object, to be
passed to |
x |
The name of the |
m |
A character vector of the
name(s) of mediator(s). The path
moves from the first mediator in the
vector to the last mediator in the
vector. Can be |
y |
The name of the |
wvalues |
A numeric vector of
named elements. The names are the
variable names of the moderators,
and the values are the values to
which the moderators will be set to.
Default is |
mc_ci |
Logical. If |
mc_out |
The pre-generated
Monte Carlo estimates generated by
manymome::do_mc, stored in
a |
boot_ci |
Logical. If |
boot_out |
The pre-generated
bootstrap estimates generated by
manymome::do_boot, stored in
a |
check_post_check |
Logical. If
|
... |
Additional arguments to
be passed to |
fit_name |
The name of the
model fit object to be extracted.
Default is |
get_map_names |
Logical. Used
by |
get_test_name |
Logical. Used
by |
Details
This function is to be used in
power4test()
for testing a
conditional
indirect effect, by setting it
to the test_fun
argument.
It uses manymome::cond_indirect()
to do the test. It can be used on
models fitted by lavaan::sem()
or fitted by a sequence of calls
to stats::lm()
, although only
nonparametric bootstrap confidence
interval is supported for models
fitted by regression using
stats::lm()
.
It can also be used to test
a conditional effect on a direct path
with no mediator. Just omit m
when
calling the function.
Value
In its normal usage, it returns a named numeric vector with the following elements:
-
est
: The mean of the estimated indirect effect across datasets. -
cilo
andcihi
: The means of the lower and upper limits of the confidence interval (95% by default), respectively. -
sig
: Whether a test by confidence interval is significant (1
) or not significant (0
).
See Also
Examples
# Specify the model
model_simple_mod <-
"
m ~ x + w + x:w
y ~ m + x
"
# Specify the population values
model_simple_mod_es <-
"
y ~ m: l
y ~ x: n
m ~ x: m
m ~ w: n
m ~ x:w: l
"
# Simulate the data
sim_only <- power4test(nrep = 5,
model = model_simple_mod,
pop_es = model_simple_mod_es,
n = 100,
R = 100,
do_the_test = FALSE,
iseed = 1234)
# Do the test in each replication
test_ind <- power4test(object = sim_only,
test_fun = test_cond_indirect,
test_args = list(x = "x",
m = "m",
y = "y",
wvalues = c(w = 1),
mc_ci = TRUE))
print(test_ind,
test_long = TRUE)
Test Several Conditional Indirect Effects
Description
Test several conditional
indirect effects
for a power4test
object.
Usage
test_cond_indirect_effects(
fit = fit,
x = NULL,
m = NULL,
y = NULL,
wlevels = NULL,
mc_ci = TRUE,
mc_out = NULL,
boot_ci = FALSE,
boot_out = NULL,
check_post_check = TRUE,
...,
fit_name = "fit",
get_map_names = FALSE,
get_test_name = FALSE
)
Arguments
fit |
The fit object, to be
passed to |
x |
The name of the |
m |
A character vector of the
name(s) of mediator(s). The path
moves from the first mediator in the
vector to the last mediator in the
vector. Can be |
y |
The name of the |
wlevels |
The output of
|
mc_ci |
Logical. If |
mc_out |
The pre-generated
Monte Carlo estimates generated by
manymome::do_mc, stored in
a |
boot_ci |
Logical. If |
boot_out |
The pre-generated
bootstrap estimates generated by
manymome::do_boot, stored in
a |
check_post_check |
Logical. If
|
... |
Additional arguments to
be passed to |
fit_name |
The name of the
model fit object to be extracted.
Default is |
get_map_names |
Logical. Used
by |
get_test_name |
Logical. Used
by |
Details
This function is to be used in
power4test()
for testing several
conditional
indirect effects, by setting it
to the test_fun
argument.
It uses manymome::cond_indirect_effects()
to do the test. It can be used on
models fitted by lavaan::sem()
or fitted by a sequence of calls
to stats::lm()
, although only
nonparametric bootstrap confidence
interval is supported for models
fitted by regression using
stats::lm()
.
It can also be used to test
conditional effects on a direct path
with no mediator. Just omit m
when
calling the function.
Value
In its normal usage, it returns
the output returned by
manymome::cond_indirect_effects()
,
with the following modifications:
-
est
: The estimated conditional indirect effects. -
cilo
andcihi
: The lower and upper limits of the confidence interval (95% by default), respectively. -
sig
: Whether a test by confidence interval is significant (1
) or not significant (0
). -
test_label
: A column of labels generated to label the conditional effects.
See Also
Examples
# Specify the model
model_simple_mod <-
"
m ~ x + w + x:w
y ~ m + x
"
# Specify the population values
model_simple_mod_es <-
"
y ~ m: l
y ~ x: n
m ~ x: m
m ~ w: n
m ~ x:w: l
"
# Simulate the data
# Set nrep to a larger value in real analysis, such as 400
sim_only <- power4test(nrep = 5,
model = model_simple_mod,
pop_es = model_simple_mod_es,
n = 100,
R = 100,
do_the_test = FALSE,
iseed = 1234)
# Do the tests in each replication
test_out <- power4test(object = sim_only,
test_fun = test_cond_indirect_effects,
test_args = list(x = "x",
m = "m",
y = "y",
wlevels = c("w"),
mc_ci = TRUE))
print(test_out,
test_long = TRUE)
Test a Moderated Mediation Effect
Description
Test a moderated
mediation effect for a power4test
object.
Usage
test_index_of_mome(
fit = fit,
x = NULL,
m = NULL,
y = NULL,
w = NULL,
mc_ci = TRUE,
mc_out = NULL,
boot_ci = FALSE,
boot_out = NULL,
check_post_check = TRUE,
...,
fit_name = "fit",
get_map_names = FALSE,
get_test_name = FALSE
)
Arguments
fit |
The fit object, to be
passed to |
x |
The name of the |
m |
A character vector of the
name(s) of mediator(s). The path
moves from the first mediator in the
vector to the last mediator in the
vector. Can be |
y |
The name of the |
w |
The name of the moderator. |
mc_ci |
Logical. If |
mc_out |
The pre-generated
Monte Carlo estimates generated by
manymome::do_mc, stored in
a |
boot_ci |
Logical. If |
boot_out |
The pre-generated
bootstrap estimates generated by
manymome::do_boot, stored in
a |
check_post_check |
Logical. If
|
... |
Additional arguments to
be passed to |
fit_name |
The name of the
model fit object to be extracted.
Default is |
get_map_names |
Logical. Used
by |
get_test_name |
Logical. Used
by |
Details
This function is to be used in
power4test()
for testing a
moderated mediation effect, by
setting it to the test_fun
argument.
It uses manymome::index_of_mome()
to do the test. It can be used on
models fitted by lavaan::sem()
or fitted by a sequence of calls
to stats::lm()
, although only
nonparametric bootstrap confidence
interval is supported for models
fitted by regression using
stats::lm()
.
Value
In its normal usage, it returns a named numeric vector with the following elements:
-
est
: The mean of the estimated indirect effect across datasets. -
cilo
andcihi
: The means of the lower and upper limits of the confidence interval (95% by default), respectively. -
sig
: Whether a test by confidence interval is significant (1
) or not significant (0
).
See Also
Examples
# Specify the model
mod <-
"
m ~ x + w + x:w
y ~ m
"
# Specify the population values
mod_es <-
"
m ~ x: n
y ~ x: m
m ~ w: l
m ~ x:w: l
"
# Simulate the data
sim_only <- power4test(nrep = 2,
model = mod,
pop_es = mod_es,
n = 100,
R = 100,
do_the_test = FALSE,
iseed = 1234)
# Do the test in each replication
test_out <- power4test(object = sim_only,
test_fun = test_index_of_mome,
test_args = list(x = "x",
m = "m",
y = "y",
w = "w",
mc_ci = TRUE))
print(test_out,
test_long = TRUE)
Test an Indirect Effect
Description
Test an indirect effect
for a power4test
object.
Usage
test_indirect_effect(
fit = fit,
x = NULL,
m = NULL,
y = NULL,
mc_ci = TRUE,
mc_out = NULL,
boot_ci = FALSE,
boot_out = NULL,
check_post_check = TRUE,
...,
fit_name = "fit",
get_map_names = FALSE,
get_test_name = FALSE
)
Arguments
fit |
The fit object, to be
passed to |
x |
The name of the |
m |
A character vector of the
name(s) of mediator(s). The path
moves from the first mediator in the
vector to the last mediator in the
vector. Can be |
y |
The name of the |
mc_ci |
Logical. If |
mc_out |
The pre-generated
Monte Carlo estimates generated by
manymome::do_mc, stored in
a |
boot_ci |
Logical. If |
boot_out |
The pre-generated
bootstrap estimates generated by
manymome::do_boot, stored in
a |
check_post_check |
Logical. If
|
... |
Additional arguments to
be passed to |
fit_name |
The name of the
model fit object to be extracted.
Default is |
get_map_names |
Logical. Used
by |
get_test_name |
Logical. Used
by |
Details
This function is to be used in
power4test()
for testing an
indirect effect, by setting it
to the test_fun
argument.
It uses manymome::indirect_effect()
to do the test. It can be used on
models fitted by lavaan::sem()
or fitted by a sequence of calls
to stats::lm()
, although only
nonparametric bootstrap confidence
interval is supported for models
fitted by regression using
stats::lm()
.
Value
In its normal usage, it returns a named numeric vector with the following elements:
-
est
: The mean of the estimated indirect effect across datasets. -
cilo
andcihi
: The means of the lower and upper limits of the confidence interval (95% by default), respectively. -
sig
: Whether a test by confidence interval is significant (1
) or not significant (0
).
See Also
Examples
# Specify the model
model_simple_med <-
"
m ~ x
y ~ m + x
"
# Specify the population values
model_simple_med_es <-
"
y ~ m: l
m ~ x: m
y ~ x: n
"
# Simulate the data
sim_only <- power4test(nrep = 5,
model = model_simple_med,
pop_es = model_simple_med_es,
n = 100,
R = 100,
do_the_test = FALSE,
iseed = 1234)
# Do the test in each replication
test_ind <- power4test(object = sim_only,
test_fun = test_indirect_effect,
test_args = list(x = "x",
m = "m",
y = "y",
mc_ci = TRUE))
print(test_ind,
test_long = TRUE)
Test Several Indirect Effects
Description
Test several indirect effects
for a power4test
object.
Usage
test_k_indirect_effects(
fit = fit,
x = NULL,
m = NULL,
y = NULL,
mc_ci = TRUE,
mc_out = NULL,
boot_ci = FALSE,
boot_out = NULL,
check_post_check = TRUE,
...,
omnibus = c("no", "all_sig", "at_least_one_sig", "at_least_k_sig"),
at_least_k = 1,
fit_name = "fit",
get_map_names = FALSE,
get_test_name = FALSE
)
Arguments
fit |
The fit object, to be
passed to |
x |
The name of the |
m |
Must be a list of character
vectors. Each character vector stores the
name(s) of mediator(s) along a path.
The path
moves from the first mediator in the
vector to the last mediator in the
vector. If |
y |
The name of the |
mc_ci |
Logical. If |
mc_out |
The pre-generated
Monte Carlo estimates generated by
manymome::do_mc, stored in
a |
boot_ci |
Logical. If |
boot_out |
The pre-generated
bootstrap estimates generated by
manymome::do_boot, stored in
a |
check_post_check |
Logical. If
|
... |
Additional arguments to
be passed to |
omnibus |
If |
at_least_k |
The minimum number
of paths required to be significant
for the omnibus test to be considered
significant. Used when
|
fit_name |
The name of the
model fit object to be extracted.
Default is |
get_map_names |
Logical. Used
by |
get_test_name |
Logical. Used
by |
Details
This function is to be used in
power4test()
for testing an
indirect effect, by setting it
to the test_fun
argument.
It uses manymome::many_indirect_effects()
to do the test. It can be used on
models fitted by lavaan::sem()
or fitted by a sequence of calls
to stats::lm()
, although only
nonparametric bootstrap confidence
interval is supported for models
fitted by regression using
stats::lm()
.
Value
In its normal usage, it returns a data frame with the following columns:
-
est
: The estimated indirect effect for each path. -
cilo
andcihi
: The lower and upper limits of the confidence interval (95% by default), respectively, for each indirect effect -
sig
: Whether a test by confidence interval is significant (1
) or not significant (0
). -
test_label
: A column of labels generated to label the indirect effects.
If omnibus
is "all_sig"
or
"at_least_one"sig"
, then
the data frame has only one row,
and the columns "est"
, "cilo"
,
and "cihi"
are NA
. The column
sig
is determined by whether
all paths are significant ("all_sig"
)
or whether at least one path is
significant ("at_least_one_sig"
).
See Also
Examples
# Specify the model
model_simple_med <-
"
m1 ~ x
m2 ~ x
y ~ m1 + m2 + x
"
# Specify the population values
model_simple_med_es <-
"
y ~ m1: s
m1 ~ x: m
y ~ m2: s
m2 ~ x: l
y ~ x: n
"
# Simulate the data
sim_only <- power4test(nrep = 5,
model = model_simple_med,
pop_es = model_simple_med_es,
n = 100,
R = 100,
do_the_test = FALSE,
iseed = 1234)
# Do the test in each replication
test_ind <- power4test(object = sim_only,
test_fun = test_k_indirect_effects,
test_args = list(x = "x",
y = "y",
mc_ci = TRUE))
print(test_ind,
test_long = TRUE)
# Set omnibus = "all_sig" to declare
# significant only if all paths are
# significant
test_ind_all_sig <- power4test(
object = sim_only,
test_fun = test_k_indirect_effects,
test_args = list(x = "x",
y = "y",
mc_ci = TRUE,
omnibus = "all_sig"))
print(test_ind_all_sig,
test_long = TRUE)
Test All Moderation Effects
Description
Test all moderation
effects by testing all product
terms for a power4test
object.
Usage
test_moderation(
fit = fit,
standardized = FALSE,
check_post_check = TRUE,
...,
fit_name = "fit",
get_map_names = FALSE,
get_test_name = FALSE
)
Arguments
fit |
The fit object, to be
passed to |
standardized |
Logical. If |
check_post_check |
Logical. If
|
... |
Additional arguments to
be passed to |
fit_name |
The name of the fit
results for which the parameter names
will be displayed. Default is |
get_map_names |
Logical. Used
by |
get_test_name |
Logical. Used
by |
Details
This function is to be used in
power4test()
for testing all
product terms, by
setting it to the test_fun
argument.
It is just a wrapper to
test_parameters()
. It will
first identifies all product
terms (terms with :
in the names),
and then call test_parameters()
,
with pars
set to select the
regression coefficients of these
terms.
Value
In its normal usage, it returns
the output returned by
lavaan::parameterEstimates()
or lmhelprs::lm_list_to_partable()
,
with the following modifications:
-
est
: The parameter estimates, even if standardized estimates are requested (notest.std
). -
cilo
andcihi
: The lower and upper limits of the confidence interval (95% by default), respectively (notci.lower
andci.upper
). -
sig
: Whether a test by confidence interval is significant (1
) or not significant (0
). -
test_label
: A column of labels generated bylavaan::lav_partable_labels()
, which are usually the labels used bycoef()
to label the parameters.
See Also
power4test()
,
test_parameters()
Examples
# Specify the model
mod <-
"
m ~ x + w1 + x:w1
y ~ m + x
"
# Specify the population values
mod_es <-
"
m ~ x: n
y ~ x: m
m ~ w1: n
m ~ x:w1: l
"
# Simulate the data
sim_only <- power4test(nrep = 4,
model = mod,
pop_es = mod_es,
n = 100,
do_the_test = FALSE,
iseed = 1234)
# Do the test in each replication
test_out <- power4test(object = sim_only,
test_fun = test_moderation)
print(test_out,
test_long = TRUE)
Test All Free Parameters
Description
Test all free parameters,
including user-defined parameters,
for a power4test
object.
Usage
test_parameters(
fit = fit,
standardized = FALSE,
pars = NULL,
op = NULL,
remove.nonfree = TRUE,
check_post_check = TRUE,
...,
omnibus = c("no", "all_sig", "at_least_one_sig", "at_least_k_sig"),
at_least_k = 1,
fit_name = "fit",
get_map_names = FALSE,
get_test_name = FALSE
)
find_par_names(object, fit_name = "fit")
Arguments
fit |
The fit object, to be
passed to |
standardized |
Logical. If |
pars |
Optional. If set to
a character vector, only parameters
with |
op |
Optional. If set to a
character vector, only parameters with
operators (e.g., |
remove.nonfree |
Logical. If
|
check_post_check |
Logical. If
|
... |
Additional arguments to
be passed to |
omnibus |
If |
at_least_k |
The minimum number
of paths required to be significant
for the omnibus test to be considered
significant. Used when
|
fit_name |
The name of the fit
results for which the parameter names
will be displayed. Default is |
get_map_names |
Logical. Used
by |
get_test_name |
Logical. Used
by |
object |
A |
Details
This function is to be used in
power4test()
for testing all
free and user-defined model
parameters, by
setting it to the test_fun
argument.
For models fitted by lavaan
,
it uses lavaan::parameterEstimates()
to do the test. If bootstrapping was
requested (by setting se = "boot"
),
then it supports bootstrap
confidence intervals returned by
lavaan::parameterEstimates()
.
It has preliminary, though limited,
supported for models fitted by
stats::lm()
(through
lmhelprs::many_lm()
). Tests are
conducted by ordinary least squares
confidence intervals based on
the t statistic, reported by
stats::confint()
applied to
the output of stats::lm()
.
Value
In its normal usage, it returns
the output returned by
lavaan::parameterEstimates()
or lmhelprs::lm_list_to_partable()
,
with the following modifications:
-
est
: The parameter estimates, even if standardized estimates are requested (notest.std
). -
cilo
andcihi
: The lower and upper limits of the confidence interval (95% by default), respectively (notci.lower
andci.upper
). -
sig
: Whether a test by confidence interval is significant (1
) or not significant (0
). -
test_label
: A column of labels generated bylavaan::lav_partable_labels()
, which are usually the labels used bycoef()
to label the parameters.
Find the names of parameters
To use the argument pars
, the
names as appeared in the function
coef()
must be used. For the
output of lavaan
, this can
usually be inferred from the
parameter syntax (e.g., y~x
,
no space). If not sure, call
coef()
on the output of lavaan
.
If a parameter is labelled, then
the label should be used in par
.
If not sure, the function
find_par_names()
can be used to
find valid names.
See Also
Examples
# Specify the model
mod <-
"
m ~ x
y ~ m + x
"
# Specify the population values
mod_es <-
"
y ~ m: l
m ~ x: m
y ~ x: n
"
# Simulate the data
sim_only <- power4test(nrep = 2,
model = mod,
pop_es = mod_es,
n = 100,
do_the_test = FALSE,
iseed = 1234)
# Do the tests in each replication
test_out <- power4test(object = sim_only,
test_fun = test_parameters)
print(test_out,
test_long = TRUE)
# Do the tests in each replication: Standardized solution
# Delta method SEs will be used to do the tests
test_out <- power4test(object = sim_only,
test_fun = test_parameters,
test_args = list(standardized = TRUE))
print(test_out,
test_long = TRUE)
# Do the tests in each replication: Parameters with the selected operator
test_out <- power4test(object = sim_only,
test_fun = test_parameters,
test_args = list(op = "~"))
print(test_out,
test_long = TRUE)
# Finding valid parameter names
find_par_names(sim_only)
Sample Size and Effect Size Determination
Description
It searches by simulation the sample size (given other factors, such as effect sizes) or effect size (given other factors, such as sample size) with power to detect an effect close to a target value.
Usage
x_from_power(
object,
x,
pop_es_name = NULL,
target_power = 0.8,
what = c("point", "ub", "lb"),
goal = switch(what, point = "ci_hit", ub = "close_enough", lb = "close_enough"),
ci_level = 0.95,
tolerance = 0.02,
x_interval = switch(x, n = c(50, 2000), es = NULL),
extendInt = NULL,
progress = TRUE,
simulation_progress = TRUE,
max_trials = 10,
final_nrep = 400,
final_R = 1000,
seed = NULL,
x_include_interval = FALSE,
check_es_interval = TRUE,
power_curve_args = list(power_model = NULL, start = NULL, lower_bound = NULL,
upper_bound = NULL, nls_control = list(), nls_args = list()),
save_sim_all = FALSE,
algorithm = NULL,
control = list()
)
n_from_power(
object,
pop_es_name = NULL,
target_power = 0.8,
what = c("point", "ub", "lb"),
goal = switch(what, point = "ci_hit", ub = "close_enough", lb = "close_enough"),
ci_level = 0.95,
tolerance = 0.02,
x_interval = c(50, 2000),
extendInt = NULL,
progress = TRUE,
simulation_progress = TRUE,
max_trials = 10,
final_nrep = 400,
final_R = 1000,
seed = NULL,
x_include_interval = FALSE,
check_es_interval = TRUE,
power_curve_args = list(power_model = NULL, start = NULL, lower_bound = NULL,
upper_bound = NULL, nls_control = list(), nls_args = list()),
save_sim_all = FALSE,
algorithm = NULL,
control = list()
)
n_region_from_power(
object,
pop_es_name = NULL,
target_power = 0.8,
ci_level = 0.95,
tolerance = 0.02,
x_interval = c(50, 2000),
extendInt = NULL,
progress = TRUE,
simulation_progress = TRUE,
max_trials = 10,
final_nrep = 400,
final_R = 1000,
seed = NULL,
x_include_interval = FALSE,
check_es_interval = TRUE,
power_curve_args = list(power_model = NULL, start = NULL, lower_bound = NULL,
upper_bound = NULL, nls_control = list(), nls_args = list()),
save_sim_all = FALSE,
algorithm = NULL,
control = list()
)
## S3 method for class 'x_from_power'
print(x, digits = 3, ...)
## S3 method for class 'n_region_from_power'
print(x, digits = 3, ...)
Arguments
object |
A |
x |
For |
pop_es_name |
The name of the
parameter. Required if |
target_power |
The target power, a value greater than 0 and less than one. |
what |
The value for which is
searched: the estimate power ( |
goal |
The goal of the search.
If |
ci_level |
The level of confidence of the confidence intervals computed for the estimated power. Default is .95, denoting 95%. |
tolerance |
Used when the goal
is |
x_interval |
A vector of
two values, the minimum value
and the maximum values of |
extendInt |
Whether |
progress |
Logical. Whether the searching progress is reported. |
simulation_progress |
Logical.
Whether the progress in each call
to |
max_trials |
The maximum number of trials in searching the value with the target power. Rounded up if not an integer. |
final_nrep |
The number of
replications in the final stage,
also the maximum number of replications
in each call to |
final_R |
The number of
Monte Carlo simulation or
bootstrapping samples in the final
stage. The |
seed |
If not |
x_include_interval |
Logical.
Whether the minimum and maximum
values in |
check_es_interval |
If |
power_curve_args |
A named
list of arguments to be passed
|
save_sim_all |
If |
algorithm |
The algorithm for
finding |
control |
A named list of additional arguments to be passed to the algorithm to be used. For advanced users. |
digits |
The number of digits after the decimal when printing the results. |
... |
Optional arguments. Not used for now. |
Details
This is how to use x_from_power()
:
Specify the model by
power4test()
, withdo_the_test = FALSE
, and set the magnitude of the effect sizes to the minimum levels to detect.Add the test using
power4test()
usingtest_fun
andtest_args
(see the help page ofpower4test()
for details). Run it on the starting sample size or effect size.Call
x_from_power()
on the output ofpower4test()
returned from the previous step. This function will iteratively repeat the analysis on either other sample sizes, or other values for a selected model parameter (the effect sizes), trying to achieve a goal (goal
) for a value of interest (what
).
If the goal
is "ci_hit"
, the
search will try to find a value (a sample
size, or a population value of
the selected model parameter) with
a power level close enough to the
target power, defined by having its
confidence interval for the power
including the target power.
If the goal
is "close_enough"
,
then the search will try to find a
value of x
with its level of
power ("point"
), the upper bound
of the confidence interval for this
level of power ("ub"
), or the
lower bound of the confidence interval
fro this level of power ("lb"
)
"close enough" to the target level of
power, defined by having an absolute
difference less than the tolerance
.
If several values of x
(sample
size or the population value of
a model parameter) have already been
examined by power4test_by_n()
or
power4test_by_es()
, the output
of these two functions can also be
used as object
by x_from_power()
.
Usually, the default values of the arguments should be sufficient.
The results can be viewed using
summary()
, and the output has
a plot
method (plot.x_from_power()
) to
plot the relation between power and
values (of x
) examined.
A detailed illustration on how to use this function for sample size can be found from this page:
https://sfcheung.github.io/power4mome/articles/x_from_power_for_n.html
The function n_from_power()
is just
a wrapper of x_from_power()
, with
x
set to "n"
.
The function n_region_from_power()
is just
a wrapper of x_from_power()
, with
x
set to "n"
, with two passes, one
with what = "ub"
and one with
what = "lb"
.
The print
method only prints
basic information. Call the
summary
method of x_from_power
objects
(summary.x_from_power()
) and its
print
method for detailed results
Value
The function x_from_power()
returns an x_from_power
object,
which is a list with the following
elements:
-
power4test_trials
: The output ofpower4test_by_n()
for all sample sizes examined, or ofpower4test_by_es()
for all population values of the selected parameter examined. -
rejection_rates
: The output ofrejection_rates()
. -
x_tried
: The sample sizes or population values examined. -
power_tried
: The estimated rejection rates for all the values examined. -
x_final
: The sample size or population value in the solution.NA
if a solution not found. -
power_final
: The estimated power of the value in the solution.NA
if a solution not found. -
i_final
: The position of the solution inpower4test_trials
.NA
if a solution not found. -
ci_final
: The confidence interval of the estimated power in the solution, formed by normal approximation.NA
if a solution not found. -
ci_level
: The level of confidence ofci_final
. -
nrep_final
: The number of replications (nrep
) when estimating the power in the solution. -
power_curve
: The output ofpower_curve()
when estimating the power curve. -
target_power
: The requested target power. -
power_tolerance
: The allowed difference between the solution's estimated power and the target power. Determined by the number of replications and the level of confidence of the confidence intervals. -
x_estimated
: The value (sample size or population value) with the target power, estimated bypower_curve
. This is used, when solution not found, to determine the range of the values to search when calling the function again. -
start
: The time and date when the process started. -
end
: The time and date when the process ended. -
time_spent
: The time spent in doing the search. -
args
: A named list of the arguments ofx_from_power()
used in the search. -
call
: The call when this function is called.
The function n_region_from_power()
returns a named list of two output of
n_from_power()
, of the class
n_region_from_power
. The output
with what = "ub"
is named "below"
,
and the output with what = "lb"
is
namd "above"
.
The print
-method of x_from_power
objects returns the object x
invisibly.
It is called for its side effect.
The print
-method of x_from_power_region
objects returns the object x
invisibly.
It is called for its side effect.
Algorithms
Two algorithms are currently available, the simple (though inefficient) bisection method, and a method that makes use of the estimated crude power curve.
Unlike typical root-finding problems,
the prediction of the level of power
is stochastic. Moreover, the computational
cost is high when Monte Carlo or
bootstrap confidence intervals are
used to do a test because the estimation
of the power for one single value of
x
can sometimes take one minute or
longer. Therefore, in addition to
the simple bisection method, a method,
named power curve method, was also
specifically developed for this
scenario.
Bisection Method
This method, algorithm = "bisection"
,
basically starts with
an interval that probably encloses the
value of x
that meets the goal,
and then successively narrows this
interval. The mid-point of this
interval is used as the estimate.
Though simple, there are cases in
which it can be slow. Nevertheless,
preliminary examination suggests that
this method is good enough for common
scenarios. Therefore, this method is
the default algorithm when x
is
n
.
Power Curve Method
This method, algorithm = "power_curve"
,
starts with a crude
power curve based on a few points.
This tentative model is then used
to suggest the values to examine in
the next iteration. The form, not
just the parameters, of the
model can change across iterations,
as more and more data points are
available.
This method can be used only with
the goal "ci_hit"
.
This method is the default method
for x = "es"
with goal = "ci_hit"
because the relation
between the power and the population
value of a parameter varies across
parameters, unlike the relation
between power and sample size. Therefore,
taking into account the working
power curve may help finding the
desired value of x
.
The technical internal workflow of
this method implemented in
x_from_power()
can be found in
this page: https://sfcheung.github.io/power4mome/articles/x_from_power_workflow.html.
See Also
power4test()
, power4test_by_n()
,
and power4test_by_es()
.
Examples
# Specify the population model
mod <-
"
m ~ x
y ~ m + x
"
# Specify the population values
mod_es <-
"
m ~ x: m
y ~ m: l
y ~ x: n
"
# Generate the datasets
sim_only <- power4test(nrep = 5,
model = mod,
pop_es = mod_es,
n = 100,
do_the_test = FALSE,
iseed = 2345)
# Do a test
test_out <- power4test(object = sim_only,
test_fun = test_parameters,
test_args = list(pars = "m~x"))
# Determine the sample size with a power of .80 (default)
# In real analysis, to have more stable results:
# - Use a larger final_nrep (e.g., 400).
# If the default values are OK, this call is sufficient:
# power_vs_n <- x_from_power(test_out,
# x = "n",
# seed = 4567)
power_vs_n <- x_from_power(test_out,
x = "n",
progress = TRUE,
target_power = .80,
final_nrep = 5,
max_trials = 1,
seed = 1234)
summary(power_vs_n)
plot(power_vs_n)