Help for package PublicationBiasBenchmark

Type:

Package

Title:

Benchmark for Publication Bias Correction Methods

Version:

0.1.3

Maintainer:

František Bartoš <f.bartos96@gmail.com>

Description:

Implements a unified interface for benchmarking meta-analytic publication bias correction methods through simulation studies (see Bartoš et al., 2025, <doi:10.48550/arXiv.2510.19489>). It provides 1) predefined data-generating mechanisms from the literature, 2) functions for running meta-analytic methods on simulated data, 3) pre-simulated datasets and pre-computed results for reproducible benchmarks, 4) tools for visualizing and comparing method performance.

License:

GPL-3

Encoding:

UTF-8

RoxygenNote:

7.3.3

Depends:

R (≥ 3.5.0)

Imports:

stats, metafor, osfr, MASS, numDeriv, pwr, sandwich, clubSandwich, lmtest, puniform, Rdpack

Suggests:

RoBMA, testthat (≥ 3.0.0), rmarkdown, knitr, kableExtra, DT, ggplot2, scales, ggdist, rprojroot, desc

RdMacros:

Rdpack

URL:

https://github.com/FBartos/PublicationBiasBenchmark, https://fbartos.github.io/PublicationBiasBenchmark/

BugReports:

https://github.com/FBartos/PublicationBiasBenchmark/issues

VignetteBuilder:

knitr

NeedsCompilation:

Packaged:

2025-12-05 08:35:10 UTC; fbartos

Author:

František Bartoš

[aut, cre], Samuel Pawel

[aut], Björn S. Siepe

[aut]

Repository:

CRAN

Date/Publication:

2025-12-05 11:50:08 UTC

PublicationBiasBenchmark: Benchmark for Publication Bias Correction Methods

Description

Implements a unified interface for benchmarking meta-analytic publication bias correction methods through simulation studies (see Bartoš et al., 2025, doi:10.48550/arXiv.2510.19489). It provides 1) predefined data-generating mechanisms from the literature, 2) functions for running meta-analytic methods on simulated data, 3) pre-simulated datasets and pre-computed results for reproducible benchmarks, 4) tools for visualizing and comparing method performance.

Author(s)

Maintainer: František Bartoš f.bartos96@gmail.com (ORCID)

Authors:

Samuel Pawel (ORCID)
Björn S. Siepe (ORCID)

Options for the PublicationBiasBenchmark package

Description

A placeholder object and functions for the PublicationBiasBenchmark package.

Usage

PublicationBiasBenchmark.options(...)

PublicationBiasBenchmark.get_option(name)

Arguments

...

named option(s) to change - for a list of available options, see details below.

name

the name of the option to get the current value of - for a list of available options, see details below.

Details

"resources_directory": Location where the benchmark data/results/measures are stored
"prompt_for_download": Whether each file download should ask for explicit approval

Value

The current value of all available PublicationBiasBenchmark options (after applying any changes specified) is returned invisibly as a named list.

Calculate sample variance of generic statistic

Description

Calculate sample variance of generic statistic

Usage

S_G_squared(G)

Arguments

G

Vector of generic statistics

Value

Sample variance S_G^2

Calculate sample variance of squared errors

Description

Calculate sample variance of squared errors

Usage

S_theta_minus_theta_squared(theta_hat, theta)

Arguments

theta_hat

Vector of estimates

theta

True parameter value

Value

Sample variance S_(theta_hat - theta)^2

Calculate sample variance of estimates

Description

Calculate sample variance of estimates

Usage

S_theta_squared(theta_hat)

Arguments

theta_hat

Vector of estimates

Value

Sample variance S_theta^2

Calculate sample variance of CI widths

Description

Calculate sample variance of CI widths

Usage

S_w_squared(ci_upper, ci_lower)

Arguments

ci_upper

Vector of upper CI bounds

ci_lower

Vector of lower CI bounds

Value

Sample variance S_w^2

Compare method with Multiple Measures for a DGM

Description

This is a high-level wrapper function that computes multiple pairwise comparison measures for a Data-Generating Mechanism (DGM) and saves the results to CSV files. It provides a clean and extensible interface for comparing method performance.

Usage

compare_measures(
  dgm_name,
  method,
  method_setting,
  measures = NULL,
  verbose = TRUE,
  estimate_col = "estimate",
  true_effect_col = "mean_effect",
  convergence_col = "convergence",
  method_replacements = NULL,
  n_repetitions = 1000,
  overwrite = FALSE,
  conditions = NULL
)

Arguments

dgm_name

Character string specifying the name of the DGM dataset to download.

method

Character vector of method names

method_setting

Character vector of method settings, must be same length as method

measures

Character vector of measures to compute. If NULL, computes all standard measures.

verbose

Print detailed progress of the calculation.

estimate_col

Character string specifying the column name containing parameter estimates. Default is "estimate"

true_effect_col

Character string specifying the column name in conditions data frame containing true effect sizes. Default is "mean_effect"

convergence_col

Character string specifying the column name containing convergence indicators. Default is "convergence"

method_replacements

Named list of replacement method specifications. Each element should be named with the "method-method_setting" combination (e.g., "RMA-default") and contain a named list with:

method: Character vector of replacement method names
method_setting: Character vector of replacement method settings (same length as methods)
power_test_type: Optional character vector of power test types for each replacement method (same length as methods). If not specified, uses the main power_test_type parameter

If multiple elements are specified within the vectors, these replacements are applied consecutively in case the previous replacements also failed to converge. Defaults to NULL, i.e., omitting repetitions without converged results on method-by-method basis.

n_repetitions

Number of repetitions in each condition. Necessary method replacement. Defaults to 1000.

overwrite

Logical indicating whether to overwrite existing files. Defaults to FALSE, which means only missing files will be downloaded.

conditions

Data frame of conditions from dgm_conditions()

Value

Invisible list of computed comparison data frames

Compare method with a Single Measure for a DGM

Description

This function provides pairwise comparison of method for Data-Generating Mechanisms (DGMs). It compares method performance on a condition-by-condition basis using estimates. For each pair of method, if method A has an estimate closer to the true value than method B, it gets a score of 1, if further it gets 0, and if equal it gets 0.5.

Usage

compare_single_measure(
  dgm_name,
  measure_name,
  method,
  method_setting,
  conditions,
  estimate_col = "estimate",
  true_effect_col = "mean_effect",
  convergence_col = "convergence",
  method_replacements = NULL,
  n_repetitions = 1000,
  overwrite = FALSE,
  ...
)

Arguments

dgm_name

Character string specifying the name of the DGM dataset to download.

measure_name

Name of the measure to compute (e.g., "bias", "mse")

method

Character vector of method names

method_setting

Character vector of method settings, must be same length as method

conditions

Data frame of conditions from dgm_conditions()

estimate_col

Character string specifying the column name containing parameter estimates. Default is "estimate"

true_effect_col

Character string specifying the column name in conditions data frame containing true effect sizes. Default is "mean_effect"

convergence_col

Character string specifying the column name containing convergence indicators. Default is "convergence"

method_replacements

Named list of replacement method specifications. Each element should be named with the "method-method_setting" combination (e.g., "RMA-default") and contain a named list with:

method: Character vector of replacement method names
method_setting: Character vector of replacement method settings (same length as methods)
power_test_type: Optional character vector of power test types for each replacement method (same length as methods). If not specified, uses the main power_test_type parameter

n_repetitions

Number of repetitions in each condition. Necessary method replacement. Defaults to 1000.

overwrite

Logical indicating whether to overwrite existing files. Defaults to FALSE, which means only missing files will be downloaded.

...

Additional arguments passed to measure functions

Value

Data frame with pairwise comparison scores in long format (method_a, method_b, score)

Compute Multiple Performance measures for a DGM

Description

This is a high-level wrapper function that computes multiple performance measures for a Data-Generating Mechanism (DGM) and saves the results to CSV files. It provides a clean and extensible interface for computing standard simulation performance measures.

Usage

compute_measures(
  dgm_name,
  method,
  method_setting,
  measures = NULL,
  verbose = TRUE,
  power_test_type = "p_value",
  power_threshold_p_value = 0.05,
  power_threshold_bayes_factor = 10,
  estimate_col = "estimate",
  true_effect_col = "mean_effect",
  ci_lower_col = "ci_lower",
  ci_upper_col = "ci_upper",
  p_value_col = "p_value",
  bf_col = "BF",
  convergence_col = "convergence",
  method_replacements = NULL,
  n_repetitions = 1000,
  overwrite = FALSE,
  conditions = NULL
)

Arguments

dgm_name

Character string specifying the DGM name

method

Character vector of method names

method_setting

Character vector of method settings, must be same length as method

measures

Character vector of measures to compute. If NULL, computes all standard measures.

verbose

Print detailed progress of the calculation.

power_test_type

Character vector specifying the test type for power computation: "p_value" (default) or "bayes_factor" for each method. If a single value is provided, it is repeated for all methods.

power_threshold_p_value

Numeric threshold for power computation with p-values. Default is 0.05 (reject H0 if p < 0.05).

power_threshold_bayes_factor

Numeric threshold for power computation with Bayes factors. Default is 10 (reject H0 if BF > 10)

estimate_col

Character string specifying the column name containing parameter estimates. Default is "estimate"

true_effect_col

Character string specifying the column name in conditions data frame containing true effect sizes. Default is "mean_effect"

ci_lower_col

Character string specifying the column name containing lower confidence interval bounds. Default is "ci_lower"

ci_upper_col

Character string specifying the column name containing upper confidence interval bounds. Default is "ci_upper"

p_value_col

Character string specifying the column name containing p-values. Default is "p_value"

bf_col

Character string specifying the column name containing Bayes factors. Default is "BF"

convergence_col

Character string specifying the column name containing convergence indicators. Default is "convergence"

method_replacements

Named list of replacement method specifications. Each element should be named with the "method-method_setting" combination (e.g., "RMA-default") and contain a named list with:

method: Character vector of replacement method names
method_setting: Character vector of replacement method settings (same length as methods)
power_test_type: Optional character vector of power test types for each replacement method (same length as methods). If not specified, uses the main power_test_type parameter

n_repetitions

Number of repetitions in each condition. Necessary method replacement. Defaults to 1000.

overwrite

Logical indicating whether to overwrite existing results. If FALSE (default), will skip computation for method-measure combinations that already exist

conditions

Data frame of conditions from dgm_conditions()

Value

TRUE upon successfully computation of the results file

Examples


# Download DGM results
dgm_name <- "no_bias"
download_dgm_results(dgm_name)

# Basic usage
compute_measures(
  dgm_name        = dgm_name,
  method          = c("mean", "RMA", "PET"),
  method_setting  = c("default", "default", "default"),
  measures        = c("bias", "mse", "coverage")
)

# With method replacements for non-converged results
method_replacements <- list(
  "RMA-default" = list(method = "FMA", method_setting = "default"),
  "PET-default" = list(method = c("WLS", "FMA"),
                       method_setting = c("default", "default"))
)

compute_measures(
  dgm_name            = dgm_name,
  method              = c("RMA", "PET"),
  method_setting      = c("default", "default"),
  method_replacements = method_replacements,
  measures            = c("bias", "mse")
)

Compute Performance Measures

Description

This function provides a modular and extensible way to compute performance measures (PM) for Data-Generating Mechanisms (DGMs). It handles different types of measures and automatically determines the required arguments for each measure function.

Usage

compute_single_measure(
  dgm_name,
  measure_name,
  method,
  method_setting,
  conditions,
  measure_fun,
  measure_mcse_fun,
  power_test_type = "p_value",
  estimate_col = "estimate",
  true_effect_col = "mean_effect",
  ci_lower_col = "ci_lower",
  ci_upper_col = "ci_upper",
  p_value_col = "p_value",
  bf_col = "BF",
  convergence_col = "convergence",
  power_threshold_p_value = 0.05,
  power_threshold_bayes_factor = 10,
  method_replacements = NULL,
  n_repetitions = 1000,
  overwrite = FALSE,
  ...
)

Arguments

dgm_name

Character string specifying the DGM name

measure_name

Name of the measure to compute (e.g., "bias", "mse")

method

Character vector of method names

method_setting

Character vector of method settings, must be same length as method

conditions

Data frame of conditions from dgm_conditions()

measure_fun

Function to compute the measure

measure_mcse_fun

Function to compute the MCSE for the measure

power_test_type

Character vector specifying the test type for power computation: "p_value" (default) or "bayes_factor" for each method. If a single value is provided, it is repeated for all methods.

estimate_col

Character string specifying the column name containing parameter estimates. Default is "estimate"

true_effect_col

Character string specifying the column name in conditions data frame containing true effect sizes. Default is "mean_effect"

ci_lower_col

Character string specifying the column name containing lower confidence interval bounds. Default is "ci_lower"

ci_upper_col

Character string specifying the column name containing upper confidence interval bounds. Default is "ci_upper"

p_value_col

Character string specifying the column name containing p-values. Default is "p_value"

bf_col

Character string specifying the column name containing Bayes factors. Default is "BF"

convergence_col

Character string specifying the column name containing convergence indicators. Default is "convergence"

power_threshold_p_value

Numeric threshold for power computation with p-values. Default is 0.05 (reject H0 if p < 0.05).

power_threshold_bayes_factor

Numeric threshold for power computation with Bayes factors. Default is 10 (reject H0 if BF > 10)

method_replacements

Named list of replacement method specifications. Each element should be named with the "method-method_setting" combination (e.g., "RMA-default") and contain a named list with:

method: Character vector of replacement method names
method_setting: Character vector of replacement method settings (same length as methods)
power_test_type: Optional character vector of power test types for each replacement method (same length as methods). If not specified, uses the main power_test_type parameter

n_repetitions

Number of repetitions in each condition. Necessary method replacement. Defaults to 1000.

overwrite

Logical indicating whether to overwrite existing results. If FALSE (default), will skip computation for method-measure combinations that already exist

...

Additional arguments passed to measure functions

Value

TRUE upon successfully computation of the results file

Create standardized empty method result for convergence failures

Description

Create standardized empty method result for convergence failures

Usage

create_empty_result(method_name, note, extra_columns = NULL)

Arguments

method_name

Character string of the method name

note

Character string describing the failure reason

extra_columns

Character vector of additional empty columns to add to the table

Value

Data frame with standardized empty result structure

DGM Method

Description

S3 Method for defining data-generating mechanisms. See simulate_dgm() for usage and further details.

Usage

dgm(dgm_name, settings)

Arguments

dgm_name

Character string specifying the DGM type

settings

List containing the required parameters for the DGM or numeric condition_id

Value

A data frame with simulated data following the structure described in the Output Structure section. This is an S3 generic method that dispatches to specific DGM implementations based on dgm_name.

Output Structure

The returned data frame follows a standardized schema that downstream functions rely on. Across the currently implemented DGMs, the following columns are used:

yi (numeric): The effect size estimate.
sei (numeric): Standard error of yi.
ni (integer): Total sample size for the estimate (e.g., sum over groups where applicable).
es_type (character): Effect size type, used to disambiguate the scale of yi. Currently used values are "SMD" (standardized mean difference / Cohen's d), "logOR" (log odds ratio), and "none" (unspecified generic continuous coefficient).
study_id (integer/character, optional): Identifier of the primary study/cluster when a DGM yields multiple estimates per study (e.g., Alinaghi2018, PRE). If absent, each row is treated as an independent study.

Examples


simulate_dgm("Carter2019", 1)

Alinaghi and Reed (2018) Data-Generating Mechanism

Description

This data-generating mechanism simulates univariate regression studies where a variable X affects a continuous outcome Y. Each study estimates the coefficient of X, which consists of a fixed component (alpha1) representing the overall mean effect, and a random component that varies across studies but is constant within each study. In the "Random Effects" environment ("RE"), each study produces one estimate, and the population effect differs across studies. In the "Panel Random Effects" environment ("PRE"), each study has 10 estimates, modeling the common scenario where multiple estimates per study are available, with publication selection targeting the study rather than individual estimates.

The description and code is based on Hong and Reed (2021). The data-generating mechanism was introduced in Alinaghi and Reed (2018).

Usage

## S3 method for class 'Alinaghi2018'
dgm(dgm_name, settings)

Arguments

dgm_name

DGM name (automatically passed)

settings

List containing

environment: Type of the simulation environment. One of "FE", "RE", or "PRE".
mean_effect: Mean effect
bias: Type of publication bias. One of "none", "positive", or "significant".

Details

This data-generating mechanism is based on Alinaghi & Reed (2018), who study univariate regression models where a variable X affects a continuous variable Y. The parameter of interest is the coefficient on X. In the "Random Effects" environment ("RE"), each study produces one estimate, and the population effect differs across studies. The coefficient on X equals a fixed component (alpha1) plus a random component that is fixed within a study but varies across studies. The overall mean effect of X on Y is given by alpha1. In the "Panel Random Effects" environment ("PRE"), each study has 10 estimates, modeling the common scenario where multiple estimates per study are available. In this environment, effect estimates and standard errors are simulated to be more similar within studies than across studies, and publication selection targets the study rather than individual estimates (a study must have at least 7 out of 10 estimates that are significant or correctly signed.).

A distinctive feature of Alinaghi & Reed's experiments is that the number of effect size estimates is fixed before publication selection, making the meta-analyst's sample size endogenous and affected by the effect size. Large population effects are subject to less publication selection, as most estimates satisfy the selection criteria (statistical significance or correct sign). The sample size of all primary studies is fixed at 100 observations. (Neither the number of estimates nor the sample size of primary studies can be changed in the current implementation of the function.)

Another feature is the separation of statistical significance and sign of the estimated effect as criteria for selection. Significant/correctly-signed estimates are always "published," while insignificant/wrong-signed estimates have only a 10% chance of being published. This allows for different and sometimes conflicting consequences for estimator performance.

Value

Data frame with

yi: effect size
sei: standard error
ni: sample size
study_id: study identifier
es_type: effect size type

References

Alinaghi N, Reed WR (2018). “Meta-analysis and publication bias: How well does the FAT-PET-PEESE procedure work?” Research Synthesis Methods, 9(2), 285-311. doi:10.1002/jrsm.1298.

Hong S, Reed WR (2021). “Using Monte Carlo experiments to select meta-analytic estimators.” Research Synthesis Methods, 12(2), 192-215. doi:10.1002/jrsm.1467.

Bom and Rachinger (2019) Data-Generating Mechanism

Description

Simulates univariate regression environments to estimate the effect of X1 on Y (parameter alpha1). Effect heterogeneity is introduced via an omitted variable (X2) correlated with X1, whose coefficient (alpha2) is randomly distributed with mean zero and variance sigma2_h.

The description and code is based on Hong and Reed (2021). The data-generating mechanism was introduced in Bom and Rachinger (2019).

Usage

## S3 method for class 'Bom2019'
dgm(dgm_name, settings)

Arguments

dgm_name

DGM name (automatically passed)

settings

List containing

mean_effect: Mean effect
effect_heterogeneity: Mean effect heterogeneity
bias: Proportion of studies affected by publication bias
n_studies: Number of effect size estimates
sample_sizes: Sample sizes of the effect size estimates. A vector of sample sizes needs to be supplied. The sample sizes in the vector are sequentially reused until all effect size estimates are generated.

Details

This function simulates univariate regression environments, focusing on estimating the effect of a variable X1 on a dependent variable Y, represented by the parameter alpha1. The simulation introduces variation in the standard errors of estimated effects by allowing sample sizes to differ across primary studies. Effect heterogeneity is modeled through an omitted variable (X2) that is correlated with X1, where the coefficient on the omitted variable, alpha2, is randomly distributed across studies with mean zero and variance sigma2_h.

Publication selection is modeled in two regimes: (1) no selection, and (2) 50% selection. Under 50% selection, each estimate has a 50% chance of being evaluated for inclusion. If selected, only positive and statistically significant estimates are published; otherwise, new estimates are generated until this criterion is met. This process continues until the meta-analyst’s sample reaches its predetermined size.

Value

Data frame with

yi: effect size
sei: standard error
ni: sample size
es_type: effect size type

References

Bom PR, Rachinger H (2019). “A kinked meta-regression model for publication bias correction.” Research Synthesis Methods, 10(4), 497-514. doi:10.1002/jrsm.1352.

Hong S, Reed WR (2021). “Using Monte Carlo experiments to select meta-analytic estimators.” Research Synthesis Methods, 12(2), 192-215. doi:10.1002/jrsm.1467.

Carter et al. (2019) Data-Generating Mechanism

Description

This data-generating mechanism simulates primary studies estimating treatment effects using Cohen's d. The observed effect size is modeled as a fixed mean plus random heterogeneity across studies, with sample sizes varying to generate differences in standard errors. The simulation introduces publication bias via a selection algorithm where the probability of publication depends nonlinearly on the sign and p-value of the effect, with regimes for no, medium, and strong publication bias. It also incorporates questionable research practices (QRPs) such as optional outlier removal, selection between dependent variables, use of moderators, and optional stopping.

The description and code is based on Hong and Reed (2021). The data-generating mechanism was introduced in Carter et al. (2019).

Usage

## S3 method for class 'Carter2019'
dgm(dgm_name, settings)

Arguments

dgm_name

DGM name (automatically passed)

settings

List containing

mean_effect: Mean effect
effect_heterogeneity: Mean effect heterogeneity
bias: Degree of publication bias with one of following levels: "none", "medium", "high".
QRP: Degree of questionable research practices with one of following levels: "none", "medium", "high".
n_studies: Number of effect size estimates

Details

This simulation environment is based on the framework described by Carter, Schönbrodt, Gervais, and Hilgard (2019). In this setup, primary studies estimate the effect of a treatment using Cohen's d as the effect size metric. The observed difference between treatment and control groups is modeled as the sum of a fixed effect (alpha1) and a random component, which introduces effect heterogeneity across studies. The degree of heterogeneity is controlled by the parameter sigma2_h. Variability in the standard errors of d is generated by simulating primary studies with different sample sizes.

The simulation incorporates two main types of distortions in the research environment. First, a publication selection algorithm is used, where the probability of a study being "published" depends nonlinearly on both the sign of the estimated effect and its p-value. Three publication selection regimes are modeled: "No Publication Bias," "Medium Publication Bias," and "Strong Publication Bias," each defined by different parameters in the selection algorithm. Second, the simulation includes four types of questionable research practices (QRPs): (a) optional removal of outliers, (b) optional selection between two dependent variables, (c) optional use of moderators, and (d) optional stopping.

Value

Data frame with

yi: effect size
sei: standard error
ni: sample size
es_type: effect size type

References

Carter EC, Schönbrodt FD, Gervais WM, Hilgard J (2019). “Correcting for bias in psychology: A comparison of meta-analytic methods.” Advances in Methods and Practices in Psychological Science, 2(2), 115-144. doi:10.1177/2515245919847196.

Hong S, Reed WR (2021). “Using Monte Carlo experiments to select meta-analytic estimators.” Research Synthesis Methods, 12(2), 192-215. doi:10.1002/jrsm.1467.

Stanley, Doucouliagos, and Ioannidis (2017) Data-Generating Mechanism

Description

Simulates two scenarios for meta-analysis studies investigating the effect of a treatment in: (1) Log Odds Ratio scenario, where the outcome is binary and effect heterogeneity is controlled by a random component, and (2) Cohen's d scenario, where the outcome is continuous and effect heterogeneity is introduced through a random component. Both scenarios allow for varying sample sizes and publication selection regimes, affecting the inclusion of study estimates based on their statistical significance and sign.

The description and code is based on Hong and Reed (2021). The data-generating mechanism was introduced in Stanley et al. (2017).

Usage

## S3 method for class 'Stanley2017'
dgm(dgm_name, settings)

Arguments

dgm_name

DGM name (automatically passed)

settings

List containing

environment: Type of the simulation environment. One of "logOR" or "SMD".
mean_effect: Mean effect
effect_heterogeneity: Mean effect heterogeneity
bias: Proportion of studies affected by publication bias
n_studies: Number of effect size estimates
sample_sizes: Sample sizes of the effect size estimates. A vector of sample sizes needs to be supplied. The sample sizes in the vector are sequentially reused until all effect size estimates are generated.

Details

This function simulates two meta-analysis scenarios to evaluate the effect of a binary treatment variable (⁠treat = {0, 1}⁠) on study outcomes, incorporating both effect heterogeneity and publication selection mechanisms.

In the Log Odds Ratio ("logOR") scenario, primary studies assess the impact of treatment on a binary success indicator (Y = 1). The control group has a fixed 10% probability of success, while the treatment group's probability is increased by a fixed effect and a mean-zero random component, whose variance (sigma2_h) controls effect heterogeneity. Each study estimates a logistic regression, with the coefficient on treat (alpha1) as the effect of interest. Study sample sizes vary, resulting in different standard errors for estimated effects.

In the Cohen's d ("SMD") scenario, the outcome variable is continuous. The treatment effect is modeled as a fixed effect (alpha1) plus a random component (variance sigma2_h). Each study computes Cohen's d, the standardized mean difference between treatment and control groups. Study sample sizes vary, affecting the standard errors of d.

Value

Data frame with

yi: effect size
sei: standard error
ni: sample size
es_type: effect size type

References

Hong S, Reed WR (2021). “Using Monte Carlo experiments to select meta-analytic estimators.” Research Synthesis Methods, 12(2), 192-215. doi:10.1002/jrsm.1467.

Stanley TD, Doucouliagos H, Ioannidis JP (2017). “Finding the power to reduce publication bias.” Statistics in Medicine, 36(10), 1580-1598. doi:10.1002/sim.7228.

Default DGM handler

Description

Default DGM handler

Usage

## Default S3 method:
dgm(dgm_name, settings)

Arguments

dgm_name

Character string specifying the DGM type

settings

List containing the required parameters for the DGM or numeric condition_id

Value

Throws an error indicating the DGM type is unknown. This default method is only called when no specific DGM implementation is found for the given dgm_name.

Normal Unbiased Data-Generating Mechanism

Description

An example data-generating mechanism to simulate effect sizes without publication bias.

Usage

## S3 method for class 'no_bias'
dgm(dgm_name, settings)

Arguments

dgm_name

DGM name (automatically passed)

settings

List containing

mean_effect: Mean effect
heterogeneity: Effect heterogeneity
n_studies: Number of effect size estimates

Details

Sample sizes of individual effect size estimates are generated from a negative binomial distribution based on empirical sample size distribution presented in Appendix B of Maier et al. (2023)

Value

Data frame with

yi: effect size
sei: standard error
ni: sample size
es_type: effect size type

References

Maier M, Bartoš F, Wagenmakers E (2023). “Robust Bayesian meta-analysis: Addressing publication bias with model-averaging.” Psychological Methods, 28(1), 107-122. doi:10.1037/met0000405.

Return Pre-specified DGM Settings

Description

This function returns the list of pre-specified settings for a given Data Generating Mechanism (DGM).

Usage

dgm_conditions(dgm_name)

get_dgm_condition(dgm_name, condition_id)

Arguments

dgm_name

Character string specifying the DGM type

condition_id

which conditions should settings be returned for.

Value

A data frame containing the pre-specified settings including a condition_id column which maps settings id to the corresponding settings.

Examples

head(dgm_conditions("Carter2019"))
get_dgm_condition("Carter2019", condition_id = 1)

head(dgm_conditions("Alinaghi2018"))

head(dgm_conditions("Stanley2017"))

Download Datasets/Results/Measures of a DGM

Description

This function downloads datasets/results/measures of a specified Data-Generating Mechanism (DGM) from the OSF repository (https://osf.io/exf3m/). The datasets/results/measures are saved to the location specified via PublicationBiasBenchmark.options(resources_directory = "/path/"). To set the location permanently, specify the PublicationBiasBenchmark_RESOURCES environment variable. The data are stored in dgm_name/datasets, dgm_name/results, dgm_name/measures subfolders.

Usage

download_dgm_datasets(
  dgm_name,
  overwrite = FALSE,
  progress = TRUE,
  max_try = 10
)

download_dgm_results(
  dgm_name,
  overwrite = FALSE,
  progress = TRUE,
  max_try = 10
)

download_dgm_measures(
  dgm_name,
  overwrite = FALSE,
  progress = TRUE,
  max_try = 10
)

Arguments

dgm_name

Character string specifying the name of the DGM dataset to download.

overwrite

Logical indicating whether to overwrite existing files. Defaults to FALSE, which means only missing files will be downloaded.

progress

Logical indicating whether to show progress downloading files. Defaults to TRUE.

max_try

Integet specifying how many times should the function attempt reconnecting to OSF upon failure.

Value

TRUE if the download was successful, otherwise an error is raised.

Examples


  download_dgm_datasets("no_bias")

Get performance measure function

Description

Returns the function for computing a specific performance measure. Can also be used to list available measures.

Usage

measure(measure, ...)

Arguments

measure

Character string specifying the measure name. If missing, returns a vector of all available measures.

...

Additional arguments passed to methods.

Value

A function that computes the requested measure, or a character vector of measure names.

Get performance measure MCSE function

Description

Returns the function for computing the Monte Carlo Standard Error (MCSE) of a specific performance measure. Can also be used to list available measures with MCSE functions.

Usage

measure_mcse(measure, ...)

Arguments

measure

Character string specifying the measure name. If missing, returns a vector of all available measures.

...

Additional arguments passed to methods.

Value

A function that computes the MCSE of the requested measure, or a character vector of measure names.

Performance Measures and Monte Carlo Standard Errors

Description

A comprehensive set of functions for computing performance measures and their Monte Carlo Standard Errors (MCSE) for simulation studies. All functions are based on definitions from Table 3 in Siepe et al. (2024). Winkler interval score is defined in Winkler (1972). Positive and negative likelihood ratios are defined in Huang and Trinquart (2023) and Deeks and Altman (2004). Also see Morris et al. (2019) for additional details. Bias and relative bias were modified to account for possibly different true values across repetitions.

Usage

bias(theta_hat, theta)

bias_mcse(theta_hat)

relative_bias(theta_hat, theta)

relative_bias_mcse(theta_hat, theta)

mse(theta_hat, theta)

mse_mcse(theta_hat, theta)

rmse(theta_hat, theta)

rmse_mcse(theta_hat, theta)

empirical_variance(theta_hat)

empirical_variance_mcse(theta_hat)

empirical_se(theta_hat)

empirical_se_mcse(theta_hat)

coverage(ci_lower, ci_upper, theta)

coverage_mcse(ci_lower, ci_upper, theta)

power(test_rejects_h0)

power_mcse(test_rejects_h0)

mean_ci_width(ci_upper, ci_lower)

mean_ci_width_mcse(ci_upper, ci_lower)

mean_generic_statistic(G)

mean_generic_statistic_mcse(G)

positive_likelihood_ratio(tp, fp, fn, tn)

positive_likelihood_ratio_mcse(tp, fp, fn, tn)

negative_likelihood_ratio(tp, fp, fn, tn)

negative_likelihood_ratio_mcse(tp, fp, fn, tn)

interval_score(ci_lower, ci_upper, theta, alpha = 0.05)

interval_score_mcse(ci_lower, ci_upper, theta, alpha = 0.05)

Arguments

theta_hat

Vector of parameter estimates from simulations

theta

True parameter value

ci_lower

Vector of lower confidence interval bounds

ci_upper

Vector of upper confidence interval bounds

test_rejects_h0

Logical vector indicating whether statistical tests reject the null hypothesis

G

Vector of generic statistics from simulations

tp

Numeric with the count of true positive hypothesis tests

fp

Numeric with the count of false positive hypothesis tests

fn

Numeric with the count of false negative hypothesis tests

tn

Numeric with the count of true negative hypothesis tests

alpha

Numeric indicating the 1 - coverage level for interval_score calculation

Details

The package provides the following performance measures and their corresponding MCSE functions:

bias(theta_hat, theta): Bias estimate
relative_bias(theta_hat, theta): Relative bias estimate
mse(theta_hat, theta): Mean Square Error
rmse(theta_hat, theta): Root Mean Square Error
empirical_variance(theta_hat): Empirical variance
empirical_se(theta_hat): Empirical standard error
coverage(ci_lower, ci_upper, theta): Coverage probability
mean_ci_width(ci_upper, ci_lower): Mean confidence interval width
interval_score(ci_lower, ci_upper, theta, alpha): interval_score
power(test_rejects_h0): Statistical power
positive_likelihood_ratio(tp, fp, fn, tn): Log positive likelihood ratio
negative_likelihood_ratio(tp, fp, fn, tn): Log negative likelihood ratio
mean_generic_statistic(G): Mean of any generic statistic

Value

Each metric function returns a numeric value representing the performance measure. Each MCSE function returns a numeric value representing the Monte Carlo standard error.

References

Deeks JJ, Altman DG (2004). “Diagnostic tests 4: likelihood ratios.” BMJ, 329(7458), 168–169. doi:10.1136/bmj.329.7458.168.

Huang Q, Trinquart L (2023). “Relative likelihood ratios for neutral comparisons of statistical tests in simulation studies.” Biometrical Journal, 66(1), 2200102. doi:10.1002/bimj.202200102.

Morris TP, White IR, Crowther MJ (2019). “Using simulation studies to evaluate statistical methods.” Statistics in Medicine, 38(11), 2074–2102. doi:10.1002/sim.8086.

Siepe BS, Bartoš F, Morris TP, Boulesteix A, Heck DW, Pawel S (2024). “Simulation studies for methodological research in psychology: A standardized template for planning, preregistration, and reporting.” Psychological Methods. doi:10.1037/met0000695.

Winkler RL (1972). “A decision-theoretic approach to interval estimation.” Journal of the American Statistical Association, 67(337), 187–191. doi:10.1080/01621459.1972.10481224.

Examples

# Generate some example data
set.seed(123)
theta_true <- 0.5
theta_estimates <- rnorm(1000, mean = theta_true, sd = 0.1)

# Compute bias and its MCSE
bias_est <- bias(theta_estimates, theta_true)
bias_se <- bias_mcse(theta_estimates)

# Compute MSE and its MCSE
mse_est <- mse(theta_estimates, theta_true)
mse_se <- mse_mcse(theta_estimates, theta_true)

# Example with coverage
ci_lower <- theta_estimates - 1.96 * 0.1
ci_upper <- theta_estimates + 1.96 * 0.1
coverage_est <- coverage(ci_lower, ci_upper, theta_true)
coverage_se <- coverage_mcse(ci_lower, ci_upper, theta_true)

Method Method

Description

S3 Method for defining methods. See run_method() for usage and further details.

Usage

method(method_name, data, settings)

Arguments

method_name

Character string specifying the method type

data

Data frame containing yi (effect sizes) and sei (standard errors)

settings

Either a character identifying a method version or list containing method-specific settings. An emty input will result in running the default (first implemented) version of the method.

Value

A data frame with method results following the structure described in the Output Structure section. This is an S3 generic method that dispatches to specific method implementations based on method_name.

Output Structure

The returned data frame follows a standardized schema that downstream functions rely on. All methods return the following columns:

method (character): The name of the method used.
estimate (numeric): The meta-analytic effect size estimate.
standard_error (numeric): Standard error of the estimate.
ci_lower (numeric): Lower bound of the 95% confidence interval.
ci_upper (numeric): Upper bound of the 95% confidence interval.
p_value (numeric): P-value for the estimate.
BF (numeric): Bayes Factor for the estimate.
convergence (logical): Whether the method converged successfully.
note (character): Additional notes describing convergence issues.

Some methods may include additional method-specific columns beyond these standard columns. Use get_method_extra_columns() to query which additional columns a particular method returns.

Examples


data <- data.frame(
  yi = c(0.2, 0.3, 0.1, 0.4),
  sei = c(0.1, 0.15, 0.08, 0.12)
)
result <- run_method("RMA", data, "default")

AK Method

Description

Implements the Andrews & Kasy (AK) method for publication bias correction in meta-analysis. The AK method categorizes estimated effects into groups with different probabilities of being published. AK1 uses symmetric selection grouping estimates into significant (|t| >= 1.96) and insignificant (|t| < 1.96) estimates. AK2 uses asymmetric selection with four groups based on both significance and sign: highly significant positive/negative effects and marginally significant positive/negative effects, each with different publication probabilities. See Andrews and Kasy (2019) for details.

Usage

## S3 method for class 'AK'
method(method_name, data, settings)

Arguments

method_name

Method name (automatically passed)

data

Data frame with yi (effect sizes), sei (standard errors), and study_id (for clustering wherever available)

settings

List of method settings (see Details)

Details

The following settings are implemented

"default": Uses AK1 estimator (symmetric selection)
"AK1": Symmetric selection model grouping estimates into significant (|t| >= 1.96) and insignificant (|t| < 1.96) categories with relative publication probabilities of 1 and p1 respectively.
"AK2": Asymmetric selection model with four groups based on t-statistics: (a) t >= 1.96, (b) t < -1.96, (c) -1.96 <= t < 0, and (d) 0 <= t < 1.96, with relative publication probabilities of 1, p1, p2, and p3 respectively.

Value

Data frame with AK results

References

Andrews I, Kasy M (2019). “Identification of and correction for publication bias.” American Economic Review, 109(8), 2766–2794. doi:10.1257/aer.20180310.

Examples

# Generate some example data
data <- data.frame(
  yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
  sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)

# Apply AK method
result <- run_method("AK", data, "default")
print(result)

Endogenous Kink Method

Description

Implements the endogenous kink (EK) method proposed by Bom and Rachinger for publication bias correction in meta-analysis. This method modifies the PET-PEESE approach by incorporating a non-linear relationship between publication bias and standard errors through a kinked regression specification. The method recognizes that when the true effect is non-zero, there is minimal publication selection when standard errors are very small (since most estimates are significant), but selection increases as standard errors grow. The kink point is endogenously determined using a two-step procedure based on the confidence interval of the initial effect estimate. See Bom and Rachinger (2019) for details.

Usage

## S3 method for class 'EK'
method(method_name, data, settings = NULL)

Arguments

method_name

Method name (automatically passed)

data

Data frame with yi (effect sizes) and sei (standard errors)

settings

List of method settings (no settings version are implemented)

Value

Data frame with EK results

References

Bom PR, Rachinger H (2019). “A kinked meta-regression model for publication bias correction.” Research Synthesis Methods, 10(4), 497-514. doi:10.1002/jrsm.1352.

Examples

# Generate some example data
data <- data.frame(
  yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
  sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)

# Apply EK method
result <- run_method("EK", data)
print(result)

Fixed Effects Meta-Analysis Method

Description

Implements the publication bias-unadjusted fixed effects meta-analysis.

Usage

## S3 method for class 'FMA'
method(method_name, data, settings)

Arguments

method_name

Method name (automatically passed)

data

Data frame with yi (effect sizes) and sei (standard errors)

settings

List of method settings (see Details)

Details

The following settings are implemented

"default": T-distribution adjustment (test = "t") and cluster robust standard errors with small-sample adjustment (if converged, otherwise no small-sample adjustment or no cluster robust standard errors) for fixed effects meta-analysis if study_ids is specified in the data

Value

Data frame with FMA results

References

There are no references for Rd macro ⁠\insertAllCites⁠ on this help page.

Examples

# Generate some example data
data <- data.frame(
  yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
  sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)

# Apply FMA method
result <- run_method("FMA", data)
print(result)

PEESE (Precision-Effect Estimate with Standard Errors) Method

Description

Implements the Precision-Effect Estimate with Standard Errors method for publication bias correction. PEESE regresses effect sizes against standard errors^2 to correct for publication bias. The intercept represents the bias-corrected effect size estimate. See Stanley and Doucouliagos (2014) for details.

Usage

## S3 method for class 'PEESE'
method(method_name, data, settings = NULL)

Arguments

method_name

Method name (automatically passed)

data

Data frame with yi (effect sizes) and sei (standard errors)

settings

List of method settings (no settings version are implemented)

Value

Data frame with PEESE results

References

Stanley TD, Doucouliagos H (2014). “Meta-regression approximations to reduce publication selection bias.” Research Synthesis Methods, 5(1), 60–78. doi:10.1002/jrsm.1095.

Examples

# Generate some example data
data <- data.frame(
  yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
  sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)

# Apply PEESE method
result <- run_method("PEESE", data)
print(result)

PET (Precision-Effect Test) Method

Description

Implements the Precision-Effect Test for publication bias correction. PET regresses effect sizes against standard errors to test for and correct publication bias. The intercept represents the bias-corrected effect size estimate. See Stanley and Doucouliagos (2014) for details.

Usage

## S3 method for class 'PET'
method(method_name, data, settings = NULL)

Arguments

method_name

Method name (automatically passed)

data

Data frame with yi (effect sizes) and sei (standard errors)

settings

List of method settings (no settings version are implemented)

Value

Data frame with PET results

References

Stanley TD, Doucouliagos H (2014). “Meta-regression approximations to reduce publication selection bias.” Research Synthesis Methods, 5(1), 60–78. doi:10.1002/jrsm.1095.

Examples

# Generate some example data
data <- data.frame(
  yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
  sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)

# Apply PET method
result <- run_method("PET", data)
print(result)

PET-PEESE (Precision-Effect Test and Precision-Effect Estimate with Standard Errors) Method

Description

Implements the Precision-Effect Test and Precision-Effect Estimate with Standard Errors (PET-PEESE) regresses effect sizes against standard errors^2 to correct for publication bias. The intercept represents the bias-corrected effect size estimate. See Stanley and Doucouliagos (2014) for details.

Usage

## S3 method for class 'PETPEESE'
method(method_name, data, settings)

Arguments

method_name

Method name (automatically passed)

data

Data frame with yi (effect sizes) and sei (standard errors)

settings

List of method settings (see Details)

Details

The following settings are implemented

"default": (conditional_alpha = 0.10) determines whether to use PET (PET's effect is not significant at alpha = 0.10 or PEESE estimate (PET's effect is significant at alpha = 0.10)

Value

Data frame with PET-PEESE results

References

Stanley TD, Doucouliagos H (2014). “Meta-regression approximations to reduce publication selection bias.” Research Synthesis Methods, 5(1), 60–78. doi:10.1002/jrsm.1095.

Examples

# Generate some example data
data <- data.frame(
  yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
  sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)

# Apply PETPEESE method
result <- run_method("PETPEESE", data)
print(result)

Random Effects Meta-Analysis Method

Description

Implements the publication bias-unadjusted random-effects meta-analysis.

Usage

## S3 method for class 'RMA'
method(method_name, data, settings)

Arguments

method_name

Method name (automatically passed)

data

Data frame with yi (effect sizes) and sei (standard errors)

settings

List of method settings (see Details)

Details

The following settings are implemented

"default": Restricted Maximum Likelihood estimator (method = "REML") with Knapp-Hartung adjustment (test = "knha") for a simple random effects meta-analysis and Restricted Maximum Likelihood estimator (method = "REML") with t-distribution adjustment (test = "t") and cluster robust standard errors with small-sample adjustment (if converged, otherwise no small-sample adjustment or no cluster robust standard errors) for a multilevel random effects meta-analysis if study_ids is specified in the data

Value

Data frame with RMA results

References

There are no references for Rd macro ⁠\insertAllCites⁠ on this help page.

Examples

# Generate some example data
data <- data.frame(
  yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
  sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)

# Apply RMA method
result <- run_method("RMA", data)
print(result)

Robust Bayesian Meta-Analysis (RoBMA) Method

Description

Implements the robust Bayesian meta-analysis (RoBMA) method that uses Bayesian model-averaging to combine results across several complementary publication bias adjustment methods. See Maier et al. (2023) and Bartoš et al. (2023) for details. If "study_id" column is included in the data input, the method uses multilevel parameterization as described in Bartoš et al. (2025).

Note that the prior settings is dispatched based on "es_type" column attached to the dataset. The resulting estimates are then summarized on the same scale as was the dataset input (for "r", heterogeneity is summarized on Fisher's z).

Important: This method requires JAGS (Just Another Gibbs Sampler) to be installed on your system. Please download and install JAGS from https://mcmc-jags.sourceforge.io/ before using this method.

Usage

## S3 method for class 'RoBMA'
method(method_name, data, settings)

Arguments

method_name

Method name (automatically passed)

data

Data frame with yi (effect sizes), sei (standard errors), es_type (either "SMD" for Cohen's d / Hedge's g, "logOR" for log odds ratio, "z" for Fisher's z, or "r" for correlations. Defaults to "none" which re-scales the default priors to unit-information width based on total sample size supplied "ni".)

settings

List of method settings (see Details.)

Details

The following settings are implemented

"default": RoBMA-PSMA with publication bias adjustment as described in Bartoš et al. (2023). (the MCMC settings was reduced to speed-up the simulations) with the three-level specification whenever "study_ids" are supplied with the data
"PSMA": RoBMA-PSMA with publication bias adjustment as described in Bartoš et al. (2023). (the MCMC settings was reduced to speed-up the simulations) with the three-level specification whenever "study_ids" are supplied with the data

Value

Data frame with RoBMA results

References

Bartoš F, Maier M, Wagenmakers E (2025). “Robust Bayesian multilevel meta-analysis: Adjusting for publication bias in the presence of dependent effect sizes.” ArXiV Preprint. doi:10.31234/osf.io/9tgp2_v1.

Bartoš F, Maier M, Wagenmakers E, Doucouliagos H, Stanley TD (2023). “Robust Bayesian meta-analysis: Model-averaging across complementary publication bias adjustment methods.” Research Synthesis Methods, 14(1), 99–116. doi:10.1002/jrsm.1594.

Maier M, Bartoš F, Wagenmakers E (2023). “Robust Bayesian meta-analysis: Addressing publication bias with model-averaging.” Psychological Methods, 28(1), 107-122. doi:10.1037/met0000405.

Examples


# Generate some example data
data <- data.frame(
  yi      = c(0.2, 0.3, 0.1, 0.4, 0.25),
  sei     = c(0.1, 0.15, 0.08, 0.12, 0.09),
  es_type = "SMD"
)

# Apply RoBMA method
result <- run_method("RoBMA", data)
print(result)

SM (Selection Models) Method

Description

Implements selection models for publication bias correction in meta-analysis. The method first fits a random effects meta-analysis model, then applies selection modeling to adjust for publication bias using the metafor package. Selection models account for the probability that studies are published based on their p-values or effect sizes. See Vevea and Hedges (1995) for details.

Usage

## S3 method for class 'SM'
method(method_name, data, settings)

Arguments

method_name

Method name (automatically passed)

data

Data frame with yi (effect sizes) and sei (standard errors)

settings

List of method settings (see Details)

Details

The following settings are implemented

"default" or "3PSM": 3-parameter step function selection model with Maximum Likelihood estimator (method = "ML") and one step at one-sided p = 0.025 (i.e., selection for significance))
"4PSM": 4-parameter step function selection model with Maximum Likelihood estimator (method = "ML") and two steps at one-sided p = 0.025 and p = 0.50 (i.e., selection for significance and direction of the effect)

Value

Data frame with SM results

References

Vevea JL, Hedges LV (1995). “A general linear model for estimating effect size in the presence of publication bias.” Psychometrika, 60(3), 419–435. doi:10.1007/BF02294384.

Examples

# Generate some example data
data <- data.frame(
  yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
  sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)

# Apply SM method
result <- run_method("SM", data, "3PSM")
print(result)

WAAPWLS (Weighted Average of Adequately Powered Studies) Method

Description

Implements the WAAP-WLS method for meta-analysis, which combines WLS and WAAP approaches. First fits a WLS model to all studies, then identifies high-powered studies based on the criterion that the WLS estimate divided by 2.8 is greater than or equal to the standard error. If at least 2 high-powered studies are found, uses WAAP (weighted average of adequate power studies only), otherwise uses the original WLS estimate. See Stanley et al. (2017) for details.

Usage

## S3 method for class 'WAAPWLS'
method(method_name, data, settings = NULL)

Arguments

method_name

Method name (automatically passed)

data

Data frame with yi (effect sizes) and sei (standard errors)

settings

List of method settings (no settings version are implemented)

Value

Data frame with WAAPWLS results

References

Stanley TD, Doucouliagos H, Ioannidis JP (2017). “Finding the power to reduce publication bias.” Statistics in Medicine, 36(10), 1580-1598. doi:10.1002/sim.7228.

Examples

# Generate some example data
data <- data.frame(
  yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
  sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)

# Apply WAAPWLS method
result <- run_method("WAAPWLS", data)
print(result)

Weighted and Iterated Least Squares (WILS) Method

Description

Implements the weighted and iterated least squares (WILS) method for publication bias correction in meta-analysis. The method is based on the idea of using excess statistical significance (ESS) to identify how many underpowered studies should be removed to reduce publication selection bias. See Stanley and Doucouliagos (2024) for details.

Usage

## S3 method for class 'WILS'
method(method_name, data, settings = NULL)

Arguments

method_name

Method name (automatically passed)

data

Data frame with yi (effect sizes) and sei (standard errors)

settings

List of method settings (see Details)

Details

The WILS method has two implementation versions based on Stanley & Doucouliagos (2024). The following settings are implemented

"default": The simulation version (default) uses residuals from the t ~ Precision regression for the first iteration, then switches to individual excess statistical significance (ESS) for subsequent iterations.
"example": The example version consistently uses residuals from the t ~ Precision regression to identify studies to remove across all iterations.

Value

Data frame with WILS results

References

Stanley TD, Doucouliagos H (2024). “Harnessing the power of excess statistical significance: Weighted and iterative least squares.” Psychological Methods, 29(2), 407–420. doi:10.1037/met0000502.

Examples

# Generate some example data
data <- data.frame(
  yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
  sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)

# Apply WILS method
result <- run_method("WILS", data)
print(result)

WLS (Weighted Least Squares) Method

Description

Implements the Weighted Least Squares method for meta-analysis. WLS fits a weighted regression model with effect sizes as the outcome and weights based on the inverse of the squared standard errors. The intercept represents the weighted average effect size estimate.

Usage

## S3 method for class 'WLS'
method(method_name, data, settings = NULL)

Arguments

method_name

Method name (automatically passed)

data

Data frame with yi (effect sizes) and sei (standard errors)

settings

List of method settings (no settings version are implemented)

Value

Data frame with WLS results

References

There are no references for Rd macro ⁠\insertAllCites⁠ on this help page.

Examples

# Generate some example data
data <- data.frame(
  yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
  sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)

# Apply WLS method
result <- run_method("WLS", data)
print(result)

Default method handler

Description

Default method handler

Usage

## Default S3 method:
method(method_name, data, settings = list())

Arguments

method_name

Character string specifying the method type

data

Data frame containing yi (effect sizes) and sei (standard errors)

settings

Either a character identifying a method version or list containing method-specific settings. An emty input will result in running the default (first implemented) version of the method.

Value

Throws an error indicating the method type is unknown. This default method is only called when no specific method implementation is found for the given method_name.

Mean Method

Description

Implements the unweighted mean method. I.e., the mean of observed effect sizes.

Usage

## S3 method for class 'mean'
method(method_name, data, settings)

Arguments

method_name

Method name (automatically passed)

data

Data frame with yi (effect sizes)

settings

List of method settings (see Details)

Details

The following settings are implemented

"default": No settings

Value

Data frame with mean results

References

There are no references for Rd macro ⁠\insertAllCites⁠ on this help page.

Examples

# Generate some example data
data <- data.frame(
  yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
  sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)

# Apply mean method
result <- run_method("mean", data)
print(result)

pcurve (P-Curve) Method

Description

Implements the p-Curve method which analyzes the distribution of p-values from significant studies to assess whether the significant findings reflect true effects or QRP/publication bias. The method also provides tests for the evidential value, lack of evidential value, and p-hacking. See Simonsohn et al. (2014) for details.

The current implementation does not provide a test against the null hypothsis of no effect and does not produce confidence intervals of the estimate.

Usage

## S3 method for class 'pcurve'
method(method_name, data, settings)

Arguments

method_name

Method name (automatically passed)

data

Data frame with yi (effect sizes), sei (standard errors), and ni (sample sizes wherever available, otherwise set to Inf)

settings

List of method settings (see Details)

Details

The following settings are implemented

"default": no options

Value

Data frame with P-Curve results

References

Simonsohn U, Nelson LD, Simmons JP (2014). “p-curve and effect size: Correcting for publication bias using only significant results.” Perspectives on Psychological Science, 9(6), 666–681. doi:10.1177/1745691614553988.

Examples

# Generate some example data
data <- data.frame(
  yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
  sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)

# Apply pcurve method
result <- run_method("pcurve", data)
print(result)

puniform (P-Uniform) Method

Description

Implements the p-uniform method for publication bias detection and correction. P-uniform uses the distribution of p-values from significant studies to test for publication bias and estimate the effect size corrected for publication bias. The method assumes that p-values follow a uniform distribution under the null hypothesis of no effect, and uses this to detect and correct for bias. See van Assen et al. (2015) and van Aert and van Assen (2025) for details.

Usage

## S3 method for class 'puniform'
method(method_name, data, settings)

Arguments

method_name

Method name (automatically passed)

data

Data frame with yi (effect sizes) and sei (standard errors)

settings

List of method settings (see Details)

Details

The following settings are implemented

"default": Default p-uniform analysis settings.
"star": P-uniform star version of the method.

Value

Data frame with P-Uniform results

References

van Aert RCM, van Assen MALM (2025). “Correcting for publication bias in a meta-analysis with the p-uniform* method.” Psychonomic Bulletin & Review. https://osf.io/preprints/metaarxiv/zqjr9/.

van Assen MALM, van Aert RCM, Wicherts JM (2015). “Meta-analysis using effect size distributions of only statistically significant studies.” Psychological Methods, 20(3), 293–309. doi:10.1037/met0000025.

Examples

# Generate some example data
data <- data.frame(
  yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
  sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)

# Apply puniform method
result <- run_method("puniform", data)
print(result)

Trim-and-Fill Meta-Analysis Method

Description

Implements the trim-and-fill method for adjusting publication bias in meta-analysis using the metafor package.

Usage

## S3 method for class 'trimfill'
method(method_name, data, settings)

Arguments

method_name

Method name (automatically passed)

data

Data frame with yi (effect sizes) and sei (standard errors)

settings

List of method settings (see Details.)

Details

The following settings are implemented

"default": Random effects model fitted with Restricted Maximum Likelihood estimator (method = "REML") with Knapp-Hartung adjustment (test = "knha"), followed by trim-and-fill using left-side trimming (side = "left") and L0 estimator (estimator = "L0").

Value

Data frame with trim-and-fill results

References

There are no references for Rd macro ⁠\insertAllCites⁠ on this help page.

Examples

# Generate some example data
data <- data.frame(
  yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
  sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)

# Apply trimfill method
result <- run_method("trimfill", data)
print(result)

Method Extra Columns

Description

Retrieves the character vector of custom columns for a given method. These are method-specific columns beyond the standard columns (method, estimate, standard_error, ci_lower, ci_upper, p_value, BF, convergence, note) that each method returns.

When implementing new methods, consider using standardized column names for consistency:

Heterogeneity: tau_estimate, tau_ci_lower, tau_ci_upper, tau_p_value, tau_BF
Publication Bias: bias_coefficient, bias_coefficient_se, bias_p_value, bias_BF

Usage

get_method_extra_columns(method_name)

method_extra_columns(method_name)

## Default S3 method:
method_extra_columns(method_name)

Arguments

method_name

Character string of the method name

Value

Character vector of extra column names, or empty character vector if no extra columns are defined for the method

Examples

# Get extra columns for PET method
get_method_extra_columns("PET")

# Get extra columns for RMA method
get_method_extra_columns("RMA")

Return Pre-specified Method Settings

Description

This function returns the list of pre-specified settings for a given Method

Usage

method_settings(method_name)

get_method_setting(method_name, version_id)

Arguments

method_name

Character string specifying the method type

version_id

which method version should be used.

Value

A list containing the pre-specified settings. For most methods, the list contains extension of the function call, however, a more elaborate list of settings that is dispatched within the method call is possible.

Examples

method_settings("RMA")
get_method_setting("RMA", version_id = "default")

Retrieve a Pre-Simulated Condition and Repetition From a DGM

Description

This function returns a pre-simulated dataset of a given repetition and condition from a dgm. The pre-simulated datasets must be already stored locally. See download_dgm function for more guidance.

Usage

retrieve_dgm_dataset(dgm_name, condition_id, repetition_id = NULL)

Arguments

dgm_name

Character string specifying the DGM type

condition_id

which conditions should settings be returned for.

repetition_id

Which repetition should be returned. The complete condition can be returned by setting to either NULL.

Value

A data.frame

Examples


  # get condition 1, repetition 1
  retrieve_dgm_dataset("no_bias", condition_id = 1, repetition_id = 1)

  # get condition 1, all repetitions
  retrieve_dgm_dataset("no_bias", condition_id = 1)

Retrieve Pre-Computed Performance measures for a DGM

Description

This function returns pre-computed performance measures for a specified Data-Generating Mechanism (DGM). The pre-computed measures must be already stored locally. See download_dgm_measures() function for more guidance.

Usage

retrieve_dgm_measures(
  dgm_name,
  measure = NULL,
  method = NULL,
  method_setting = NULL,
  condition_id = NULL,
  replacement = FALSE
)

Arguments

dgm_name

Character string specifying the DGM type

measure

Which performance measure should be returned (e.g., "bias", "mse", "coverage"). All measures can be returned by setting to NULL.

method

Which method(s) should be returned. The complete results are returned by setting to NULL (default setting).

method_setting

Which method setting(s) should be returned. The complete results are returned by setting to NULL (default setting).

condition_id

which conditions should settings be returned for.

replacement

Whether performance measures computed using replacement should be returned. Defaults to FALSE.

Value

A data.frame

Examples


  # get bias measures for all methods and conditions
  retrieve_dgm_measures("no_bias", measure = "bias")

  # get all measures for RMA method
  retrieve_dgm_measures("no_bias", method = "RMA")

  # get MSE measures for PET method in condition 1
  retrieve_dgm_measures("no_bias", measure = "mse", method = "PET", condition_id = 1)

Retrieve a Pre-Computed Results of a Method Applied to DGM

Description

This function returns a pre-computed results of a given method at a specific repetition and condition from a dgm. The pre-computed results must be already stored locally. See download_dgm_results() function for more guidance.

Usage

retrieve_dgm_results(
  dgm_name,
  method = NULL,
  method_setting = NULL,
  condition_id = NULL,
  repetition_id = NULL
)

Arguments

dgm_name

Character string specifying the DGM type

method

Which method(s) should be returned. The complete results are returned by setting to NULL (default setting).

method_setting

Which method setting(s) should be returned. The complete results are returned by setting to NULL (default setting).

condition_id

which conditions should settings be returned for.

repetition_id

Which repetition should be returned. The complete condition can be returned by setting to either NULL.

Value

A data.frame

Examples


  # get condition 1, repetition 1 for default method setting
  retrieve_dgm_results("no_bias", condition_id = 1, repetition_id = 1)

  # get condition 1, all repetitions for default method setting
  retrieve_dgm_results("no_bias", condition_id = 1)

Generic method function for publication bias correction

Description

This function provides a unified interface to various publication bias correction methods. The specific method is determined by the first argument. See vignette("Adding_New_Methods", package = "PublicationBiasBenchmark") for details of extending the package with new methods

Usage

run_method(method_name, data, settings = NULL, silent = FALSE)

Arguments

method_name

Character string specifying the method type

data

Data frame containing yi (effect sizes) and sei (standard errors)

settings

Either a character identifying a method version or list containing method-specific settings. An emty input will result in running the default (first implemented) version of the method.

silent

Logical indicating whether error messages from the method should be suppressed.

Value

A data frame with standardized method results

Output Structure

The returned data frame follows a standardized schema that downstream functions rely on. All methods return the following columns:

method (character): The name of the method used.
estimate (numeric): The meta-analytic effect size estimate.
standard_error (numeric): Standard error of the estimate.
ci_lower (numeric): Lower bound of the 95% confidence interval.
ci_upper (numeric): Upper bound of the 95% confidence interval.
p_value (numeric): P-value for the estimate.
BF (numeric): Bayes Factor for the estimate.
convergence (logical): Whether the method converged successfully.
note (character): Additional notes describing convergence issues.

Some methods may include additional method-specific columns beyond these standard columns. Use get_method_extra_columns() to query which additional columns a particular method returns.

Examples

# Example usage with RMA method
data <- data.frame(
  yi = c(0.2, 0.3, 0.1, 0.4),
  sei = c(0.1, 0.15, 0.08, 0.12)
)
result <- run_method("RMA", data, "default")

Simulate From Data-Generating Mechanism

Description

This function provides a unified interface to various data-generating mechanisms for simulation studies. The specific DGM is determined by the first argument. See vignette("Adding_New_DGMs", package = "PublicationBiasBenchmark") for details of extending the package with new DGMs.

Usage

simulate_dgm(dgm_name, settings)

Arguments

dgm_name

Character string specifying the DGM type

settings

List containing the required parameters for the DGM or numeric condition_id

Value

A data frame containing the generated data with standardized structure

Output Structure

The returned data frame follows a standardized schema that downstream functions rely on. Across the currently implemented DGMs, the following columns are used:

yi (numeric): The effect size estimate.
sei (numeric): Standard error of yi.
ni (integer): Total sample size for the estimate (e.g., sum over groups where applicable).
es_type (character): Effect size type, used to disambiguate the scale of yi. Currently used values are "SMD" (standardized mean difference / Cohen's d), "logOR" (log odds ratio), and "none" (unspecified generic continuous coefficient).
study_id (integer/character, optional): Identifier of the primary study/cluster when a DGM yields multiple estimates per study (e.g., Alinaghi2018, PRE). If absent, each row is treated as an independent study.

Examples


simulate_dgm("Carter2019", 1)

simulate_dgm("Carter2019", list(mean_effect = 0, effect_heterogeneity = 0,
                       bias = "high", QRP = "high", n_studies = 10))

simulate_dgm("Stanley2017", list(environment = "SMD", mean_effect = 0,
                        effect_heterogeneity = 0, bias = 0, n_studies = 5,
                        sample_sizes = c(32,64,125,250,500)))

Upload Datasets of a DGM

Description

This function uploads datasets of a specified Data-Generating Mechanism (DGM) to the OSF repository at https://osf.io/exf3m/.

This is an internal function intended for the benchmark maintainer. It requires OSF repository authentication (via osfr::osf_auth()) and repository access.

Usage

upload_dgm_datasets(dgm_name, overwrite = FALSE, progress = TRUE, max_try = 10)

upload_dgm_results(dgm_name, overwrite = FALSE, progress = TRUE, max_try = 10)

upload_dgm_measures(dgm_name, overwrite = TRUE, progress = TRUE, max_try = 10)

Arguments

dgm_name

Character string specifying the name of the DGM dataset to upload.

overwrite

Logical indicating whether to overwrite existing files on OSF. Defaults to TRUE for performance measures and FALSE for results and datasets

progress

Logical indicating whether to show progress uploading files. Defaults to TRUE.

max_try

Integet specifying how many times should the function attempt reconnecting to OSF upon failure.

Value

TRUE if the upload was successful, otherwise an error is raised.

Validate DGM Settings

Description

This function validates the settings provided for a given Data Generating Mechanism (DGM).

Usage

validate_dgm_setting(dgm_name, settings)

Arguments

dgm_name

Character string specifying the DGM type

settings

List containing the required parameters for the DGM or numeric condition_id

Value

Error or TRUE depending whether the settings are valid for the specified DGM.

Examples

validate_dgm_setting("Carter2019", list(mean_effect = 0,
                        effect_heterogeneity = 0, bias = "high",
                        QRP = "high", n_studies = 10))

validate_dgm_setting("Alinaghi2018", list(environment = "FE",
                        mean_effect = 0, bias = "positive"))

validate_dgm_setting("Stanley2017", list(environment = "SMD",
                        mean_effect = 0,
                        effect_heterogeneity = 0, bias = 0, n_studies = 5,
                        sample_sizes = c(32,64,125,250,500)))

PublicationBiasBenchmark: Benchmark for Publication Bias Correction Methods

Description

Author(s)

See Also

Options for the PublicationBiasBenchmark package

Description

Usage

Arguments

Details

Value

Calculate sample variance of generic statistic

Description

Usage

Arguments

Value

Calculate sample variance of squared errors

Description

Usage

Arguments

Value

Calculate sample variance of estimates

Description

Usage

Arguments

Value

Calculate sample variance of CI widths

Description

Usage

Arguments

Value

Compare method with Multiple Measures for a DGM

Description

Usage

Arguments

Value

Compare method with a Single Measure for a DGM

Description

Usage

Arguments

Value

Compute Multiple Performance measures for a DGM

Description

Usage

Arguments

Value

Examples

Compute Performance Measures

Description

Usage

Arguments

Value

Create standardized empty method result for convergence failures

Description

Usage

Arguments

Value

DGM Method

Description

Usage

Arguments

Value

Output Structure

See Also

Examples

Alinaghi and Reed (2018) Data-Generating Mechanism

Description

Usage

Arguments

Details

Value

References

See Also

Bom and Rachinger (2019) Data-Generating Mechanism

Description

Usage

Arguments

Details

Value

References

See Also