ReproStat

Reproducibility Diagnostics for Statistical Modeling

ReproStat helps you answer a practical question that standard model summaries do not answer well:

If the data changed a little, would the substantive result still look the same?

The package repeatedly perturbs a dataset, refits the model, and measures how stable the outputs remain. It summarizes that behavior through:

This makes ReproStat useful when you want to move beyond single-fit inference and assess how sensitive your modeling conclusions are to ordinary data variation.


What ReproStat Is For

ReproStat is designed for analysts who want to know whether a model result is:

Typical use cases include:

ReproStat does not claim to prove scientific reproducibility on its own. It quantifies the stability of model outputs under controlled perturbation schemes so you can assess how fragile or dependable the modeling result appears.


Installation

If the package is available on CRAN:

install.packages("ReproStat")

To install the development version from GitHub:

remotes::install_github("ntiGideon/ReproStat")

Optional dependencies

Package Purpose
MASS Robust M-estimation backend via backend = "rlm"
glmnet Penalized regression backend via backend = "glmnet"
ggplot2 ggplot-based plotting helpers

Core Idea

The package follows a simple workflow:

original data
    ->
perturb data many times
    ->
refit the model each time
    ->
measure how much results change
    ->
summarize the stability of those changes

If coefficient signs, significance decisions, selected variables, predictions, and model rankings remain similar across perturbations, the analysis is more reproducible in the sense ReproStat measures. If they vary substantially, the result may be fragile.


Quick Start

library(ReproStat)

set.seed(1)

diag_obj <- run_diagnostics(
  mpg ~ wt + hp + disp,
  data = mtcars,
  B = 200,
  method = "bootstrap"
)

print(diag_obj)
coef_stability(diag_obj)
pvalue_stability(diag_obj)
selection_stability(diag_obj)
prediction_stability(diag_obj)
reproducibility_index(diag_obj)
ri_confidence_interval(diag_obj, R = 500, seed = 1)

Main Workflow

1. Fit the diagnostic object

run_diagnostics() is the main entry point. It fits the original model, perturbs the data B times, refits the model, and stores the results for downstream summaries.

diag_obj <- run_diagnostics(
  mpg ~ wt + hp + disp,
  data = mtcars,
  B = 200,
  method = "bootstrap"
)

2. Inspect individual stability dimensions

coef_stability(diag_obj)
pvalue_stability(diag_obj)
selection_stability(diag_obj)
prediction_stability(diag_obj)

These functions answer different questions:

3. Summarize with the Reproducibility Index

ri <- reproducibility_index(diag_obj)
ri

The RI is a compact summary of multiple stability components, scaled to 0-100.

4. Visualize the result

oldpar <- par(mfrow = c(2, 2))
plot_stability(diag_obj, "coefficient")
plot_stability(diag_obj, "pvalue")
plot_stability(diag_obj, "selection")
plot_stability(diag_obj, "prediction")
par(oldpar)

Interpreting the Outputs

Perturbation methods

Method What it tests Good when you want to assess
"bootstrap" resampling variability ordinary data-driven stability
"subsample" sample composition sensitivity robustness to who enters the sample
"noise" measurement perturbation sensitivity to noisy recorded values

Reproducibility Index

The RI is best treated as an interpretive summary, not as a hard universal threshold.

RI range Interpretation
90-100 Highly stable under the chosen perturbation design
70-89 Moderately stable; overall pattern is fairly dependable
50-69 Mixed stability; some conclusions may be sensitive
< 50 Low stability; investigate model dependence and data fragility

Two important cautions:


Supported Backends

ReproStat supports four model-fitting backends through the same interface:

Backend Model family Notes
"lm" ordinary least squares default
"glm" generalized linear models use family = ...
"rlm" robust regression requires MASS
"glmnet" penalized regression requires glmnet

Examples:

# Logistic regression
diag_glm <- run_diagnostics(
  am ~ wt + hp + qsec,
  data = mtcars,
  B = 100,
  family = stats::binomial()
)

# Robust regression
if (requireNamespace("MASS", quietly = TRUE)) {
  diag_rlm <- run_diagnostics(
    mpg ~ wt + hp,
    data = mtcars,
    B = 100,
    backend = "rlm"
  )
}

# Penalized regression
if (requireNamespace("glmnet", quietly = TRUE)) {
  diag_lasso <- run_diagnostics(
    mpg ~ wt + hp + disp + qsec,
    data = mtcars,
    B = 100,
    backend = "glmnet",
    en_alpha = 1
  )
}

Comparing Models with CV Ranking Stability

ReproStat also helps evaluate model-selection stability, not just single-model stability.

cv_ranking_stability() repeatedly runs cross-validation across competing formulas and records how often each model ranks best.

models <- list(
  baseline = mpg ~ wt + hp,
  medium   = mpg ~ wt + hp + disp,
  full     = mpg ~ wt + hp + disp + qsec
)

cv_obj <- cv_ranking_stability(models, mtcars, v = 5, R = 50)
cv_obj$summary
plot_cv_stability(cv_obj, metric = "top1_frequency")
plot_cv_stability(cv_obj, metric = "mean_rank")

This is useful when the lowest average error and the most consistently top-ranked model are not the same thing.


Example: End-to-End Analysis

library(ReproStat)

set.seed(42)

diag_obj <- run_diagnostics(
  mpg ~ wt + hp + disp,
  data = mtcars,
  B = 200,
  method = "bootstrap"
)

# Individual summaries
coef_stability(diag_obj)
pvalue_stability(diag_obj)
selection_stability(diag_obj)
prediction_stability(diag_obj)$mean_variance

# Composite summary
ri <- reproducibility_index(diag_obj)
cat(sprintf("RI = %.1f\n", ri$index))
ri_confidence_interval(diag_obj, R = 500, seed = 42)

# Visuals
oldpar <- par(mfrow = c(2, 2))
plot_stability(diag_obj, "coefficient")
plot_stability(diag_obj, "pvalue")
plot_stability(diag_obj, "selection")
plot_stability(diag_obj, "prediction")
par(oldpar)

Learn More on the pkgdown Site

The pkgdown site is organized for different user needs:

Key pages:


Citation

If you use ReproStat in published work, cite the package and any associated manuscript or software record that accompanies the release you used.


License

GPL (>= 3)