| Title: | Censored Linear Regression Models under Heavy‑tailed Distributions |
| Version: | 0.0.1 |
| Maintainer: | Yessenia Alvarez Gil <yessenia.alvarez@ufpe.br> |
| Description: | Functions for fitting univariate linear regression models under Scale Mixtures of Skew-Normal (SMSN) distributions, considering left, right or interval censoring and missing responses. Estimation is performed via an EM-type algorithm. Includes selection criteria, sample generation and envelope. For details, see Gil, Y.A., Garay, A.M., and Lachos, V.H. (2025) <doi:10.1007/s10260-025-00797-x>. |
| License: | GPL-3 |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.2 |
| Depends: | R (≥ 3.5.0) |
| Imports: | mvtnorm, mnormt, cubature, ggplot2 |
| NeedsCompilation: | no |
| Packaged: | 2025-11-15 02:01:56 UTC; Usuario |
| Author: | Yessenia Alvarez Gil [aut, cre], Aldo M. Garay [aut], Victor H. Lachos [aut] |
| Repository: | CRAN |
| Date/Publication: | 2025-11-19 19:40:12 UTC |
Fit Censored Linear Regression Model under Scale Mixtures of Skew-Normal Distributions
Description
Fits a univariate linear regression model with censoring and/or missing values in the response variable, assuming it follows a distribution from the Scale Mixtures of Skew-Normal (SMSN) family. Computes standard errors using the empirical information matrix and provides model selection criteria (AIC, BIC, CAIC, HQ). Optionally generates envelope plots based on martingale residuals.
Usage
CensRegSMSN(
cc,
x,
y,
beta = NULL,
sigma2 = NULL,
lambda = NULL,
nu = NULL,
cens = "Int",
UL = NULL,
get.init = TRUE,
show.envelope = FALSE,
error = 1e-04,
iter.max = 300,
family = "ST",
verbose = TRUE
)
Arguments
cc |
Indicator vector for incomplete observations of length |
x |
Design matrix (of dimension |
y |
Response vector of length |
beta |
Optional initial values for the regression coefficients. Default is |
sigma2 |
Optional initial value for the scale parameter. Default is |
lambda |
Optional initial value for the shape parameter (for skewed distributions). Default is |
nu |
Optional initial value for the distribution-specific parameter. Required for |
cens |
Character indicating the type of censoring. Should be one of |
UL |
Vector of upper limits of length |
get.init |
Logical; if |
show.envelope |
Logical; if |
error |
Convergence threshold for the algorithm. Default is |
iter.max |
Maximum number of iterations allowed in the algorithm. Default is |
family |
Character string indicating the distribution family. Possible values include: |
verbose |
Logical indicating whether results should be printed to the console. Default is |
Details
The model assumes that the response variable follows a distribution from the Scale Mixtures of Skew-Normal (SMSN) family, which allows for heavy tails and/or asymmetry.
Interval censoring is a general framework that includes left and right censoring and missing responses, providing a unified treatment for all cases.
For the Skew Contaminated Normal ("SCN") and the Contaminated Normal ("CN") distributions, the nu parameter must be a two-dimensional vector with values in the interval (0, 1).
Value
A list with the following components:
beta |
Estimated regression coefficients. |
sigma2 |
Estimated scale parameter. |
lambda |
Estimated shape parameter. For symmetric distributions ("N", "T", "CN"), this is zero. |
nu |
Estimated parameters of the scale mixture distribution. |
SE |
Standard errors of the estimated parameters. |
iter |
Number of iterations until convergence. |
logver |
Value of the log-likelihood function at convergence, computed under the fitted model. |
AIC, BIC, CAIC, HQ |
Information criteria for model selection. |
residual |
Transformed martingale residuals used for envelope plots. Returned only if |
References
Gil, Y. A., Garay, A. M. & Lachos, V. H. Likelihood-based inference for interval censored regression models under heavy-tailed distributions. Stat Methods Appl 34, 519–544 (2025). doi:10.1007/s10260-025-00797-x.
Examples
# See examples in ?gen_SMSNCens_sample for a complete workflow
# illustrating data generation and model fitting.
Generate simulated censored data under heavy‑tailed Distributions
Description
Simulates a univariate linear regression dataset with censoring and/or missing values in the response variable, considering that the error follows a SMSN distribution.
Usage
gen_SMSNCens_sample(
n,
x,
beta,
sigma2,
lambda,
nu,
cens = "Int",
pcens = 0,
pna = 0,
family = "ST"
)
Arguments
n |
Integer. Sample size to be generated. |
x |
Numeric matrix of covariates (dimension |
beta |
Numeric vector of regression coefficients of length |
sigma2 |
Positive numeric scalar. Scale parameter of SMSN class. |
lambda |
Numeric scalar. Shape parameter that controls the skewness in the SMSN class. Ignored when |
nu |
Distribution-specific parameter: for |
cens |
Character string indicating the type of censoring: |
pcens |
Proportion of censored observations. Must be between 0 and 1. Default is |
pna |
Proportion of missing values (treated as extreme interval censoring). Must be between 0 and 1. Only allowed when |
family |
Character string indicating the error distribution family. Possible values: |
Details
The following procedures are applied to the generated response variable with incomplete observation:
-
Left censoring: values below a cutoff point (defined based on the
pcens) are replaced by that cutoff, indicating that the true value is less than or equal to it. -
Right censoring: values above a cutoff point (also based on the
pcens) are replaced by that value, indicating that the true value is greater than or equal to it. -
Interval censoring: a subset of observations is randomly selected (based on the
pcens), and each value is replaced by an interval centered at the true value. -
Missing data: an additional subset of observations (defined based on the
pna) is replaced by unbounded intervals of the form(-Inf, Inf), representing complete uncertainty about the true value.
Value
A list with the following components:
y |
Fully observed response values (uncensored). |
yc |
Incomplete response values. |
cc |
Censoring indicator. |
UL |
Vector of upper limits of the censoring interval. Equal to |
Examples
set.seed(1997)
# Generate covariates and true parameter values
n <- 500
x <- cbind(1, rnorm(n))
beta <- c(2, -1)
sigma2 <- 1
lambda <- 3
nu <- 3
# Generate a simulated dataset under SMSN-ICR model, with interval censoring and/or missing values
sample <- gen_SMSNCens_sample(n = n, x = x, beta = beta, sigma2 = sigma2,
lambda = lambda, nu = nu, cens = "Int",
pcens = 0.1, pna = 0.05, family = "ST")
# Fit the SMSN-ICR model using the generated data
fit <- CensRegSMSN(sample$cc, x, sample$yc, cens = "Int", UL = sample$UL, get.init = TRUE,
show.envelope = TRUE, family = "ST")