| Type: | Package |
| Title: | ROC Analysis in Three-Class Classification Problems for Clustered Data |
| Date: | 2025-10-01 |
| Version: | 1.0.3 |
| Maintainer: | Duc-Khanh To <tdkhanh@hcmus.edu.vn> |
| Description: | Statistical methods for ROC surface analysis in three-class classification problems for clustered data and in presence of covariates. In particular, the package allows to obtain covariate-specific point and interval estimation for: (i) true class fractions (TCFs) at fixed pairs of thresholds; (ii) the ROC surface; (iii) the volume under ROC surface (VUS); (iv) the optimal pairs of thresholds. Methods considered in points (i), (ii) and (iv) are proposed and discussed in To et al. (2022) <doi:10.1177/09622802221089029>. Referring to point (iv), three different selection criteria are implemented: Generalized Youden Index (GYI), Closest to Perfection (CtP) and Maximum Volume (MV). Methods considered in point (iii) are proposed and discussed in Xiong et al. (2018) <doi:10.1177/0962280217742539>. Visualization tools are also provided. We refer readers to the articles cited above for all details. |
| License: | GPL-3 |
| Depends: | R (≥ 3.5.0), stats, utils, graphics, nlme, Rcpp (≥ 0.12.3) |
| Imports: | rgl, ellipse, numDeriv, ggplot2, ggpubr, foreach, iterators, parallel, doParallel |
| LinkingTo: | Rcpp, RcppArmadillo |
| Encoding: | UTF-8 |
| LazyData: | true |
| LazyLoad: | yes |
| ByteCompile: | yes |
| RoxygenNote: | 7.3.3 |
| NeedsCompilation: | yes |
| URL: | https://github.com/toduckhanh/ClusROC |
| BugReports: | https://github.com/toduckhanh/ClusROC/issues |
| Suggests: | testthat (≥ 3.0.0) |
| Config/testthat/edition: | 3 |
| Packaged: | 2025-10-01 15:29:07 UTC; duckh |
| Author: | Duc-Khanh To |
| Repository: | CRAN |
| Date/Publication: | 2025-10-01 17:00:10 UTC |
ROC Analysis in Three-Class Classification Problems for Clustered Data
Description
This package implements the techniques for ROC surface analysis, in cases of clustered data and in presence of covariates. In particular, the package allows to obtain covariate-specific point and interval estimation for: (i) true class fractions (TCFs) at fixed pairs of thresholds; (ii) the ROC surface; (iii) the volume under ROC surface (VUS); (iv) the optimal pairs of thresholds. Methods considered in points (i), (ii) and (iv) are proposed and discussed in To et al. (2022). Referring to point (iv), three different selection criteria are implemented: Generalized Youden Index (GYI), Closest to Perfection (CtP) and Maximum Volume (MV). Methods considered in point (iii) are proposed and discussed in Xiong et al. (2018). Visualization tools are also provided. We refer readers to the articles cited above for all details.
Details
| Package: | ClusROC |
| Type: | Package |
| Version: | 1.0-2 |
| Date: | 2022-10-10 |
| License: | GPL 2 | GPL 3 |
| Lazy load: | yes |
Major functions are clus_lme, clus_roc_surface, clus_opt_thres3, clus_vus and clus_tcfs.
Author(s)
Duc-Khanh To, with contributions from Gianfranco Adimari and Monica Chiogna
Maintainer: Duc-Khanh To <toduc@stat.unipd.it>
References
Bantis, L. E., Nakas, C. T., Reiser, B., Myall, D., and Dalrymple-Alford, J. C. (2017). Construction of joint confidence regions for the optimal true class fractions of Receiver Operating Characteristic (ROC) surfaces and manifolds. Statistical methods in medical research, 26, 3, 1429-1442.
Gurka, M. J., Edwards, L. J. , Muller, K. E., and Kupper, L. L. (2006). Extending the Box-Cox transformation to the linear mixed model. Journal of the Royal Statistical Society: Series A (Statistics in Society), 169, 2, 273-288.
Gurka, M. J. and Edwards, L. J. (2011). Estimating variance components and random effects using the box-cox transformation in the linear mixed model. Communications in Statistics - Theory and Methods, 40, 3, 515-531.
Kauermann, G. and Carroll, R. J. (2001). A note on the efficiency of sandwich covariance matrix estimation. Journal of the American Statistical Association, 96, 456, 1387-1396.
Liang, K. Y. and Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika, 73, 1, 13-22.
Mancl, L. A. and DeRouen, T. A. (2001). A covariance estimator for GEE with improved small-sample properties. Biometrics, 57, 1, 126-134.
To, D-K., Adimari, G., Chiogna, M. and Risso, D. (2022). Receiver operating characteristic estimation and threshold selection criteria in three-class classification problems for clustered data. Statistical Methods in Medical Research, 7, 31, 1325-1341.
Xiong, C., Luo, J., Chen L., Gao, F., Liu, J., Wang, G., Bateman, R. and Morris, J. C. (2018). Estimating diagnostic accuracy for clustered ordinal diagnostic groups in the three-class case – Application to the early diagnosis of Alzheimer disease. Statistical Methods in Medical Research, 27, 3, 701-714.
A subset of energy choice data in 4 cities of Ethiopia
Description
A subset of energy choice data used in Alem et al. (2016). The authors are used the full dataset to investigate the determinants of household cooking fuel choice and energy transition in urban Ethiopia. A full data is publicly available at doi:10.1016/j.eneco.2016.06.025.
Usage
data(EnergyEthiopia)
Format
A data frame with 2088 observations from 1123 households (or clusters) in the capital Addis Ababa and 9 variables:
uqidthe id of household (which yield 1123 clusters).
energy2a factor with 3 levels (types) of cooking energy state at each time (2000, 2004, 2009), i.e., 1 (clean fuel only - electricity, gas and kerosene), 2 (a mix of clean and biomass), 3 (biomass fuel only - firewood, charcoal, dung and crop residues).
hhshousehold size.
hhs_fta factor with 4 levels of household size: small (1
\lehhs\le4); medium (5\lehhs\le8); large ((9\lehhs\le12)); very large (hhs\ge13).lrconsaeulog of real consumption per adult equivalent units.
lfirewood_prFirewood log price.
lcharcol_prCharcoal log price.
lkerosene_prKernosene log price.
lelectric_prElectricity log price.
References
Alem, Y., Beyene, A. D., Köhlin, G., & Mekonnen, A. (2016). "Modeling household cooking fuel choice: A panel multinomial logit approach". Energy Economics, 59, 129-137.
A subset of mouse brain cells data
Description
A subset of mouse brain cells data used in To el al. (2022). This is used to evaluate the ability of Lamp5 gene to discriminate three types of glutamatergic neurons. A full data is publicly available at https://portal.brain-map.org/atlases-and-data/rnaseq/mouse-v1-and-alm-smart-seq.
Usage
data(MouseNeurons)
Format
A data frame with 860 observations from 23 clusters and 7 variables:
sample_namename of each observation.
subclass_labela factor with 3 levels (types) of glutamatergic neurons, i.e., L2/3 IT (Layer 2/3 Intratelencephalic), L4 (Layer 4) and L5 PT (Layer 5 Pyramidal Tract) neurons.
genotype_idthe mouse genotype (which yield 23 clusters).
sexthe gender of mouse.
age_daysthe age of mouse, in days.
Slc17a7_cpmcount per million of Slc17a7 (Solute Carrier Family 17 Member 7) gene expression.
Lamp5_cpmcount per million of Lamp5 (Lysosomal Associated Membrane Protein Family Member 5) gene expression.
References
To, D-K., Adimari, G., Chiogna, M. and Risso, D. (2022) “Receiver operating characteristic estimation and threshold selection criteria in three-class classification problems for clustered data”. Statistical Methods in Medical Research, 7, 31, 1325-1341.
Confidence Intervals for Covariate-specific VUS
Description
Computes confidence intervals for covariate-specific VUS.
Usage
ci_clus_vus(x, ci_level = 0.95)
Arguments
x |
an object of class "VUS", a result of |
ci_level |
a confidence level to be used for constructing the confidence interval; default is 0.95. |
Details
A confidence interval for covariate-specific VUS is given based on normal approximation. If the lower bound (or the upper bound) of the confidence interval is smaller than 0 (or greater than 1), it will be set as 0 (or 1). Also, logit and probit transformations are available if one wants guarantees that confidence limits are inside (0, 1).
Value
ci_clus_vus returns an object of class inheriting from "ci_VUS" class. An object of class "ci_VUS" is a list, containing at least the following components:
ci_vus_norm |
the normal approximation-based confidence interval for covariate-specific VUS. |
ci_vus_log |
the confidence interval for covariate-specific VUS, after using logit-transformation. |
ci_vus_prob |
the confidence interval for covariate-specific VUS, after using probit-transformation. |
ci_level |
fixed confidence level. |
newdata |
value(s) of covariate(s). |
n_p |
total numbers of the regressors in the model. |
See Also
Linear Mixed-Effects Models for a continuous diagnostic test or a biomarker (or a classifier).
Description
clus_lme fits the cluster-effect model for a continuous diagnostic test in a three-class setting as described in Xiong et al. (2018) and To et al. (2022).
Usage
clus_lme(
fixed_formula,
name_class,
name_clust,
data = sys.frame(sys.parent()),
subset,
na_action = na.fail,
levl_class = NULL,
ap_var = TRUE,
boxcox = FALSE,
interval_lambda = c(-2, 2),
trace = TRUE,
...
)
Arguments
fixed_formula |
a two-sided linear formula object, describing the fixed-effects part of the model for three classes, with the response on the left of ~ operator and the terms, separated by + operators, on the right. For example, |
name_class |
name of variable indicating three classes (or three groups) in the data. |
name_clust |
name of variable indicating clusters in the data. |
data |
a data frame containing the variables in the model. |
subset |
an optional expression indicating the subset of the rows of data that should be used in the fit. This can be a logical vector, or a numeric vector indicating which observation numbers are to be included, or a character vector of the row names to be included. All observations are included by default. |
na_action |
a function that indicates what should happen when the data contain NAs. The default action ( |
levl_class |
a vector (of strings) containing the ordered name chosen for the disease classes. The ordering is intended to be “increasing” with respect to the disease severity. If |
ap_var |
a logical value. Default = |
boxcox |
a logical value. Default = |
interval_lambda |
a vector containing the end-points of the interval for searching the Box-Cox parameter, |
trace |
a logical value. Default = |
... |
additional arguments for |
Details
This function fits a linear mixed-effect model for a continuous diagnostic test in a three-class setting in order to account for the cluster and covariates effects on the test result. See Xiong et al. (2018) and To et al. (2022) for more details.
Estimation is done by using
lmewith the restricted maximum log-likelihood (REML) method.Box-Cox transformation for the model can be used when the distributions of test results are skewed (Gurka et al. 2006). The estimation procedure is described in To et al. (2022). The Box-Cox parameter
\lambdais estimated by a grid search on the interval (-2, 2), as discussed in Gurka and Edwards (2011).The estimated variance-covariance matrix for the estimated parameters are obtained by sandwich formula (see, Liang and Zeger, 1986; Kauermann and Carroll, 2001; Mancl and DeRouen, 2001) as discussed in To et al. (2022).
Value
clus_lme returns an object of class "clus_lme" class, i.e., a list containing at least the following components:
call |
the matched call. |
est_para |
a vector containing the estimated parameters. |
se_para |
a vector containing the standard errors. |
vcov_sand |
the estimated covariance matrix for all estimated parameters. |
residual |
a list of residuals. |
fitted |
a list of fitted values. |
randf |
a vector of estimated random effects for each cluster level. |
n_coef |
total number of coefficients included in the model. |
n_p |
total numbers of regressors in the model. |
icc |
an estimate of intra-class correlation - ICC |
terms |
the |
boxcox |
logical value indicating whether the Box-Cox transformation was applied or not. |
data |
data frame is used to fitting model. |
Generic functions such as print and plot are also used to show results of the fit.
References
Gurka, M. J., Edwards, L. J. , Muller, K. E., and Kupper, L. L. (2006) “Extending the Box-Cox transformation to the linear mixed model”. Journal of the Royal Statistical Society: Series A (Statistics in Society), 169, 2, 273-288.
Gurka, M. J. and Edwards, L. J. (2011) “Estimating variance components and random effects using the box-cox transformation in the linear mixed model”. Communications in Statistics - Theory and Methods, 40, 3, 515-531.
Kauermann, G. and Carroll, R. J. (2001) “A note on the efficiency of sandwich covariance matrix estimation”. Journal of the American Statistical Association, 96, 456, 1387-1396.
Liang, K. Y. and Zeger, S. L. (1986) “Longitudinal data analysis using generalized linear models”. Biometrika, 73, 1, 13-22.
Mancl, L. A. and DeRouen, T. A. (2001) “A covariance estimator for GEE with improved small-sample properties”. Biometrics, 57, 1, 126-134.
To, D-K., Adimari, G., Chiogna, M. and Risso, D. (2022) “Receiver operating characteristic estimation and threshold selection criteria in three-class classification problems for clustered data”. Statistical Methods in Medical Research, 7, 31, 1325-1341.
Xiong, C., Luo, J., Chen L., Gao, F., Liu, J., Wang, G., Bateman, R. and Morris, J. C. (2018) “Estimating diagnostic accuracy for clustered ordinal diagnostic groups in the three-class case – Application to the early diagnosis of Alzheimer disease”. Statistical Methods in Medical Research, 27, 3, 701-714.
Examples
## Example 1:
data(data_3class)
head(data_3class)
## A model with two covariate
out1 <- clus_lme(fixed_formula = Y ~ X1 + X2, name_class = "D",
name_clust = "id_Clus", data = data_3class)
print(out1)
plot(out1)
## Example 2: Box-Cox transformation
data(data_3class_bcx)
out2 <- clus_lme(fixed_formula = Y ~ X, name_class = "D",
name_clust = "id_Clus", data = data_3class_bcx,
boxcox = TRUE)
print(out2)
plot(out2)
Estimation of the covariate-specific optimal pair of thresholds for clustered data.
Description
clus_opt_thres3 estimates covariate-specific optimal pair of thresholds of a continuous diagnostic test in a clustered design, with three classes of diseases.
Usage
clus_opt_thres3(
method = c("GYI", "CtP", "MV"),
out_clus_lme,
newdata,
ap_var = TRUE,
control = list()
)
Arguments
method |
the method to be used. See 'Details'. |
out_clus_lme |
an object of class "clus_lme", i.e., a result of |
newdata |
a data frame (containing specific value(s) of covariate(s)) in which to look for variables with which to estimate covariate-specific optimal pair of thresholds. In absence of covariate, no values have to be specified. |
ap_var |
logical value. If set to |
control |
a list of control parameters. See 'Details'. |
Details
This function implements estimation methods discussed in To et al. (2022) for covariate-specific optimal pair of thresholds in a clustered design with three ordinal groups. The estimators are based on the results from clus_lme function, which fits the linear mixed-effect model by using REML approach.
Before performing estimation, a check for the monotone ordering assumption is performed. This means that, for the fixed values of covariates, three predicted mean values for test results in three diagnostic groups are compared. If the assumption is not meet, the covariate-specific optimal pair of thresholds at the values of covariates are not estimated.
The estimation procedure uses three criteria. Method "GYI" is Generalized Youden Index, which maximizes the sum of three covariate-specific True Class Fractions - TCFs. Method "CtP" is based on Closest to Pefection approach. By using this method, the optimal pair of thresholds is obtained by minimizing the distance, in the unit cube, between a generic point on the covariate-specific ROC surface and the top corner (1, 1, 1). Method "MV" is based on Maximum Volume approach, which searches for thresholds that maximize the volume of a box under the covariate-specific ROC surface. The user can select more than one method. This function allows to estimate covariate-specific optimal pair of thresholds at multiple points for covariates.
The asymptotic variance-covariance matrix of the (estimated) covariate-specific optimal thresholds is estimated by using the Delta method under the normal assumption. If the Box-Cox transformation is applied to the linear mixed-effect model, a nonparametric bootstrap procedure for clustered data will be used to obtain the estimated asymptotic covariance matrix (see To et al. 2022, for more details).
The control argument is a list that can supply any of the following components:
method_optimOptimization method to be used. There are three options:
"L-BFGS-B","BFGS"and"Nelder-Mead". Default is"L-BFGS-B".startStarting values in the optimization procedure. If it is
NULL, a starting point will be automatically obtained.maxitThe maximum number of iterations. Default is 200.
lower, upperPossible bounds on the threshold range, for the optimization based on "L-BFGS-B" method. Defaults are
-InfandInf.n_bootNumber of bootstrap replicates for estimating the covariance matrix (when Box-Cox transformation is applied). Default is 250.
parallelA logical value. If set to
TRUE, a parallel computing is employed in the bootstrap resampling process.ncpusNumber of processes to be used in parallel computing. Default is 2.
Value
clus_opt_thres3 returns an object of "clus_opt_thres3" class, which is a list containing at least the following components:
call |
the matched call. |
method |
the methods used to obtain the estimated optimal pair of threholds. |
thres3 |
a vector or matrix containing the estimated optimal thresholds. |
thres3_se |
a vector or matrix containing the estimated standard errors. |
vcov_thres3 |
a matrix or list of matrices containing the estimated variance-covariance matrices. |
tcfs |
a vector or matrix containing the estimated TCFs at the optimal thresholds. |
mess_order |
a diagnostic message from checking the monontone ordering. |
newdata |
value(s) of covariate(s). |
n_p |
total number of regressors in the model. |
Generic functions such as print and plot are also used to show the results.
References
To, D-K., Adimari, G., Chiogna, M. and Risso, D. (2022) “Receiver operating characteristic estimation and threshold selection criteria in three-class classification problems for clustered data”. Statistical Methods in Medical Research, 7, 31, 1325-1341.
Examples
data(data_3class)
## One covariate
out1 <- clus_lme(fixed_formula = Y ~ X1, name_class = "D",
name_clust = "id_Clus", data = data_3class)
### Estimate covariate-specific optimal thresholds at multiple values of one covariate,
### with 3 methods
out_thres_1 <- clus_opt_thres3(method = c("GYI", "MV", "CtP"),
out_clus_lme = out1,
newdata = data.frame(X1 = 1), ap_var = TRUE)
print(out_thres_1)
plot(out_thres_1)
## Two covariates
out2 <- clus_lme(fixed_formula = Y ~ X1 + X2, name_class = "D",
name_clust = "id_Clus", data = data_3class)
### Estimate covariate-specific optimal thresholds at one point, with 3 methods
out_thres_2 <- clus_opt_thres3(method = c("GYI", "MV", "CtP"),
out_clus_lme = out2,
newdata = data.frame(X1 = 1, X2 = 0),
ap_var = TRUE)
print(out_thres_2)
plot(out_thres_2)
### Estimate covariate-specific optimal thresholds at three points, with 3 methods
out_thres_3 <- clus_opt_thres3(method = c("GYI", "MV", "CtP"),
out_clus_lme = out2,
newdata = data.frame(X1 = c(-0.5, 0.5, 0.5),
X2 = c(0, 0, 1)),
ap_var = TRUE)
print(out_thres_3)
plot(out_thres_3, colors = c("forestgreen", "blue"))
Plot an estimated covariate-specific ROC surface for clustered data.
Description
clus_roc_surface estimates and makes a 3D plot of a covariate-specific ROC surface for a continuous diagnostic test, in a clustered design, with three ordinal groups.
Usage
clus_roc_surface(
out_clus_lme,
newdata,
step_tcf = 0.01,
main = NULL,
file_name = NULL,
ellips = FALSE,
thresholds = NULL,
ci_level = ifelse(ellips, 0.95, NULL)
)
Arguments
out_clus_lme |
an object of class "clus_lme", a result of |
newdata |
a data frame with 1 row (containing specific value(s) of covariate(s)) in which to look for variables with which to estimate covariate-specific ROC. In absence of covariate, no values have to be specified. |
step_tcf |
number: increment to be used in the grid for |
main |
the main title for plot. |
file_name |
File name to create on disk. |
ellips |
a logical value. If set to |
thresholds |
a specified pair of thresholds, used to construct the ellipsoidal confidence region for TCFs. |
ci_level |
a confidence level to be used for constructing the ellipsoidal confidence region; default is 0.95. |
Details
This function implements a method in To et al. (2022) for estimating covariate-specific ROC surface of a continuous diagnostic test in a clustered design, with three ordinal groups. The estimator is based on the results from clus_lme with REML approach.
Before performing estimation, a check for the monotone ordering assumption is performed. This means that, for the fixed values of covariates, three predicted mean values for test results in three diagnostic groups are compared. If the assumption is not meet, the covariate-specific ROC surface at the values of covariates is not estimated.
The ellipsoidal confidence region for TCFs at a given pair of thresholds, if required, is constructed by using normal approximation and is plotted in the ROC surface space. The confidence level (default) is 0.95. Note that, if the Box-Cox transformation is applied for the linear mixed-effect model, the pair of thresholds must be input in the original scale. If the constructed confidence region for TCFs is outside the unit cube, a probit transformation will be automatically applied to obtain an appropriate confidence region, which is inside the unit cube (see Bantis et. al., 2017).
Value
clus_roc_surface returns a 3D rgl plot of the estimated covariate-specific ROC surface.
References
Bantis, L. E., Nakas, C. T., Reiser, B., Myall, D., and Dalrymple-Alford, J. C. (2017). “Construction of joint confidence regions for the optimal true class fractions of Receiver Operating Characteristic (ROC) surfaces and manifolds”. Statistical methods in medical research, 26, 3, 1429-1442.
To, D-K., Adimari, G., Chiogna, M. and Risso, D. (2022) “Receiver operating characteristic estimation and threshold selection criteria in three-class classification problems for clustered data”. Statistical Methods in Medical Research, 7, 31, 1325-1341.
Examples
data(data_3class)
## One covariate
out1 <- clus_lme(fixed_formula = Y ~ X1, name_class = "D",
name_clust = "id_Clus", data = data_3class)
### plot only covariate-specific ROC surface
clus_roc_surface(out_clus_lme = out1, newdata = data.frame(X1 = 1))
### plot covariate-specific ROC surface and a 95% ellipsoidal confidence region for TCFs
clus_roc_surface(out_clus_lme = out1, newdata = data.frame(X1 = 1),
ellips = TRUE, thresholds = c(0.9, 3.95))
## Two covariates
out2 <- clus_lme(fixed_formula = Y ~ X1 + X2, name_class = "D",
name_clust = "id_Clus", data = data_3class)
### plot only covariate-specific ROC surface
clus_roc_surface(out_clus_lme = out2, newdata = data.frame(X1 = 1, X2 = 1))
### plot covariate-specific ROC surface and a 95% ellipsoidal confidence region for TCFs
clus_roc_surface(out_clus_lme = out2, newdata = data.frame(X1 = 1, X2 = 1),
ellips = TRUE, thresholds = c(0.9, 3.95))
Estimation of the covariate-specific TCFs for clustered data.
Description
clus_tcfs estimates covariate-specific True Class Fractions (TCFs), at a specified pair of thresholds, of a continuous diagnostic test in a clustered design with three ordinal groups. This function allows to estimate covariate-specific TCFs at multiple points for covariates.
Usage
clus_tcfs(out_clus_lme, newdata, thresholds, ap_var = FALSE)
Arguments
out_clus_lme |
an object of class "clus_lme", a result of |
newdata |
a data frame (containing specific value(s) of covariate(s)) in which to look for variables with which to estimate covariate-specific TCFs. In absence of covariate, no values have to be specified. |
thresholds |
a specified pair of thresholds. |
ap_var |
logical value. If set to |
Details
This function implements a method in To et al. (2022) for estimating covariate-specific TCFs at a specified pair of thresholds of a continuous diagnostic test in a clustered design with three ordinal groups. The estimator is based on results from clus_lme, which uses the REML approach. The asymptotic variance-covariance matrix of the estimated covariate-specific TCFs is estimated through the Delta method. Note that, if the Box-Cox transformation is applied for the linear mixed-effect model, the pair of thresholds must be input in the original scale.
Before performing estimation, a check for the monotone ordering assumption is performed. This means that, for the fixed values of covariates, three predicted mean values for test results in three diagnostic groups are compared. If the assumption is not meet, the covariate-specific TCFs at the values of covariates are not estimated.
Value
TCFs returns an object of class "TCFs", which is a list containing at least the following components:
call |
the matched call. |
tcfs_est |
a vector or matrix containing the estimated TCFs. |
tcf_vcov |
a matrix or list of matrices containing the estimated variance-covariance matrices. |
thresholds |
specified pair of thresholds. |
mess_order |
a diagnostic message from checking the monontone ordering. |
newdata |
value(s) of covariate(s). |
n_p |
total number of regressors in the model. |
Generic functions such as print is also used to show the results.
References
To, D-K., Adimari, G., Chiogna, M. and Risso, D. (2022) “Receiver operating characteristic estimation and threshold selection criteria in three-class classification problems for clustered data”. Statistical Methods in Medical Research, 7, 31, 1325-1341.
Examples
data(data_3class)
## One covariate
out1 <- clus_lme(fixed_formula = Y ~ X1, name_class = "D",
name_clust = "id_Clus", data = data_3class)
### Estimate TCFs at one single value of X1, (t1, t2) = (1, 4)
out_tcfs_1 <- clus_tcfs(out_clus_lme = out1, newdata = data.frame(X1 = 1),
thresholds = c(1, 4), ap_var = TRUE)
print(out_tcfs_1)
## Two covariates
out2 <- clus_lme(fixed_formula = Y ~ X1 + X2, name_class = "D",
name_clust = "id_Clus", data = data_3class)
### Estimate covariate-specific TCFs at point (X1, X2) = (1, 0), and (t1, t2) = (1, 4)
out_tcfs_2 <- clus_tcfs(out_clus_lme = out2,
newdata = data.frame(X1 = 1, X2 = 0),
thresholds = c(1, 4), ap_var = TRUE)
print(out_tcfs_2)
### Estimate covariate-specific TCFs at three points and (t1, t2) = (1, 4)
out_tcfs_3 <- clus_tcfs(out_clus_lme = out2,
newdata = data.frame(X1 = c(-0.5, 0.5, 0.5),
X2 = c(0, 0, 1)),
thresholds = c(1, 4), ap_var = TRUE)
print(out_tcfs_3)
Estimation of the covariate-specific VUS for clustered data.
Description
This function estimates the covariate-specific VUS of a continuous diagnostic test in the setting of clustered data as described in Xiong et al. (2018). This function allows to estimate covariate-specific VUS at multiple points for covariates.
Usage
clus_vus(out_clus_lme, newdata, ap_var = TRUE, subdivisions = 1000, ...)
Arguments
out_clus_lme |
an object of class "clus_lme", a result of |
newdata |
a data frame (containing specific value(s) of covariate(s)) in which to look for variables with which to estimate covariate-specific VUS. In absence of covariate, no values have to be specified. |
ap_var |
logical value. If set to |
subdivisions |
the maximum number of subintervals used to approximate integral. Default is 1000. |
... |
additional arguments to be passed to |
Details
This function implements a method in Xiong et al. (2018) for estimating covariate-specific VUS of a continuous diagnostic test in a clustered design with three ordinal groups. The estimator is based on results from clus_lme, which uses the REML approach. The standard error of the estimated covariate-specific VUS is approximated through the Delta method.
Before performing estimation, a check for the monotone ordering assumption is performed. This means that, for the fixed values of covariates, three predicted mean values for test results in three diagnostic groups are compared. If the assumption is not meet, the covariate-specific VUS at the values of covariates are not estimated. In addition, this function also performs the statistical test, H_0: VUS = 1/6 versus an alternative of interest.
Value
clus_vus returns an object of class "VUS" which is a list containing at least the following components:
call |
the matched call. |
vus_est |
a vector containing the estimated covariate-specific VUS. |
vus_se |
a vector containing the standard errors. |
mess_order |
a diagnostic message from checking the monontone ordering. |
newdata |
value(s) of covariate(s). |
n_p |
total number of regressors in the model. |
Generic functions such as print is also used to show the results.
References
Xiong, C., Luo, J., Chen L., Gao, F., Liu, J., Wang, G., Bateman, R. and Morris, J. C. (2018) “Estimating diagnostic accuracy for clustered ordinal diagnostic groups in the three-class case – Application to the early diagnosis of Alzheimer disease”. Statistical Methods in Medical Research, 27, 3, 701-714.
Examples
data(data_3class)
## One covariate
out1 <- clus_lme(fixed_formula = Y ~ X1, name_class = "D",
name_clust = "id_Clus", data = data_3class)
### Estimate covariate-specific VUS at one value of one covariate
out_vus1 <- clus_vus(out1, newdata = data.frame(X1 = 0.5))
ci_clus_vus(out_vus1, ci_level = 0.95)
### Estimate covariate-specific VUS at multiple values of one covariate
out_vus2 <- clus_vus(out1, newdata = data.frame(X1 = c(-0.5, 0, 0.5)))
ci_clus_vus(out_vus2, ci_level = 0.95)
## Two covariates
out2 <- clus_lme(fixed_formula = Y ~ X1 + X2, name_class = "D",
name_clust = "id_Clus", data = data_3class)
### Estimate covariate-specific VUS at one point
out_vus3 <- clus_vus(out2, newdata = data.frame(X1 = 1.5, X2 = 1))
ci_clus_vus(out_vus3, ci_level = 0.95)
### Estimate covariate-specific VUS at three points
out_vus4 <- clus_vus(out2, newdata = data.frame(X1 = c(-0.5, 0.5, 0.5),
X2 = c(0, 0, 1)))
ci_clus_vus(out_vus4, ci_level = 0.95)
A simulated data
Description
A simulated data example with 30 clusters.
Usage
data(data_3class)
Format
A data frame with 225 observations (from 30 clusters).
id_Clusthe id number of cluster.
Ya vector containing test results.
Da factor with 3 levels for the disease status, 1, 2, 3. The levels correspond to benign disease, early stage and late stage.
X1a continuous covariate.
X2a binary covariate.
A simulated data
Description
A simulated data example with 60 clusters. This dataset is used in a example of analysis with Box-Cox transformation.
Usage
data(data_3class_bcx)
Format
A data frame with 582 observations (from 60 clusters).
id_Clusthe id number of cluster.
Ya vector containing test results.
Da factor with 3 levels for disease status, 1, 2, 3. The levels correspond to benign disease, early stage and late stage.
Xa continuous covariate.
Plot an clus_lme object.
Description
Diagnostic plots for the linear mixed-effect model, fitted by clus_lme.
Usage
## S3 method for class 'clus_lme'
plot(x, file_name = NULL, ...)
Arguments
x |
an object of class "clus_lme", i.e., a result of |
file_name |
File name to create on disk. |
... |
further arguments used with |
Details
plot.clus_lme provides three diagnostic plots: Q-Q plots for residuals, Fitted vs. Residuals values, and Q-Q plot for cluster effects, based on ggplot().
Value
plot.clus_lme returns the diagnostic plots for the linear mixed-effect model, fitted by clus_lme.
See Also
Plot of confidence regions for covariate-specific optimal pair of thresholds.
Description
This function plots confidence regions for covariate-specific optimal pair of thresholds.
Usage
## S3 method for class 'clus_opt_thres3'
plot(
x,
ci_level = 0.95,
colors = NULL,
xlims,
ylims,
size_point = 0.5,
size_path = 0.5,
names_labels,
nrow_legend = 1,
file_name = NULL,
...
)
Arguments
x |
an object of class "clus_opt_thres3", i.e., a result of |
ci_level |
confidence level to be used for constructing the confidence regions; default is 0.95. |
colors |
a string vector for the name(s) specifying color(s) to be used for drawing confidence regions. If specified, the dimension of the vector needs to be equal the number of considered points (each point corresponds to a set of values for the covariates). |
xlims, ylims |
numeric vectors of dimension 2, giving the limits for x and y axes in the plot. |
size_point, size_path |
numeric values, indicating sizes for point(s) and line(s) in the plot. |
names_labels |
an optional character vector giving the label name for covariates. |
nrow_legend |
an optional number of rows in the legend. |
file_name |
File name to create on disk. |
... |
further arguments used with |
Details
plot.clus_opt_thres3 provides plots of confidence regions (and point estimates) of covariate-specific optimal pair of thresholds. The plots are based on ggplot().
Value
plot.clus_opt_thres3 returns plots of confidence regions of covariate-specific optimal pair of thresholds.
See Also
Print summary results from ci_clus_vus
Description
print.ci_vus displays the results of the output from ci_clus_vus.
Usage
## S3 method for class 'ci_clus_vus'
print(x, digits = 3, ...)
Arguments
x |
an object of class "ci_clus_vus", a result of |
digits |
minimal number of significant digits, see |
... |
further arguments passed to |
Details
print.ci_clus_vus shows a summary table for confidence interval limits for covariate-specific VUS.
Value
print.ci_clus_vus shows a summary table for confidence intervals for covariate-specific VUS.
See Also
Print summary results of an clus_lme object
Description
print.clus_lme displays results of the output from clus_lme.
Usage
## S3 method for class 'clus_lme'
print(x, digits = max(3L, getOption("digits") - 3L), call = TRUE, ...)
Arguments
x |
an object of class "clus_lme", a result of |
digits |
minimal number of significant digits, see |
call |
logical. If set to |
... |
further arguments passed to |
Details
print.clus_lme shows a summary table for the estimated parameters in the cluster-effect model (continuous diagnostic test in three-class setting).
Value
print.clus_lme returns a summary table for the estimated parameters in the cluster-effect model.
See Also
Print summary results from clus_opt_thres3
Description
print.clus_opt_thres3 displays the results of the output from clus_opt_thres3.
Usage
## S3 method for class 'clus_opt_thres3'
print(x, digits = 3, call = TRUE, ...)
Arguments
x |
an object of class "clus_opt_thres3", a result of |
digits |
minimal number of significant digits, see |
call |
logical. If set to |
... |
further arguments passed to |
Details
print.clus_opt_thres3 shows a summary table for covariate-specific optimal pair of thresholds estimates.
Value
print.clus_opt_thres3 returns a summary table for results of covariate-specific optimal pair of thresholds estimation.
See Also
Print summary results from clus_tcfs
Description
print.clus_tcfs displays the results of the output from clus_tcfs.
Usage
## S3 method for class 'clus_tcfs'
print(x, digits = 3, call = TRUE, ...)
Arguments
x |
an object of class "clus_tcfs", a result of |
digits |
minimal number of significant digits, see |
call |
logical. If set to |
... |
further arguments passed to |
Details
print.clus_tcfs shows a summary table for covariate-specific TCFs estimates.
Value
print.clus_tcfs returns a summary table for covariate-specific TCFs estimates.
See Also
Print summary results from clus_vus
Description
print.clus_vus displays the results of the output from clus_vus.
Usage
## S3 method for class 'clus_vus'
print(x, digits = 3, call = TRUE, ...)
Arguments
x |
an object of class "VUS", a result of |
digits |
minimal number of significant digits, see |
call |
logical. If set to |
... |
further arguments passed to |
Details
print.clus_vus shows a summary table for covariate-specific VUS estimates, containing estimates, standard errors, z-values and p-values for the hypothesis testing H_0: VUS = 1/6 versus an alternative H_A: VUS > 1/6.
Value
print.clus_vus returns a summary table for covariate-specific VUS estimates.