1 Introduction

Meta-analysis is a common tool for integrating findings across multiple OMIC scans, particularly when investigators have limited access to only summary results from each study. Traditional meta-analysis techniques often overlook the problem of hidden non-independencies among study elements, such as overlapping or related subjects, leading to potential biases and inaccuracies in the aggregated results. The corrmeta package presents a solution for conducting correlated meta-analysis, a critical tool for researchers dealing with the complexities of data dependencies in studies with potentially related subjects (Province 2005), (Borecki and Province 2008), (Province and Borecki 2013). This vignette will cover basic usage of the corrmeta" package.

2 Installation

2.1 Install from CRAN

install.packages("corrmeta")

Try this first before other installation methods.

2.2 Install from Github

devtools::install_github("wsjung/corrmeta")

2.3 Load the package

library(corrmeta)

Check that there is no error when loading the package.

3 Simple example

3.1 Preprocessing

3.1.1 Load data

data(snp_example, package="corrmeta")
varlist <- c("trt1","trt2","trt3")

This loads trt1, trt2, and trt3 which are short, simulated SNP-trait association datasets. Note that although the examples are working on SNP datasets, corrmeta works for any common OMIC unit of inference across each input dataset. corrmeta requires that the input is a single dataframe where the OMIC units of inference are under column markname and each scan has its own column.

3.2 Correlated meta-analysis

With the preprocessing step, we can now run the function tetracorr which takes the input dataframe data and varlist the list of scans which are column names in data. Briefly, tetracorr computes the z-scores of the input p-values using the complement probit transformation then calculates the polychoric correlations.

tc <- tetracorr(snp_example, varlist)
tc
## $sigma
## # A tibble: 3 × 4
##   row     trt1  trt2   trt3
##   <chr>  <dbl> <dbl>  <dbl>
## 1 trt1   1     0.215 -0.215
## 2 trt2   0.215 1      0.127
## 3 trt3  -0.215 0.127  1    
## 
## $sum_sigma
## [1] 3.253552

tetracorr returns an object with two elements. sigma is the table of tetrachoric correlation coefficients between each pair of the input scans. sum_sigma is the sum of all pair-wise tetrachoric corerlation coefficients.

3.3 Fisher’s method

The final correlated meta-analysis p-value can be computed using the Fisher’s method. fishp takes the input dataframe, list of scans, and the outputs from tetracorr.

fishp(snp_example, varlist, tc$sigma, tc$sum_sigma)
##          markname    trt1   trt2    trt3 num_obs sum_sigma_var sum_chisq
## 1  c01b000015585s 0.35580 0.7356 0.69200       3      3.253552  3.417249
## 2  c01b000015644s 0.58850 0.4539 0.71640       3      3.253552  3.307147
## 3  c01b000015647s 0.18840 0.3029 0.21110       3      3.253552  8.837928
## 4  c01b000015717s 0.99820 0.2474 0.20290       3      3.253552  5.987185
## 5  c01b000015721s 0.74750 0.2206 0.19540       3      3.253552  6.870263
## 6  c01b000016805s 0.08051 0.1532 0.79100       3      3.253552  9.259684
## 7  c01b000016809s 0.07062 0.2896 0.85790       3      3.253552  8.085928
## 8  c01b000016856s 0.74300 0.5204 0.31930       3      3.253552  4.183682
## 9  c01b000016946s 0.77860 0.6758 0.80840       3      3.253552  1.709628
## 10 c01b000016963s 0.82460 0.7960 0.30990       3      3.253552  3.185037
## 11 c01b000016968s 0.13200 0.5866 0.25170       3      3.253552  7.875766
## 12 c01b000016977s 0.82080 0.7761 0.21520       3      3.253552  3.974274
## 13 c01b000016993s 0.18290 0.6209 0.06663       3      3.253552  9.768003
## 14 c01b000017041s 0.76820 0.8736 0.54980       3      3.253552  1.994077
## 15 c01b000017101s 0.24760 0.3189 0.10090       3      3.253552  9.664888
## 16 c01b000017147s 0.03534 0.9412 0.99310       3      3.253552  6.820527
## 17 c01b000017181s 0.84080 0.7264 0.76440       3      3.253552  1.523440
## 18 c01b000017375s 0.97000 0.2214 0.03283       3      3.253552  9.909312
## 19 c01b000017379s 0.56130 0.5311 0.05570       3      3.253552  8.196160
##         sum_z    pvalue     meta_z     meta_p meta_nlog10p
## 1  -0.7616582 0.7549448 -0.4222612 0.66358283   0.17810486
## 2  -0.6800542 0.7694257 -0.3770202 0.64692071   0.18914894
## 3   2.2024960 0.1829002  1.2210578 0.11103206   0.95455159
## 4  -1.3972360 0.4246272 -0.7746239 0.78071902   0.10750524
## 5   0.9616926 0.3330121  0.5331598 0.29696150   0.52729986
## 6   1.6145585 0.1594917  0.8951069 0.18536498   0.73197231
## 7   0.9548107 0.2318753  0.5293445 0.29828326   0.52537112
## 8  -0.2341224 0.6518348 -0.1297968 0.55163641   0.25834708
## 9  -2.0954750 0.9443755 -1.1617257 0.87732654   0.05683873
## 10 -1.2643232 0.7852901 -0.7009374 0.75832894   0.12014237
## 11  1.5673289 0.2473471  0.8689229 0.19244465   0.71569416
## 12 -0.8889986 0.6801580 -0.4928584 0.68894370   0.16181627
## 13  2.0978928 0.1347681  1.1630661 0.12240135   0.91221381
## 14 -2.0016627 0.9202425 -1.1097164 0.86643938   0.06226182
## 15  2.4292787 0.1394923  1.3467855 0.08902466   1.05048968
## 16 -2.2198268 0.3377643 -1.2306660 0.89077610   0.05023145
## 17 -2.3202403 0.9579223 -1.2863350 0.90083691   0.04535383
## 18  0.7274177 0.1285234  0.4032784 0.34337171   0.46423549
## 19  1.3596307 0.2240816  0.7537756 0.22549200   0.64686886

4 Example with missing samples

This example shows corrmeta’s capability in dealing with missing samples across the scans. This is possible by leveraging the basic property of the MVN distribution that every subdimensional space is also MVN distributed (learn more at (Province and Borecki 2013)). The example datasets are the same as above, but with some samples removed.

4.1 Preprocessing

data(snp_example_missing, package="corrmeta")
varlist <- c("trt1","trt2","trt3")
##          markname    trt1   trt2    trt3
## 1  c01b000015585s 0.35580     NA      NA
## 2  c01b000015644s 0.58850 0.4539      NA
## 3  c01b000015647s 0.18840 0.3029 0.21110
## 4  c01b000015717s 0.99820 0.2474 0.20290
## 5  c01b000015721s 0.74750 0.2206 0.19540
## 6  c01b000016805s 0.08051 0.1532 0.79100
## 7  c01b000016809s 0.07062 0.2896 0.85790
## 8  c01b000016856s 0.74300 0.5204 0.31930
## 9  c01b000016946s 0.77860 0.6758 0.80840
## 10 c01b000016963s 0.82460 0.7960 0.30990
## 11 c01b000016968s 0.13200 0.5866 0.25170
## 12 c01b000016977s 0.82080 0.7761 0.21520
## 13 c01b000016993s 0.18290 0.6209 0.06663
## 14 c01b000017041s 0.76820 0.8736 0.54980
## 15 c01b000017101s 0.24760 0.3189 0.10090
## 16 c01b000017147s 0.03534 0.9412 0.99310
## 17 c01b000017181s 0.84080 0.7264 0.76440
## 18 c01b000017375s 0.97000 0.2214 0.03283
## 19 c01b000017379s 0.56130 0.5311 0.05570

We can see that trt2_missing is missing c01b000015585s and trt3_missing is missing both c01b000015585s and c01b000015644s.

4.2 Correlated meta-analysis

tc <- tetracorr(snp_example_missing, varlist)
tc
## $sigma
## # A tibble: 3 × 4
##   row     trt1  trt2   trt3
##   <chr>  <dbl> <dbl>  <dbl>
## 1 trt1   1     0.319 -0.212
## 2 trt2   0.319 1      0.192
## 3 trt3  -0.212 0.192  1    
## 
## $sum_sigma
## [1] 3.597483

4.3 Fisher’s method

fishp(snp_example_missing, varlist, tc$sigma, tc$sum_sigma)
##          markname    trt1   trt2    trt3 num_obs sum_sigma_var sum_chisq
## 1  c01b000015585s 0.35580     NA      NA       1      1.000000  2.066773
## 2  c01b000015644s 0.58850 0.4539      NA       2      2.637578  2.640113
## 3  c01b000015647s 0.18840 0.3029 0.21110       3      3.597483  8.837928
## 4  c01b000015717s 0.99820 0.2474 0.20290       3      3.597483  5.987185
## 5  c01b000015721s 0.74750 0.2206 0.19540       3      3.597483  6.870263
## 6  c01b000016805s 0.08051 0.1532 0.79100       3      3.597483  9.259684
## 7  c01b000016809s 0.07062 0.2896 0.85790       3      3.597483  8.085928
## 8  c01b000016856s 0.74300 0.5204 0.31930       3      3.597483  4.183682
## 9  c01b000016946s 0.77860 0.6758 0.80840       3      3.597483  1.709628
## 10 c01b000016963s 0.82460 0.7960 0.30990       3      3.597483  3.185037
## 11 c01b000016968s 0.13200 0.5866 0.25170       3      3.597483  7.875766
## 12 c01b000016977s 0.82080 0.7761 0.21520       3      3.597483  3.974274
## 13 c01b000016993s 0.18290 0.6209 0.06663       3      3.597483  9.768003
## 14 c01b000017041s 0.76820 0.8736 0.54980       3      3.597483  1.994077
## 15 c01b000017101s 0.24760 0.3189 0.10090       3      3.597483  9.664888
## 16 c01b000017147s 0.03534 0.9412 0.99310       3      3.597483  6.820527
## 17 c01b000017181s 0.84080 0.7264 0.76440       3      3.597483  1.523440
## 18 c01b000017375s 0.97000 0.2214 0.03283       3      3.597483  9.909312
## 19 c01b000017379s 0.56130 0.5311 0.05570       3      3.597483  8.196160
##         sum_z    pvalue      meta_z    meta_p meta_nlog10p
## 1   0.3697081 0.9134561  0.36970809 0.3558000   0.44879406
## 2  -0.1078742 0.8524690 -0.06642244 0.5264792   0.27861874
## 3   2.2024960 0.1829002  1.16122324 0.1227756   0.91088807
## 4  -1.3972360 0.4246272 -0.73666555 0.7693371   0.11388331
## 5   0.9616926 0.3330121  0.50703373 0.3060656   0.51418552
## 6   1.6145585 0.1594917  0.85124462 0.1973167   0.70483607
## 7   0.9548107 0.2318753  0.50340539 0.3073396   0.51238142
## 8  -0.2341224 0.6518348 -0.12343648 0.5491193   0.26033332
## 9  -2.0954750 0.9443755 -1.10479851 0.8653765   0.06279488
## 10 -1.2643232 0.7852901 -0.66658985 0.7474829   0.12639873
## 11  1.5673289 0.2473471  0.82634374 0.2043046   0.68972193
## 12 -0.8889986 0.6801580 -0.46870728 0.6803606   0.16726087
## 13  2.0978928 0.1347681  1.10607323 0.1343474   0.87177070
## 14 -2.0016627 0.9202425 -1.05533782 0.8543646   0.06835677
## 15  2.4292787 0.1394923  1.28079001 0.1001337   0.99941966
## 16 -2.2198268 0.3377643 -1.17036060 0.8790721   0.05597552
## 17 -2.3202403 0.9579223 -1.22330167 0.8893921   0.05090673
## 18  0.7274177 0.1285234  0.38351686 0.3506683   0.45510351
## 19  1.3596307 0.2240816  0.71683887 0.2367368   0.62573429

5 Session info

## R version 4.3.3 (2024-02-29)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS 15.5
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0
## 
## locale:
## [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: America/Chicago
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] corrmeta_1.0.1   dplyr_1.1.4      magrittr_2.0.4   BiocStyle_2.30.0
## 
## loaded via a namespace (and not attached):
##  [1] vctrs_0.6.5         cli_3.6.5           knitr_1.50         
##  [4] rlang_1.1.6         xfun_0.53           purrr_1.1.0        
##  [7] generics_0.1.4      jsonlite_2.0.0      glue_1.8.0         
## [10] htmltools_0.5.8.1   sass_0.4.10         rmarkdown_2.30     
## [13] evaluate_1.0.5      jquerylib_0.1.4     tibble_3.3.0       
## [16] fastmap_1.2.0       mvtnorm_1.3-3       yaml_2.3.10        
## [19] lifecycle_1.0.4     bookdown_0.45       BiocManager_1.30.26
## [22] compiler_4.3.3      pkgconfig_2.0.3     tidyr_1.3.1        
## [25] polycor_0.8-1       rstudioapi_0.17.1   digest_0.6.37      
## [28] admisc_0.38         R6_2.6.1            utf8_1.2.6         
## [31] tidyselect_1.2.1    parallel_4.3.3      pillar_1.11.1      
## [34] bslib_0.9.0         withr_3.0.2         tools_4.3.3        
## [37] cachem_1.1.0

References

Borecki, IB, and MA Province. 2008. “Genetic and Genomic Discovery Using Family Studies.” Circulation 118 (10): 1057–63. https://doi.org/10.1161/CIRCULATIONAHA.107.714592.
Province, MA. 2005. “Meta-Analyses of Correlated Genomic Scans.” Genet Epidemiol 29 (274).
Province, MA, and IB Borecki. 2013. “A Correlated Meta-Analysis Strategy for Data Mining ’Omic’ Scans.” Pac Symp Biocomput, 236–46.