Introduction

Introduction to MiscMetabar: an R package to facilitate visualization and reproducibility in metabarcoding analysis

Raison d’être

Quick overview

For an introduction to metabarcoding in R, Please visite the state of the field vignettes. The import, export and track vignette explains how import and export phyloseq object. Its also show how to summarize useful information (number of sequences, samples and clusters) accross bioinformatic pipelines.

If you are interested in ecological metrics, see the vignettes describing alpha-diversity and beta-diversity analysis. The vignette filter taxa and samples describes some data-filtering processes using MiscMetabar and the reclustering tutorial introduces the different way of clustering already-clustered OTU/ASV. The vignette tengeler explore the dataset from Tengeler et al. (2020) using some MiscMetabar functions.

For developers, I also wrote a vignette describing som rules of codes.

Summarize a physeq object

library("MiscMetabar")
library("phyloseq")
data("data_fungi")
summary_plot_pq(data_fungi)

Create an interactive table of the tax_table

data("GlobalPatterns", package = "phyloseq")
tax_datatable(subset_taxa(
  GlobalPatterns,
  rowSums(GlobalPatterns@otu_table) > 100000
))

Sankey diagram of the tax_table

gp <- subset_taxa(GlobalPatterns, GlobalPatterns@tax_table[, 1] == "Archaea")
sankey_pq(gp, taxa = c(1:5))
Archaea → Crenarchaeota
106 OTUs
Archaea → Euryarchaeota
102 OTUs
Crenarchaeota → Thaumarchaeota
57 OTUs
Euryarchaeota → Thermoplasmata
44 OTUs
Thermoplasmata → E2
44 OTUs
Thaumarchaeota → Cenarchaeales
37 OTUs
Cenarchaeales → Cenarchaeaceae
34 OTUs
Crenarchaeota → C2
27 OTUs
Euryarchaeota → Methanobacteria
27 OTUs
Methanobacteria → Methanobacteriales
27 OTUs
Methanobacteriales → Methanobacteriaceae
25 OTUs
Euryarchaeota → Methanomicrobia
24 OTUs
Thaumarchaeota → Nitrososphaerales
20 OTUs
Nitrososphaerales → Nitrososphaeraceae
20 OTUs
C2 → pGrfC26
18 OTUs
E2 → pMC2A33
16 OTUs
Methanomicrobia → Methanomicrobiales
12 OTUs
Methanomicrobia → Methanosarcinales
12 OTUs
E2 → WCHD302
12 OTUs
E2 → MarinegroupII
10 OTUs
Crenarchaeota → SdNA
8 OTUs
Crenarchaeota → pMC2A209
8 OTUs
Euryarchaeota → Halobacteria
7 OTUs
C2 → B10
7 OTUs
Halobacteria → Halobacteriales
7 OTUs
Halobacteriales → Halobacteriaceae
7 OTUs
Methanosarcinales → Methanosarcinaceae
6 OTUs
SdNA → NRPJ
5 OTUs
Methanosarcinales → Methanosaetaceae
5 OTUs
Crenarchaeota → Thermoprotei
3 OTUs
E2 → MOB79
3 OTUs
Cenarchaeales → SAGMAX
3 OTUs
Methanomicrobiales → Methanocorpusculaceae
2 OTUs
Thermoprotei → Sulfolobales
1 OTUs
Methanobacteriales → MSBL1
1 OTUs
E2 → MarinegroupIII
1 OTUs
Sulfolobales → Sulfolobaceae
1 OTUs
Methanobacteriales → WSA2
1 OTUs
Archaea
208 OTUs
Archaea
Crenarchaeota
106 OTUs
Crenarchaeota
Euryarchaeota
102 OTUs
Euryarchaeota
C2
27 OTUs
C2
Thaumarchaeota
57 OTUs
Thaumarchaeota
Thermoplasmata
44 OTUs
Thermoplasmata
Halobacteria
7 OTUs
Halobacteria
Methanobacteria
27 OTUs
Methanobacteria
Methanomicrobia
24 OTUs
Methanomicrobia
SdNA
8 OTUs
SdNA
Thermoprotei
3 OTUs
Thermoprotei
Cenarchaeales
37 OTUs
Cenarchaeales
Halobacteriales
7 OTUs
Halobacteriales
E2
44 OTUs
E2
Methanobacteriales
27 OTUs
Methanobacteriales
Methanomicrobiales
12 OTUs
Methanomicrobiales
Methanosarcinales
12 OTUs
Methanosarcinales
Nitrososphaerales
20 OTUs
Nitrososphaerales
Sulfolobales
1 OTUs
Sulfolobales
pMC2A209
8 OTUs
pMC2A209
B10
7 OTUs
B10
NRPJ
5 OTUs
NRPJ
pGrfC26
18 OTUs
pGrfC26
Cenarchaeaceae
34 OTUs
Cenarchaeaceae
Halobacteriaceae
7 OTUs
Halobacteriaceae
MOB79
3 OTUs
MOB79
MSBL1
1 OTUs
MSBL1
MarinegroupII
10 OTUs
MarinegroupII
MarinegroupIII
1 OTUs
MarinegroupIII
Methanobacteriaceae
25 OTUs
Methanobacteriaceae
Methanocorpusculaceae
2 OTUs
Methanocorpusculaceae
Methanosaetaceae
5 OTUs
Methanosaetaceae
Methanosarcinaceae
6 OTUs
Methanosarcinaceae
Nitrososphaeraceae
20 OTUs
Nitrososphaeraceae
SAGMAX
3 OTUs
SAGMAX
Sulfolobaceae
1 OTUs
Sulfolobaceae
WCHD302
12 OTUs
WCHD302
WSA2
1 OTUs
WSA2
pMC2A33
16 OTUs
pMC2A33

Upset plot for visualize distribution of taxa in function of samples variables

upset_pq(gp, "SampleType", taxa = "Class")

References

Tengeler, A.C., Dam, S.A., Wiesmann, M. et al. Gut microbiota from persons with attention-deficit/hyperactivity disorder affects the brain in mice. Microbiome 8, 44 (2020). https://doi.org/10.1186/s40168-020-00816-x

Session inform

sessionInfo()
#> R version 4.4.2 (2024-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Debian GNU/Linux 12 (bookworm)
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.11.0 
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.11.0
#> 
#> locale:
#>  [1] LC_CTYPE=fr_FR.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=fr_FR.UTF-8        LC_COLLATE=C              
#>  [5] LC_MONETARY=fr_FR.UTF-8    LC_MESSAGES=fr_FR.UTF-8   
#>  [7] LC_PAPER=fr_FR.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: Europe/Paris
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] MiscMetabar_0.14.1 purrr_1.0.4        dplyr_1.1.4        dada2_1.34.0      
#> [5] Rcpp_1.0.14        ggplot2_3.5.1      phyloseq_1.50.0   
#> 
#> loaded via a namespace (and not attached):
#>   [1] bitops_1.0-9                deldir_2.0-4               
#>   [3] permute_0.9-7               rlang_1.1.5                
#>   [5] magrittr_2.0.3              ade4_1.7-22                
#>   [7] matrixStats_1.5.0           compiler_4.4.2             
#>   [9] mgcv_1.9-1                  png_0.1-8                  
#>  [11] vctrs_0.6.5                 reshape2_1.4.4             
#>  [13] stringr_1.5.1               pwalign_1.0.0              
#>  [15] pkgconfig_2.0.3             crayon_1.5.3               
#>  [17] fastmap_1.2.0               XVector_0.44.0             
#>  [19] labeling_0.4.3              Rsamtools_2.20.0           
#>  [21] rmarkdown_2.29              UCSC.utils_1.0.0           
#>  [23] xfun_0.50                   zlibbioc_1.50.0            
#>  [25] cachem_1.1.0                GenomeInfoDb_1.40.1        
#>  [27] jsonlite_1.8.9              biomformat_1.32.0          
#>  [29] rhdf5filters_1.16.0         DelayedArray_0.30.1        
#>  [31] Rhdf5lib_1.26.0             BiocParallel_1.38.0        
#>  [33] jpeg_0.1-10                 parallel_4.4.2             
#>  [35] cluster_2.1.6               R6_2.6.0                   
#>  [37] bslib_0.9.0                 stringi_1.8.4              
#>  [39] RColorBrewer_1.1-3          ComplexUpset_1.3.3         
#>  [41] GenomicRanges_1.56.2        jquerylib_0.1.4            
#>  [43] SummarizedExperiment_1.34.0 iterators_1.0.14           
#>  [45] knitr_1.49                  IRanges_2.38.1             
#>  [47] Matrix_1.7-1                splines_4.4.2              
#>  [49] igraph_2.1.4                tidyselect_1.2.1           
#>  [51] abind_1.4-8                 yaml_2.3.10                
#>  [53] vegan_2.6-10                codetools_0.2-20           
#>  [55] hwriter_1.3.2.1             lattice_0.22-6             
#>  [57] tibble_3.2.1                plyr_1.8.9                 
#>  [59] Biobase_2.64.0              withr_3.0.2                
#>  [61] ShortRead_1.62.0            evaluate_1.0.3             
#>  [63] survival_3.7-0              RcppParallel_5.1.10        
#>  [65] Biostrings_2.72.1           pillar_1.10.1              
#>  [67] MatrixGenerics_1.16.0       DT_0.33                    
#>  [69] foreach_1.5.2               stats4_4.4.2               
#>  [71] generics_0.1.3              S4Vectors_0.42.1           
#>  [73] munsell_0.5.1               scales_1.3.0               
#>  [75] glue_1.8.0                  tools_4.4.2                
#>  [77] interp_1.1-6                data.table_1.16.4          
#>  [79] GenomicAlignments_1.40.0    rhdf5_2.48.0               
#>  [81] grid_4.4.2                  tidyr_1.3.1                
#>  [83] ape_5.8-1                   crosstalk_1.2.1            
#>  [85] latticeExtra_0.6-30         colorspace_2.1-1           
#>  [87] patchwork_1.3.0             networkD3_0.4              
#>  [89] nlme_3.1-167                GenomeInfoDbData_1.2.12    
#>  [91] cli_3.6.3                   S4Arrays_1.4.1             
#>  [93] gtable_0.3.6                sass_0.4.9                 
#>  [95] digest_0.6.37               BiocGenerics_0.50.0        
#>  [97] SparseArray_1.4.8           htmlwidgets_1.6.4          
#>  [99] farver_2.1.2                htmltools_0.5.8.1          
#> [101] multtest_2.60.0             lifecycle_1.0.4            
#> [103] httr_1.4.7                  MASS_7.3-61