| Type: | Package | 
| Title: | Detecting Trait Clustering in Environmental Gradients | 
| Version: | 0.1.1 | 
| Author: | Mateu Menendez-Serra, Vicente J. Ontiveros, Emilio O. Casamayor, David Alonso | 
| Maintainer: | Mateu Menendez-Serra <mateu.menendez@ceab.csic.es> | 
| Description: | The Randomized Trait Community Clustering method (Triado-Margarit et al., 2019, <doi:10.1038/s41396-019-0454-4>) is a statistical approach which allows to determine whether if an observed trait clustering pattern is related to an increasing environmental constrain. The method 1) determines whether exists or not a trait clustering on the sampled communities and 2) assess if the observed clustering signal is related or not to an increasing environmental constrain along an environmental gradient. Also, when the effect of the environmental gradient is not linear, allows to determine consistent thresholds on the community assembly based on trait-values. | 
| License: | GPL-3 | 
| Encoding: | UTF-8 | 
| LazyData: | true | 
| RoxygenNote: | 7.1.0 | 
| Imports: | matrixStats, vegan, Rcpp | 
| Suggests: | testthat, knitr, rmarkdown | 
| LinkingTo: | testthat, Rcpp | 
| NeedsCompilation: | yes | 
| Packaged: | 2020-06-12 17:17:37 UTC; macbookair | 
| Depends: | R (≥ 3.5.0) | 
| Repository: | CRAN | 
| Date/Publication: | 2020-06-12 17:50:03 UTC | 
RTCC: Detecting trait clustering in environmental gradients with the Randomized Trait Community Clustering method
Description
A set of functions which allows to determine if the observed traits present clustering/overdispersion patterns on the observed samples, and if so, to stablish if the observed pattern is linked to the effect of an environmental gradient.
Details
The study of phenotypic similarities and differences within species along environmental gradients might be used as a powerful tool complementing taxon-based approaches when assesing the contribution of stochastic and deterministic processes in community assembly. For this, this package allows an easy implementation of a method for detecting clustering/overdispersion patterns along an environmental gradient (Triado-Margarit et al., 2019). A first function assesses if the observed traits exhibit a clustering/overdispersion pattern on the tested samples. If positive, two subsequent functions determine whether the observed pattern is linked to the effect of an environmental varible and its statistical significance.
Data entry
The data consists on presence-absence observations along a measured environmental gradient and trait quantitative information of the observed organisms.
References
Triado-Margarit, X., Capitan, J.A., Menendez-Serra, M. et al. (2019) A Randomized Trait Community Clustering approach to unveil consistent environmental thresholds in community assembly. ISME J 13, 2681–2689 . https://doi.org/10.1038/s41396-019-0454-4
Genomic data linked to saline lagoons.
Description
A dataset containing genomic data of 544 genomes that matched 16s rRNA data from saline lagoons of the Monegros desert area.
Usage
group_information
Format
A data frame with 544 rows and 14 variables:
- genome
- Genome IMG code 
- Genome_Size
- Genome size 
- GC_perc
- GC percentage 
- Coding_base_perc
- Conding base percentage 
- CDS_perc
- CDS percentage 
- RNA_perc
- RNA percentage 
- rRNA_count
- rRNA count 
- Transporter_perc
- Transporter proteins percentage 
- Signal_peptide_perc
- Signal peptide percentage 
- Transmembrane_perc
- Transmembrane proteins percentage 
- Gene_Count
- Gene count 
- min_env
- Minimum environmental value where the organism has been observed 
- max_env
- Minimum environmental value where the organism has been observed 
- rel_abundance
- Relative abundance of the organism on the metacommunity 
...
Source
Triadó-Margarit, X., Capitán, J.A., Menéndez-Serra, M. et al. A Randomized Trait Community Clustering approach to unveil consistent environmental thresholds in community assembly. ISME J 13, 2681–2689 (2019).
Salinity values of saline lagoons.
Description
A dataset containing salinity values of 136 lagoons on the Monegros desert area.
Usage
metadata
Format
A data frame with 136 rows and 2 variables:
- sample_ID
- Sample internal code 
- salinity
- Sample salinity value 
Source
Triadó-Margarit, X., Capitán, J.A., Menéndez-Serra, M. et al. A Randomized Trait Community Clustering approach to unveil consistent environmental thresholds in community assembly. ISME J 13, 2681–2689.2019.
Trait selection
Description
This function determines whether the selected traits exhibit or not a clustering/overdispersion signal on the tested samples. For each trait, compares the observed Mean Pairwise Distance (MPD) of each sample against a distribution of synthetic commmunities MPDs obtained by a randomization test. Each synthetic community is build maintaining the original sample richness and randomly selecting organisms form the global pool.
Usage
rtcc1(table1, table2, table3, traits_columns, repetitions)
Arguments
| table1 | A data frame containing organisms names on the first column and its trait values on the consecutive ones. It also has to contain two columns with the maximum and the minimum values of the tested environmental variable where the organisms have been observed. | 
| table2 | A presence-absence observations table with the organisms names on the first column and the sample names as consecutive colnames. | 
| table3 | A dataframe containing sample names on the first column and environmental parameters on the consecutive ones. | 
| traits_columns | Table 1 column numbers where different trait values appear. | 
| repetitions | Number of simulated synthetic communities distributions. | 
Value
The function returns a dataframe with trait names as colnames and the p-value distribution of the different traits.
Examples
data(group_information)
data(table_presence_absence)
data(metadata)
rtcc1(group_information, table_presence_absence, metadata, 2:11, 100)
Clustering signal along an environmental gradient
Description
For a given trait, this function determines whether the observed trait clustering/overdispersion on the metacommunity is linked to an environmental gradient. For this, it sequentially remove samples in decreasing order of the environmental variable and computes at each step the remaining metacommunity h-index. This index is based on the percentage of samples on a metacommunity presenting significant trait clustering/overdispersion.
Usage
rtcc2(
  table1,
  table2,
  table3,
  species_abundances,
  trait_col_number,
  min_env_col,
  max_env_col,
  env_var_col,
  h_iteration,
  repetitions,
  model
)
Arguments
| table1 | A data frame containing organisms names on the first column and its trait values on the consecutive ones. It also has to contain two columns with the maximum and the minimum values of the tested environmental variable where the organisms have been observed. | 
| table2 | A presence-absence observations table with the organisms names on the first column and the sample names as consecutive colnames. | 
| table3 | A dataframe containing sample names on the first column and environmental parameters on the consecutive ones. | 
| species_abundances | A vector containing the relative abundance of the organisms on the whole data set on the same order as appear on Table 1. | 
| trait_col_number | Table 1 column number of the tested trait. | 
| min_env_col | Table 1 column number indicating the minimum value of the environmental variable were each organism has been observed. | 
| max_env_col | Table 1 column number indicating the maximum value of the environmental variable were each organism has been observed. | 
| env_var_col | Table 2 column number indicating the tested environmental variable. | 
| h_iteration | Number of h-index calculations for computing a confidence interval. | 
| repetitions | Number of simulated synthetic communities distributions. | 
| model | Model selection. All models build synthetic communities based on the organisms richness of the observed communities. - Model 1: organism are selected randomly from the global pool. - Model 2: organism are selected randomly with a probability based on its relative abundance on the global pool. - Model 3: organism are selected randomly, but only those whose environmental range includes the value of the simulated community are elegible. - Model 4: organism are selected randomly, but only those whose environmental range includes the value of the simulated community are elegible and the selection probability is based on its relative abundance on the global pool. | 
Value
The function returns a dataframe with the maximum of the environmental variable on the remaining metacommunity after the sequential removal, h-index calculation for each environmental value, and its confidence standard deviation.
Examples
data(group_information)
data(table_presence_absence)
data(metadata)
rtcc2(group_information, table_presence_absence, metadata, group_information$sums,
9, 12, 13, 2, 100, 100, model = 1)
Clustering signal significance.
Description
For a given trait and environmental variable, this function creates a null model of the clustering/overdispersion pattern in order to test if the observed pattern statistically differs from the expected by random. For this, it sequentially remove random samples from the metacommunity and computes at each step the remaining metacommunity h-index. This index is based on the percentage of samples on a metacoomunity presenting significant trait clustering/overdispersion. After h iterations, computes a 95 obtained h-index for each point of the environmental gradient.
Usage
rtcc3(
  table1,
  table2,
  table3,
  species_abundances,
  trait_col_number,
  min_env_col,
  max_env_col,
  env_var_col,
  h_iteration,
  repetitions,
  model
)
Arguments
| table1 | A data frame containing organisms names on the first column and its trait values on the consecutive ones. It also has to contain two columns with the maximum and the minimum values of the tested environmental variable where the organisms have been observed. | 
| table2 | A presence-absence observations table with the organisms names on the first column and the sample names as consecutive colnames. | 
| table3 | A dataframe containing sample names on the first column and environmental parameters on the consecutive ones. | 
| species_abundances | A vector containing the relative abundance of the organisms on the whole data set on the same order as appear on Table 1. | 
| trait_col_number | Table 1 column number of the tested trait. | 
| min_env_col | Table 1 column number indicating the minimum value of the environmental variable were each organism has been observed. | 
| max_env_col | Table 1 column number indicating the maximum value of the environmental variable were each organism has been observed. | 
| env_var_col | Table 2 column number indicating the tested environmental variable. | 
| h_iteration | Number of h-index calculations for computing a confidence interval. | 
| repetitions | Number of simulated synthetic communities distributions. | 
| model | Model selection. All models build synthetic communities based on the organisms richness of the observed communities. - Model 1: organism are selected randomly from the global pool. - Model 2: organism are selected randomly with a probability based on its relative abundance on the global pool. - Model 3: organism are selected randomly, but only those whose environmental range includes the value of the simulated community are elegible. - Model 4: organism are selected randomly, but only those whose environmental range includes the value of the simulated community are elegible and the selection probability is based on its relative abundance on the global pool. | 
Value
The function returns a dataframe with the maximum value of environmental variable corresponding to the same number of samples on the ordered remova, h-index calculation for each environmental value, and the percentiles 0.025, 0.5 and 0.975 of the obtained distribution for each point (mean value and 95
Examples
data(group_information)
data(table_presence_absence)
data(metadata)
rtcc3(group_information, table_presence_absence, metadata, group_information$sums,
9, 12, 13, 2, 50, 20, model = 1)
Genome presence-absence data of 136 saline lagoons.
Description
A dataset containing presence-absence data of 544 genomes on 136 saline lagoons of the Monegros desert area.
Usage
table_presence_absence
Format
A data frame with 544 rows and 137 variables:
- genome
- Genome IMG code 
- MON_10
- Sample presence-absence observations 
- MON_100
- Sample presence-absence observations 
- MON_101
- Sample presence-absence observations 
- MON_103
- Sample presence-absence observations 
- MON_104
- Sample presence-absence observations 
- MON_106
- Sample presence-absence observations 
- MON_107
- Sample presence-absence observations 
- MON_108
- Sample presence-absence observations 
- MON_109
- Sample presence-absence observations 
- MON_11
- Sample presence-absence observations 
- MON_110
- Sample presence-absence observations 
- MON_111
- Sample presence-absence observations 
- MON_112
- Sample presence-absence observations 
- MON_113
- Sample presence-absence observations 
- MON_114
- Sample presence-absence observations 
- MON_116
- Sample presence-absence observations 
- MON_117
- Sample presence-absence observations 
- MON_118
- Sample presence-absence observations 
- MON_119
- Sample presence-absence observations 
- MON_12
- Sample presence-absence observations 
- MON_120
- Sample presence-absence observations 
- MON_122
- Sample presence-absence observations 
- MON_123
- Sample presence-absence observations 
- MON_124
- Sample presence-absence observations 
- MON_125
- Sample presence-absence observations 
- MON_126
- Sample presence-absence observations 
- MON_127
- Sample presence-absence observations 
- MON_128
- Sample presence-absence observations 
- MON_129
- Sample presence-absence observations 
- MON_13
- Sample presence-absence observations 
- MON_130
- Sample presence-absence observations 
- MON_131
- Sample presence-absence observations 
- MON_133
- Sample presence-absence observations 
- MON_134
- Sample presence-absence observations 
- MON_135
- Sample presence-absence observations 
- MON_136
- Sample presence-absence observations 
- MON_137
- Sample presence-absence observations 
- MON_138
- Sample presence-absence observations 
- MON_139
- Sample presence-absence observations 
- MON_14
- Sample presence-absence observations 
- MON_140
- Sample presence-absence observations 
- MON_141
- Sample presence-absence observations 
- MON_142
- Sample presence-absence observations 
- MON_144
- Sample presence-absence observations 
- MON_145
- Sample presence-absence observations 
- MON_146
- Sample presence-absence observations 
- MON_147
- Sample presence-absence observations 
- MON_148
- Sample presence-absence observations 
- MON_15
- Sample presence-absence observations 
- MON_17
- Sample presence-absence observations 
- MON_18
- Sample presence-absence observations 
- MON_19
- Sample presence-absence observations 
- MON_2
- Sample presence-absence observations 
- MON_20
- Sample presence-absence observations 
- MON_21
- Sample presence-absence observations 
- MON_22
- Sample presence-absence observations 
- MON_23
- Sample presence-absence observations 
- MON_24
- Sample presence-absence observations 
- MON_25
- Sample presence-absence observations 
- MON_26
- Sample presence-absence observations 
- MON_27
- Sample presence-absence observations 
- MON_28
- Sample presence-absence observations 
- MON_29
- Sample presence-absence observations 
- MON_30
- Sample presence-absence observations 
- MON_31
- Sample presence-absence observations 
- MON_32
- Sample presence-absence observations 
- MON_33
- Sample presence-absence observations 
- MON_34
- Sample presence-absence observations 
- MON_35
- Sample presence-absence observations 
- MON_36
- Sample presence-absence observations 
- MON_37
- Sample presence-absence observations 
- MON_38
- Sample presence-absence observations 
- MON_39
- Sample presence-absence observations 
- MON_4
- Sample presence-absence observations 
- MON_40
- Sample presence-absence observations 
- MON_41
- Sample presence-absence observations 
- MON_42
- Sample presence-absence observations 
- MON_43
- Sample presence-absence observations 
- MON_44
- Sample presence-absence observations 
- MON_45
- Sample presence-absence observations 
- MON_46
- Sample presence-absence observations 
- MON_47
- Sample presence-absence observations 
- MON_48
- Sample presence-absence observations 
- MON_49
- Sample presence-absence observations 
- MON_5
- Sample presence-absence observations 
- MON_50
- Sample presence-absence observations 
- MON_51
- Sample presence-absence observations 
- MON_52
- Sample presence-absence observations 
- MON_53
- Sample presence-absence observations 
- MON_54
- Sample presence-absence observations 
- MON_55
- Sample presence-absence observations 
- MON_56
- Sample presence-absence observations 
- MON_57
- Sample presence-absence observations 
- MON_58
- Sample presence-absence observations 
- MON_59
- Sample presence-absence observations 
- MON_60
- Sample presence-absence observations 
- MON_61
- Sample presence-absence observations 
- MON_62
- Sample presence-absence observations 
- MON_63
- Sample presence-absence observations 
- MON_64
- Sample presence-absence observations 
- MON_65
- Sample presence-absence observations 
- MON_66
- Sample presence-absence observations 
- MON_67
- Sample presence-absence observations 
- MON_68
- Sample presence-absence observations 
- MON_69
- Sample presence-absence observations 
- MON_7
- Sample presence-absence observations 
- MON_70
- Sample presence-absence observations 
- MON_71
- Sample presence-absence observations 
- MON_72
- Sample presence-absence observations 
- MON_73
- Sample presence-absence observations 
- MON_74
- Sample presence-absence observations 
- MON_75
- Sample presence-absence observations 
- MON_76
- Sample presence-absence observations 
- MON_77
- Sample presence-absence observations 
- MON_78
- Sample presence-absence observations 
- MON_79
- Sample presence-absence observations 
- MON_8
- Sample presence-absence observations 
- MON_80
- Sample presence-absence observations 
- MON_81
- Sample presence-absence observations 
- MON_82
- Sample presence-absence observations 
- MON_83
- Sample presence-absence observations 
- MON_84
- Sample presence-absence observations 
- MON_85
- Sample presence-absence observations 
- MON_86
- Sample presence-absence observations 
- MON_88
- Sample presence-absence observations 
- MON_9
- Sample presence-absence observations 
- MON_90
- Sample presence-absence observations 
- MON_91
- Sample presence-absence observations 
- MON_92
- Sample presence-absence observations 
- MON_93
- Sample presence-absence observations 
- MON_94
- Sample presence-absence observations 
- MON_95
- Sample presence-absence observations 
- MON_96
- Sample presence-absence observations 
- MON_97
- Sample presence-absence observations 
- MON_98
- Sample presence-absence observations 
- MON_99
- Sample presence-absence observations 
...
Source
Triadó-Margarit, X., Capitán, J.A., Menéndez-Serra, M. et al. A Randomized Trait Community Clustering approach to unveil consistent environmental thresholds in community assembly. ISME J 13, 2681–2689 (2019).