This vignette guides users on how to estimate fraction unbound in plasma (fup) from mass spectrometry data using ultracentrifugation (UC). Fraction unbound in plasma is a chemical specific parameter that describes the amount of free chemical in the plasma that is usually responsible for pharmacological effects (Redgrave, Roberts, and West (1975)).
The mass spectrometry data should be collected from an assay that uses ultracentrifugation as seen in Figure 1 (Smeltz, Wambaugh, and Wetmore (2023), Kreutz et al. (2023)).
Fig 1: fup UC experimental set up
First, we load in the example dataset from
invitroTKstats
.
Many datasets are loaded in: fup_uc_L0
,
fup_uc_L1
, fup_uc_L2
, fup_uc_L3
,
and fup_uc_L4
. These datasets are fup data at
Level 0, 1, 2, 3, and 4 respectively. Additional datasets associated
with Level 4 processing are also loaded in:
fup_uc_L2_heldout
and fup_uc_PREJAGS
. These
will be described later in the “Level 4 processing” section. Lastly, a
fup_uc_cheminfo
dataset is loaded in that contains chemical
information necessary for identification mapping; it is used to create
Level 0 data. For the purpose of this vignette, we’ll start with
fup_uc_L0
, the Level 0 data, to demonstrate the complete
pipelining process.
fup_uc_L0
is the output from the
merge_level0
function which compiles raw lab data from
specified Excel files into a singular data frame. The data frame
contains exactly one row per sample with information obtained from the
mass spectrometer. For more details on curating raw lab data to a
singular Level 0 data frame, see the “Data Guide Creation and Level-0
Data Compilation” vignette.
The following table displays the first three rows of
fup_uc_L0
, our Level 0 data.
Compound | DTXSID | Lab.Compound.ID | Date | Sample | Type | Compound.Conc | Peak.Area | ISTD.Peak.Area | ISTD.Name | Analysis.Params | Level0.File | Level0.Sheet | Sample Text | Sample.Type | Dilution.Factor | Replicate |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
8:2 Fluorotelomer sulfonic acid | DTXSID00192353 | 8:2 FTS | 010320 | 20200103_PFAS_PPB_UC_Sample004 | Blank | 0.00000 | M2-8:2FTS | 20220201_PFAS-LC_FractionUnbound_MGS.xlsx | 20200103 | Crash Blank | CC | 1 | ||||
8:2 Fluorotelomer sulfonic acid | DTXSID00192353 | 8:2 FTS | 010320 | 20200103_PFAS_PPB_UC_Sample005 | Blank | 0.00000 | 792.493 | M2-8:2FTS | 20220201_PFAS-LC_FractionUnbound_MGS.xlsx | 20200103 | Crash Mixed Matrix Blank | CC | 1 | |||
8:2 Fluorotelomer sulfonic acid | DTXSID00192353 | 8:2 FTS | 010320 | 20200103_PFAS_PPB_UC_Sample007 | Standard | 0.00008 | 4.745 | 742.940 | M2-8:2FTS | 2.12 | 20220201_PFAS-LC_FractionUnbound_MGS.xlsx | 20200103 | CC1 - 0.053 pg/uL | CC | 1 |
format_fup_uc
is the Level 1 function used to create a
standardized data frame. This level of processing is necessary because
naming conventions or formatting can differ across data sets.
If the Level 0 data already contains the required column, then the
existing column name can be specified. For example,
fup_uc_L0
already contains a column specifying the sample
name called “Sample”. However, the default column name for sample name
is “Lab.Sample.Name”. Therefore, we specify the correct column with
sample.col = "Sample"
. In general, to specify an already
existing column that differs from the default, the user must use the
parameter with the .col
suffix.
If the Level 0 data does not already contain the required column,
then the entire column can be populated with a single value. For
example, fup_uc_L0
does not contain a column specifying
biological replicates. Therefore, we populate the required column with
biological.replicates = 1
. In general, to specify a single
value for an entire column, the user must use the parameter without the
.col
suffix.
Users should be mindful if they choose to specify a single value for all of their samples; they should verify this action is one they wish to take.
Some columns must be present in the Level 0 data while others can be
filled with a single value. At minimum, the following columns must be
present in the Level 0 data and specification with a single entry is not
permitted: sample.col
, lab.compound.col
,
dtxsid.col
, compound.col
,
area.col
, type.col
, and
istd.col
.
If there is no additional note.col
in the Level 0 data,
users should use note.col = NULL
to fill the column with
“Note”.
The rest of the following columns may either be specified from the
Level 0 data or filled with a single value: date.col
or
date
, test.conc.col
or test.conc
,
cal.col
or cal
, dilution.col
or
dilution
, istd.name.col
or
istd.name
, istd.conc.col
or
istd.conc
, uc.assay.conc.col
or
uc.assay.conc
, biological.replicates.col
or
biological.replicates
,
technical.replicates.col
or
technical.replicates
, analysis.method.col
or
analysis.method
, analysis.instrument.col
or
analysis.instrument
, analysis.parameters.col
or analysis.parameters
, level0.file.col
or
level0.file
, and level0.sheet.col
or
level0.sheet
.
Argument | Default | Required in L0? | Corresp. single-entry Argument | Descr. |
---|---|---|---|---|
FILENAME | MYDATA | N/A | Output and input filename | |
data.in | N/A | Level 0 data frame | ||
sample.col | Lab.Sample.Name | Y | Lab sample name | |
lab.compound.col | Lab.Compound.Name | Y | Lab test compound name (abbr.) | |
dtxsid.col | DTXSID | Y | EPA's DSSTox Structure ID | |
date.col | Date | N | date | Lab measurement date |
compound.col | Compound.Name | Y | Formal test compound name | |
area.col | Area | Y | Target analyte peak area | |
type.col | Sample.Type | Y | Sample type (CC/AF/T1/T5) | |
test.conc.col | Test.Compound.Conc | N | test.conc | Standard test chemical concentration |
cal.col | Cal | N | cal | MS calibration |
dilution.col | Dilution.Factor | N | dilution | Number of times sample was diluted |
istd.col | ISTD.Area | Y | Internal standard peak area | |
istd.name.col | ISTD.Name | N | istd.name | Internal standard name |
istd.conc.col | ISTD.Conc | N | istd.conc | Internal standard concentration |
uc.assay.conc.col | UC.Assay.Conc | N | uc.assay.conc | Intended initial test concentration |
biological.replicates.col | Biological.Replicates | N | biological.replicates | Replicates with the same analyte |
technical.replicates.col | Technical.Replicates | N | technical.replicates | Repeated measurements from one sample |
analysis.method.col | Analysis.Method | N | analysis.method | Analytical chemistry analysis method |
analysis.instrument.col | Analysis.Instrument | N | analysis.instrument | Analytical chemistry analysis instrument |
analysis.parameters.col | Analysis.Parameters | N | analysis.parameters | Analytical chemistry analysis parameters |
note.col | Note | N | Additional notes | |
level0.file.col | Level0.File | N | level0.file | Raw data filename |
level0.sheet.col | Level0.Sheet | N | level0.sheet | Raw data sheet name |
output.res | FALSE | N/A | Export results (TSV)? | |
save.bad.types | FALSE | N/A | Export bad data (TSV)? | |
sig.figs | 5 | N/A | Number of significant figures | |
INPUT.DIR | N/A | Input directory of Level 0 file | ||
OUTPUT.DIR | N/A | Export directory to save Level 1 files |
A TSV file containing the level-1 data can be exported to the user’s
per-session temporary directory. This temporary directory is a
per-session directory whose path can be found with the following code:
tempdir()
. For more details, see [https://www.collinberke.com/til/posts/2023-10-24-temp-directories/].
To avoid exporting to this temporary directory, an
OUTPUT.DIR
must be specified. We have omitted this export
entirely with output.res = FALSE
(the default). The option
to omit exporting a TSV file is also available at levels 2 and 3 and
will be used from this point forward.
fup_uc_L1_curated <- format_fup_uc(FILENAME = "Fup_UC_vignette",
data.in = fup_uc_L0,
# columns present in L0 data
sample.col = "Sample",
lab.compound.col = "Lab.Compound.ID",
compound.col = "Compound",
area.col = "Peak.Area",
test.conc.col = "Compound.Conc",
cal.col = "Date",
istd.col = "ISTD.Peak.Area",
technical.replicates.col = "Replicate",
analysis.parameters.col = "Analysis.Params",
# columns not present in L0 data
istd.conc = 1,
test.nominal.conc = 10,
biological.replicates = 1,
analysis.method = "UPLC-MS/MS",
analysis.instrument = "Waters Xevo TQ-S micro(QEB0036)",
note.col = NULL,
# don't export output TSV file
output.res = FALSE
)
#> 240 observations of 3 chemicals based on 3 separate measurements (calibrations).
All of our samples are successfully formatted and returned in
fup_uc_L1_curated
, our Level 1 data produced from
format_fup_uc
. Each sample has one of the following sample
types
If any samples had a different sample type, they would have been
removed and reported to the user. If the user wants to export the
removed samples as a TSV, the user should set the parameter
save.bad.types = TRUE
.
The following table displays the first three rows of
fup_uc_L1_curated
. In addition to the columns specified by
the user, there is an additional column called Response
.
This column is the test compound concentration and is calculated as
\(\textrm{Response} = \frac{\textrm{Analyte
Area}}{\textrm{ISTD Area}} * \textrm{ISTD Conc}\) where \(\textrm{Analyte Area}\) is defined by the
Area
column, \(\textrm{ISTD
Area}\) is defined by the ISTD.Area
column, and
\(\textrm{ISTD Conc}\) is defined by
the ISTD.Conc
column.
Lab.Sample.Name | Date | Compound.Name | DTXSID | Lab.Compound.Name | Sample.Type | Dilution.Factor | Calibration | ISTD.Name | ISTD.Conc | ISTD.Area | Area | Analysis.Method | Analysis.Instrument | Analysis.Parameters | Note | Level0.File | Level0.Sheet | Test.Compound.Conc | Test.Nominal.Conc | Biological.Replicates | Technical.Replicates | Response |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
20200103_PFAS_PPB_UC_Sample004 | 010320 | 8:2 Fluorotelomer sulfonic acid | DTXSID00192353 | 8:2 FTS | CC | 1 | 010320 | M2-8:2FTS | 1 | 678.1172 | 0.000 | UPLC-MS/MS | Waters Xevo TQ-S micro(QEB0036) | 20220201_PFAS-LC_FractionUnbound_MGS.xlsx | 20200103 | 0.00000 | 10 | 1 | 0.000000000 | |||
20200103_PFAS_PPB_UC_Sample005 | 010320 | 8:2 Fluorotelomer sulfonic acid | DTXSID00192353 | 8:2 FTS | CC | 1 | 010320 | M2-8:2FTS | 1 | 792.4930 | UPLC-MS/MS | Waters Xevo TQ-S micro(QEB0036) | 20220201_PFAS-LC_FractionUnbound_MGS.xlsx | 20200103 | 0.00000 | 10 | 1 | |||||
20200103_PFAS_PPB_UC_Sample007 | 010320 | 8:2 Fluorotelomer sulfonic acid | DTXSID00192353 | 8:2 FTS | CC | 1 | 010320 | M2-8:2FTS | 1 | 742.9400 | 4.745 | UPLC-MS/MS | Waters Xevo TQ-S micro(QEB0036) | 2.12 | 20220201_PFAS-LC_FractionUnbound_MGS.xlsx | 20200103 | 0.00008 | 10 | 1 | 0.006386788 |
sample_verification
is the Level 2 function used to add
a verification column. The verification column indicates whether a
sample should be included in the point estimation (Level 3) and credible
interval (Level 4) processing. This column allows users to keep all
samples in their data but only utilize the reliable samples for
fup estimation. All of the data in Level 2 is identical to
the data in Level 1 with the exception of the additional
Verified
column.
To determine whether a sample should be included, the user should consult the wet-lab scientists from where their data originates or a chemist who may be able to provide reliable rationale for samples that should not be verified. This level of processing allows the user to receive feedback from the wet-lab scientists, exclude erroneous or unreliable samples, and produce new fup estimates. Thus, there is an open channel of communication between the user and the wet-lab scientists or chemists.
We will use the already processed Level 2 data frame,
fup_uc_L2
, to regenerate our exclusion data. Note, all of
our samples are verified but we are explaining how to create an
exclusion list for learning purposes. In general, the user would not
have access to the exclusion information a priori.
The exclusion data frame must include the following columns:
Variables
, Values
, and Message
.
The Variables
column contains the variable names used to
filter the excluded rows. Here, we are using
Lab.Sample.Name
and DTXSID
to identify the
excluded rows separated by a “|”. The Values
column
contains the values of the variables, as a character, also separated by
a “|”. The Message
column contains the reason for
exclusion. Here we are using the reasons listed in the
Verified
column in fup_uc_L2
. The user should
refrain from using “|” in any of their descriptions to avoid conflicts
with the sample_verifiation
function.
# Use verification data from loaded in `fup_uc_L2` data frame
exclusion <- fup_uc_L2 %>%
filter(Verified != "Y") %>%
mutate("Variables" = "Lab.Sample.Name|DTXSID") %>%
mutate("Values" = paste(Lab.Sample.Name, DTXSID, sep = "|")) %>%
mutate("Message" = Verified) %>%
select(Variables, Values, Message)
Variables | Values | Message |
---|
As expected, our exclusion data frame is empty because all of our
samples are verified. If all of the user’s samples are verified, they
simply do not provide an exclusion.info
data frame in
sample_verification
.
fup_uc_L2_curated <- sample_verification(FILENAME = "fup_UC_vignette",
data.in = fup_uc_L1_curated,
assay = "fup-UC",
# don't export output TSV file
output.res = FALSE)
Our Level 2 data now contains a Verified
column. If the
sample should be included, the column contains a “Y” for yes. If the
sample should should be excluded, the column contains the reason for
exclusion.
The following table displays some rows of the Level 2 data.
Lab.Sample.Name | Date | Compound.Name | DTXSID | Lab.Compound.Name | Sample.Type | Dilution.Factor | Calibration | ISTD.Name | ISTD.Conc | ISTD.Area | Area | Analysis.Method | Analysis.Instrument | Analysis.Parameters | Note | Level0.File | Level0.Sheet | Test.Compound.Conc | Test.Nominal.Conc | Biological.Replicates | Technical.Replicates | Response | Verified |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
20200103_PFAS_PPB_UC_Sample004 | 010320 | 8:2 Fluorotelomer sulfonic acid | DTXSID00192353 | 8:2 FTS | CC | 1 | 010320 | M2-8:2FTS | 1 | 678.1172 | 0.000 | UPLC-MS/MS | Waters Xevo TQ-S micro(QEB0036) | 20220201_PFAS-LC_FractionUnbound_MGS.xlsx | 20200103 | 0.00000 | 10 | 1 | 0.000000000 | Y | |||
20200103_PFAS_PPB_UC_Sample005 | 010320 | 8:2 Fluorotelomer sulfonic acid | DTXSID00192353 | 8:2 FTS | CC | 1 | 010320 | M2-8:2FTS | 1 | 792.4930 | UPLC-MS/MS | Waters Xevo TQ-S micro(QEB0036) | 20220201_PFAS-LC_FractionUnbound_MGS.xlsx | 20200103 | 0.00000 | 10 | 1 | Y | |||||
20200103_PFAS_PPB_UC_Sample007 | 010320 | 8:2 Fluorotelomer sulfonic acid | DTXSID00192353 | 8:2 FTS | CC | 1 | 010320 | M2-8:2FTS | 1 | 742.9400 | 4.745 | UPLC-MS/MS | Waters Xevo TQ-S micro(QEB0036) | 2.12 | 20220201_PFAS-LC_FractionUnbound_MGS.xlsx | 20200103 | 0.00008 | 10 | 1 | 0.006386788 | Y |
calc_fup_uc
is the Level 3 function used to calculate
the fup point estimate from ultracentrifugation for each test
compound using a Frequentist framework.
Mathematically, fup is the ratio of the compound concentration in the aqueous fraction after centrifugation to the compound concentration in the sample after incubation for 5 hours. It can be expressed as \[f_{up} = \frac{C_{\textrm{aqueous fraction}}}{C_{\textrm{incubation}}}\] where \(C_{\textrm{aqueous fraction}}\) is the compound concentration in the aqueous fraction and \(C_{\textrm{incubation}}\) is the compound concentration in the incubated sample.
The concentrations are defined as the mean AF or T5 response multiplied by its dilution factor.
fup_uc_L3_curated <- calc_fup_uc_point(FILENAME = "Fup_UC_vignette",
data.in = fup_uc_L2_curated,
# don't export output TSV file
output.res = FALSE)
#> [1] "8:2 Fluorotelomer sulfonic acid f_up = 0.04"
#> [1] "Perfluorooctanoyl fluoride f_up = 0.00305"
#> [1] "Potassium perfluorobutanesulfonate f_up = 0.00501"
#> [1] "Fraction unbound values calculated for 3 chemicals."
#> [1] "Fraction unbound values calculated for 3 measurements."
Compound.Name | DTXSID | Lab.Compound.Name | Calibration | Fup |
---|---|---|---|---|
8:2 Fluorotelomer sulfonic acid | DTXSID00192353 | 8:2 FTS | All Data | 0.039992451 |
Perfluorooctanoyl fluoride | DTXSID0059829 | PFOA-F | All Data | 0.003046138 |
Potassium perfluorobutanesulfonate | DTXSID3037707 | K-PFBS | All Data | 0.005009286 |
Our Level 3 data contains a Fup
estimate and a
Calibration
column that details which data was used to
calculate the point estimate.
calc_fup_uc
is the Level 4 function used to calculate
fup point estimates and credible intervals from
ultracentrifugation using a Bayesian framework. Markov chain Monte Carlo
(MCMC) simulations are used to randomly sample from the posterior
distribution with a uniform prior.
To run Level 4, one needs to have JAGS installed on their machine. To
determine the correct path, the user may use the
runjags::findjags()
or may need to specify the JAGS
installation location for the JAGS.PATH
argument.
JAGS is a software used for conducting Bayesian hierarchical modeling
using Markov Chain Monte Carlo (MCMC) simulation.
invitroTKstats
contains the JAGS models for each of the
applicable assays and utilizes the JAGS model under the hood to run the
MCMC simulations via the runjags
dependency to obtain the
level-4 Bayesian estimates.
JAGS.PATH
argument within the level-4 function.rjags
,
runjags
, and coda
R packages need to be
installed via the R console using the install.packages
function.We pass fup_uc_L2_curated
, the Level 2 data frame, and
not fup_uc_L3_curated
, the Level 3 data
frame, into calc_fup_uc
. This is because Level 3 and Level
4 processing are not sequential; they are methods that calculate
different statistical quantities. The following code chunk takes a while
to run; previous runtimes are around 10 minutes.
fup_uc_L4_curated <- calc_fup_uc(FILENAME = "Fup_UC_vignette",
data.in = fup_uc_L2_curated,
JAGS.PATH = runjags::findjags()
)
The fup intervals are returned to the user’s R session, in
an exported TSV file, and in an exported .RData file. There is no
parameter to prevent the TSV or RData files from being exported because
of the potential for the simulations to crash. If there are no crashes,
then the exported TSV file is identical to the user’s R session and the
exported .RData file. fup_uc_L4
is an example exported
.RData file.
Additionally, intermediate files are saved to the user’s current
working directory if TEMP.DIR = NULL
. These include a Level
2 heldout set, fup_uc_L2_heldout
, containing unverified
samples and a Level 4 PREJAGS list, fup_uc_PREJAGS
,
containing arguments provided to JAGS. Because the PREJAGS list is
overwritten with each compound, fup_uc_PREJAGS
only
contains information relevant to the last tested compound, K-PFBS in
this case.
Our Level 4 data contains a credible interval for fup and for fstable which measures the compound’s stability. Mathematically, \(f_{stable} = \frac{\textrm{T5}}{\textrm{T1}}\) where a value of \(1\) indicates no compound breakdown.
Compound | DTXSID | Lab.Compound.Name | Fstable.Med | Fstable.Low | Fstable.High | Fup.Med | Fup.Low | Fup.High | Fup.point |
---|---|---|---|---|---|---|---|---|---|
8:2 Fluorotelomer sulfonic acid | DTXSID00192353 | 8:2 FTS | 0.934688 | 0.8940251 | 0.9998001 | 0.039551450 | 0.037112872 | 0.041812808 | 0.039992451 |
Perfluorooctanoyl fluoride | DTXSID0059829 | PFOA-F | 0.999929 | 0.9907321 | 0.9999990 | 0.003166725 | 0.002614648 | 0.003749361 | 0.003046138 |
Potassium perfluorobutanesulfonate | DTXSID3037707 | K-PFBS | 0.999937 | 0.9936837 | 0.9999990 | 0.005359600 | 0.004983673 | 0.005782690 | 0.005009286 |
Generally, data processing pipelines should include minimal to no manual coding. It is best to keep clean code that is easily reproducible and transferable. The user should aim to have all the required data and meta-data files properly formatted to avoid further modifications throughout the pipeline.
In this section, we provide the equations for the Bayesian model (i.e., priors and likelihoods) used to estimate the fraction unbound in plasma (\(f_{up}\)), from the ultracentrifugation assay, and the uncertainty about that estimate. The following sub-sections are organized such that:
Some of the indices are reused between sections (e.g. \(i\), \(w^*_i\), etc.). However, it should be noted that these are not meant to be understood across sub-sections; rather, only understood within the context of the section they are in.
NOTE: Readers unfamiliar with JAGS should be aware that JAGS software uses precision (\(\tau\)) rather than variance (\(\sigma^2\)) (i.e., \(\tau = \sigma^2\)).
Each chemical may have more than one day of experimentation, and thus
multiple calibrations. Suppose for the chemical of interest there are a
total of \(n_{cal}\) calibrations
(Num.cal
). For a particular calibration \(w \in (1,\ldots,n_{cal})\) we assume the
following priors for our Bayesian model:
Prior for log-scale constant analytic standard deviation
(log.const.analytic.sd
):
\[ log(\sigma_a)_w \sim \textrm{Unif} \left( a = -6,b = 1 \right); \space \sigma_{a,w} = 10^{log(\sigma_a)_w} \]
where \(a\) and \(b\) are the minimum and maximum,
respectively, and \(\sigma_{a,w}\) is
the converted parameter used in later equations
(const.analytic.sd
).
Prior for the log-scale heteroscedastic analytic slope
(log.hetero.analytic.slope
):
\[ log(m_h)_w \sim \textrm{Unif} \left( a = -6,b = 1 \right) ; \space m_{h,w} = 10^{log(m_h)_w} \]
where \(a\) and \(b\) are the minimum and maximum,
respectively, and \(m_{h,w}\) is the
converted parameter used in later equations
(hetero.analytic.slope
).
Prior for the threshold concentration
(C.thresh
):
\[ C_{thresh,w} \sim \textrm{Unif} \left( a = 0, b = \frac{conc_{target,w}}{10} \right)\]
where \(a\) and \(b\) are the minimum and maximum,
respectively, and \(conc_{target,w}\)
is the expected initial concentration (Test.Nominal.Conc
)
for calibration index \(w\).
Prior for the log-scale calibration
(log.calibration
):
\[ log(cal)_w \sim \textrm{N} \left( \mu = 0, \tau = 0.01 \right); \space cal_w = 10^{log(cal)_w} \]
where \(\mu\) and \(\tau\) are the mean and precision,
respectively, and \(cal_w\) is the
converted parameter used in later equations
(calibration
).
Prior for the background
(background
):
\[ \gamma_w \sim \textrm{Exp} \left( \lambda = 100 \right) \]
where \(\lambda\) is the rate parameter.
Suppose \(n\) is the total number of
response observations (Num.obs
), and \(y_i\) indicates the \(i^{th}\) observation, for all sample types.
For each observation we obtain a posterior MCMC estimate for the
observations with the following:
Calibration curve slope for the \(i^{th}\) observation (slope
)
is assumed to be the calibration estimate corresponding to the
calibration index (\(w^*_i\)):
\[ m_i = cal_{w^*_i}\]
where \(w^*_i\) is the calibration
index (obs.cal
) for the \(i^{th}\) observation, such that \(w^* = (w^*_1,...,w^*_n)\) and \(w^*_i \in w\) (i.e. \(w^*_i\) indicates one of the \(n_{cal}\) calibrations).
Calibration curve intercept for the \(i^{th}\) observation
(intercept
) is assumed to be the background estimate
corresponding to the calibration index (\(w^*_i\)):
\[ \alpha_{CC,i} = \gamma_{w^*_i} \]
Estimation for the predicted values for response observations
(Response.pred
):
\[ x_i = \frac{m_i*((C_{c^*_i}- C_{thresh,w^*_i}) * \beta_i + \alpha_i)}{df_i} \]
where
obs.conc
) for the
\(i^{th}\) observationDilution.Factor
) for the \(i^{th}\) observation\[ \beta_i = \begin{cases} 1 & if \space (C_{c^*_i}- C_{thresh,w^*_i}) \ge 0\\ 0 & o.w. \end{cases} \]
Estimation of the precision for observations
(Response.prec
):
\[ \tau^*_i = \frac{1}{(\sigma_{a,w^*_i}+m_{h,w^*_i}*x_i)^2} \]
Likelihood for the observations
(Response.obs
):
\[ y_i \sim N(\mu = x_i,\tau = \tau^*_i) \]
Prior for the log-scale fraction unbound in plasma
(log.Fup
):
\[ log(f_{up}) \sim \textrm{Unif} \left( a = -15, b = 0 \right) ; \space f_{up} = 10^{log(f_{up})} \]
where \(a\) and \(b\) are the minimum and maximum,
respectively, and \(f_{up}\) is the
converted parameter used in later equations (Fup
).
Prior for the log-scale chemical loss fraction
(log.Floss
):
\[ log(f_{loss}) \sim \textrm{Unif} \left( a = -6,b = 0 \right); \space f_{stable} = 1 - 10^{log(f_{loss})} \]
where \(a\) and \(b\) are the minimum and maximum,
respectively, and \(f_{stable}\) is the
converted parameter used in later equations (Fstable
)
estimating the fraction of stable chemical in the assay and available
for plasma binding.
Suppose \(n_s\) is the total number
of series run (i.e. biological replicates) (Num.series
) and
\(C_i\) indicates the \(i^{th}\) series. Each series has a total of
3 observations, including the 1 and 5 hour whole plasma and aqueous
fraction observations (i.e. sample types = T1
or
T5
or AF
, respectively), such that there are a
total of \(n*\) observations (i.e.,
\(n^* = 3*n_s\)). For each series
obtain the following:
Prior for the whole plasma 1 hour sample (T1
)
concentration (Conc
):
\[ C_i \sim \textrm{N} \left( \mu = conc_{target,w^*_i}, \tau = 100 \right) \]
where
obs.cal
) for the \(i^{th}\) series \(w^* = (w^*_1, \ldots,w^*_n)\) and \(w^*_i \in w\) (i.e. \(w^*_i\) indicates one of the \(n_{cal}\) calibrations)Test.Nominal.Conc
)
corresponding to the \(i^{th}\) series,
given the calibration index \(w^*_i\)Posterior estimate for the whole plasma 5 hour sample
(T5
) concentrations (after potential breakdown):
\[ C_{T5,i} = f_{stable} * C_i \]
Posterior estimate for the aqueous fraction sample (AF
)
concentrations for the stable chemical at T5
:
\[ C_{AF,i} = f_{up}*C_{T5,i} \]
Posterior estimates for the fraction of stable chemical in the assay,
including the median (Fstable.Med
) and \(95\%\) credible interval
(Fstable.Low
and Fstable.High
):
\[ \textrm{Fstable.Med} = f_{stable,0.5} \]
\[ \textrm{Fstable.CI} = (\textrm{Fstable.Low},\textrm{Fstable.High}) = (f_{stable,0.025},f_{stable,0.975}) \]
where \(f_{stable,p}\) indicates the percentile (\(p\)) for the posterior distribution of the chemical stability fraction, (\(p = 0.5\) indicates the \(50\%\) percentile, i.e. median, for the posterior distribution).
Posterior estimates for the chemical fraction unbound in plasma,
including the median (Fup.Med
) and \(95\%\) credible interval
(Fup.Low
and Fup.High
):
\[ \textrm{Fup.Med} = f_{up,0.5} \]
\[ \textrm{Fup.CI} = (\textrm{Fup.Low},\textrm{Fup.High}) = (f_{up,0.025},f_{up,0.975}) \]
where \(f_{up,p}\) indicates the percentile (\(p\)) for the posterior distribution of the fraction of the chemical unbound in plasma, (\(p = 0.5\) indicates the \(50\%\) percentile, i.e. median, for the posterior distribution).
Point estimate for the chemical fraction unbound in plasma
(Fup.point
):
Suppose there are a total of \(n_{AF}\) observed aqueous fraction
(AF
) responses, then estimate the mean response:
\[ \hat{\mu_{AF}} = \frac{\sum_{j = 1}^{n_{AF}}(y_{AF,j} * df_{AF,j})}{n_{AF}} \]
where \(y_{AF,j}\) is the response
and \(df_{AF,j}\) is the dilution
factor for the \(j^{th}\) aqueous
fraction observation, sample type = AF
.
Suppose there are a total of \(n_{T5}\) observed whole plasma 5 hour
(T5
) responses, then estimate the mean response:
\[ \hat{\mu_{T5}} = \frac{\sum_{j = 1}^{n_{T5}}(y_{T5,j} * df_{T5})}{n_{T5}} \]
where \(y_{T5,j}\) is the response
and \(df_{T5,j}\) is the dilution
factor for the \(j^{th}\) whole plasma
5 hour observation, sample type = T5
.
Then the fraction unbound in plasma point estimate can be estimated as follows:
\[ \textrm{Fup.point} = \frac{\hat{\mu_{AF}}}{\hat{\mu_{T5}}} \]
Data passed to JAGS as part of the PREJAGS
object:
Data | Notation | Description |
---|---|---|
Test.Nominal.Conc |
\(conc_{target}\) | expected initial concentration |
Num.cal |
\(n_{cal}\) | total number of calibrations |
Num.obs |
\(n\) | total number of responses |
Response.obs |
\(y\) | all sample responses |
obs.conc |
\(c^*\) | concentration indices for all samples |
obs.cal |
\(w^*\) | calibration index for all samples |
Dilution.Factor |
\(df\) | dilution factors for all samples |
Conc |
\(C\) | the standard test chemical concentration of CC samples
(and NA placeholders for T1 , T5 ,
and AF samples) |
Num.cc.obs |
\(n_{CC}\) | total number of CC samples |
Num.series |
\(n_s\) | number of biological replicates (series) |
MCMC Parameters in JAGS:
JAGS Parameter Name | Parameter | Distribution | Prior/Posterior/Calculated |
---|---|---|---|
log.const.analytic.sd | \(log(\sigma_a)\) | Uniform | Prior |
log.hetero.analytic.slope | \(log(m_h)\) | Uniform | Prior |
C.thresh | \(C_{thresh}\) | Uniform | Prior |
log.calibration | \(log(cal)\) | Normal | Prior |
background | \(\gamma\) | Exponential | Prior |
const.analytic.sd | \(\sigma_a\) | Calculated | |
hetero.analytic.slope | \(m_h\) | Calculated | |
calibration | \(cal\) | Calculated | |
slope | \(m\) | Calculated | |
intercept | \(\alpha\) | Calculated | |
Response.pred | \(x\) | Calculated | |
Response.prec | \(\tau^*\) | Calculated | |
Response.obs | \(y\) | Normal | Posterior |
log.Fup |
\(log(f_{up})\) | Uniform | Prior |
Fup |
\(f_{up}\) | Calculated | |
log.Floss |
\(log(f_{loss})\) | Uniform | Prior |
Fstable |
\(f_{stable}\) | Calculated | |
Conc |
\(C\) | Normal (T1 ) / - (T5 , AF ) |
Prior (T1 ) / Calculated (T5 ,
AF ) |