Fraction Unbound in Plasma - Ultracentrifugation (fup UC)

US EPA’s Center for Computational Toxicology and Exposure ccte@epa.gov

Introduction

This vignette guides users on how to estimate fraction unbound in plasma (fup) from mass spectrometry data using ultracentrifugation (UC). Fraction unbound in plasma is a chemical specific parameter that describes the amount of free chemical in the plasma that is usually responsible for pharmacological effects (Redgrave, Roberts, and West (1975)).

The mass spectrometry data should be collected from an assay that uses ultracentrifugation as seen in Figure 1 (Smeltz, Wambaugh, and Wetmore (2023), Kreutz et al. (2023)).

Fig 1: fup UC experimental set up

Fig 1: f~up~ UC experimental set up

Suggested packages for use with this vignette

# Primary package 
library(invitroTKstats)
# Data manipulation package 
library(dplyr)
# Table formatting package 
library(flextable)

Load Data

First, we load in the example dataset from invitroTKstats.

# Load example fup UC data 
data("Fup-UC-example")

Many datasets are loaded in: fup_uc_L0, fup_uc_L1, fup_uc_L2, fup_uc_L3, and fup_uc_L4. These datasets are fup data at Level 0, 1, 2, 3, and 4 respectively. Additional datasets associated with Level 4 processing are also loaded in: fup_uc_L2_heldout and fup_uc_PREJAGS. These will be described later in the “Level 4 processing” section. Lastly, a fup_uc_cheminfo dataset is loaded in that contains chemical information necessary for identification mapping; it is used to create Level 0 data. For the purpose of this vignette, we’ll start with fup_uc_L0, the Level 0 data, to demonstrate the complete pipelining process.

fup_uc_L0 is the output from the merge_level0 function which compiles raw lab data from specified Excel files into a singular data frame. The data frame contains exactly one row per sample with information obtained from the mass spectrometer. For more details on curating raw lab data to a singular Level 0 data frame, see the “Data Guide Creation and Level-0 Data Compilation” vignette.

The following table displays the first three rows of fup_uc_L0, our Level 0 data.

Table 1: Level 0 data

Compound

DTXSID

Lab.Compound.ID

Date

Sample

Type

Compound.Conc

Peak.Area

ISTD.Peak.Area

ISTD.Name

Analysis.Params

Level0.File

Level0.Sheet

Sample Text

Sample.Type

Dilution.Factor

Replicate

8:2 Fluorotelomer sulfonic acid

DTXSID00192353

8:2 FTS

010320

20200103_PFAS_PPB_UC_Sample004

Blank

0.00000

M2-8:2FTS

20220201_PFAS-LC_FractionUnbound_MGS.xlsx

20200103

Crash Blank

CC

1

8:2 Fluorotelomer sulfonic acid

DTXSID00192353

8:2 FTS

010320

20200103_PFAS_PPB_UC_Sample005

Blank

0.00000

792.493

M2-8:2FTS

20220201_PFAS-LC_FractionUnbound_MGS.xlsx

20200103

Crash Mixed Matrix Blank

CC

1

8:2 Fluorotelomer sulfonic acid

DTXSID00192353

8:2 FTS

010320

20200103_PFAS_PPB_UC_Sample007

Standard

0.00008

4.745

742.940

M2-8:2FTS

2.12

20220201_PFAS-LC_FractionUnbound_MGS.xlsx

20200103

CC1 - 0.053 pg/uL

CC

1

Level 1 processing

format_fup_uc is the Level 1 function used to create a standardized data frame. This level of processing is necessary because naming conventions or formatting can differ across data sets.

If the Level 0 data already contains the required column, then the existing column name can be specified. For example, fup_uc_L0 already contains a column specifying the sample name called “Sample”. However, the default column name for sample name is “Lab.Sample.Name”. Therefore, we specify the correct column with sample.col = "Sample". In general, to specify an already existing column that differs from the default, the user must use the parameter with the .col suffix.

If the Level 0 data does not already contain the required column, then the entire column can be populated with a single value. For example, fup_uc_L0 does not contain a column specifying biological replicates. Therefore, we populate the required column with biological.replicates = 1. In general, to specify a single value for an entire column, the user must use the parameter without the .col suffix.

Users should be mindful if they choose to specify a single value for all of their samples; they should verify this action is one they wish to take.

Some columns must be present in the Level 0 data while others can be filled with a single value. At minimum, the following columns must be present in the Level 0 data and specification with a single entry is not permitted: sample.col, lab.compound.col, dtxsid.col, compound.col, area.col, type.col, and istd.col.

If there is no additional note.col in the Level 0 data, users should use note.col = NULL to fill the column with “Note”.

The rest of the following columns may either be specified from the Level 0 data or filled with a single value: date.col or date, test.conc.col or test.conc, cal.col or cal, dilution.col or dilution, istd.name.col or istd.name, istd.conc.col or istd.conc, uc.assay.conc.col or uc.assay.conc, biological.replicates.col or biological.replicates, technical.replicates.col or technical.replicates, analysis.method.col or analysis.method, analysis.instrument.col or analysis.instrument, analysis.parameters.col or analysis.parameters, level0.file.col or level0.file, and level0.sheet.col or level0.sheet.

Table 2: Level 1 `format_fup_uc` function arguments

Argument

Default

Required in L0?

Corresp. single-entry Argument

Descr.

FILENAME

MYDATA

N/A

Output and input filename

data.in

N/A

Level 0 data frame

sample.col

Lab.Sample.Name

Y

Lab sample name

lab.compound.col

Lab.Compound.Name

Y

Lab test compound name (abbr.)

dtxsid.col

DTXSID

Y

EPA's DSSTox Structure ID

date.col

Date

N

date

Lab measurement date

compound.col

Compound.Name

Y

Formal test compound name

area.col

Area

Y

Target analyte peak area

type.col

Sample.Type

Y

Sample type (CC/AF/T1/T5)

test.conc.col

Test.Compound.Conc

N

test.conc

Standard test chemical concentration

cal.col

Cal

N

cal

MS calibration

dilution.col

Dilution.Factor

N

dilution

Number of times sample was diluted

istd.col

ISTD.Area

Y

Internal standard peak area

istd.name.col

ISTD.Name

N

istd.name

Internal standard name

istd.conc.col

ISTD.Conc

N

istd.conc

Internal standard concentration

uc.assay.conc.col

UC.Assay.Conc

N

uc.assay.conc

Intended initial test concentration

biological.replicates.col

Biological.Replicates

N

biological.replicates

Replicates with the same analyte

technical.replicates.col

Technical.Replicates

N

technical.replicates

Repeated measurements from one sample

analysis.method.col

Analysis.Method

N

analysis.method

Analytical chemistry analysis method

analysis.instrument.col

Analysis.Instrument

N

analysis.instrument

Analytical chemistry analysis instrument

analysis.parameters.col

Analysis.Parameters

N

analysis.parameters

Analytical chemistry analysis parameters

note.col

Note

N

Additional notes

level0.file.col

Level0.File

N

level0.file

Raw data filename

level0.sheet.col

Level0.Sheet

N

level0.sheet

Raw data sheet name

output.res

FALSE

N/A

Export results (TSV)?

save.bad.types

FALSE

N/A

Export bad data (TSV)?

sig.figs

5

N/A

Number of significant figures

INPUT.DIR

N/A

Input directory of Level 0 file

OUTPUT.DIR

N/A

Export directory to save Level 1 files

A TSV file containing the level-1 data can be exported to the user’s per-session temporary directory. This temporary directory is a per-session directory whose path can be found with the following code: tempdir(). For more details, see [https://www.collinberke.com/til/posts/2023-10-24-temp-directories/].

To avoid exporting to this temporary directory, an OUTPUT.DIR must be specified. We have omitted this export entirely with output.res = FALSE (the default). The option to omit exporting a TSV file is also available at levels 2 and 3 and will be used from this point forward.

fup_uc_L1_curated <- format_fup_uc(FILENAME = "Fup_UC_vignette",
                                   data.in = fup_uc_L0,
                                   # columns present in L0 data
                                   sample.col = "Sample",
                                   lab.compound.col = "Lab.Compound.ID",
                                   compound.col = "Compound",
                                   area.col = "Peak.Area", 
                                   test.conc.col = "Compound.Conc",
                                   cal.col = "Date", 
                                   istd.col = "ISTD.Peak.Area",
                                   technical.replicates.col = "Replicate",
                                   analysis.parameters.col = "Analysis.Params",
                                   # columns not present in L0 data 
                                   istd.conc = 1, 
                                   test.nominal.conc = 10, 
                                   biological.replicates = 1, 
                                   analysis.method = "UPLC-MS/MS",
                                   analysis.instrument = "Waters Xevo TQ-S micro(QEB0036)",
                                   note.col = NULL,
                                   # don't export output TSV file
                                   output.res = FALSE
                                   )
#> 240 observations of 3 chemicals based on 3 separate measurements (calibrations).

All of our samples are successfully formatted and returned in fup_uc_L1_curated, our Level 1 data produced from format_fup_uc. Each sample has one of the following sample types

  1. Calibration Curve (CC)
  2. Ultracentrifugation Aqueous Fraction (AF)
  3. Whole Plasma T1h Sample (T1)
  4. Whole Plasma T5h Sample (T5)

If any samples had a different sample type, they would have been removed and reported to the user. If the user wants to export the removed samples as a TSV, the user should set the parameter save.bad.types = TRUE.

The following table displays the first three rows of fup_uc_L1_curated. In addition to the columns specified by the user, there is an additional column called Response. This column is the test compound concentration and is calculated as \(\textrm{Response} = \frac{\textrm{Analyte Area}}{\textrm{ISTD Area}} * \textrm{ISTD Conc}\) where \(\textrm{Analyte Area}\) is defined by the Area column, \(\textrm{ISTD Area}\) is defined by the ISTD.Area column, and \(\textrm{ISTD Conc}\) is defined by the ISTD.Conc column.

Table 3: Level 1 data

Lab.Sample.Name

Date

Compound.Name

DTXSID

Lab.Compound.Name

Sample.Type

Dilution.Factor

Calibration

ISTD.Name

ISTD.Conc

ISTD.Area

Area

Analysis.Method

Analysis.Instrument

Analysis.Parameters

Note

Level0.File

Level0.Sheet

Test.Compound.Conc

Test.Nominal.Conc

Biological.Replicates

Technical.Replicates

Response

20200103_PFAS_PPB_UC_Sample004

010320

8:2 Fluorotelomer sulfonic acid

DTXSID00192353

8:2 FTS

CC

1

010320

M2-8:2FTS

1

678.1172

0.000

UPLC-MS/MS

Waters Xevo TQ-S micro(QEB0036)

20220201_PFAS-LC_FractionUnbound_MGS.xlsx

20200103

0.00000

10

1

0.000000000

20200103_PFAS_PPB_UC_Sample005

010320

8:2 Fluorotelomer sulfonic acid

DTXSID00192353

8:2 FTS

CC

1

010320

M2-8:2FTS

1

792.4930

UPLC-MS/MS

Waters Xevo TQ-S micro(QEB0036)

20220201_PFAS-LC_FractionUnbound_MGS.xlsx

20200103

0.00000

10

1

20200103_PFAS_PPB_UC_Sample007

010320

8:2 Fluorotelomer sulfonic acid

DTXSID00192353

8:2 FTS

CC

1

010320

M2-8:2FTS

1

742.9400

4.745

UPLC-MS/MS

Waters Xevo TQ-S micro(QEB0036)

2.12

20220201_PFAS-LC_FractionUnbound_MGS.xlsx

20200103

0.00008

10

1

0.006386788

Level 2 processing

sample_verification is the Level 2 function used to add a verification column. The verification column indicates whether a sample should be included in the point estimation (Level 3) and credible interval (Level 4) processing. This column allows users to keep all samples in their data but only utilize the reliable samples for fup estimation. All of the data in Level 2 is identical to the data in Level 1 with the exception of the additional Verified column.

To determine whether a sample should be included, the user should consult the wet-lab scientists from where their data originates or a chemist who may be able to provide reliable rationale for samples that should not be verified. This level of processing allows the user to receive feedback from the wet-lab scientists, exclude erroneous or unreliable samples, and produce new fup estimates. Thus, there is an open channel of communication between the user and the wet-lab scientists or chemists.

We will use the already processed Level 2 data frame, fup_uc_L2, to regenerate our exclusion data. Note, all of our samples are verified but we are explaining how to create an exclusion list for learning purposes. In general, the user would not have access to the exclusion information a priori.

The exclusion data frame must include the following columns: Variables, Values, and Message. The Variables column contains the variable names used to filter the excluded rows. Here, we are using Lab.Sample.Name and DTXSID to identify the excluded rows separated by a “|”. The Values column contains the values of the variables, as a character, also separated by a “|”. The Message column contains the reason for exclusion. Here we are using the reasons listed in the Verified column in fup_uc_L2. The user should refrain from using “|” in any of their descriptions to avoid conflicts with the sample_verifiation function.

# Use verification data from loaded in `fup_uc_L2` data frame 
exclusion <- fup_uc_L2 %>% 
  filter(Verified != "Y") %>% 
  mutate("Variables" = "Lab.Sample.Name|DTXSID") %>% 
  mutate("Values" = paste(Lab.Sample.Name, DTXSID, sep = "|")) %>% 
  mutate("Message" = Verified) %>% 
  select(Variables, Values, Message)
Table 4: Exclusion data

Variables

Values

Message

As expected, our exclusion data frame is empty because all of our samples are verified. If all of the user’s samples are verified, they simply do not provide an exclusion.info data frame in sample_verification.

fup_uc_L2_curated <- sample_verification(FILENAME = "fup_UC_vignette", 
                                         data.in = fup_uc_L1_curated,
                                         assay = "fup-UC",
                                         # don't export output TSV file
                                         output.res = FALSE)

Our Level 2 data now contains a Verified column. If the sample should be included, the column contains a “Y” for yes. If the sample should should be excluded, the column contains the reason for exclusion.

The following table displays some rows of the Level 2 data.

Table 5: Level 2 data

Lab.Sample.Name

Date

Compound.Name

DTXSID

Lab.Compound.Name

Sample.Type

Dilution.Factor

Calibration

ISTD.Name

ISTD.Conc

ISTD.Area

Area

Analysis.Method

Analysis.Instrument

Analysis.Parameters

Note

Level0.File

Level0.Sheet

Test.Compound.Conc

Test.Nominal.Conc

Biological.Replicates

Technical.Replicates

Response

Verified

20200103_PFAS_PPB_UC_Sample004

010320

8:2 Fluorotelomer sulfonic acid

DTXSID00192353

8:2 FTS

CC

1

010320

M2-8:2FTS

1

678.1172

0.000

UPLC-MS/MS

Waters Xevo TQ-S micro(QEB0036)

20220201_PFAS-LC_FractionUnbound_MGS.xlsx

20200103

0.00000

10

1

0.000000000

Y

20200103_PFAS_PPB_UC_Sample005

010320

8:2 Fluorotelomer sulfonic acid

DTXSID00192353

8:2 FTS

CC

1

010320

M2-8:2FTS

1

792.4930

UPLC-MS/MS

Waters Xevo TQ-S micro(QEB0036)

20220201_PFAS-LC_FractionUnbound_MGS.xlsx

20200103

0.00000

10

1

Y

20200103_PFAS_PPB_UC_Sample007

010320

8:2 Fluorotelomer sulfonic acid

DTXSID00192353

8:2 FTS

CC

1

010320

M2-8:2FTS

1

742.9400

4.745

UPLC-MS/MS

Waters Xevo TQ-S micro(QEB0036)

2.12

20220201_PFAS-LC_FractionUnbound_MGS.xlsx

20200103

0.00008

10

1

0.006386788

Y

Level 3 processing

calc_fup_uc is the Level 3 function used to calculate the fup point estimate from ultracentrifugation for each test compound using a Frequentist framework.

Mathematically, fup is the ratio of the compound concentration in the aqueous fraction after centrifugation to the compound concentration in the sample after incubation for 5 hours. It can be expressed as \[f_{up} = \frac{C_{\textrm{aqueous fraction}}}{C_{\textrm{incubation}}}\] where \(C_{\textrm{aqueous fraction}}\) is the compound concentration in the aqueous fraction and \(C_{\textrm{incubation}}\) is the compound concentration in the incubated sample.

The concentrations are defined as the mean AF or T5 response multiplied by its dilution factor.

fup_uc_L3_curated <- calc_fup_uc_point(FILENAME = "Fup_UC_vignette", 
                                 data.in = fup_uc_L2_curated,
                                 # don't export output TSV file 
                                 output.res = FALSE)
#> [1] "8:2 Fluorotelomer sulfonic acid f_up = 0.04"
#> [1] "Perfluorooctanoyl fluoride f_up = 0.00305"
#> [1] "Potassium perfluorobutanesulfonate f_up = 0.00501"
#> [1] "Fraction unbound values calculated for 3 chemicals."
#> [1] "Fraction unbound values calculated for 3 measurements."
Table 6: Level 3 data

Compound.Name

DTXSID

Lab.Compound.Name

Calibration

Fup

8:2 Fluorotelomer sulfonic acid

DTXSID00192353

8:2 FTS

All Data

0.039992451

Perfluorooctanoyl fluoride

DTXSID0059829

PFOA-F

All Data

0.003046138

Potassium perfluorobutanesulfonate

DTXSID3037707

K-PFBS

All Data

0.005009286

Our Level 3 data contains a Fup estimate and a Calibration column that details which data was used to calculate the point estimate.

Level 4 processing

calc_fup_uc is the Level 4 function used to calculate fup point estimates and credible intervals from ultracentrifugation using a Bayesian framework. Markov chain Monte Carlo (MCMC) simulations are used to randomly sample from the posterior distribution with a uniform prior.

To run Level 4, one needs to have JAGS installed on their machine. To determine the correct path, the user may use the runjags::findjags() or may need to specify the JAGS installation location for the JAGS.PATH argument.

JAGS is a software used for conducting Bayesian hierarchical modeling using Markov Chain Monte Carlo (MCMC) simulation. invitroTKstats contains the JAGS models for each of the applicable assays and utilizes the JAGS model under the hood to run the MCMC simulations via the runjags dependency to obtain the level-4 Bayesian estimates.

  1. Download and install the “Just Another Gibbs Sampler” (JAGS) software, which is freely available here.
  2. Once JAGS is installed one will need to identify the installation location, which is likely to be in a system location (e.g. “Program Files”). Users may need this location to specify the JAGS.PATH argument within the level-4 function.
  3. Finally, if not already installed, the rjags, runjags, and coda R packages need to be installed via the R console using the install.packages function.

We pass fup_uc_L2_curated, the Level 2 data frame, and not fup_uc_L3_curated, the Level 3 data frame, into calc_fup_uc. This is because Level 3 and Level 4 processing are not sequential; they are methods that calculate different statistical quantities. The following code chunk takes a while to run; previous runtimes are around 10 minutes.

fup_uc_L4_curated <- calc_fup_uc(FILENAME = "Fup_UC_vignette",
                                 data.in = fup_uc_L2_curated,
                                 JAGS.PATH = runjags::findjags()
                                 )

The fup intervals are returned to the user’s R session, in an exported TSV file, and in an exported .RData file. There is no parameter to prevent the TSV or RData files from being exported because of the potential for the simulations to crash. If there are no crashes, then the exported TSV file is identical to the user’s R session and the exported .RData file. fup_uc_L4 is an example exported .RData file.

Additionally, intermediate files are saved to the user’s current working directory if TEMP.DIR = NULL. These include a Level 2 heldout set, fup_uc_L2_heldout, containing unverified samples and a Level 4 PREJAGS list, fup_uc_PREJAGS, containing arguments provided to JAGS. Because the PREJAGS list is overwritten with each compound, fup_uc_PREJAGS only contains information relevant to the last tested compound, K-PFBS in this case.

Our Level 4 data contains a credible interval for fup and for fstable which measures the compound’s stability. Mathematically, \(f_{stable} = \frac{\textrm{T5}}{\textrm{T1}}\) where a value of \(1\) indicates no compound breakdown.

Table 7: Level 4 data

Compound

DTXSID

Lab.Compound.Name

Fstable.Med

Fstable.Low

Fstable.High

Fup.Med

Fup.Low

Fup.High

Fup.point

8:2 Fluorotelomer sulfonic acid

DTXSID00192353

8:2 FTS

0.934688

0.8940251

0.9998001

0.039551450

0.037112872

0.041812808

0.039992451

Perfluorooctanoyl fluoride

DTXSID0059829

PFOA-F

0.999929

0.9907321

0.9999990

0.003166725

0.002614648

0.003749361

0.003046138

Potassium perfluorobutanesulfonate

DTXSID3037707

K-PFBS

0.999937

0.9936837

0.9999990

0.005359600

0.004983673

0.005782690

0.005009286

Best Practices: Food for thought

Generally, data processing pipelines should include minimal to no manual coding. It is best to keep clean code that is easily reproducible and transferable. The user should aim to have all the required data and meta-data files properly formatted to avoid further modifications throughout the pipeline.

Appendix

L4 Bayesian Modeling Information

In this section, we provide the equations for the Bayesian model (i.e., priors and likelihoods) used to estimate the fraction unbound in plasma (\(f_{up}\)), from the ultracentrifugation assay, and the uncertainty about that estimate. The following sub-sections are organized such that:

Some of the indices are reused between sections (e.g. \(i\), \(w^*_i\), etc.). However, it should be noted that these are not meant to be understood across sub-sections; rather, only understood within the context of the section they are in.

NOTE: Readers unfamiliar with JAGS should be aware that JAGS software uses precision (\(\tau\)) rather than variance (\(\sigma^2\)) (i.e., \(\tau = \sigma^2\)).

Measurement Model

Each chemical may have more than one day of experimentation, and thus multiple calibrations. Suppose for the chemical of interest there are a total of \(n_{cal}\) calibrations (Num.cal). For a particular calibration \(w \in (1,\ldots,n_{cal})\) we assume the following priors for our Bayesian model:

Prior for log-scale constant analytic standard deviation (log.const.analytic.sd):

\[ log(\sigma_a)_w \sim \textrm{Unif} \left( a = -6,b = 1 \right); \space \sigma_{a,w} = 10^{log(\sigma_a)_w} \]

where \(a\) and \(b\) are the minimum and maximum, respectively, and \(\sigma_{a,w}\) is the converted parameter used in later equations (const.analytic.sd).

Prior for the log-scale heteroscedastic analytic slope (log.hetero.analytic.slope):

\[ log(m_h)_w \sim \textrm{Unif} \left( a = -6,b = 1 \right) ; \space m_{h,w} = 10^{log(m_h)_w} \]

where \(a\) and \(b\) are the minimum and maximum, respectively, and \(m_{h,w}\) is the converted parameter used in later equations (hetero.analytic.slope).

Prior for the threshold concentration (C.thresh):

\[ C_{thresh,w} \sim \textrm{Unif} \left( a = 0, b = \frac{conc_{target,w}}{10} \right)\]

where \(a\) and \(b\) are the minimum and maximum, respectively, and \(conc_{target,w}\) is the expected initial concentration (Test.Nominal.Conc) for calibration index \(w\).

Prior for the log-scale calibration (log.calibration):

\[ log(cal)_w \sim \textrm{N} \left( \mu = 0, \tau = 0.01 \right); \space cal_w = 10^{log(cal)_w} \]

where \(\mu\) and \(\tau\) are the mean and precision, respectively, and \(cal_w\) is the converted parameter used in later equations (calibration).

Prior for the background (background):

\[ \gamma_w \sim \textrm{Exp} \left( \lambda = 100 \right) \]

where \(\lambda\) is the rate parameter.

Estimation of Response Observations

Suppose \(n\) is the total number of response observations (Num.obs), and \(y_i\) indicates the \(i^{th}\) observation, for all sample types. For each observation we obtain a posterior MCMC estimate for the observations with the following:

Calibration curve slope for the \(i^{th}\) observation (slope) is assumed to be the calibration estimate corresponding to the calibration index (\(w^*_i\)):

\[ m_i = cal_{w^*_i}\]

where \(w^*_i\) is the calibration index (obs.cal) for the \(i^{th}\) observation, such that \(w^* = (w^*_1,...,w^*_n)\) and \(w^*_i \in w\) (i.e. \(w^*_i\) indicates one of the \(n_{cal}\) calibrations).

Calibration curve intercept for the \(i^{th}\) observation (intercept) is assumed to be the background estimate corresponding to the calibration index (\(w^*_i\)):

\[ \alpha_{CC,i} = \gamma_{w^*_i} \]

Estimation for the predicted values for response observations (Response.pred):

\[ x_i = \frac{m_i*((C_{c^*_i}- C_{thresh,w^*_i}) * \beta_i + \alpha_i)}{df_i} \]

where

  • \(c^*_i\) is the index for the standard test chemical concentration (obs.conc) for the \(i^{th}\) observation
  • \(df_i\) is the dilution factor (Dilution.Factor) for the \(i^{th}\) observation
  • \(\beta_i\) is the JAGS step function for the \(i^{th}\) observation, such that

\[ \beta_i = \begin{cases} 1 & if \space (C_{c^*_i}- C_{thresh,w^*_i}) \ge 0\\ 0 & o.w. \end{cases} \]

Estimation of the precision for observations (Response.prec):

\[ \tau^*_i = \frac{1}{(\sigma_{a,w^*_i}+m_{h,w^*_i}*x_i)^2} \]

Likelihood for the observations (Response.obs):

\[ y_i \sim N(\mu = x_i,\tau = \tau^*_i) \]

Binding Model

Prior for the log-scale fraction unbound in plasma (log.Fup):

\[ log(f_{up}) \sim \textrm{Unif} \left( a = -15, b = 0 \right) ; \space f_{up} = 10^{log(f_{up})} \]

where \(a\) and \(b\) are the minimum and maximum, respectively, and \(f_{up}\) is the converted parameter used in later equations (Fup).

Prior for the log-scale chemical loss fraction (log.Floss):

\[ log(f_{loss}) \sim \textrm{Unif} \left( a = -6,b = 0 \right); \space f_{stable} = 1 - 10^{log(f_{loss})} \]

where \(a\) and \(b\) are the minimum and maximum, respectively, and \(f_{stable}\) is the converted parameter used in later equations (Fstable) estimating the fraction of stable chemical in the assay and available for plasma binding.

Suppose \(n_s\) is the total number of series run (i.e. biological replicates) (Num.series) and \(C_i\) indicates the \(i^{th}\) series. Each series has a total of 3 observations, including the 1 and 5 hour whole plasma and aqueous fraction observations (i.e. sample types = T1 or T5 or AF, respectively), such that there are a total of \(n*\) observations (i.e., \(n^* = 3*n_s\)). For each series obtain the following:

Prior for the whole plasma 1 hour sample (T1) concentration (Conc):

\[ C_i \sim \textrm{N} \left( \mu = conc_{target,w^*_i}, \tau = 100 \right) \]

where

  • \(\mu\) and \(\tau\) are the mean and precision, respectively
  • \(i\) indicates the \(i^{th}\) series run (biological replicate)
  • \(w^*_i\) represents the calibration index (obs.cal) for the \(i^{th}\) series \(w^* = (w^*_1, \ldots,w^*_n)\) and \(w^*_i \in w\) (i.e. \(w^*_i\) indicates one of the \(n_{cal}\) calibrations)
  • \(conc_{target,w^*_i}\) is the expected initial concentration (Test.Nominal.Conc) corresponding to the \(i^{th}\) series, given the calibration index \(w^*_i\)

Posterior estimate for the whole plasma 5 hour sample (T5) concentrations (after potential breakdown):

\[ C_{T5,i} = f_{stable} * C_i \]

Posterior estimate for the aqueous fraction sample (AF) concentrations for the stable chemical at T5:

\[ C_{AF,i} = f_{up}*C_{T5,i} \]

Post-MCMC Estimates

Fraction of Chemical Stability

Posterior estimates for the fraction of stable chemical in the assay, including the median (Fstable.Med) and \(95\%\) credible interval (Fstable.Low and Fstable.High):

\[ \textrm{Fstable.Med} = f_{stable,0.5} \]

\[ \textrm{Fstable.CI} = (\textrm{Fstable.Low},\textrm{Fstable.High}) = (f_{stable,0.025},f_{stable,0.975}) \]

where \(f_{stable,p}\) indicates the percentile (\(p\)) for the posterior distribution of the chemical stability fraction, (\(p = 0.5\) indicates the \(50\%\) percentile, i.e. median, for the posterior distribution).

Fraction of Chemical Unbound in Plasma (Bayesian Estimates)

Posterior estimates for the chemical fraction unbound in plasma, including the median (Fup.Med) and \(95\%\) credible interval (Fup.Low and Fup.High):

\[ \textrm{Fup.Med} = f_{up,0.5} \]

\[ \textrm{Fup.CI} = (\textrm{Fup.Low},\textrm{Fup.High}) = (f_{up,0.025},f_{up,0.975}) \]

where \(f_{up,p}\) indicates the percentile (\(p\)) for the posterior distribution of the fraction of the chemical unbound in plasma, (\(p = 0.5\) indicates the \(50\%\) percentile, i.e. median, for the posterior distribution).

Fraction Unbound in Plasma (Empirical Point Estimate)

Point estimate for the chemical fraction unbound in plasma (Fup.point):

Suppose there are a total of \(n_{AF}\) observed aqueous fraction (AF) responses, then estimate the mean response:

\[ \hat{\mu_{AF}} = \frac{\sum_{j = 1}^{n_{AF}}(y_{AF,j} * df_{AF,j})}{n_{AF}} \]

where \(y_{AF,j}\) is the response and \(df_{AF,j}\) is the dilution factor for the \(j^{th}\) aqueous fraction observation, sample type = AF.

Suppose there are a total of \(n_{T5}\) observed whole plasma 5 hour (T5) responses, then estimate the mean response:

\[ \hat{\mu_{T5}} = \frac{\sum_{j = 1}^{n_{T5}}(y_{T5,j} * df_{T5})}{n_{T5}} \]

where \(y_{T5,j}\) is the response and \(df_{T5,j}\) is the dilution factor for the \(j^{th}\) whole plasma 5 hour observation, sample type = T5.

Then the fraction unbound in plasma point estimate can be estimated as follows:

\[ \textrm{Fup.point} = \frac{\hat{\mu_{AF}}}{\hat{\mu_{T5}}} \]

JAGS to L4 Equation Notation Tables

Data passed to JAGS as part of the PREJAGS object:

Data Notation Description
Test.Nominal.Conc \(conc_{target}\) expected initial concentration
Num.cal \(n_{cal}\) total number of calibrations
Num.obs \(n\) total number of responses
Response.obs \(y\) all sample responses
obs.conc \(c^*\) concentration indices for all samples
obs.cal \(w^*\) calibration index for all samples
Dilution.Factor \(df\) dilution factors for all samples
Conc \(C\) the standard test chemical concentration of CC samples (and NA placeholders for T1, T5, and AF samples)
Num.cc.obs \(n_{CC}\) total number of CC samples
Num.series \(n_s\) number of biological replicates (series)

MCMC Parameters in JAGS:

JAGS Parameter Name Parameter Distribution Prior/Posterior/Calculated
log.const.analytic.sd \(log(\sigma_a)\) Uniform Prior
log.hetero.analytic.slope \(log(m_h)\) Uniform Prior
C.thresh \(C_{thresh}\) Uniform Prior
log.calibration \(log(cal)\) Normal Prior
background \(\gamma\) Exponential Prior
const.analytic.sd \(\sigma_a\) Calculated
hetero.analytic.slope \(m_h\) Calculated
calibration \(cal\) Calculated
slope \(m\) Calculated
intercept \(\alpha\) Calculated
Response.pred \(x\) Calculated
Response.prec \(\tau^*\) Calculated
Response.obs \(y\) Normal Posterior
log.Fup \(log(f_{up})\) Uniform Prior
Fup \(f_{up}\) Calculated
log.Floss \(log(f_{loss})\) Uniform Prior
Fstable \(f_{stable}\) Calculated
Conc \(C\) Normal (T1) / - (T5, AF) Prior (T1) / Calculated (T5, AF)

References

Kreutz, Anna, Matthew S. Clifton, W. Matthew Henderson, Marci G. Smeltz, Matthew Phillips, John F. Wambaugh, and Barbara A. Wetmore. 2023. “Category-Based Toxicokinetic Evaluations of Data-Poor Per- and Polyfluoroalkyl Substances (PFAS) Using Gas Chromatography Coupled with Mass Spectrometry.” Toxics 11 (5): 463.
Redgrave, T. G., D. C. K. Roberts, and C. E. West. 1975. “Separation of Plasma Lipoproteins by Density-Gradient Ultracentrifugation.” Analytical Biochemistry 65 (1–2): 42–49.
Smeltz, Marci, John F. Wambaugh, and Barbara A. Wetmore. 2023. “Plasma Protein Binding Evaluations of Per- and Polyfluoroalkyl Substances for Category-Based Toxicokinetic Assessment.” Chemical Research in Toxicology 36 (6): 870–81.