Introduction to htrSPRanalysis

This R package consists of a library of functions for processing and analyzing data from SPR binding kinetics experiments, and also contains an example script that executes those functions in a basic pipeline.

Install using install.packages(htrSPRanalysis)

Then load the library using:

library(htrSPRanalysis)

Input

The user must provide two input files

  1. A time series file that contains the times and corresponding response unit (RU) values for a complete high-throughput SPR run. The software assumes this file is in the output format from the Carterra LSA, with 384 spots on the chip. There are two header lines, as shown below:

  1. A sample information file in the form of an Excel spreadsheet with the following columns

Sample information column definitions

Reading and processing input files, obtaining user supplied options

The function process_input when executed in Rstudio will prompt the user for the sample information file and the titration data file. It will also prompt for settings such as the minimum and maximum RU to use for the dissocation window determination, the maximum iterations and the tolerances ptol and ftolto use for the non-linear least squares curve fit. It returns a list object with the sample information, the original time series, the baseline adjusted RUs and several other variables used internally. This is the object that will be passed to the functions for fitting and plotting.

As soon as these files are selected, the software will read in both files and check that the formats are correct. This may take some time, depending on the size of the titration data file.

Next, the user will be prompted to select the minimum allowed \(k_d\). This is used in the curve/parameter fitting as a lower bound.

The maximum number of iterations is the number of iterations at which the optimization algorithm should stop if convergence is not obtained.

The automatic detection of the dissociation window will be suppressed if the maximum RU is too high or the minimum RU is too low. The following prompts allow the user to determine these.

It is highly recommended that this software be run on a high-performance platform. The compute-intensive processes are all parallelized. The software automatically detects the total number of cores available. The user can change this to their preferred value.

There are three default output files:

  1. A .pdf file with each page corresponding to a spot on the chip. For each spot, there is a sensorgram with the baseline corrected data and the fitted RU curve for each concentration used to fit the curves/parameters, a residual plot, a response curve showing the selected concentrations and a table of kinetics parameters and their standard errors.

  2. A .pdf file that tracks any errors encountered in fitting/plotting

  3. A .csv file that outputs all fitted parameters and their standard errors

The software uses the data directory as the default output path, and uses the sample information filename along with a time/date stamp to name each output file. These can be changed by the user via arguments to process_input

Output

As explained in the previous section, the standard pipeline outputs \(3\) files, a .pdf consisting of plots and tables for each spot on the chip:

The Pipeline

# CRAN has strict limits on package sizes, and the titration files tend to be
#large. Therefore, we will need to download the titration data from the gitlab 
#repository. The following will download necessary files from gitlab onto
#your computer.

# By default, the files will go to the temp directory for this R session. 
# To change this, change the filename in destfile 

destfile <- file.path(tempdir(),"titration_data.xlsx")

download.file("https://gitlab.oit.duke.edu/janice/htrspranalysis/-/raw/master/inst/extdata/titration_data.xlsx?ref_type=heads",
              destfile = destfile, mode = "wb")

sample_sheet_path <- system.file("extdata",
                                 "sample_sheet.xlsx",
                                 package="htrSPRanalysis")

data_file_path <- destfile

output_file_path <- tempdir()

# process the input
processed_input <- process_input(sample_sheet_path,
                    data_file_path,
                    num_cores = 2,
                    output_file_path = output_file_path)

Obtain plots of raw data

plots_before_processing <- get_plots_before_baseline(processed_input)

Example plot:

plots_before_processing[[1]]
#> Warning: Removed 225 rows containing missing values or values outside the scale range
#> (`geom_point()`).

plot of chunk unnamed-chunk-5

Fit the data to the 1:1 Languir model

fits_list <- get_fits(processed_input)

Plot the sensorgrams with fitted curves

plot_list <- get_fitted_plots(processed_input, fits_list)

Example plot:

plot_list[[1]]

plot of chunk unnamed-chunk-8

Create the response curve plots

These plots show the average response vs \(log_{10}\) of the concentration of analyte. The color indicates the concentrations selected for fitting. These should be within the linear part of the curve.

rc_list <- get_rc_plots(processed_input)

Example plot

rc_list[[5]]

plot of chunk unnamed-chunk-10

Create a pdf with fitted sensorgrams, curve parameters, standard errors, residual plot and response curve.

pdf_pages <- create_pdf(processed_input, fits_list, rc_list, plot_list)

plot of chunk unnamed-chunk-11plot of chunk unnamed-chunk-11plot of chunk unnamed-chunk-11plot of chunk unnamed-chunk-11plot of chunk unnamed-chunk-11

Create a .csv file with all model parameters and standard errors.

parameter_table <- create_csv(processed_input, fits_list)

Bulk shift example

Pipeline


sample_sheet_path <- system.file("extdata", "bulkshift_example_sample_sheet.xlsx", package="htrSPRanalysis")
data_file_path <- system.file("extdata", "bulkshift_example_titration_data.xlsx", package="htrSPRanalysis")

# process the input

pdfpath <- tempdir()


bulkshift_example <- process_input(sample_sheet_path = sample_sheet_path,
                data_file_path = data_file_path,
                num_cores = 2, 
                output_file_path = pdfpath)
#> Joining with `by = join_by(Sample, Block)`
#> New names:

bulkshift_plots_before_processing <- get_plots_before_baseline(bulkshift_example)

bulkshift_fits <- get_fits(bulkshift_example)

bulkshift_plots <- get_fitted_plots(processed_input = bulkshift_example,
                                    fits_list = bulkshift_fits)

bulkshift_rc <- get_rc_plots(processed_input = bulkshift_example)

Output

bulkshift_plots_before_processing[[1]]
#> Warning: Removed 114 rows containing missing values or values outside the scale range
#> (`geom_point()`).

plot of chunk unnamed-chunk-14


bulkshift_plots[[1]]

plot of chunk unnamed-chunk-14


bulkshift_rc[[1]]

plot of chunk unnamed-chunk-14


create_pdf(processed_input = bulkshift_example, 
           fits_list = bulkshift_fits,
           rc_list = bulkshift_rc)

create_csv(processed_input = bulkshift_example, 
           fits_list = bulkshift_fits)
#> # A tibble: 1 × 42
#>   Ligand Analyte Block Row   Column   ROI Rmax1 Rmax2 Rmax3 Rmax4 Rmax5 Rmax_se1
#>   <chr>  <chr>   <dbl> <chr>  <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>    <dbl>
#> 1 Antib… Antige…     1 A          1     1  59.8  71.9  83.6  87.8  91.4     2.78
#> # ℹ 30 more variables: Rmax_se2 <dbl>, Rmax_se3 <dbl>, Rmax_se4 <dbl>,
#> #   Rmax_se5 <dbl>, ka <dbl>, ka_se <dbl>, kd <dbl>, kd_se <dbl>, KD <dbl>,
#> #   KD_se <dbl>, Bulkshift1 <dbl>, Bulkshift2 <dbl>, Bulkshift3 <dbl>,
#> #   Bulkshift4 <dbl>, Bulkshift5 <dbl>, Bulkshift_se1 <dbl>,
#> #   Bulkshift_se2 <dbl>, Bulkshift_se3 <dbl>, Bulkshift_se4 <dbl>,
#> #   Bulkshift_se5 <dbl>, R01 <dbl>, R02 <dbl>, R03 <dbl>, R04 <dbl>, R05 <dbl>,
#> #   R0_se1 <dbl>, R0_se2 <dbl>, R0_se3 <dbl>, R0_se4 <dbl>, R0_se5 <dbl>