updateColMeta() function
read_annotations()is.AptName() to check for AptName format
is.apt() may return TRUE for both SeqIds and
AptNamesis.AptName(), to explicitly check
if a SomaScan identifier is an AptNameis.AptName() code and tests based on code/tests from
is.SeqId()read_annotations()
preProcessAdat() to handle missing
ColCheck
preProcessAdat() to handle adats that
are missing ColCheck in column annotation datalibrary() calls loading third-party packages to
new code chunk that now appears in analysis workflow articlespreProcessAdat() function
preProcessAdat() to filter features,
filter samples, generate data QC plots of normalization scale factors by
covariates, and perform standard analyte RFU transformations including
log10, centering, and scalingcalcOutlierMap() function
calcOutlierMap() and its print and plot S3
methods, along with getOutlierIds() for identifying sample
level outliers from outlier map objectggplot2 as a package dependencyex_clin_data object
tibble object with additional sample annotation
fields smoking_status and alcohol_use to
demonstrate merging to a soma_adat objectpreProcessAdat() functionREADME
read_adat() example in READMEexample_data.adat objectpreProcessAdat()
function for pre-processingREADME and loading and wrangling vignette
article with section including code to join the
ex_clin_data object to the example_data
adatfigure(),
close_figure(), save_png(), and
expect_snapshot_plot() for saving plot snapshot output to
testthat/helper.RpreProcessAdat()
messaging, print and QC plot outputcalc_eLOD() function (#131)
soma_adat or data.framecrayon bug and ui_bullet() issue
(#129, #130)
crayon and usethis as dependencies
in favor of cliui_bullet() internal
calls within loadAdatsAsList() and
write_adat()Summary.soma_adat() operations (#121)
min(), max(),
any(), range(), etc. would return the
incorrect value due to an as.matrix() conversion under the
hoodsoma_adat, just like
Summary.data.frame()collapseAdats() now maintains Cal.Set entries of
Col.Meta (#113)
collapseAdats() now attempts to smartly merge the
(potentially numerous elements) Col.Meta attribute in the final object,
preserving the “Cal.Set” and “ColCheck” columns in particularCol.Meta attribute is a combined product
of the individual ADAT elements, and the intersect of the
analyte features (as is the case for the rbind() that is
called)read_annotations() to load the individual
Excel fileslift_master object to alpha sort columnsvignette() code
referencesread_annotations() example documentation now points to
the most recent 11k Excel annotations fileparseHeader() example now prints list elements
separately, rather than full object, which slowed website renderingrhub.yaml configuration file to comply with
rhub v2pkgdown.yaml to macOS-14pkgdown.yaml file to enable
deploymentubuntu machine was taking too long to buildgetSomaScanLiftCCC(),
parseCheck() and release utilities which were previously
untestedpivotExpressionSet()lift_adat() functionality (@stufield, #81, #78)
lift_adat() now takes a bridge = argument,
replacing the anno.tbl = argument. Lifting is now performed
internally for a better (and safer) user experience, without the
necessity of an external annotations (Excel) file.is_lifted() is new and returns a boolean according
to whether the signal space (RFU) has been previously lifted
Lifting accessor function for Lin’s CCC values (#88)
getSomaScanLiftCCC() accesses the lifting correlations
between SomaScan versions for each analytetibble split by sample matrix (serum or
plasma)merge_clin() is newly exported (#80)
soma_adat objects easilydplyrNewly exported ADAT “get**” helpers (#83)
getAdatVersion()getSomaScanVersion()getSignalSpace()checkSomaScanVersion()getAdatVersion() gets a new S3 method (#92)
soma_adat or list depending on the
situationNewly exported functions that were previously internal only:
addAttributes()addClass()cleanNames()README is now simplified (#35)
READMEREADME into
their own vignettescollapseAdats() better combines HEADER
information (#86)
PlateScale and
Cal*, are better maintained in the final collapsed
ADATHEADER information in the
resulting (collapsed) soma_adatUpdate read_annotations() with 11k
content (#85)
Update transform() and
scaleAnalytes()
scaleAnalytes() (internal) now skips missing references
and is much more like a “step” in the recipes packagetransform() gets edge case protection with
drop = FALSE in case a single-analyte
soma_adat is scaled.New row.names() S3 method support for
soma_adat class
rownmaes()NextMethod() which normally would
invoke data.frame, we now force the data.frame
method in case there are tbl_df or grouped_df
classes present that would be dispatched. Those are bypassed in favor of
the data.frame because tbl_df 1) can nuke the
attributes, 2) triggers a warning about adding rownames to a
tibble.New grouped_df S3 print support for the grouped
soma_adat
soma_adat classNew grouped_df S3 method support for
soma_adat class (#66)
grouped_df data objects previously unsupported and were
interfering with downstream S3 methods for dplyr verbs once
NextMethod() was calledsoma_adat class itself (and most importantly,
with its attributes intact)tidyr::separate.soma_adat() S3 method was simplified
(#72)
%||% helper internallystopifnot() to be more
informativeis_intact_attr() is now much quieter,
signaling only when called indirectly (#71)
is_intact_attr() can be, sometimes deeply,
nested in the call stackDevelopment and improvements to the pkgdown
website
README now
links to the pkgdown websiteSomaDataIO no longer depends on desc
package
README.mdsysdata.rda no longer contains non-exported functions
(#59)
convertColMeta()genRowNames()parseCheck()syncColMeta()scaleAnalytes()write_adat(), via a call to apply(), which
expects a 2-dim object when replacing those values.write_adat() no longer uses apply() and
instead converts the entire RFU data frame to a matrix (maintains
original dimensions), and use vectorized format conversion via
sprintf()sprintf() is
only called once on a long vector, rather than 1000s of times on shorter
vectors (inside apply()).SomaScanObjects.R
(thanks @Hijinx725!, #40)Rscript --vanilla merge_clin.R for merging clinical
variables into existing *.adat SomaScan data filesmeta.csv and meta2.csv
files to run examples with random data but with valid index keysdir(system.file("cli", "merge", package = "SomaDataIO"))example_data.adat was reduced in size to
n = 10 samples (from 192) to conform to CRAN size
requirements (< 5MB)example_data10.adat to
reflect this changesystem.file()example_data object itself however remains true to
its original file
(https://github.com/SomaLogic/SomaLogic-Data/blob/master/example_data.adat)inst/example/ was renamed
inst/extdata/ to conform to CRAN package standard naming
conventionssingle_sample.adat was removed from package
data as it is now redundant (however still used in unit testing)SomaDataObjects was renamed and is now
SomaScanObjectsread.adat() is now soft-deprecated; please use
read_adat() insteadwarn() ->
stop() for functions that have been been soft deprecated
since v5.0.0
getSomamers()getSomamerData()meltExpressionSet()tibble has new max_extra_cols = argument,
which is set to 6 for the print.soma_adat
methodbase::merge() on a soma_adat is
strongly discourageddplyr::*_join()
alternatives which are designed to preserve soma_adat
attributesprepHeaderMeta() (@stufield)
CreatedDate and
CreatedBy in the HEADER entry. This currently breaks the
writerCreatedDateHistory was removed as an entry from written
ADATsCreatedByHistory was combined and dated for written
ADATsNULL behavior remains if keys are missingCreatedBy and CreatedDate will be
generated either as new entries or over-written as appropriateBug-fix release related to write_adat():
write_adat() that resulted from
adding/removing clinical (non-SomaScan) variables to an ADAT. Export via
write_adat() resulted in a broken ADAT file (@stufield, #18)write_adat() now has much higher fidelity to original
text file (*.adat) in full-cycle read-write-read
operations; particularly in presence of bangs (!) in the
Header section and in floating point decimals in the
?Col.Meta sectionwrite_adat() no longer converts commas (,)
to semi-colons (;) in the ?Col.Meta block
(originally introduced to avoid cell alignment issues in
*.csv formats)write_adat() no longer concatenates written ADATs, when
writing to the same file. Data is over-written to file to avoid mangled
ADATs resulting from re-writing to the same connection and to match the
default behavior of write.table(),
write.csv(), etc.read_adat() now has more consistent character type
the Barcode2 variable in standard ADATs, now forces
character class, does not allow R’s
read.delim() to “guess” the type
Decreased dependency of magrittr pipes
(%>%) in favor of the native R pipe
(|>). As a result the package now depends on
R >= 4.1.0
SomaDataIO will continue to re-export
magrittr pipes for backward compatibility, but this should
not be considered permanent. Please code accordinglyMigration to the default branch in GitHub from
master -> main (@stufield, #19)
Numerous non-user-facing (API) changes internal package maintenance, efficiency, and structural upgrades were included
Upgrades primarily from improvements to SomaLogic internal code base, including: (@stufield)
readr
package for parsing and importing ADATs (e.g. read.delim()
over readr::read_delim()). This is mostly for code
simplification, but can often result in marked speed improvements. As
the SomaScan plex size increases, this speed improvement
will become more important.parseHeader() was dramatically simplified, now reading
in lines 20L at a time until the RFU block is reached. In addition, once
the block is reached, all header lines are read-in once and indexed (as
opposed to line-by-line).read_adat() now specifies column types via
colClasses = which for the majority of the ADAT is type
double for the RFU columns. This should dramatically
improve speed of ingest.write_adat() was simplified internally, with fewer
nested apply and for-loops.UTF-8.New getAnalytes() S3 method for class
recipe from the recipes package.
New loadAdatsAsList() to load multiple ADAT files in
a single call and optionally collapse them into a single data frame
(@stufield,
#8).
New getTargetNames() function to map ADAT
seq.XXXX.XX names to corresponding protein targets from the
annotations table
SomaLogic Inc. is now SomaLogic Operating Co. Inc.
Added new documentation regarding Col.Meta (@stufield, #12).
Research Use Only (“RUO”) language was added to the README (@stufield, #10).
Numerous internal code improvements from SomaLogic code-base (@stufield)
stop() over ui_stop() and
warning() over ui_warn(), using
usethis, cli, and crayon shims
aliases.purrr very selectively and no longer uses
stringr.New lift_adat() was added to provided ‘lifting’
functionality (@stufield, #11)
transform.soma_adat() method which
simplifies linear scaling of soma_adat columns
(analytes).Minor improvements and updates to the README.Rmd
(@stufield, #7)
adat2eSet() link in README (#5).README regarding
Biobase installation.pkgdown and links to Issues (#4).Startup message was improved with dynamic width (@stufield).
New locateSeqId() function to pull out
SeqId regex. (@stufield).
New read_annotations() function (@stufield, #2)
*.xlsx).New set_rn() drop-in replacement for
magrittr::set_rownames()
getFeatures() was renamed to be less ambiguous and
better align with internal SomaLogic code usage. Now use
getAnalytes() (@stufield)
getFeatureData() was also renamed to
getAnalyteInfo() (@stufield)
various upgrades as required by code changes in external package
dependencies, e.g. tidyverse.
new alias for read_adat(), read.adat(),
for backward compatibility to previous versions of
SomaDataIO (@stufield)