rconf is a minimal and lightweight configuration tool for R that parses basic YAML configuration files without any external dependencies. This vignette provides a detailed guide on using rconf for various scenarios—from basic configuration loading to advanced techniques such as dynamic configuration selection and merging configurations.
The simplest way to use rconf is to load a configuration
file and extract settings. For example, assume you have a configuration
file stored in the package’s inst/extdata
directory. You
can load it as follows:
library(rconf)
# Load the default configuration from the sample file in extdata.
cfg <- get_config(system.file("extdata", "config.yml", package = "rconf"))
print(cfg)
## $raw_data_dir
## [1] "/data/proteomics/raw"
##
## $processed_data_dir
## [1] "/data/proteomics/processed"
##
## $sample_metadata
## [1] "/data/proteomics/metadata/samples.csv"
##
## $normalization_method
## [1] "median"
##
## $quantification
## [1] "LFQ"
##
## $protein_fdr
## [1] 0.01
##
## $differential_expression
## [1] TRUE
##
## $p_value_cutoff
## [1] 0.05
##
## $plot
## $plot$output_dir
## [1] "results/plots"
##
## $plot$format
## [1] "png"
##
## $plot$dpi
## [1] 300
The returned list cfg
will contain keys such as
raw_data_dir
, sample_metadata
, and analysis
parameters defined in your YAML file.
Your YAML file can contain multiple configuration sets (e.g.,
default
, development
,
production
). For example, consider the following YAML
snippet:
default:
raw_data_dir: "/data/proteomics/raw"
processed_data_dir: "/data/proteomics/processed"
sample_metadata: "/data/proteomics/metadata/samples.csv"
normalization_method: "median"
development:
raw_data_dir: "/data/dev/proteomics/raw"
processed_data_dir: "/data/dev/proteomics/processed"
sample_metadata: "/data/dev/proteomics/metadata/samples_dev.csv"
normalization_method: "quantile"
You can load a specific configuration by passing the desired configuration name:
# Load the 'development' configuration
dev_cfg <- get_config(system.file("extdata", "config.yml", package = "rconf"),
config_name = "development")
print(dev_cfg$raw_data_dir)
## NULL
The rconf
parser supports the following:
yields a value of "bam"
.
Inline Arrays: Arrays specified inline are flattened into atomic vectors. For example:
is parsed into the numeric vector c(25, 30, 35)
.
Nested Keys: Indentation is used to create nested lists. For example:
results in a list
plot with elements:
output_dir
and format
.
Because rconf
returns a list, you can easily override
configuration values at runtime. For instance:
# Load default configuration
cfg <- get_config(system.file("extdata", "config.yml", package = "rconf"))
# Override a parameter
cfg$normalization_method <- "z-score"
print(cfg$normalization_method)
## [1] "z-score"
You can also merge multiple configurations. For example, if you have
a development
configuration and want to override specific
parameters with values from the default
configuration, you
can do so as follows:
base_cfg <- get_config(system.file("extdata", "config.yml", package = "rconf"), config_name = "default")
dev_cfg <- get_config(system.file("extdata", "config.yml", package = "rconf"), config_name = "development")
combined_cfg <- merge_configs(base_cfg, dev_cfg)
print(combined_cfg)
## $raw_data_dir
## [1] "/data/proteomics/raw"
##
## $processed_data_dir
## [1] "/data/proteomics/processed"
##
## $sample_metadata
## [1] "/data/proteomics/metadata/samples.csv"
##
## $normalization_method
## [1] "median"
##
## $quantification
## [1] "LFQ"
##
## $protein_fdr
## [1] 0.01
##
## $differential_expression
## [1] TRUE
##
## $p_value_cutoff
## [1] 0.05
##
## $plot
## $plot$output_dir
## [1] "results/plots"
##
## $plot$format
## [1] "png"
##
## $plot$dpi
## [1] 300
##
##
## $default
## $default$raw_data_dir
## [1] "/data/proteomics/raw"
##
## $default$processed_data_dir
## [1] "/data/proteomics/processed"
##
## $default$sample_metadata
## [1] "/data/proteomics/metadata/samples.csv"
##
## $default$normalization_method
## [1] "median"
##
## $default$quantification
## [1] "LFQ"
##
## $default$protein_fdr
## [1] 0.01
##
## $default$differential_expression
## [1] TRUE
##
## $default$p_value_cutoff
## [1] 0.05
##
## $default$plot
## $default$plot$output_dir
## [1] "results/plots"
##
## $default$plot$format
## [1] "png"
##
## $default$plot$dpi
## [1] 300
If your project needs to choose configurations dynamically (e.g., based on an environment variable), you can create a helper function:
select_config <- function() {
env <- Sys.getenv("APP_ENV", unset = "default")
cfg <- get_config(system.file("extdata", "config.yml", package = "rconf"), config_name = env)
cfg
}
# Example usage:
Sys.setenv(APP_ENV = "development")
current_cfg <- select_config()
print(current_cfg)
## $default
## $default$raw_data_dir
## [1] "/data/proteomics/raw"
##
## $default$processed_data_dir
## [1] "/data/proteomics/processed"
##
## $default$sample_metadata
## [1] "/data/proteomics/metadata/samples.csv"
##
## $default$normalization_method
## [1] "median"
##
## $default$quantification
## [1] "LFQ"
##
## $default$protein_fdr
## [1] 0.01
##
## $default$differential_expression
## [1] TRUE
##
## $default$p_value_cutoff
## [1] 0.05
##
## $default$plot
## $default$plot$output_dir
## [1] "results/plots"
##
## $default$plot$format
## [1] "png"
##
## $default$plot$dpi
## [1] 300
Empty or Missing Values: Ensure that your YAML file does not contain only comments or blank lines; otherwise, rconf will return an empty list.
Parsing Errors: Double-check your YAML syntax (e.g., colon-separated key-value pairs, consistent indentation). The parser assumes a 2-space indentation for nested keys.
Overriding Behavior: When merging configurations, note that nested lists are replaced rather than deeply merged. Use a custom merge function if you require deep merging.