
fastbioclim is an R package for creating custom-time
bioclimatic and derived environmental summary variables from supplied
raster data. It is designed to overcome computational bottlenecks and
methodological inflexibility by automatically switching between
processing frameworks to handle large-scale extents on standard
hardware.
Working with large climate datasets often presents a major challenge:
the data is too large to fit into memory. fastbioclim
addresses this gap by providing an efficient, unified interface that
automatically switches between two frameworks:
"terra"): For smaller
datasets, fastbioclim uses the terra package
to maximize speed."tiled"): For rasters
that exceed available RAM, it automatically switches to an on-disk
tiling framework. This approach leverages the high-performance
Rfast and exactextractr packages to process
data in chunks, ensuring scalability on standard personal computers
without requiring the end-user to manage these decisions.The core of the package is a set of flexible
functions—derive_bioclim() and
derive_statistics()—that automatically select the optimal
processing framework, providing a seamless experience for generating
temporally-matched environmental variables.
auto Method: Automatically
manages memory by switching between the in-memory (“terra”) and on-disk
(“tiled”) frameworks based on raster size and available RAM.derive_statistics() function to compute summary statistics
(mean, max, min, standard deviation, etc.) for any other time series
data available (e.g., wind speed, evapotranspiration, cloud cover).You can install the development version of fastbioclim
from GitHub with:
# install.packages("remotes")
remotes::install_github("gepinillab/fastbioclim")
# Install to get the package example data
remotes::install_github("gepinillab/egdata.fastbioclim")The package provides two primary core functions for variable calculation:
derive_bioclim(): For calculating the
standard and extended set of 35 bioclimatic variables.derive_statistics(): For deriving
summary statistics from any other environmental variable.Note: The package also includes aggregation functions like calculate_average(), calculate_roll(), and calculate_sum() to easily prepare your time-series data
This example demonstrates the core functionality using simple, self-contained mock data.
library(fastbioclim)
library(terra)
library(future.apply)
library(progressr)
tmin_ecu <- system.file("extdata/ecuador/", package = "egdata.fastbioclim") |>
list.files("tmin", full.names = TRUE) |> rast()
tmax_ecu <- system.file("extdata/ecuador/", package = "egdata.fastbioclim") |>
list.files("tmax", full.names = TRUE) |> rast()
prcp_ecu <- system.file("extdata/ecuador/", package = "egdata.fastbioclim") |>
list.files("prcp", full.names = TRUE) |> rast()
# The function will automatically use the fast in-memory "terra" method for this small dataset
output_dir_bioclim <- file.path(tempdir(), "bioclim_ecuador")
bioclim_vars <- derive_bioclim(
bios = 1:19,
tmin = tmin_ecu,
tmax = tmax_ecu,
prcp = prcp_ecu,
output_dir = output_dir_bioclim,
overwrite = TRUE
)
plot(bioclim_vars[[c("bio01", "bio12")]])# Derive environmental summary variables for a different factor (e.g., wind speed)
wind_rast <- system.file("extdata/ecuador/", package = "egdata.fastbioclim") |>
list.files("wind", full.names = TRUE) |> rast()
output_dir_custom <- file.path(tempdir(), "wind_ecuador")
custom_stats <- derive_statistics(
variable = wind_rast,
stats = c("mean", "max", "stdev"),
output_prefix = "wind",
output_dir = output_dir_custom,
overwrite = TRUE
)
plot(custom_stats)The real power of fastbioclim shines with large
datasets. The method = "auto" setting in
derive_bioclim() and derive_statistics()
handles this automatically.
When the wrapper function detects that the input rasters are too large to fit in memory, it seamlessly switches to the tiled workflow.
Important Requirement: For the tiled workflow to
function, your input SpatRaster objects must be pointing to
files on disk, not held entirely in memory.
# Conceptual example for large, file-based rasters
tmin_neo <- system.file("extdata/neotropics/", package = "egdata.fastbioclim") |>
list.files("tmin", full.names = TRUE) |> rast()
tmax_neo <- system.file("extdata/neotropics/", package = "egdata.fastbioclim") |>
list.files("tmax", full.names = TRUE) |> rast()
prcp_neo <- system.file("extdata/neotropics/", package = "egdata.fastbioclim") |>
list.files("prcp", full.names = TRUE) |> rast()
output_dir_bios <- file.path(tempdir(), "bioclim_neotropics")
# Optional: ACTIVATE PROGRESS BAR
progressr::handlers(global = TRUE)
# Optional: DEFINE PARALLEL PLAN FOR EVEN FASTER PROCESSING
future::plan("multisession", workers = 4)
# The call is identical. `derive_bioclim` will detect the large file size
# and automatically use the memory-safe tiled method.
large_scale_vars <- derive_bioclim(
bios = 1:19,
tmin = tmin_neo,
tmax = tmax_neo,
prcp = prcp_neo,
output_dir = output_dir_bios,
tile_degrees = 20,
overwrite = TRUE
)
print(large_scale_vars)
plot(large_scale_vars[["bio11"]])This R package is currently under active development. While it is functional, it may contain bugs or undergo changes to the API.
Contributions, bug reports, and feature requests are highly encouraged. Please open an issue on our GitHub repository to provide feedback.