---
title: "Bias surface"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Bias surface}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown}
editor_options: 
  chunk_output_type: console
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 10,
  fig.height = 6,
  dpi = 100,
  out.width = "95%"
)
```

<hr>

# Summary

-   [Description](#description)
-   [Getting ready](#getting-ready)
-   [Preparing the Bias Layer](#preparing-the-bias-layer)
    -   [Function Arguments](#function-arguments)
    -   [Direct vs. Inverse Bias Preparation](#direct-vs.-inverse-bias-preparation)
-   [Applying Bias to Predictions](#applying-bias-to-predictions)
    -   [Function Arguments](#function-arguments)
    -   [Comparing Applied Biases](#comparing-applied-biases)
    -   [Three-Dimensional Example](#sec-three-dimensional-example)
-   [Save and export](#save-and-export)

<hr>

# Description {#description}

Real-world biodiversity data is rarely collected systematically. Biologists often sample near roads, rivers, or research stations, creating spatial clustering that does not necessarily reflect optimal habitat.

The `nicheR` package allows you to simulate these spatial sampling biases. This workflow involves two steps:

1.  **`prepare_bias()`**: Standardizing raw environmental or anthropocentric covariates into probability-scaling surfaces.

2.  **`apply_bias()`**: Mathematically fusing those bias surfaces with your habitat suitability predictions.

<br>

# Getting ready {#getting-ready}

First, we load the core packages required for our spatial and niche operations. For this vignette, we assume you have already defined a nicheR_ellipsoid object.

```{r get_ready, results='hide', message=FALSE, warning=FALSE}
library(nicheR)
library(terra)

# Saving original plotting parameters
original_par <- par(no.readonly = TRUE)

# 1. Load reference niche (nicheR_ellipsoid object)
data("ref_ellipse", package = "nicheR")

# 2. Load pre-calculated prediction surfaces (from previous vignettes)
# These SpatRasters contain "suitability", "Mahalanobis", "suitability_trunc", etc.
pred <- terra::rast(system.file("extdata", "predictions_rast.tif", package = "nicheR"))
pred_3d <- terra::rast(system.file("extdata", "predictions_3d_rast.tif", package = "nicheR"))
```

<br>

# Preparing the Bias Layer {#preparing-the-bias-layer}

Raw bias proxies (e.g., species richness, nighttime lights, distance to roads) come in varying scales and units. `prepare_bias()` resamples, aligns, and min-max standardizes these layers to a strict [0, 1] scale. If multiple layers are provided, it allows you to assign unique directional effects to each before multiplying them into a single composite surface.

```{r, results='hide', message=FALSE, warning=FALSE}
# Load a sample bias layer containing 'sp_richness' and 'nighttime'
biases_file <- system.file("extdata", "ma_biases.tif", package = "nicheR")
raw_bias <- terra::rast(biases_file)

# --- Plotting the Output ---
par(mfrow = c(1, 2), mar = c(4, 4, 3, 2))

# Plot the raw inputs
terra::plot(raw_bias[["sp_richness"]], main = "Species Richness")
terra::plot(raw_bias[["nighttime"]], main = "Nighttime Lights")
```

<br>

## Function Arguments {#function-arguments}

-   **`bias_surface`**: A `SpatRaster` (single or multi-layer) or a list of `SpatRaster` objects representing your raw bias proxies.

-   **`effect_direction`**: Character vector dictating how the raw values map to sampling probability:

    -   `"direct"`: Higher raw values = higher sampling probability (kept as is).

    -   `"inverse"`: Higher raw values = lower sampling probability (transformed to `1 - x`).

    -   *Note: If providing a multi-layer raster, you can pass a vector of directions mapping to each respective layer.*

-   **`template_layer`**: Optional `SpatRaster`. The reference layer used to align all bias layers (matching resolution, extent, and CRS). If `NULL`, the algorithm automatically selects the finest-resolution layer from your input.

-   **`include_composite`**: Logical. If `TRUE` (default), the function outputs the combined, multiplied composite bias surface.

-   **`include_processed_layers`**: Logical. If `TRUE`, includes the standardized individual layers *before* they are multiplied together. Default is `FALSE`.

-   **`mask_na`**: Logical. Controls how `NA` (NoData) pixels are handled across multiple layers:

    -   `TRUE` (Intersection): If a pixel is `NA` in *any* layer, the final composite pixel becomes `NA`.

    -   `FALSE` (Union - default): Ignores `NA`s where other layers have valid values, preserving as much spatial coverage as possible.

-   **`verbose`**: Logical. If `TRUE`, prints processing steps to the console.

<br>

## Example: Mixed-Direction Composite Bias

In this example, our `raw_bias` raster contains two layers: `sp_richness` and `nighttime` (nighttime lights/urbanization).

Ecologically, sampling effort is likely highest in areas with known high species richness, but lowest in highly urbanized areas. Therefore, we will assign a `"direct"` effect to richness and an `"inverse"` effect to nighttime lights to create a realistic composite bias surface.

```{r, results='hide', message=FALSE, warning=FALSE}
# Prepare a composite bias surface mapping unique directions to each layer
prep_composite <- prepare_bias(bias_surface = raw_bias, 
                               effect_direction = c("direct", "inverse"), 
                               verbose = FALSE)
# Plot the resulting unified bias probability surface
terra::plot(prep_composite$composite_surface, main = "Composite Bias Surface")
```

*Notice how the final composite surface creates "hotspots" for sampling in areas that feature both high species richness AND low urbanization, accurately reflecting the combined `effect_direction` parameter.*

<br>

# Applying Bias to Predictions {#applying-bias-to-predictions}

Once your bias surface is scaled between 0 and 1, you must integrate it with your actual ecological prediction (e.g., habitat suitability). `apply_bias()` aligns the grids and multiplies the suitability by the bias. **The resulting output is a weighted surface, not a true biological probability.**

<br>

## Function Arguments

-   **`prepared_bias`**: A single-layer `SpatRaster`, or the list output generated by `prepare_bias()`.

-   **`prediction`**: A `SpatRaster` containing your suitability predictions. **Values must be within [0, 1]**.

-   **`prediction_layer`**: Character. If your `prediction` raster has multiple layers, specify the exact name of the layer to use (e.g., `"suitability"`). If `NULL`, it defaults to the first layer.

-   **`effect_direction`**: Character. Dictates how the prepared bias is applied to the suitability:

    -   `"direct"` (default): Multiplies suitability $\times$ bias.

    -   `"inverse"`: Multiplies suitability $\times$ $(1 - \text{bias})$.

-   **`verbose`**: Logical. If `TRUE`, prints processing steps to the console.

<br>

## Example: Comparing Applied Biases

Below, we take a standard suitability prediction and restrict it using the composite bias layer we just prepared. We will apply it as a `"direct"` multiplier, meaning areas with high composite bias scores will retain their suitability, while areas with low bias scores will be penalized.

```{r}
# Apply the composite bias to our suitability layer
applied_bias <- apply_bias(prepared_bias = prep_composite, 
                           prediction = pred, 
                           prediction_layer = "suitability",
                           effect_direction = "direct")

# --- Plotting the Output ---
par(mfrow = c(1, 2), mar = c(4, 4, 3, 2))

# Original Biological Suitability
terra::plot(pred[["suitability"]], main = "Habitat Suitability")

# Suitability mathematically restricted by our composite sampling bias
terra::plot(applied_bias[[1]], main = "Suitability + Composite Bias")
```

*In the final comparison, observe how the spatial footprint of the species shrinks based on the bias layer. When drawing points from the `applied_bias` raster using `sample_biased_data()`, the algorithm is forced to ignore large swaths of highly suitable habitat simply because the simulated sampling effort (driven by nighttime lights and richness) in those areas is too low.*

## Three-Dimensional Example {#sec-three-dimensional-example}

Because collection bias is fundamentally geographic, the process of applying it works identically regardless of how many environmental dimensions were used to define the fundamental niche. Here, we apply the exact same composite bias surface to our 3-dimensional species prediction.

```{r}
# Apply the composite bias to our 3D suitability layer
applied_bias_3d <- apply_bias(prepared_bias = prep_composite, 
                              prediction = pred_3d, 
                              prediction_layer = "suitability",
                              effect_direction = "direct")

# --- Plotting the Output ---
par(mfrow = c(1, 2), mar = c(4, 4, 3, 2))

terra::plot(pred_3d[["suitability"]], main = "3D Habitat Suitability")
terra::plot(applied_bias_3d[[1]], main = "3D Suitability + Composite Bias")
```

```{r par_reset}
# Reset plotting parameters
par(original_par)
```

# Save and export {#save-and-export}

```{r}
# Save the final biased prediction layers to a temporary directory
temp_rast <- file.path(tempdir(), "applied_bias_rast.tif")
temp_rast_3d <- file.path(tempdir(), "applied_bias_3d_rast.tif")

terra::writeRaster(applied_bias[[1]], filename = temp_rast, overwrite = TRUE)
terra::writeRaster(applied_bias_3d[[1]], filename = temp_rast_3d, overwrite = TRUE)
```
