Type: Package
Title: Compositional Statistical Framework for RNA Fractionation Analysis
Version: 1.0.0
Date: 2025-10-16
Description: A compositional statistical framework for absolute proportion estimation between fractions in RNA sequencing data. 'FracFixR' addresses the fundamental challenge in fractionated RNA-seq experiments where library preparation and sequencing depth obscure the original proportions of RNA fractions. It reconstructs original fraction proportions using non-negative linear regression, estimates the "lost" unrecoverable fraction, corrects individual transcript frequencies, and performs differential proportion testing between conditions. Supports any RNA fractionation protocol including polysome profiling, sub-cellular localization, and RNA-protein complex isolation.
License: CC BY 4.0
Encoding: UTF-8
LazyData: true
Depends: R (≥ 4.0.0)
Imports: future.apply (≥ 1.8.1), nnls (≥ 1.4), ggplot2 (≥ 3.3.5), dplyr (≥ 1.0.7), RColorBrewer (≥ 1.1-2), tidyr (≥ 1.1.3), matrixStats (≥ 0.60.0), aod (≥ 1.3.1), stats, utils, future, grDevices
Suggests: testthat (≥ 3.0.0), knitr, rmarkdown
RoxygenNote: 7.3.2
VignetteBuilder: knitr
NeedsCompilation: no
Packaged: 2025-10-16 02:59:56 UTC; acleynen
Author: Alice Cleynen ORCID iD [aut, cre], Agin Ravindran [aut], Nikolay Shirokikh ORCID iD [aut]
Maintainer: Alice Cleynen <alice.cleynen@cnrs.fr>
Repository: CRAN
Date/Publication: 2025-10-21 17:40:02 UTC

FracFixR: Fraction Correction Framework for RNA-seq Data

Description

A compositional statistical framework for absolute proportion estimation between fractions in RNA sequencing data. FracFixR addresses the fundamental challenge in fractionated RNA-seq experiments where library preparation and sequencing depth obscure the original proportions of RNA fractions.

Author(s)

Maintainer: Alice Cleynen alice.cleynen@cnrs.fr (ORCID)

Authors:


DiffPropTest: Statistical Testing for Differential Proportions

Description

Performs statistical testing to identify transcripts with significantly different proportions between conditions in specified fraction(s). Implements three test options: GLM (most powerful), Logit, and Wald.

Usage

DiffPropTest(NormObject, Conditions, Types, Test = c("GLM", "Logit", "Wald"))

Arguments

NormObject

Output from FracFixR() function

Conditions

Character vector of exactly 2 conditions to compare

Types

Character vector of fraction type(s) to analyze. Can be single fraction or multiple (will be combined)

Test

Statistical test to use: "GLM", "Logit", or "Wald"

Details

Value

Data frame with columns:

Examples

data(example_counts)
data(example_annotation)

# Run FracFixR
results <- FracFixR(example_counts, example_annotation, parallel=FALSE)
# Run differential testing
diff_results <- DiffPropTest(results,
                            Conditions = c("Control", "Treatment"),
                            Types = "Heavy_Polysome",
                            Test = "GLM")


FracFixR: Main Function for Fraction Correction

Description

This is the core function that implements the FracFixR framework. It takes raw count data from total and fractionated samples and reconstructs the original fraction proportions through compositional modeling.

Usage

FracFixR(MatrixCounts, Annotation, parallel = TRUE)

Arguments

MatrixCounts

A numeric matrix of raw transcript/gene counts with:

  • Rows: transcripts/genes (must have rownames)

  • Columns: samples (must have colnames matching Annotation$Sample)

Annotation

A data.frame with required columns:

  • Sample: sample identifiers matching column names in MatrixCounts

  • Condition: experimental condition (e.g., "Control", "Treatment")

  • Type: fraction type (must include at least one "Total")

  • Replicate: replicate identifier

parallel

A boolean indicating whether to use parallel processing of the transcripts (default=TRUE).

Details

The function works by:

  1. Filtering transcripts based on presence in Total samples

  2. For each condition and replicate, fitting NNLS regression

  3. Estimating global fraction weights and individual transcript proportions

  4. Calculating the "lost" unrecoverable fraction

Value

A list containing:

References

Cleynen et al. FracFixR: A compositional statistical framework for absolute proportion estimation between fractions in RNA sequencing data.

Examples

# Load example data
data(example_counts)
data(example_annotation)

# Run FracFixR
results <- FracFixR(example_counts, example_annotation, parallel=FALSE)

# View fraction proportions
print(results$Fractions)


PlotComparison: Create Volcano Plot for Differential Results

Description

Generates avolcano plot showing transcripts with significant differential proportions between conditions.

Usage

PlotComparison(DiffPropResult, Conditions = NULL, Types = NULL, cutoff = NULL)

Arguments

DiffPropResult

Output from DiffPropTest() function

Conditions

Character vector of conditions being compared

Types

Character vector of fraction types analyzed

cutoff

Optional y-axis maximum for plot

Value

Volcano plot-type object

Examples

data(example_counts)
data(example_annotation)

# Run FracFixR
results <- FracFixR(example_counts, example_annotation,parallel=FALSE)
# Run differential testing
diff_results <- DiffPropTest(results,
                            Conditions = c("Control", "Treatment"),
                            Types = "Heavy_Polysome",
                            Test = "GLM")
# Create volcano plot
volcano <- PlotComparison(diff_results, 
                         Conditions = c("Control", "Treatment"),
                         Types = "Heavy_Polysome")


PlotFractions: Visualize Fraction Proportions

Description

Creates a stacked bar plot showing the distribution of RNA across fractions for each replicate, including the "lost" fraction.

Usage

PlotFractions(FracFixed)

Arguments

FracFixed

Output from FracFixR() function

Value

ggplot2 object showing fraction proportions

Examples

data(example_counts)
data(example_annotation)

# Run FracFixR
results <- FracFixR(example_counts, example_annotation, parallel=FALSE)
# Create fraction plot
frac_plot <- PlotFractions(results)
# Save plot with ggsave("fractions.pdf", frac_plot, width = 10, height = 8)


ProcessReplicate: Core NNLS Regression for Individual Replicates

Description

This function implements the mathematical core of FracFixR: fitting a non-negative least squares (NNLS) regression to estimate fraction weights and correct individual transcript abundances.

Usage

ProcessReplicate(RepMat, transcriptlist)

Arguments

RepMat

Data frame with transcripts as rows, samples as columns. Must include a "Total" column representing the whole cell lysate

transcriptlist

Character vector of transcript IDs to use for regression. These should be informative transcripts in the appropriate abundance range

Details

Mathematical basis:

Total = \alpha_0 + \alpha_1 \times Fraction1 + \alpha_2 \times Fraction2 + ... + \epsilon

Where \alpha_0 represents the "lost" fraction and other \alpha_i are fraction weights

Value

List containing:


beta_binomial_wald: Beta-Binomial Wald Test

Description

Implements Wald test using beta-binomial distribution to account for overdispersion in count data. Useful when variance exceeds that expected under binomial distribution.

Usage

beta_binomial_wald(counts, successes, annotation)

Arguments

counts

Total count matrix

successes

Success count matrix (not proportions)

annotation

Sample metadata

Value

Data frame with test results for all transcripts


Example annotation data frame

Description

A data frame containing sample annotations for the example_counts matrix. Describes the experimental design with conditions, fraction types, and replicates.

Usage

example_annotation

Format

A data frame with 12 rows and 4 columns:

Sample

Sample identifier matching column names in example_counts

Condition

Experimental condition (Control or Treatment)

Type

Fraction type (Total, Light_Polysome, or Heavy_Polysome)

Replicate

Replicate identifier (Rep1 or Rep2)

Source

Simulated data generated for package examples

Examples

data(example_annotation)
head(example_annotation)
table(example_annotation$Condition, example_annotation$Type)

Example RNA-seq count matrix

Description

A matrix containing simulated RNA-seq counts for 100 genes across 12 samples. The data simulates a polysome profiling experiment with two conditions (Control and Treatment) and three fractions (Total, Light_Polysome, Heavy_Polysome).

Usage

example_counts

Format

A numeric matrix with 100 rows (genes) and 12 columns (samples):

rows

Gene identifiers (Gene1 to Gene100)

columns

Sample identifiers (Sample1 to Sample12)

Source

Simulated data generated for package examples

Examples

data(example_counts)
dim(example_counts)
head(example_counts[, 1:6])

extract_condition_matrix: Extract and Prepare Data for Statistical Testing

Description

Extracts count and proportion data for specified conditions and fraction types. Handles both single fraction and combined fraction analyses.

Usage

extract_condition_matrix(
  originalcounts,
  proportions,
  annotation,
  conditions,
  types
)

Arguments

originalcounts

Original count matrix

proportions

Proportion estimates from FracFixR

annotation

Sample annotation data frame

conditions

Vector of conditions to extract

types

Vector of fraction types to analyze

Value

List containing:


logit_diff_test: Logit-based Differential Test

Description

Alternative to GLM using logit transformation. Faster but potentially less powerful than the full GLM approach.

Usage

logit_diff_test(counts, successes, annotation)

Arguments

counts

Total count matrix

successes

Proportion matrix

annotation

Sample metadata

Value

Data frame with test results for all transcripts


Polysome profiling annotation example

Description

An alternative annotation data frame for polysome profiling experiments with monosome and polysome fractions.

Usage

polysome_annotation

Format

A data frame with 12 rows and 4 columns:

Sample

Sample identifier

Condition

Experimental condition (Control or Stress)

Type

Fraction type (Total, Monosome, or Polysome)

Replicate

Replicate identifier (Rep1 or Rep2)

Source

Simulated data generated for package examples

Examples

data(polysome_annotation)
head(polysome_annotation)

run_glm: Binomial GLM for Single Transcript

Description

Fits a binomial generalized linear model to test for differential proportions between conditions for a single transcript.

Usage

run_glm(transcript, counts, successes, sample_info)

Arguments

transcript

Transcript identifier

counts

Total count matrix

successes

Proportion matrix

sample_info

Sample metadata with Condition factor

Value

Data frame with test results for this transcript


Subcellular fractionation annotation example

Description

An annotation data frame for subcellular fractionation experiments with nuclear and cytoplasmic fractions.

Usage

subcellular_annotation

Format

A data frame with 12 rows and 4 columns:

Sample

Sample identifier

Condition

Experimental condition (WT or Mutant)

Type

Fraction type (Total, Nuclear, or Cytoplasmic)

Replicate

Replicate identifier (Rep1 or Rep2)

Source

Simulated data generated for package examples

Examples

data(subcellular_annotation)
head(subcellular_annotation)