Introduction to lorbridge: Bridging Log-Odds Ratios and Correspondence Analysis

Why lorbridge?

Clinical and medical researchers routinely report odds ratios (ORs) from logistic regression as their primary measure of association. An OR of 1.83, for example, means the odds of the outcome are 83% higher for a one-unit increase in the predictor — a statement that requires statistical training to interpret intuitively.

lorbridge provides a formal mathematical bridge (Kim & Grochowalski, 2019) that re-expresses log-odds ratios (LORs) as cosine theta — a metric bounded between −1 and +1, immediately interpretable like a Pearson correlation. At the same time, the package extends this bridge into singly-ordered (SONSCA) and doubly-ordered (DONSCA) nonsymmetric correspondence analysis, giving researchers visual geometric maps alongside their regression results.

Dataset

The package includes lorbridge_data, an individual-level dataset (N = 900) with Vocabulary Meaning (VM) scores and binary minority/majority group membership:

library(lorbridge)
data(lorbridge_data)
str(lorbridge_data)
#> 'data.frame':    900 obs. of  4 variables:
#>  $ VM      : int  54 59 62 63 63 63 63 63 63 65 ...
#>  $ VMbin   : Factor w/ 6 levels "VM4","VM1","VM2",..: 2 2 2 2 2 2 2 2 2 3 ...
#>  $ minority: int  1 1 1 0 0 0 0 1 1 1 ...
#>  $ Race    : chr  "Race2" "Race1" "Race1" "Race4" ...

Subprogram 1: Binary Logistic Regression

1a. Continuous predictor (VM per 1 SD)

res_1a <- blr_continuous(
  outcome   = lorbridge_data$minority,
  predictor = lorbridge_data$VM
)
print(res_1a$summary_table[, c("LOR","OR","OR_lo","OR_hi","p",
                                "Nagelkerke_R2","YuleQ","r_meta")],
      digits = 4)
#>       LOR    OR OR_lo  OR_hi        p Nagelkerke_R2   YuleQ  r_meta
#> 1 -0.4339 0.648 0.561 0.7484 3.57e-09       0.05504 -0.2136 -0.1188

Plain-English interpretation: A one-standard-deviation increase in VM score is associated with an OR of approximately 0.65 for minority membership. The LOR of −0.43 translates to an r_meta of approximately −0.12 on the familiar −1 to +1 scale — a small but statistically reliable negative association.

1b. Categorical predictor (VM bins, VM4 as reference)

res_1b <- blr_categorical(
  outcome   = lorbridge_data$minority,
  predictor = lorbridge_data$VMbin,
  ref_level = "VM4"
)
print(res_1b$results[, c("Category","LOR","OR","p","YuleQ","r_meta","cos_theta")],
      digits = 4)
#>   Category     LOR     OR         p   YuleQ   r_meta cos_theta
#> 1      VM1  1.0290 2.7982 1.306e-01  0.4734  0.27288         1
#> 2      VM2  1.0016 2.7225 6.187e-05  0.4627  0.26614         1
#> 3      VM3  0.5041 1.6555 1.197e-03  0.2468  0.13764         1
#> 4      VM5 -0.3643 0.6947 2.144e-01 -0.1801 -0.09991        -1
#> 5      VM6  1.4990 4.4771 8.621e-02  0.6348  0.38189         1

Note: In a 2-row table, the 1D correspondence analysis solution yields cosine thetas of exactly ±1. The sign carries the substantive information: positive = minority over-represented relative to VM4; negative = under-represented.

Subprogram 2: SONSCA

Singly-Ordered Nonsymmetric Correspondence Analysis is applied to the IQ-by-race and VM-by-race contingency tables, with Race2 and VM4 (or IQ4) as the row and column anchors respectively.

data(tab_IQ)
row_anchor <- "Race2"
col_anchor <- "IQ4"
races      <- setdiff(rownames(tab_IQ), row_anchor)
bins       <- setdiff(colnames(tab_IQ), col_anchor)

# Pairwise CCMs for Race1 vs Race2 at IQ1 vs IQ4
sonsca_ccm(tab_IQ, row_k = "Race1", bin_j = "IQ1",
           row_anchor = row_anchor, col_anchor = col_anchor)
#>    Race Bin       OR     OR_lo    OR_hi        LOR     LOR_lo    LOR_hi
#> 1 Race1 IQ1 1.028571 0.4805975 2.201341 0.02817088 -0.7327251 0.7890669
#>        YuleQ      Q_lo      Q_hi       YuleY       Y_lo      Y_hi      r_meta
#> 1 0.01408451 -0.350806 0.3752619 0.007042603 -0.1811595 0.1947471 0.007765475
#>         r_lo      r_hi
#> 1 -0.1979878 0.2125476

# SONSCA coordinates and cosine theta matrix
sc  <- sonsca_coords(tab_IQ)
cos <- sonsca_cosines(sc$row_coords, sc$col_coords,
                      row_anchor = row_anchor,
                      col_anchor = col_anchor)
round(cos[races, bins], 3)
#>          IQ1    IQ2    IQ3    IQ5    IQ6
#> Race1  0.024  0.039 -0.143 -0.080 -0.474
#> Race3 -0.311 -0.271 -0.367  0.433  0.450
#> Race4 -0.853 -0.846 -0.928  0.798  0.304

pct <- inertia_pct(tab_IQ)
cat(sprintf("Dimension 1: %.1f%%  |  Dimension 2: %.1f%%\n", pct[1], pct[2]))
#> Dimension 1: 83.3%  |  Dimension 2: 16.5%

Subprogram 3: DONSCA

Doubly-Ordered Nonsymmetric Correspondence Analysis is applied to the 6 × 6 IQ × VM table, with IQ4 and VM4 as the row and column anchors.

data(tab_IQ_VM)
fit <- donsca_fit(tab_IQ_VM)
cos_d <- donsca_cosines(fit, col_anchor_idx = 4, row_anchor_idx = 4)
head(cos_d, 6)
#>   Row Col cos_theta
#> 1 IQ1 VM1 0.8045030
#> 2 IQ1 VM2 0.7295621
#> 3 IQ1 VM3 0.2470193
#> 4 IQ1 VM5 0.1681869
#> 5 IQ1 VM6 0.2093072
#> 6 IQ2 VM1 0.4604104

Multinomial logistic regression with CCMs

data(lorbridge_data)  # use VM as numeric predictor, VMbin as outcome proxy
# Illustrative: treat VM bins as the outcome and VM numeric as predictor
# (In practice use IQ bins as outcome and VM as predictor per the paper)
data(tab_IQ_VM)

# Build long-format data from tab_IQ_VM for multinomial logit
vm_vals <- c(54,59,62,63,65,67,69,71,73,74,76,78,80,81,82,84,85,86,87,89,
             90,92,93,95,96,98,100,101,103,104,105,107,108,110,112,113,
             115,117,119,121,123,125,126,128,130,132,134,136,138,139,
             143,147,149)
rows6 <- paste0("IQ", 1:6)
# (Full X_wide matrix omitted here for brevity — see unified analysis script)

Key References

Kim, S.-K., & Grochowalski, J. H. (2019). Gaining from discretization of continuous data: The correspondence analysis biplot approach. Behavior Research Methods, 51(2), 589–601. https://doi.org/10.3758/s13428-018-1161-1

Kim, S.-K. (2020). Test treatment effect differences in repeatedly measured symptoms with binary values: The matched correspondence analysis approach. Behavior Research Methods, 52, 1480–1490.

Kim, S.-K. (2024). Factorization of person response profiles to identify summative profiles carrying central response patterns. Psychological Methods, 29(4), 723–730.