
The dissemination of aggregate health statistics derived from clinical and administrative data carries an inherent tension between analytic utility and patient confidentiality. When reported frequencies are sufficiently small, individuals within a subgroup may be vulnerable to re-identification — particularly in stratified or cross-tabulated outputs where demographic, geographic, or clinical covariates intersect. Such vulnerabilities have prompted formal regulatory responses, culminating in data use agreement (DUA) obligations codified by federal agencies and clinical research networks alike.
Federal agencies — including the CMS, the AHRQ, the NCI, and the CDC — each maintain formal small-cell suppression requirements as a condition of data access and publication. Clinical research networks similarly enforce these standards: PCORnet® and PEDSnet both require a minimum cell-size threshold of 11 and 5 respectively across all distributed data queries under their respective data sharing agreements.
For studies spanning multiple reporting dimensions — such as
stratified demographic breakdowns or multi-site analyses — manually
identifying and suppressing all qualifying primary and complementary
cells across large tables is a cumbersome and error-prone process.
countmaskr automates this workflow end-to-end, enabling
patient privacy as well as asists end-users to meet their DUA
obligations consistently and in a reproducible manner across
institutional data sharing pipelines.
| Age | N |
|---|---|
| 0 - 1 | 4 |
| 2 - 9 | 71 |
| 10 - 19 | 925 |
| 20 - 29 | 0 |
| 30 - 39 | 0 |
| Age | N |
|---|---|
| 0 - 1 | <11 |
| 2 - 9 | <80 |
| 10 - 19 | 925 |
| 20 - 29 | 0 |
| 30 - 39 | 0 |
You can install countmaskr from GitHub with:
# install.packages("devtools")
devtools::install_github("Query-Fulfillment/countmaskr")or using pak
# install.packages("pak")
pak::pkg_install("Query-Fulfillment/countmaskr")