library(chmsflow)library(chmsflow)chmsflow harmonizes variables from the Canadian Health Measures Survey (CHMS) across cycles 1–6 and derives health indicators used in health research. It works with the recodeflow package to transform raw CHMS variables into analysis-ready versions using recoding rules defined in CSV metadata files.
The package includes two metadata CSV files (variables.csv and variable-details.csv) that define how raw CHMS variables are recoded, and 42 functions that derive new health indicators. The table below summarizes the available variables, organized by section and subject as defined in variables.csv:
| Section | Subject | Examples |
|---|---|---|
| Sociodemographics | Age, sex, ethnicity | clc_age, clc_sex, pgdcgt |
| Socioeconomic | Income, education, occupation, marital status | adj_hh_income, income_quintile, edudr04 |
| Health status | Blood pressure, hypertension | sbp_adj_mmhg, htn_status, htn_control_status |
| Health status | Chronic disease (diabetes, CKD, CVD) | diab_status, ckd_status, cvd_status |
| Health status | Medication (8 drug classes from ATC codes) | ace_med, any_htn_med, diab_med |
| Health status | Weight, height, cholesterol | nonhdl_mmoll, waist_height_ratio, hwmdbmi |
| Health status | Family history | cvd_premature_famhist_status, fam_bp |
| Health behaviour | Alcohol, diet | alc_risk_score, fv_daily_times, healthy_diet_indicator |
| Health behaviour | Exercise | exercise_min_week, enough_exercise_indicator |
| Health behaviour | Smoking | pack_years, smoke |
For the full variable list, see Variable schema reference.
cycle4). Keep medication data separate as cyclex_meds.rec_with_table() from recodeflow to transform source variables and derive new ones.# Install release version from CRAN
install.packages("chmsflow")
# Install the most recent version from GitHub
devtools::install_github("Big-Life-Lab/chmsflow")Use rec_with_table() from recodeflow to transform CHMS variables. The cycle data object must be named cyclex for recoding to work properly.
library(recodeflow)
# Recode a source variable (age)
cycle4_ages <- rec_with_table(
cycle4, "clc_age",
variable_details = variable_details, log = TRUE
) value_to From rows_recoded
1 copy [3, 80] 50
2 NA::a 996 0
3 NA::b [997, 999] 0
4 <NA> else 0
head(cycle4_ages) clc_age
1 73
2 33
3 22
4 47
5 74
6 22
chmsflow handles three types of variables, each recoded differently.
Source variables are mapped directly from raw CHMS columns. Variable names may differ across cycles, but chmsflow harmonizes them to a single name.
# Recode sex (same variable name across all cycles)
cycle4_sexes <- rec_with_table(
cycle4, "clc_sex",
variable_details = variable_details, log = TRUE
) value_to From rows_recoded
1 1 1 27
2 2 2 23
3 NA::a 6 0
4 NA::b [7, 9] 0
5 NA(b) else 0
head(cycle4_sexes) clc_sex
1 2
2 2
3 2
4 2
5 2
6 2
Some variables convert continuous measurements into categories using thresholds defined in variable-details.csv.
# Recode age into 4 groups
cycle4_categorical_ages <- rec_with_table(
cycle4, "agegroup4",
variable_details = variable_details, log = TRUE
) value_to From rows_recoded
1 1 [20, 39] 12
2 2 [40, 59] 23
3 3 [60, 69] 6
4 4 [70, 79] 9
5 NA::a 996 0
6 NA::b [997, 999] 0
7 NA(b) else 0
head(cycle4_categorical_ages) agegroup4
1 4
2 1
3 1
4 2
5 4
6 1
Derived variables are computed by R functions referenced as Func:: entries in variable-details.csv. These require their input variables to be present in the data. See Derived variables for details.
# Derive adjusted systolic blood pressure
# bpmdpbps (raw SBP) must be in the data for sbp_adj_mmhg to be computed
cycle4_adjusted_SBPs <- rec_with_table(
cycle4, c("bpmdpbps", "sbp_adj_mmhg"),
variable_details = variable_details, log = TRUE
) value_to From rows_recoded
1 copy [73, 216] 50
2 NA::a 996 0
3 NA::b [997, 999] 0
4 <NA> else 0
head(cycle4_adjusted_SBPs) bpmdpbps sbp_adj_mmhg
1 207 203.91
2 101 105.33
3 92 96.96
4 196 193.68
5 152 152.76
6 79 84.87
Func:: and DerivedVar:: entries work in Derived variables.haven::tagged_na() handles CHMS missing codes in Missing data (tagged_na).