In clinical studies, patients often face multiple mutually exclusive
outcomes of equal clinical importance. Traditional survival analysis
focusing on a single endpoint can provide incomplete insights,
particularly when competing events like recovery and death,
disease-specific versus non-disease mortality, or treatment success
versus treatment-related toxicity occur simultaneously. This
jointCompRisk
package implements advanced methods for
analyzing competing risks data with joint inference capabilities,
providing both traditional competing risks analysis and novel
severity-weighted approaches that incorporate longitudinal ordinal
severity scores.
Key Features: 1. Simultaneous analysis of multiple disease outcomes - Compares cumulative incidence functions across different competing risks in a single statistical model 2. Disease severity integration - Incorporates repeated measurements of disease progression over time to weight the analysis by clinical severity 3. Clinically interpretable results - Provides outcomes in meaningful time units such as days of life gained or lost, rather than abstract statistical measures 4. Methodologically validated - Statistical approaches verified and published in peer-reviewed literature (Wen, Wang & Hu, 2023, Statistics in Medicine)
Hospital-based Treatment Studies: patients either recover (treatment success, resource availability for new patients) or experience adverse outcomes including death (treatment failure).
Cancer Research: Multiple disease-related events (local recurrence, distant metastasis, secondary cancers, death) may respond differently to treatment interventions.
Cardiovascular Studies: Various clinical events may show differential treatment responses across mortality and morbidity outcomes.
The package enables simultaneous analysis of multiple competing outcomes while respecting the clinical importance of each endpoint.
The cumulative incidence function (CIF) describes the probability of experiencing a specific event type over time while accounting for competing events. Unlike standard survival analysis that treats competing events as censored, CIF provides realistic estimates by properly incorporating all competing risks.
Mathematically, the CIF for event type \(k\) is defined as
\[
F_{k}(t) = P\bigl(T \le t,\ \delta = k \bigr) = \int_{0}^{t}
S\!\bigl(u^{-}\bigr)\, d\Lambda_{k}(u)
\] where
- \(T\) is the time to the first event
of any type,
- \(\delta\) indicates the type of
event that occurred,
- \(S(t) = P(T > t)\) is the overall
survival function (the probability no event has occurred by time \(t\)),
- \(\Lambda_k(t)\) is the cumulative
hazard function for event type \(k\),
- \(S(u^-) = \lim_{s \uparrow u} S(s)\)
denotes the left-continuous limit of the survival function at time \(u\).
This formulation shows that the CIF for event type \(k\) accumulates over time as the hazard for that event type increases, weighted by the probability of surviving free of any event up to that point.
The restricted mean survival time (RMST) up to a specified time \(\tau\) represents the average survival time accumulated by that point. Unlike the median survival, RMST uses the entire survival curve up to \(\tau\) and is particularly useful when the survival curve does not reach 50%.
It is defined as
\[ \text{RMST}(\tau) = E\bigl[\min(T, \tau)\bigr] = \int_0^\tau S(t) \, dt \]
where \(S(t)\) is the overall survival function.
In competing risks settings, we can define endpoint-specific restricted mean times:
Restricted Mean Time Gained (RMTG) for
beneficial events:
\[
\text{RMTG}_k(\tau) = \int_0^\tau F_k(t) \, dt
\]
Restricted Mean Time Lost (RMTL) for adverse
events:
\[
\text{RMTL}_k(\tau) = \int_0^\tau F_k(t) \, dt
\]
where \(F_k(t)\) is the cumulative incidence function for event type \(k\).
Load the package
Sample Data Throughout this vignette, we use simulated data that follows the structure of Adaptive COVID-19 Treatment Trials (ACTT). The dataset represents a clinical trial of 150 patients with competing risks survival data (recovery vs. death) and longitudinal ordinal severity scores measured over a 30-day follow-up period. 1. Treatment group: 1.5× faster recovery times, 1.8× improved survival 2. Severity scores: 1-8 scale (mild to severe illness) measured at scheduled visits 3. Realistic treatment effects demonstrating significant benefits across standard and weighted analyses
#Sample datasets included with package
main_df <- read.csv("main_df.csv")
long_df <- read.csv("long_df.csv")
#View data structure
head(main_df)
## ID TimeToRecovery TimeToDeath RecoveryCensoringIndicator
## 1 Patient_001 3 12 0
## 2 Patient_002 13 7 1
## 3 Patient_003 2 30 0
## 4 Patient_004 15 30 1
## 5 Patient_005 9 14 0
## 6 Patient_006 11 30 0
## DeathCensoringIndicator BaselineScore Treatment
## 1 0 5 1
## 2 1 6 0
## 3 1 4 1
## 4 1 6 0
## 5 0 4 1
## 6 1 6 1
## PersonID OrdinalScore RelativeDay
## 1 Patient_001 5 0
## 2 Patient_001 6 1
## 3 Patient_001 6 3
## 4 Patient_001 5 5
## 5 Patient_001 5 7
## 6 Patient_001 5 10
main_df
contains baseline demographics, treatment
assignment, censoring indicators, and time-to-event outcomes
(TimeToRecovery, TimeToDeath). long_df
contains
longitudinal severity measurements with one row per visit per
participant, enabling joint modeling of how baseline characteristics and
evolving severity scores predict competing outcomes.
In addition to the above columns, we need an additional column 1. Baseline Score: baseline disease severity As well as we need an additional longitudinal dataset with 1. Patient ID: matching the main dataset 2. Time Point: Days since treatment starts 3. Severity Score: Disease severity score at each time point
Standard CIF analysis provides traditional competing risks analysis with joint inference capabilities.
The prep_data_cif() function preprocesses your data by excluding patients with zero survival time, handling “discharge-to-die” cases (recoded as censored for recovery), creating event time variables (etime, estatus, etype2), and splitting data by treatment group.
Customize column names to match your dataset:
mydata_std <- prep_data_cif(
data = main_df,
ID = "ID", #change this
TimeToRecovery = "TimeToRecovery", #change this
TimeToDeath = "TimeToDeath", #change this
Recov_Censoring = "RecoveryCensoringIndicator", #change this
Death_Censoring = "DeathCensoringIndicator", #change this
Treatment = "Treatment" #change this
)
The do_cif_analysis()
function performs two analyses:
RMLT1 (time lost through death1) and RMLT2 (time lost through death2).
The tau parameter sets the time horizon (e.g., tau=15 analyzes the first
15 days).
Each analysis returns formatted results with group-specific estimates
and treatment contrasts, containing estimates, standard errors,
confidence intervals, and p-values for both treatment groups and their
difference. The function provides comprehensive statistical inference
for competing risk endpoints in an easy-to-interpret format with
groups
(individual group results) and contrast
(treatment difference) components. For RMLT1, positive differences
typically indicate beneficial recovery outcomes, while for RMLT2,
negative differences indicate beneficial survival outcomes (less time
lost to death).
res_std <- do_cif_analysis(mydata_std, tau=15)
# Access RMLT1 results (recovery/discharge)
res_std$RMLT1$groups # Group estimates
## Group Estimate se Lower95 Upper95
## 1 Group1 (trt=1) 3.760347 0.4769287 2.8255670 4.695128
## 2 Group2 (trt=0) 1.167820 0.3530140 0.4759129 1.859728
## Contrast Estimate se Lower95 Upper95 p_value
## 1 Difference 2.592527 0.5933632 1.429535 3.755519 1.246981e-05
## Group Estimate se Lower95 Upper95
## 1 Group1 (trt=1) 1.082678 0.3039021 0.4870301 1.678326
## 2 Group2 (trt=0) 2.743811 0.4912417 1.7809770 3.706644
## Contrast Estimate se Lower95 Upper95 p_value
## 1 Difference -1.661133 0.577646 -2.793319 -0.5289465 1.995969
According to the results above, using the simulated example data, we can interpret the result as below:
RMLT1 (Recovery Analysis): Treatment group gains 2.59 more days through recovery (3.76 vs 1.17 days, p < 0.001). The positive difference indicates that treated patients experience significantly more recovery events and accumulate more time in recovered states, demonstrating enhanced recovery outcomes with strong statistical significance.
RMLT2 (Death Analysis): Treatment group loses 1.66 fewer days to death (1.08 vs 2.74 days, p = 0.004). The negative difference indicates that treated patients experience reduced mortality risk, with fewer deaths occurring and delayed time to death when it does occur, representing a clear survival benefit with strong statistical significance.
Clinical Conclusion: Both standard competing risks measures consistently show treatment benefits - more time gained through recovery (beneficial) and less time lost to death (beneficial), demonstrating that treatment enhances recovery while reducing mortality across both competing endpoints. The treatment effect is statistically significant for both outcomes, indicating robust clinical efficacy.
Weighted analysis incorporates disease severity changes over time, providing insights when patient conditions vary during follow-up.
The function merges time-to-event data with longitudinal severity measurements, calculating time spent in different severity states and applying clinical weights. Severity scores are flexible (example uses scores 4-7, but any range can be used to meet your dataset’s needs) with matching weights required for each state. In practice, death weights often increase with severity (example - State 4: 2.0, State 5: 1.5, State 6: 1.0, State 7: 0.5) to emphasize concerning states, while discharge weights (if applicable) often decrease with severity (example - State 4: 0.5, State 5: 1.0, State 6: 1.5, State 7: 2.0) to emphasize favorable recovery states. This produces separate datasets enabling computation of weighted restricted mean lifetime 1 and 2. Weights can also be customized based on stakeholder priorities—hospital administrators might prioritize shorter stays, while patients might weight quality of life differently than clinical severity.
Handling Missing Measurements in Weighted Analysis: Here the weighted CIF analysis uses a “last observation carried forward” approach for gaps between measurement time points. When longitudinal severity scores are missing between recorded visits (e.g., measurements on day 4 and day 7 with no intermediate assessments), the algorithm assumes the patient’s severity state remains constant at the last observed value throughout the entire interval. Specifically, if a patient has a severity score of 5 on day 4 and the next measurement is on day 7, regardless of the severity score on day 7, the algorithm credits all 3 intervening days (days 4-6) to severity state 5. This approach calculates interval_length = day_7 - day_4 = 3 days and adds this entire duration to the appropriate severity state counter without interpolation or imputation of intermediate values. While this method may not capture dynamic changes within measurement intervals, it provides a conservative and computationally tractable approach for incorporating longitudinal severity information into competing risks analysis. Researchers should consider the frequency of their measurement schedule when interpreting weighted results, as longer intervals between assessments may reduce the precision of severity-weighted estimates.
prepped_w <- prep_data_weighted_cif2(
data_main = main_df,
data_long = long_df,
wID_main = "ID",
wTimeToRecovery_main = "TimeToRecovery",
wTimeToDeath_main = "TimeToDeath",
wRecov_Censoring_main = "RecoveryCensoringIndicator",
wDeath_Censoring_main = "DeathCensoringIndicator",
wTreatment_main = "Treatment",
wBaselineScore_main = "BaselineScore",
wID_long = "PersonID",
wADY_long = "RelativeDay",
wScore_long = "OrdinalScore",
wStates_death = c(4,5,6,7,8),
wWeights_death = c(2,1.5,1,0.5,0.2),
wStates_discharge = c(4,5,6,7,8),
wWeights_discharge = c(0.2,0.5,1,1.5,2)
)
The function do_weighted_cif_analysis()
performs
weighted competing risks analysis using the datasets prepared by
prep_data_weighted_cif2()
, computing Weighted RMLT1 (in our
simulated example it would be recovery/discharge analysis with
severity-weighted time) and Weighted RMLT2 (in our simulated example it
would be death analysis with severity-weighted time) at the specified
time horizon tau
. Unlike standard analysis that treats all
time equally, this approach emphasizes clinically relevant periods -
time spent in better states before discharge receives higher weight in
WRMLT1 calculations, while time spent in severe states before death
receives higher weight in WRMLT2 calculations. The function returns
formatted results with $groups
and $contrast
components for both weighted measures, containing estimates, standard
errors, confidence intervals, and p-values that reflect the
severity-adjusted treatment effects on competing outcomes.
Different tau
values reveal treatment effects over time:
short-term (e.g., tau=15) shows immediate treatment effects, long-term
(e.g., tau=29) reveals sustained or delayed benefits, and multiple
horizons demonstrate the evolution of treatment effects. Treatment
benefits may be immediate, delayed, or diminishing, so analyzing
multiple time points provides comprehensive insights.
For weighted analysis, here in this example, WRMLT1 represents recovery/discharge time weighted by clinical severity states, while WRMLT2 shows death time weighted by severity progression. Both measures capture timing and severity of clinical outcomes simultaneously. Positive differences in WRMLT1 indicate more severity-weighted recovery time (beneficial), while negative differences in WRMLT2 indicate less severity-weighted time lost to death (beneficial). Results provide clinically meaningful metrics in time units, facilitating evidence-based treatment decisions.
#Analysis at 15 days
res_w15 <- do_weighted_cif_analysis(prepped_w, tau=15)
# Display WRMLT1 results (discharge-focused analysis)
#Group estimates with standard errors
res_w15$WRMLT1$groups
## Group Estimate se Lower95 Upper95
## 1 Group1 (trt=1) 3.955174 0.6654671 2.6508587 5.259490
## 2 Group2 (trt=0) 1.674784 0.5287521 0.6384295 2.711138
## Contrast Estimate se Lower95 Upper95 p_value
## 1 Difference 2.280391 0.849956 0.614477 3.946304 0.007297557
## Group Estimate se Lower95 Upper95
## 1 Group1 (trt=1) 1.587423 0.4903069 0.6264214 2.548424
## 2 Group2 (trt=0) 3.007981 0.6181803 1.7963481 4.219615
## Contrast Estimate se Lower95 Upper95 p_value
## 1 Difference -1.420559 0.7890169 -2.967032 0.1259145 1.928205
WRMLT1 (Recovery/Discharge Analysis): Treatment group shows 2.28 more severity-weighted recovery days (3.96 vs 1.67 days, p = 0.007). The positive difference indicates that treated patients accumulate significantly more time in recovery-related severity states, representing enhanced recovery outcomes weighted by clinical improvement trajectories.
WRMLT2 (Death Analysis): Treatment group shows 1.42 fewer severity-weighted death days (1.59 vs 3.01 days, p = 0.072). The negative difference indicates that treated patients lose less time to death when accounting for disease severity progression, representing a survival benefit, though this approaches but does not reach statistical significance at the 0.05 level.
Clinical Conclusion: Both weighted measures consistently show treatment benefits - more severity-weighted recovery time (beneficial) and less severity-weighted time lost to death (beneficial), confirming that treatment advantages persist even when accounting for disease severity trajectories.