cLDA constraint for discrete-time models in R formulas

I work with longitudinal (repeated measures) models of clinical trial data. Patients are randomized to different treatment groups and measured over multiple pre-specified points in time. An example dataset is the FEV dataset from the mmrm package, where ARMCD denotes treatment groups and AVISIT denotes discrete time points:

library(tidyverse)data("fev_data", package = "mmrm")data <- fev_data %>%  as_tibble() %>%  mutate(ARMCD = as.character(ARMCD)) %>%  select(-VISITN, -VISITN2)data#> # A tibble: 800 × 8#>    USUBJID AVISIT ARMCD RACE                      SEX    FEV1_BL  FEV1 WEIGHT#>    <fct>   <fct>  <chr> <fct>                     <fct>    <dbl> <dbl>  <dbl>#>  1 PT1     VIS1   TRT   Black or African American Female    25.3  NA    0.677#>  2 PT1     VIS2   TRT   Black or African American Female    25.3  40.0  0.801#>  3 PT1     VIS3   TRT   Black or African American Female    25.3  NA    0.709#>  4 PT1     VIS4   TRT   Black or African American Female    25.3  20.5  0.809#>  5 PT2     VIS1   PBO   Asian                     Male      45.0  NA    0.465#>  6 PT2     VIS2   PBO   Asian                     Male      45.0  31.5  0.233#>  7 PT2     VIS3   PBO   Asian                     Male      45.0  36.9  0.360#>  8 PT2     VIS4   PBO   Asian                     Male      45.0  48.8  0.507#>  9 PT3     VIS1   PBO   Black or African American Female    43.5  NA    0.682#> 10 PT3     VIS2   PBO   Black or African American Female    43.5  36.0  0.892#> # ℹ 790 more rows#> # ℹ Use `print(n = ...)` to see more rows

It is common to use a so-called "constrained longitudinal data analysis" (cLDA) approach which pools all treatment groups at baseline. If time were continuous with baseline at AVISIT == 0, it would be easy to enforce cLDA in the model formula:

formula <- FEV1 ~ AVISIT + AVISIT:ARMCD

However, this does not work with discrete time because the extra AVISITVIS1:ARMCDTRT term distinguishes between baselines for PBO and TRT.

colnames(model.matrix(formula, data = data))#> [1] "(Intercept)"         "AVISITVIS2"          "AVISITVIS3"          "AVISITVIS4"         #> [5] "AVISITVIS1:ARMCDTRT" "AVISITVIS2:ARMCDTRT" "AVISITVIS3:ARMCDTRT" "AVISITVIS4:ARMCDTRT"

A workaround I have seen others use is to manually enforce cLDA in the data.

count(data, ARMCD, AVISIT)#> # A tibble: 8 × 3#>   ARMCD AVISIT     n#>   <chr> <fct>  <int>#> 1 PBO   VIS1     105#> 2 PBO   VIS2     105#> 3 PBO   VIS3     105#> 4 PBO   VIS4     105#> 5 TRT   VIS1      95#> 6 TRT   VIS2      95#> 7 TRT   VIS3      95#> 8 TRT   VIS4      95clda <- mutate(data, ARMCD = ifelse(AVISIT == "VIS1", "PBO", ARMCD))count(clda, ARMCD, AVISIT)#> # A tibble: 7 × 3#>   ARMCD AVISIT     n#>   <chr> <fct>  <int>#> 1 PBO   VIS1     200#> 2 PBO   VIS2     105#> 3 PBO   VIS3     105#> 4 PBO   VIS4     105#> 5 TRT   VIS2      95#> 6 TRT   VIS3      95#> 7 TRT   VIS4      95

But the terms in the model are all the same.

colnames(model.matrix(formula, data = clda))#> [1] "(Intercept)"         "AVISITVIS2"          "AVISITVIS3"          "AVISITVIS4"         #> [5] "AVISITVIS1:ARMCDTRT" "AVISITVIS2:ARMCDTRT" "AVISITVIS3:ARMCDTRT" "AVISITVIS4:ARMCDTRT"

Even worse, the model matrix is no longer full rank.

as.integer(Matrix::rankMatrix(model.matrix(formula, data = clda)))#> [1] 7

Is there a way to code up the formula and/or contrasts to use cLDA without having to recode the data or model matrix?

It seems oddly low-level to have to manually generate a custom model matrix and then call lm() or brms::brm() on the hacked model matrix instead of the actual data. And it would prevent me from straightforwardly working with many of the nice post-processing tools that brms provides.

Latest Images

Trending Articles

Latest Images