In depth flow cytometry characterization of the peripheral blood immune cell compartment composition and its association with pro-inflammatory factors in patients with stage 5 chronic kidney disease



Authors and affiliations:

Ivan Zahradka1, Filip Tichanek2, Lucie Majercikova3, Jana Douskova3, Mathias Streitz4, Stephan Schlickeiser5, Klara Osickova1, Vojtech Petr1, Petra Hruba3, Ondrej Viklicky1,3

1 Department of Nephrology, Institute for Clinical and Experimental Medicine, Czech Republic
2 Department of Data Science, Institute for Clinical and Experimental Medicine, Czech Republic
3 Transplantation Laboratory, Institute for Clinical and Experimental Medicine, Prague, Czech Republic
4 Department of Experimental Animal Facilities and Biorisk Management, Federal Research Institute for Animal Health, Greifswald-Insel Riems, Germany
5 Berlin Institute of Health Center of Regenerative Therapies (BRCT), Berlin Institute of Health at Charité, Berlin, Germany


This is a statistical report of the study In depth flow cytometry characterization of the peripheral blood immune cell compartment composition and its association with pro-inflammatory factors in patients with stage 5 chronic kidney disease that under review in the Frontiers in Immunulogy

When using this code or data, cite the original publication:

TO BE ADDED

BibTex citation for the original publication:

TO BE ADDED


Original GitHub repository: https://github.com/filip-tichanek/https://github.com/filip-tichanek/rTX_lymphocytes

Statistical code with all details of statistical analysis is shown a HTML report.

Original data are available here


1 Introduction

1.1 Statistical methodology

All analyses were run in R, version 4.1 (R Core Team 2024).

Initially, we noted that only 3 patients had repeated renal transplantation. Their lymphocyte counts were noticeably different from other patients, so we excluded these re-transplanted patients from all analyses.

First, we performed single-variable Permutational Analyses of Variance (PERMANOVAs) using the ‘Vegan’ R package (Oksanen et al. 2024) to identify which factors explain the largest portion of variance in the multivariate lymphocyte composition when considered as single predictors. We then selected those with \(R^2 > 2\) and included them in a multivariable PERMANOVA, in a sequential manner, ordered by \(R^2\) (the most explanatory factor first). This analysis was conducted primarily with all ‘specific’ data (the most detailed sub-populations of lymphocytes available), but also with more general data (broader categories of lymphocyte populations), and subsets of specific data of similar types (dendritic cells, B-lymphocytes, T-lymphocytes). As a sensitivity analysis, we repeated the construction of the multivariable PERMANOVA, using a p-value threshold (< 0.05) in the univariable models for variable selection, in cases where this yielded a different set of predictors than selection based on \(R^2\). All PERMANOVAs were run with Euclidean distances after log10-transformation and z-standardization of the lymphocyte counts, using 5000 permutations.

Predictors (clinical characteristics) that showed a statistically significant effect in any of the multivariable PERMANOVAs were then used in a series of generalized linear models, fitted with ‘glmmTMB’ package (Brooks et al. 2017). Although variable selection for these PERMANOVAs was primarily based on \(R^2\), the same set of covariates would have been selected if PERMANOVA variable selection had been based on p-values < 0.05. These models predicted the count of a specific lymphocyte sub-population assuming a Poisson distribution, with observation-specific random intercepts to account for overdispersion. The selected predictors were: Smoking, cmv (past Cytomegalovirus infection), receiver_age_30y (receiver age scaled so that a 30-year increase represents a one-unit change), Dialysis, and ascvd (Atherosclerotic Cardiovascular Disease). P-values for each predictor were corrected using the Benjamini-Hochberg correction (FDR) for multiple comparisons (Benjamini and Hochberg 1995), accounting for the repeated estimation of the effect of each predictor across 18 lymphocyte sub-populations. The results were visualized using a volcano plot, showing the estimated effect of the predictor on the X axis (as estimated log-fold difference in count when the predictor value increases by 1 unit) and the -log10(P-value) on the Y axis (the higher the value, the stronger the evidence of the effect direction). As a sensitivity analysis, we repeated the same procedure using linear model with log-transformed counts as outcomes.

Further, we also show Spearman’s rank correlation between lymphocyte sub-population and the selected patients characteristics, visualized with ‘ComplexHeatmap’ R package (Gu, Eils, and Schlesner 2016). For the visualization, lymphocyte sub-populations were clustered based on similarity in distribution of their abundance using UPGMA.

2 Analysis

2.1 Initialization

2.1.1 Packages

if (TRUE) {rm(list = ls() )}
if (TRUE) {
  suppressWarnings(suppressMessages({
    library(tidyverse)
    library(stringr)
    library(stringi)
    library(ggpubr)
    library(emmeans)
    library(gtsummary)
    library(car)
    library(RJDBC)
    library(sjPlot)
    library(flextable)
    library(openxlsx)
    library(mgcv)
    library(pROC)
    library(cowplot)
    library(boot)
    library(glmnet)
    library(brms)
    library(janitor)
    library(arm)
    library(corrplot)
    library(lubridate)
    library(kableExtra)    
    library(ggdist)
    library(bayesplot)
    library(coxed)
    library(quantreg)
    library(ggbeeswarm)    
    library(vegan)
    library(circlize)
    library(ComplexHeatmap)
    library(glmmTMB)
    library(ggrepel)
  }))
}

2.1.2 Functions

setwd('~/1_ticf_sec/483_ZAHI_lymphocytes')
source('483_functions.R')

2.1.3 Directories

folders <- c("data", 
             "gitignore",
             "gitignore/run",
             "gitignore/figures",
             "gitignore/outputs",
             "gitignore/data")

invisible(
  lapply(folders, function(x) if (!dir.exists(x)) 
    dir.create(x, recursive = TRUE)
    )
  )

2.1.4 Seed

set.seed(16)

2.2 Data

2.2.1 Import and wrangling

Import

if (!file.exists("data/data_all.csv")) {
  data_excel_all <- read.xlsx("gitignore/data/FACS_lymfocyty_filip2.xlsx") %>%
    dplyr::mutate(
      id = as.character(1:nrow(.)),
      group = factor(group)
    ) %>%
    dplyr::rename(
    `non_switched_memory` = `non-switched.memory`) %>% 
    dplyr::select(
      -`rodne_cislo`, -`Jmeno`,
      -`Sekce.klinické.proměnné`,
      -`sekce.dendritické.buňky`,
      -`sekce.B-lymfocyty`,
      -`Sekce.T-lymfocyty`
    ) %>%
    dplyr::select(id, everything())

  readr::write_csv(data_excel_all, "data/data_all.csv")
}

data_excel_all <- readr::read_csv("data/data_all.csv") %>%
  mutate(group = factor(group),
         id = as.character(id))
## Rows: 136 Columns: 40
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): group
## dbl (39): id, Diabetes, PRD.GN, Smoking, hep.B, ASCVD, lecena.hypertenze, le...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.


data_avail <- data_excel_all %>% 
  dplyr::rename(
    mDC1 = mDC1_absp,
    mDC2 = mDC2_absp,
    mDC3 = mDC3_absp,
    pDC = pDC_absp,
    `naive B cells` = naiveB_absp,
    `marginal-zone B cells` = marginal.B,
    `class non-switched memory cells` = `non_switched_memory`,
    `class switched memory cells` = switched.memory,
    plasmablast = Plazmablasty_absp,
    `transitional B cells` = Transitional_absp,
    `CD4+ CM cells` = CD.4.CM,
    `CD4+ EM cells` = CD.4.EM,
    `CD4+ naive cells` = CD4_naive_absp,
    `CD4+ TEMRA cells` = CD4_TEMRA_absp,
    `CD8+ CM cells` = CD8_CM_absp,
    `CD8+ EM cells` = CD8_EM_absp,
    `CD8+ naive cells` = CD.8.naive,
    `CD8+ TEMRA cells` = CD8_TEMRA_absp,
    mDC1 = mDC1_absp,
    mDC2 = mDC2_absp,
    mDC3 = mDC3_absp,
    pDC = pDC_absp,
    `naive B cells` = naiveB_absp,
    `marginal-zone B cells` = marginal.B,
    `HLA-DR+ cells`= `HLA_DR+_Absp`, 
     `total mDC` = `mDC_absp`, 
     `total CD19+ B cells` = `CD.19`, 
     `total CD3+ cells` = `CD3_absp`, 
     `total CD4+ cells` = `CD4_absp`, 
     `total CD8+ cells` = `CD8_absp`,
     `diabetes` = `Diabetes`,
    `hepatitis_b` = `hep.B`,
    `hypertension` = `lecena.hypertenze`,
    `hyperuricemia` = `lecena.hyperurikemie`,
    `dyslipidemia` = `lecena.dyslipidemie`,
    `anti_HLA` = `anti-HLA.preTx.nebo.historicky.ano/ne`,
    `receiver_age` = `věk.příjemce`,
    `receiver_female` = `pohlavi.prijemce.(ž=1)`,
    `dialysis` = `Dialysis`,
    `re_transplantation` = `retransplantace`) 

skimr::skim(data_avail)
Data summary
Name data_avail
Number of rows 136
Number of columns 40
_______________________
Column type frequency:
character 1
factor 1
numeric 38
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
id 0 1 1 3 0 136 0

Variable type: factor

skim_variable n_missing complete_rate ordered n_unique top_counts
group 0 1 FALSE 2 pat: 107, con: 29

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
diabetes 29 0.79 0.21 0.41 0.00 0.00 0.00 0.00 1.00 ▇▁▁▁▂
PRD.GN 29 0.79 0.46 0.50 0.00 0.00 0.00 1.00 1.00 ▇▁▁▁▇
Smoking 29 0.79 0.27 0.45 0.00 0.00 0.00 1.00 1.00 ▇▁▁▁▃
hepatitis_b 29 0.79 0.04 0.19 0.00 0.00 0.00 0.00 1.00 ▇▁▁▁▁
ASCVD 29 0.79 0.20 0.40 0.00 0.00 0.00 0.00 1.00 ▇▁▁▁▂
hypertension 29 0.79 0.94 0.23 0.00 1.00 1.00 1.00 1.00 ▁▁▁▁▇
hyperuricemia 29 0.79 0.42 0.50 0.00 0.00 0.00 1.00 1.00 ▇▁▁▁▆
dyslipidemia 29 0.79 0.51 0.50 0.00 0.00 1.00 1.00 1.00 ▇▁▁▁▇
anti_HLA 29 0.79 0.24 0.43 0.00 0.00 0.00 0.00 1.00 ▇▁▁▁▂
receiver_age 29 0.79 50.81 13.12 18.86 41.76 53.14 62.18 80.44 ▂▅▇▇▁
receiver_female 29 0.79 0.29 0.46 0.00 0.00 0.00 1.00 1.00 ▇▁▁▁▃
dialysis 29 0.79 0.75 0.44 0.00 0.50 1.00 1.00 1.00 ▃▁▁▁▇
re_transplantation 29 0.79 0.03 0.17 0.00 0.00 0.00 0.00 1.00 ▇▁▁▁▁
CMV 29 0.79 0.81 0.39 0.00 1.00 1.00 1.00 1.00 ▂▁▁▁▇
HLA-DR+ cells 0 1.00 69.44 45.68 2.14 39.75 57.17 87.68 263.98 ▇▆▂▁▁
total mDC 0 1.00 54.32 39.33 0.30 26.43 46.15 71.87 198.29 ▇▇▃▁▁
mDC1 0 1.00 8.45 6.44 0.11 4.26 7.48 10.99 32.33 ▇▇▂▁▁
mDC2 0 1.00 40.12 32.28 0.10 16.67 33.05 55.62 156.81 ▇▆▂▁▁
mDC3 0 1.00 0.47 0.62 0.00 0.17 0.33 0.58 5.94 ▇▁▁▁▁
pDC 0 1.00 12.81 17.65 0.50 4.83 8.11 13.91 159.52 ▇▁▁▁▁
total CD19+ B cells 0 1.00 137.46 128.13 0.30 56.24 99.52 182.46 815.84 ▇▂▁▁▁
naive B cells 0 1.00 79.31 85.00 0.00 25.77 55.40 95.69 564.31 ▇▂▁▁▁
marginal-zone B cells 0 1.00 20.25 18.36 0.04 7.86 15.16 26.10 105.19 ▇▃▁▁▁
class non-switched memory cells 0 1.00 19.59 21.69 0.00 5.92 13.75 25.53 156.59 ▇▁▁▁▁
class switched memory cells 0 1.00 15.89 17.11 0.00 5.81 10.70 20.28 116.93 ▇▂▁▁▁
plasmablast 0 1.00 0.83 1.62 0.00 0.16 0.41 0.78 11.61 ▇▁▁▁▁
transitional B cells 0 1.00 10.52 14.63 0.00 2.38 6.68 12.57 125.80 ▇▁▁▁▁
total CD3+ cells 1 0.99 1269.76 669.92 0.20 776.51 1207.49 1591.39 4492.87 ▅▇▂▁▁
total CD4+ cells 1 0.99 750.07 377.31 0.02 495.59 704.21 958.30 2922.12 ▆▇▁▁▁
CD4+ CM cells 1 0.99 383.01 228.08 0.00 251.82 341.99 508.35 1794.23 ▇▆▁▁▁
CD4+ EM cells 1 0.99 96.00 75.20 0.00 43.19 75.54 127.54 444.02 ▇▅▁▁▁
CD4+ naive cells 1 0.99 249.29 175.27 0.02 118.51 212.75 326.79 957.51 ▇▇▂▁▁
CD4+ TEMRA cells 1 0.99 23.14 63.55 0.00 1.93 5.99 15.31 562.90 ▇▁▁▁▁
total CD8+ cells 1 0.99 433.05 325.32 0.18 217.43 348.76 567.49 2331.76 ▇▃▁▁▁
CD8+ CM cells 1 0.99 63.67 67.78 0.13 28.18 46.59 76.16 565.37 ▇▁▁▁▁
CD8+ EM cells 1 0.99 97.95 84.91 0.04 39.55 80.00 130.53 495.01 ▇▃▁▁▁
CD8+ naive cells 1 0.99 99.66 100.77 0.00 37.69 76.30 123.68 823.58 ▇▁▁▁▁
CD8+ TEMRA cells 1 0.99 172.29 199.42 0.02 56.92 109.65 213.23 1496.27 ▇▁▁▁▁
names(data_avail)
##  [1] "id"                              "group"                          
##  [3] "diabetes"                        "PRD.GN"                         
##  [5] "Smoking"                         "hepatitis_b"                    
##  [7] "ASCVD"                           "hypertension"                   
##  [9] "hyperuricemia"                   "dyslipidemia"                   
## [11] "anti_HLA"                        "receiver_age"                   
## [13] "receiver_female"                 "dialysis"                       
## [15] "re_transplantation"              "CMV"                            
## [17] "HLA-DR+ cells"                   "total mDC"                      
## [19] "mDC1"                            "mDC2"                           
## [21] "mDC3"                            "pDC"                            
## [23] "total CD19+ B cells"             "naive B cells"                  
## [25] "marginal-zone B cells"           "class non-switched memory cells"
## [27] "class switched memory cells"     "plasmablast"                    
## [29] "transitional B cells"            "total CD3+ cells"               
## [31] "total CD4+ cells"                "CD4+ CM cells"                  
## [33] "CD4+ EM cells"                   "CD4+ naive cells"               
## [35] "CD4+ TEMRA cells"                "total CD8+ cells"               
## [37] "CD8+ CM cells"                   "CD8+ EM cells"                  
## [39] "CD8+ naive cells"                "CD8+ TEMRA cells"

data_excel <- data_excel_all %>% 
  na.omit()

dim(data_excel)
## [1] 106  40

Clinical data

data_clinical <- data_excel %>% 
  dplyr::select(
    id, Diabetes, PRD.GN, Smoking, hep.B, ASCVD, 
    lecena.hypertenze, lecena.hyperurikemie, lecena.dyslipidemie,
    `anti-HLA.preTx.nebo.historicky.ano/ne`,
    `věk.příjemce`, `pohlavi.prijemce.(ž=1)`, 
    Dialysis, retransplantace, CMV) %>% 
  dplyr::rename(
    `diabetes` = `Diabetes`,
    `hepatitis_b` = `hep.B`,
    `hypertension` = `lecena.hypertenze`,
    `hyperuricemia` = `lecena.hyperurikemie`,
    `dyslipidemia` = `lecena.dyslipidemie`,
    `anti_HLA` = `anti-HLA.preTx.nebo.historicky.ano/ne`,
    `receiver_age` = `věk.příjemce`,
    `receiver_female` = `pohlavi.prijemce.(ž=1)`,
    `dialysis` = `Dialysis`,
    `re_transplantation` = `retransplantace`)

summary(data_clinical)
##       id               diabetes         PRD.GN          Smoking      
##  Length:106         Min.   :0.000   Min.   :0.0000   Min.   :0.0000  
##  Class :character   1st Qu.:0.000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Mode  :character   Median :0.000   Median :0.0000   Median :0.0000  
##                     Mean   :0.217   Mean   :0.4528   Mean   :0.2642  
##                     3rd Qu.:0.000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##                     Max.   :1.000   Max.   :1.0000   Max.   :1.0000  
##   hepatitis_b          ASCVD         hypertension    hyperuricemia   
##  Min.   :0.00000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:1.0000   1st Qu.:0.0000  
##  Median :0.00000   Median :0.0000   Median :1.0000   Median :0.0000  
##  Mean   :0.03774   Mean   :0.1981   Mean   :0.9434   Mean   :0.4245  
##  3rd Qu.:0.00000   3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.00000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##   dyslipidemia       anti_HLA       receiver_age   receiver_female 
##  Min.   :0.0000   Min.   :0.0000   Min.   :18.86   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:41.48   1st Qu.:0.0000  
##  Median :1.0000   Median :0.0000   Median :53.33   Median :0.0000  
##  Mean   :0.5094   Mean   :0.2453   Mean   :50.79   Mean   :0.2925  
##  3rd Qu.:1.0000   3rd Qu.:0.0000   3rd Qu.:62.26   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :80.44   Max.   :1.0000  
##     dialysis      re_transplantation      CMV        
##  Min.   :0.0000   Min.   :0.0000     Min.   :0.0000  
##  1st Qu.:0.2500   1st Qu.:0.0000     1st Qu.:1.0000  
##  Median :1.0000   Median :0.0000     Median :1.0000  
##  Mean   :0.7453   Mean   :0.0283     Mean   :0.8113  
##  3rd Qu.:1.0000   3rd Qu.:0.0000     3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000     Max.   :1.0000

General lymphocytes (wider lymphocytes categories)

data_lymphocytes_general <- data_excel %>% 
  dplyr::select(
    id, `HLA_DR+_Absp`, `mDC_absp`, 
    `CD.19`, `CD3_absp`, `CD4_absp`, `CD8_absp`) %>% 
    rename(
     `HLA-DR+ cells`= `HLA_DR+_Absp`, 
     `total mDC` = `mDC_absp`, 
     `total CD19+ B cells` = `CD.19`, 
     `total CD3+ cells` = `CD3_absp`, 
     `total CD4+ cells` = `CD4_absp`, 
     `total CD8+ cells` = `CD8_absp`
    )

summary(data_lymphocytes_general)
##       id            HLA-DR+ cells       total mDC        total CD19+ B cells
##  Length:106         Min.   :  2.143   Min.   :  0.9969   Min.   :  0.3032   
##  Class :character   1st Qu.: 41.339   1st Qu.: 31.8812   1st Qu.: 49.3718   
##  Mode  :character   Median : 59.071   Median : 47.1999   Median : 83.6212   
##                     Mean   : 72.969   Mean   : 59.7426   Mean   :109.3094   
##                     3rd Qu.: 91.785   3rd Qu.: 74.6530   3rd Qu.:131.7112   
##                     Max.   :263.984   Max.   :198.2863   Max.   :815.8440   
##  total CD3+ cells total CD4+ cells total CD8+ cells 
##  Min.   : 240.0   Min.   : 116.2   Min.   :  44.87  
##  1st Qu.: 745.2   1st Qu.: 477.1   1st Qu.: 202.81  
##  Median :1036.9   Median : 689.2   Median : 326.15  
##  Mean   :1206.8   Mean   : 731.7   Mean   : 400.42  
##  3rd Qu.:1515.0   3rd Qu.: 912.4   3rd Qu.: 519.38  
##  Max.   :4492.9   Max.   :2922.1   Max.   :2331.76


data_lymphocytes_general_log10 <- data_lymphocytes_general %>% 
  dplyr::mutate(dplyr::across(
    .cols = dplyr::where(is.numeric),  
    .fns = ~ arm::rescale(log10(.x + 0.5)),  
    .names = "scaled_log10_{.col}" 
  )) %>% 
  dplyr::select(-c(`HLA-DR+ cells` : `total CD8+ cells`))

summary(data_lymphocytes_general_log10)
##       id            scaled_log10_HLA-DR+ cells scaled_log10_total mDC
##  Length:106         Min.   :-2.33382           Min.   :-2.2848499    
##  Class :character   1st Qu.:-0.27386           1st Qu.:-0.2557272    
##  Mode  :character   Median :-0.01045           Median : 0.0000093    
##                     Mean   : 0.00000           Mean   : 0.0000000    
##                     3rd Qu.: 0.31597           3rd Qu.: 0.3000675    
##                     Max.   : 1.10152           Max.   : 0.9421523    
##  scaled_log10_total CD19+ B cells scaled_log10_total CD3+ cells
##  Min.   :-2.47345                 Min.   :-1.41203             
##  1st Qu.:-0.23674                 1st Qu.:-0.33065             
##  Median : 0.04644                 Median :-0.01514             
##  Mean   : 0.00000                 Mean   : 0.00000             
##  3rd Qu.: 0.29138                 3rd Qu.: 0.34719             
##  Max.   : 1.27773                 Max.   : 1.38593             
##  scaled_log10_total CD4+ cells scaled_log10_total CD8+ cells
##  Min.   :-1.63279              Min.   :-1.30202             
##  1st Qu.:-0.28429              1st Qu.:-0.28711             
##  Median : 0.06748              Median : 0.03384             
##  Mean   : 0.00000              Mean   : 0.00000             
##  3rd Qu.: 0.33573              3rd Qu.: 0.34833             
##  Max.   : 1.44914              Max.   : 1.36410

Specific lymphocytes (narrower lymphocytes categories)

data_lymphocytes_specific <- data_excel %>% 
  dplyr::select(
    id, 
    `mDC1_absp`,`mDC2_absp`, `mDC3_absp`,
    `pDC_absp`,
    `naiveB_absp`,
    `marginal.B`, `non_switched_memory`,`switched.memory`,   
    `Plazmablasty_absp`,`Transitional_absp`, 
    `CD.4.CM`, `CD.4.EM`, `CD4_naive_absp`, `CD4_TEMRA_absp`,
    `CD8_CM_absp`, `CD8_EM_absp`, `CD.8.naive`, `CD8_TEMRA_absp`) %>% 
  dplyr::rename(
    mDC1 = mDC1_absp,
    mDC2 = mDC2_absp,
    mDC3 = mDC3_absp,
    pDC = pDC_absp,
    `naive B cells` = naiveB_absp,
    `marginal-zone B cells` = marginal.B,
    `class non-switched memory cells` = non_switched_memory,
    `class switched memory cells` = switched.memory,
    plasmablast = Plazmablasty_absp,
    `transitional B cells` = Transitional_absp,
    `CD4+ CM cells` = CD.4.CM,
    `CD4+ EM cells` = CD.4.EM,
    `CD4+ naive cells` = CD4_naive_absp,
    `CD4+ TEMRA cells` = CD4_TEMRA_absp,
    `CD8+ CM cells` = CD8_CM_absp,
    `CD8+ EM cells` = CD8_EM_absp,
    `CD8+ naive cells` = CD.8.naive,
    `CD8+ TEMRA cells` = CD8_TEMRA_absp
  )


summary(data_lymphocytes_specific)
##       id                 mDC1              mDC2               mDC3       
##  Length:106         Min.   : 0.2758   Min.   :  0.4985   Min.   :0.0000  
##  Class :character   1st Qu.: 4.8975   1st Qu.: 19.5353   1st Qu.:0.1776  
##  Mode  :character   Median : 7.8591   Median : 35.4651   Median :0.3278  
##                     Mean   : 9.1475   Mean   : 44.0277   Mean   :0.4632  
##                     3rd Qu.:11.7542   3rd Qu.: 57.8292   3rd Qu.:0.6290  
##                     Max.   :32.3276   Max.   :156.8147   Max.   :2.3878  
##       pDC           naive B cells    marginal-zone B cells
##  Min.   :  0.5043   Min.   :  0.00   Min.   : 0.04332     
##  1st Qu.:  4.9067   1st Qu.: 20.12   1st Qu.: 6.33619     
##  Median :  8.1418   Median : 44.65   Median :11.67662     
##  Mean   : 13.2617   Mean   : 61.07   Mean   :15.90749     
##  3rd Qu.: 13.6297   3rd Qu.: 73.50   3rd Qu.:21.74574     
##  Max.   :159.5242   Max.   :483.96   Max.   :84.39243     
##  class non-switched memory cells class switched memory cells  plasmablast     
##  Min.   :  0.000                 Min.   :  0.000             Min.   : 0.0000  
##  1st Qu.:  5.064                 1st Qu.:  4.246             1st Qu.: 0.1612  
##  Median :  9.015                 Median :  8.742             Median : 0.3952  
##  Mean   : 15.477                 Mean   : 13.659             Mean   : 0.9237  
##  3rd Qu.: 21.957                 3rd Qu.: 16.796             3rd Qu.: 0.8395  
##  Max.   :156.593                 Max.   :116.931             Max.   :11.6077  
##  transitional B cells CD4+ CM cells     CD4+ EM cells      CD4+ naive cells 
##  Min.   : 0.000       Min.   :  32.75   Min.   :  0.3629   Min.   :  5.646  
##  1st Qu.: 1.981       1st Qu.: 240.25   1st Qu.: 49.2556   1st Qu.:104.642  
##  Median : 6.113       Median : 333.86   Median : 88.3852   Median :183.540  
##  Mean   : 9.024       Mean   : 381.14   Mean   :104.3018   Mean   :222.655  
##  3rd Qu.:11.119       3rd Qu.: 509.06   3rd Qu.:134.9888   3rd Qu.:295.569  
##  Max.   :67.714       Max.   :1794.23   Max.   :444.0170   Max.   :733.693  
##  CD4+ TEMRA cells  CD8+ CM cells     CD8+ EM cells      CD8+ naive cells
##  Min.   :  0.000   Min.   :  3.097   Min.   :  0.2016   Min.   :  0.00  
##  1st Qu.:  1.808   1st Qu.: 26.823   1st Qu.: 34.9957   1st Qu.: 35.29  
##  Median :  5.434   Median : 41.187   Median : 67.9814   Median : 68.61  
##  Mean   : 25.329   Mean   : 62.879   Mean   : 87.0337   Mean   : 94.07  
##  3rd Qu.: 15.317   3rd Qu.: 76.214   3rd Qu.:113.6403   3rd Qu.:112.14  
##  Max.   :562.900   Max.   :565.372   Max.   :495.0150   Max.   :466.48  
##  CD8+ TEMRA cells  
##  Min.   :   6.427  
##  1st Qu.:  48.558  
##  Median : 107.805  
##  Mean   : 157.089  
##  3rd Qu.: 187.504  
##  Max.   :1496.275


data_lymphocytes_specific_log10 <- data_lymphocytes_specific %>% 
  dplyr::mutate(dplyr::across(
    .cols = dplyr::where(is.numeric),  
    .fns = ~ arm::rescale(log10(.x + 0.5)), 
    .names = "scaled_log10_{.col}" 
  )) %>% 
  dplyr::select(-c(`mDC1` : `CD8+ TEMRA cells`))

summary(data_lymphocytes_specific_log10)
##       id            scaled_log10_mDC1  scaled_log10_mDC2  scaled_log10_mDC3 
##  Length:106         Min.   :-1.40589   Min.   :-2.01301   Min.   :-0.76626  
##  Class :character   1st Qu.:-0.19815   1st Qu.:-0.28743   1st Qu.:-0.36353  
##  Mode  :character   Median : 0.07406   Median : 0.04923   Median :-0.09803  
##                     Mean   : 0.00000   Mean   : 0.00000   Mean   : 0.00000  
##                     3rd Qu.: 0.31237   3rd Qu.: 0.32741   3rd Qu.: 0.31316  
##                     Max.   : 0.92592   Max.   : 0.89833   Max.   : 1.55780  
##  scaled_log10_pDC  scaled_log10_naive B cells
##  Min.   :-1.2566   Min.   :-1.98655          
##  1st Qu.:-0.2893   1st Qu.:-0.28944          
##  Median :-0.0199   Median : 0.06826          
##  Mean   : 0.0000   Mean   : 0.00000          
##  3rd Qu.: 0.2627   3rd Qu.: 0.29379          
##  Max.   : 1.6573   Max.   : 1.15117          
##  scaled_log10_marginal-zone B cells
##  Min.   :-1.65986                  
##  1st Qu.:-0.27830                  
##  Median : 0.03625                  
##  Mean   : 0.00000                  
##  3rd Qu.: 0.36540                  
##  Max.   : 1.09610                  
##  scaled_log10_class non-switched memory cells
##  Min.   :-1.45126                            
##  1st Qu.:-0.28560                            
##  Median :-0.02608                            
##  Mean   : 0.00000                            
##  3rd Qu.: 0.38949                            
##  Max.   : 1.33062                            
##  scaled_log10_class switched memory cells scaled_log10_plasmablast
##  Min.   :-1.45378                         Min.   :-0.5423         
##  1st Qu.:-0.31789                         1st Qu.:-0.3329         
##  Median : 0.01853                         Median :-0.1058         
##  Mean   : 0.00000                         Mean   : 0.0000         
##  3rd Qu.: 0.33483                         3rd Qu.: 0.1962         
##  Max.   : 1.30161                         Max.   : 1.8461         
##  scaled_log10_transitional B cells scaled_log10_CD4+ CM cells
##  Min.   :-1.04586                  Min.   :-1.77095          
##  1st Qu.:-0.34169                  1st Qu.:-0.21861          
##  Median : 0.08932                  Median : 0.03896          
##  Mean   : 0.00000                  Mean   : 0.00000          
##  3rd Qu.: 0.33704                  3rd Qu.: 0.36938          
##  Max.   : 1.11522                  Max.   : 1.35669          
##  scaled_log10_CD4+ EM cells scaled_log10_CD4+ naive cells
##  Min.   :-2.66956           Min.   :-2.0134              
##  1st Qu.:-0.27555           1st Qu.:-0.2874              
##  Median : 0.06704           Median : 0.0529              
##  Mean   : 0.00000           Mean   : 0.0000              
##  3rd Qu.: 0.31594           3rd Qu.: 0.3419              
##  Max.   : 1.01746           Max.   : 0.8939              
##  scaled_log10_CD4+ TEMRA cells scaled_log10_CD8+ CM cells
##  Min.   :-0.86654              Min.   :-1.37013          
##  1st Qu.:-0.35749              1st Qu.:-0.25020          
##  Median :-0.04321              Median :-0.01689          
##  Mean   : 0.00000              Mean   : 0.00000          
##  3rd Qu.: 0.28313              3rd Qu.: 0.32002          
##  Max.   : 1.47255              Max.   : 1.42372          
##  scaled_log10_CD8+ EM cells scaled_log10_CD8+ naive cells
##  Min.   :-2.08450           Min.   :-2.49097             
##  1st Qu.:-0.23177           1st Qu.:-0.30354             
##  Median : 0.07852           Median : 0.03358             
##  Mean   : 0.00000           Mean   : 0.00000             
##  3rd Qu.: 0.31953           3rd Qu.: 0.28378             
##  Max.   : 1.01299           Max.   : 1.01216             
##  scaled_log10_CD8+ TEMRA cells
##  Min.   :-1.17442             
##  1st Qu.:-0.29107             
##  Median : 0.06679             
##  Mean   : 0.00000             
##  3rd Qu.: 0.31527             
##  Max.   : 1.25234

3 Exploration and analysis

3.1 Number of re-transplantations

There is very small proportion of re-trans plantations. How many? Are these patients different in terms of lymphocyte composition?

table(data_clinical$re_transplantation)
## 
##   0   1 
## 103   3

data_clinical %>% 
  dplyr::left_join(data_lymphocytes_specific, by = 'id') %>% 
  dplyr::select(-id) %>% 
  dplyr::mutate(re_transplantation) %>% 
  gtsummary::tbl_summary(by = 're_transplantation') %>% 
  add_p() %>% 
  add_q()

Characteristic

0
N = 103

1

1
N = 3

1

p-value

2

q-value

3
diabetes 22 (21%) 1 (33%) 0.5 >0.9
PRD.GN 46 (45%) 2 (67%) 0.6 >0.9
Smoking 27 (26%) 1 (33%) >0.9 >0.9
hepatitis_b 4 (3.9%) 0 (0%) >0.9 >0.9
ASCVD 21 (20%) 0 (0%) >0.9 >0.9
hypertension 97 (94%) 3 (100%) >0.9 >0.9
hyperuricemia 43 (42%) 2 (67%) 0.6 >0.9
dyslipidemia 52 (50%) 2 (67%) >0.9 >0.9
anti_HLA 25 (24%) 1 (33%) >0.9 >0.9
receiver_age 53 (41, 62) 55 (36, 63) >0.9 >0.9
receiver_female 31 (30%) 0 (0%) 0.6 >0.9
dialysis 76 (74%) 3 (100%) 0.6 >0.9
CMV 83 (81%) 3 (100%) >0.9 >0.9
mDC1 8 (5, 12) 12 (7, 32) 0.13 0.4
mDC2 35 (19, 58) 50 (13, 51) >0.9 >0.9
mDC3 0.33 (0.17, 0.65) 0.37 (0.33, 0.49) 0.6 >0.9
pDC 8 (5, 14) 10 (5, 11) >0.9 >0.9
naive B cells 45 (20, 74) 33 (13, 121) 0.7 >0.9
marginal-zone B cells 11 (6, 21) 23 (20, 24) 0.10 0.3
class non-switched memory cells 9 (5, 22) 28 (24, 40) 0.025 0.2
class switched memory cells 8 (4, 17) 26 (9, 43) 0.090 0.3
plasmablast 0.40 (0.14, 0.84) 0.24 (0.20, 1.01) >0.9 >0.9
transitional B cells 6 (2, 12) 1 (1, 1) 0.025 0.2
CD4+ CM cells 320 (236, 497) 678 (550, 702) 0.028 0.2
CD4+ EM cells 78 (48, 134) 166 (132, 377) 0.034 0.2
CD4+ naive cells 184 (104, 309) 183 (64, 201) 0.5 >0.9
CD4+ TEMRA cells 5 (2, 15) 22 (14, 56) 0.065 0.3
CD8+ CM cells 41 (25, 75) 131 (103, 166) 0.017 0.2
CD8+ EM cells 68 (33, 106) 240 (118, 279) 0.019 0.2
CD8+ naive cells 68 (33, 111) 220 (55, 273) 0.2 0.4
CD8+ TEMRA cells 106 (46, 172) 416 (110, 457) 0.067 0.3
1

n (%); Median (Q1, Q3)

2

Fisher’s exact test; Wilcoxon rank sum test

3

False discovery rate correction for multiple testing

It seems that patients with re-transplantation may have very different lymphocyte composition, but small number of such patients does not allow to infer the effect of re-transplantation. Thus, the 3 patients will be removed for further analysis

3.1.1 Re-transplantated patients removal

ids <- data_clinical %>% filter(re_transplantation == 1) %>% 
  dplyr::select(id)

ids <- ids$id

data_clinical <- data_clinical %>% 
  dplyr::filter(!id %in% ids) %>% 
  dplyr::select(-re_transplantation)

3.2 Patients vs controls

median_diffs <- data_excel_all %>%
  dplyr::select(`HLA_DR+_Absp`:`CD8_TEMRA_absp`, group) %>%
  tidyr::pivot_longer(cols = -group, names_to = "variable", values_to = "value") %>%
  dplyr::group_by(variable, group) %>%
  dplyr::summarise(median = mean(value, na.rm = TRUE), .groups = "drop") %>%
  tidyr::pivot_wider(names_from = group, values_from = median) %>%
  dplyr::mutate(median_diff = .[[3]] / .[[2]]) %>%
  dplyr::select(variable, median_diff)

# Create the gtsummary table
tbl <- data_excel_all %>%
  dplyr::select(`HLA_DR+_Absp`:`CD8_TEMRA_absp`, group) %>%
  gtsummary::tbl_summary(
    by = group,
    statistic = all_continuous() ~ "{mean} ({p25}, {median}, {p75})"
  ) %>%
  gtsummary::modify_table_body(
    ~ dplyr::left_join(.x, median_diffs, by = "variable")
  ) %>%
  gtsummary::modify_header(median_diff ~ "**Mean counts ratio**") %>%
  gtsummary::modify_fmt_fun(
    median_diff ~ (function(x) gtsummary::style_number(x, digits = 2))
) %>% 
  gtsummary::add_p()
  
tbl

Characteristic

control
N = 29

1

patient
N = 107

1

Mean counts ratio

p-value

2
HLA_DR+_Absp 56 (24, 46, 79) 73 (41, 60, 93) 1.31 0.046
mDC_absp 33 (2, 27, 65) 60 (32, 48, 78) 1.79 <0.001
mDC1_absp 6 (2, 6, 9) 9 (5, 8, 12) 1.57 0.015
mDC2_absp 25 (1, 18, 45) 44 (19, 35, 58) 1.79 0.001
mDC3_absp 0.51 (0.16, 0.30, 0.47) 0.47 (0.17, 0.33, 0.65) 0.92 0.3
pDC_absp 11 (5, 8, 14) 13 (5, 8, 14) 1.15 >0.9
CD.19 243 (144, 213, 260) 109 (49, 82, 133) 0.45 <0.001
naiveB_absp 148 (88, 120, 174) 61 (20, 45, 74) 0.41 <0.001
marginal.B 36 (23, 30, 48) 16 (6, 11, 22) 0.44 <0.001
non_switched_memory 35 (19, 29, 37) 15 (5, 9, 22) 0.44 <0.001
switched.memory 24 (13, 18, 33) 14 (4, 9, 17) 0.56 <0.001
Plazmablasty_absp 0.52 (0.21, 0.46, 0.68) 0.92 (0.16, 0.39, 0.84) 1.78 0.6
Transitional_absp 16 (6, 10, 18) 9 (2, 6, 11) 0.55 0.003
CD3_absp 1,500 (1,226, 1,475, 1,841) 1,207 (743, 1,037, 1,517) 0.80 0.001
    Unknown 0 1 0.80
CD4_absp 817 (647, 800, 1,033) 732 (477, 689, 914) 0.90 0.043
    Unknown 0 1 0.90
CD.4.CM 390 (309, 375, 460) 381 (239, 334, 510) 0.98 0.3
    Unknown 0 1 0.98
CD.4.EM 66 (27, 51, 113) 104 (49, 88, 135) 1.59 0.006
    Unknown 0 1 1.59
CD4_naive_absp 347 (220, 307, 376) 223 (104, 184, 296) 0.64 <0.001
    Unknown 0 1 0.64
CD4_TEMRA_absp 15 (2, 7, 15) 25 (2, 5, 16) 1.67 0.4
    Unknown 0 1 1.67
CD8_absp 552 (363, 462, 723) 400 (201, 326, 520) 0.72 0.006
    Unknown 0 1 0.72
CD8_CM_absp 67 (40, 54, 73) 63 (27, 41, 76) 0.94 0.2
    Unknown 0 1 0.94
CD8_EM_absp 138 (78, 117, 200) 87 (35, 68, 116) 0.63 0.003
    Unknown 0 1 0.63
CD.8.naive 120 (47, 94, 132) 94 (35, 69, 112) 0.78 0.13
    Unknown 0 1 0.78
CD8_TEMRA_absp 228 (83, 131, 287) 157 (47, 108, 193) 0.69 0.085
    Unknown 0 1 0.69
1

Mean (Q1, Median, Q3)

2

Wilcoxon rank sum test

3.3 PERMANOVA

3.3.1 General data

3.3.1.1 Prepare data

outcomes_gen <- data_lymphocytes_general_log10 %>% 
  filter(!id %in% ids) %>% 
  dplyr::select(-id)

data_predict <- data_clinical %>% 
  dplyr::select(-id)

3.3.1.2 Run single-predictor analysis

modelac <- 'perm_general'

assign(modelac, run(
  expr = perm_over(outcome = outcomes_gen,
                   predictors = data_predict,
                   nperm = 5000,
                   seed = 483),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% dplyr::arrange(desc(r2)) %>% kableExtra::kable()
predictor r2 p_val
Smoking 0.0458487 0.0027994
ASCVD 0.0448173 0.0039992
anti_HLA 0.0223096 0.0615877
diabetes 0.0205435 0.0949810
CMV 0.0147104 0.1953609
receiver_age 0.0086260 0.4385123
hypertension 0.0084216 0.4337133
hyperuricemia 0.0082793 0.4725055
receiver_female 0.0079409 0.4851030
dialysis 0.0064187 0.5806839
PRD.GN 0.0056397 0.6440712
dyslipidemia 0.0040431 0.7630474
hepatitis_b 0.0021462 0.8810238

3.3.1.3 Run multivariable analysis - R2 selection


## by margin
modelac <- 'perm_general_multivariable_margin'
set.seed(483)

assign(modelac, run(
  expr = adonis2(outcomes_gen ~ 
                   Smoking +
                   ASCVD +
                   anti_HLA + 
                   diabetes,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'margin'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()
Df SumOfSqs R2 F Pr(>F)
Smoking 1 5.5715327 0.0363342 3.9690721 0.0110989
ASCVD 1 3.5270048 0.0230010 2.5125827 0.0595940
anti_HLA 1 2.7383895 0.0178582 1.9507856 0.1117888
diabetes 1 0.7845024 0.0051161 0.5588671 0.6512349
Residual 98 137.5662083 0.8971256 NA NA
Total 102 153.3410767 1.0000000 NA NA


## sequential
modelac <- 'perm_general_multivariable_sequential'
set.seed(483)

assign(modelac, run(
  expr = adonis2(outcomes_gen ~ 
                   Smoking +
                   ASCVD +
                   anti_HLA + 
                   diabetes,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'terms'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()
Df SumOfSqs R2 F Pr(>F)
Smoking 1 7.0304932 0.0458487 5.0084126 0.0027997
ASCVD 1 5.1051071 0.0332925 3.6367979 0.0144986
anti_HLA 1 2.8547657 0.0186171 2.0336901 0.1017898
diabetes 1 0.7845024 0.0051161 0.5588671 0.6512349
Residual 98 137.5662083 0.8971256 NA NA
Total 102 153.3410767 1.0000000 NA NA

3.3.1.4 Run multivariable analysis - P-val selection


## by margin
modelac <- 'perm_general_multivariable_margin_sensP'
set.seed(483)

assign(modelac, run(
  expr = adonis2(outcomes_gen ~ 
                   Smoking +
                   ASCVD,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'margin'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()
Df SumOfSqs R2 F Pr(>F)
Smoking 1 5.263274 0.0343240 3.727387 0.0134987
ASCVD 1 5.105107 0.0332925 3.615375 0.0148985
Residual 100 141.205476 0.9208588 NA NA
Total 102 153.341077 1.0000000 NA NA


## sequential
modelac <- 'perm_general_multivariable_sequential_sensP'
set.seed(483)

assign(modelac, run(
  expr = adonis2(outcomes_gen ~ 
                   Smoking +
                   ASCVD,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'terms'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()
Df SumOfSqs R2 F Pr(>F)
Smoking 1 7.030493 0.0458487 4.978910 0.0027997
ASCVD 1 5.105107 0.0332925 3.615375 0.0148985
Residual 100 141.205476 0.9208588 NA NA
Total 102 153.341077 1.0000000 NA NA

3.3.2 All specific data

3.3.2.1 Prepare

outcomes <- data_lymphocytes_specific_log10 %>% 
  filter(!id %in% ids) %>% 
  dplyr::select(-id)

data_predict <- data_clinical %>% 
  dplyr::select(-id)

3.3.2.2 Run single-predictor analysis

modelac <- 'perm_specific'

assign(modelac, run(
  expr = perm_over(outcome = outcomes,
                   predictors = data_predict,
                   nperm = 5000,
                   seed = 483),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% dplyr::arrange(desc(r2)) %>% kableExtra::kable()
predictor r2 p_val
Smoking 0.0405064 0.0011998
ASCVD 0.0267541 0.0103979
CMV 0.0246470 0.0179964
diabetes 0.0218696 0.0279944
receiver_age 0.0202068 0.0415917
dialysis 0.0147878 0.1343731
dyslipidemia 0.0129008 0.2155569
anti_HLA 0.0112876 0.2871426
receiver_female 0.0110641 0.3135373
hyperuricemia 0.0093547 0.4423115
PRD.GN 0.0062281 0.7750450
hepatitis_b 0.0051746 0.7854429
hypertension 0.0031043 0.9694061

3.3.2.3 Run multivariable analyses

Note that both selection with R2 > 2 as well as P-value < 0.05 selections results in the same multivariable PERMANOVA


## by margin
modelac <- 'perm_specific_multivariable_margin'
set.seed(483)

assign(modelac, run(
  expr = adonis2(outcomes ~ 
                   Smoking +
                   CMV +
                   ASCVD +
                   diabetes +
                   receiver_age,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'margin'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()
Df SumOfSqs R2 F Pr(>F)
Smoking 1 11.838048 0.0257911 2.825554 0.0088991
CMV 1 9.960709 0.0217011 2.377463 0.0242976
ASCVD 1 6.828019 0.0148760 1.629740 0.1107889
diabetes 1 5.654930 0.0123202 1.349742 0.1947805
receiver_age 1 9.165706 0.0199690 2.187709 0.0330967
Residual 97 406.394825 0.8853984 NA NA
Total 102 458.996571 1.0000000 NA NA

## sequential
modelac <- 'perm_specific_multivariable_sequential'
set.seed(483)
assign(modelac, run(
  expr = adonis2(outcomes ~ 
                   Smoking +
                   CMV +
                   ASCVD +
                   diabetes +
                   receiver_age,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'terms'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()
Df SumOfSqs R2 F Pr(>F)
Smoking 1 18.592295 0.0405064 4.437686 0.0007999
CMV 1 10.431152 0.0227260 2.489750 0.0198980
ASCVD 1 8.249543 0.0179730 1.969035 0.0554945
diabetes 1 6.163051 0.0134272 1.471023 0.1525847
receiver_age 1 9.165706 0.0199690 2.187709 0.0330967
Residual 97 406.394825 0.8853984 NA NA
Total 102 458.996571 1.0000000 NA NA

3.3.3 Dendritic cells

3.3.3.1 Prepare data

outcomes <- data_lymphocytes_specific_log10 %>% 
  filter(!id %in% ids) %>% 
  dplyr::select(
    scaled_log10_mDC1:scaled_log10_pDC
    )

data_predict <- data_clinical %>%
  dplyr::select(-id)

3.3.3.2 Run single-predictor analysis

modelac <- 'perm_specific_dendritic'

assign(modelac, run(
  expr = perm_over(outcome = outcomes,
                   predictors = data_predict,
                   nperm = 5000,
                   seed = 483),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% dplyr::arrange(desc(r2)) %>% kableExtra::kable()
predictor r2 p_val
Smoking 0.0282209 0.0335933
ASCVD 0.0224578 0.0681864
receiver_age 0.0154331 0.1801640
receiver_female 0.0112941 0.3221356
dialysis 0.0107099 0.3305339
dyslipidemia 0.0095974 0.4065187
diabetes 0.0087107 0.4435113
PRD.GN 0.0086675 0.4519096
CMV 0.0040339 0.7716457
hyperuricemia 0.0037099 0.8062388
hypertension 0.0034601 0.8002400
hepatitis_b 0.0027088 0.8526295
anti_HLA 0.0018789 0.9256149

3.3.3.3 Run multivariable analyses


## by margin
modelac <- 'perm_specific_dendritic_multivariable_margin'
set.seed(483)

assign(modelac, run(
  expr = adonis2(outcomes ~ 
                   Smoking +
                   ASCVD,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'margin'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()
Df SumOfSqs R2 F Pr(>F)
Smoking 1 3.007452 0.0290616 3.064018 0.0283972
ASCVD 1 2.411052 0.0232985 2.456401 0.0591941
Residual 100 98.153856 0.9484806 NA NA
Total 102 103.485356 1.0000000 NA NA

## sequential
modelac <- 'perm_specific_dendritic_multivariable_sequential'
set.seed(483)
assign(modelac, run(
  expr = adonis2(outcomes ~ 
                   Smoking +
                   ASCVD,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'terms'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()
Df SumOfSqs R2 F Pr(>F)
Smoking 1 2.920448 0.0282209 2.975378 0.0323968
ASCVD 1 2.411052 0.0232985 2.456401 0.0591941
Residual 100 98.153856 0.9484806 NA NA
Total 102 103.485356 1.0000000 NA NA

Note that as only Smoking would be selected on the basis of p-value from a multivariable model, resulting in a multivariable model.

3.3.4 B-lymphocytes

3.3.4.1 Prepare data

outcomes <- data_lymphocytes_specific_log10 %>% 
  filter(!id %in% ids) %>%
  dplyr::select(
   `scaled_log10_naive B cells`:`scaled_log10_transitional B cells`
    )

data_predict <- data_clinical %>% dplyr::select(-id)

3.3.4.2 Run single-predictor analysis

modelac <- 'perm_specific_lymphB'

assign(modelac, run(
  expr = perm_over(outcome = outcomes,
                   predictors = data_predict,
                   nperm = 5000,
                   seed = 483),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% dplyr::arrange(desc(r2)) %>% kableExtra::kable()
predictor r2 p_val
Smoking 0.0446542 0.0077984
dialysis 0.0353771 0.0227954
diabetes 0.0276732 0.0411918
ASCVD 0.0212909 0.0927814
hyperuricemia 0.0202773 0.0983803
anti_HLA 0.0189893 0.1113777
receiver_female 0.0171100 0.1535693
dyslipidemia 0.0070766 0.5236953
CMV 0.0045456 0.7064587
PRD.GN 0.0044617 0.7204559
hepatitis_b 0.0038473 0.7268546
hypertension 0.0029038 0.8392322
receiver_age 0.0025997 0.8854229

3.3.4.3 Run multivariable analyses - R2 selection


## by margin
modelac <- 'perm_specific_lymphB_multivariable_margin'
set.seed(483)

assign(modelac, run(
  expr = adonis2(outcomes ~ 
                   Smoking +
                   dialysis +
                   diabetes +
                   ASCVD +
                   hyperuricemia,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'margin'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()
Df SumOfSqs R2 F Pr(>F)
Smoking 1 4.9076996 0.0320696 3.4901724 0.0261974
dialysis 1 2.9890295 0.0195320 2.1256860 0.1008899
diabetes 1 2.6227091 0.0171382 1.8651726 0.1283872
ASCVD 1 0.9802364 0.0064054 0.6971075 0.5213479
hyperuricemia 1 0.8173329 0.0053409 0.5812566 0.6107389
Residual 97 136.3963756 0.8912894 NA NA
Total 102 153.0326473 1.0000000 NA NA

## sequential
modelac <- 'perm_specific_lymphB_multivariable_sequential'
set.seed(483)
assign(modelac, run(
  expr = adonis2(outcomes ~ 
                   Smoking +
                   dialysis +
                   diabetes +
                   ASCVD +
                   hyperuricemia,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'terms'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()
Df SumOfSqs R2 F Pr(>F)
Smoking 1 6.8335483 0.0446542 4.8597639 0.0076992
dialysis 1 5.3788543 0.0351484 3.8252400 0.0187981
diabetes 1 2.6075522 0.0170392 1.8543936 0.1282872
ASCVD 1 0.9989839 0.0065279 0.7104400 0.5104490
hyperuricemia 1 0.8173329 0.0053409 0.5812566 0.6107389
Residual 97 136.3963756 0.8912894 NA NA
Total 102 153.0326473 1.0000000 NA NA

3.3.4.4 Run multivariable analyses - P-value selection


## by margin
modelac <- 'perm_specific_lymphB_multivariable_margin_sensP'
set.seed(483)

assign(modelac, run(
  expr = adonis2(outcomes ~ 
                   Smoking +
                   dialysis +
                   diabetes,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'margin'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()
Df SumOfSqs R2 F Pr(>F)
Smoking 1 5.857672 0.0382773 4.195776 0.0127987
dialysis 1 4.783796 0.0312600 3.426573 0.0286971
diabetes 1 2.607552 0.0170392 1.867757 0.1260874
Residual 99 138.212693 0.9031582 NA NA
Total 102 153.032647 1.0000000 NA NA

## sequential
modelac <- 'perm_specific_lymphB_multivariable_sequential_sensP'
set.seed(483)
assign(modelac, run(
  expr = adonis2(outcomes ~ 
                   Smoking +
                   dialysis +
                   diabetes ,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'terms'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()
Df SumOfSqs R2 F Pr(>F)
Smoking 1 6.833548 0.0446542 4.894784 0.0076992
dialysis 1 5.378854 0.0351484 3.852805 0.0179982
diabetes 1 2.607552 0.0170392 1.867757 0.1260874
Residual 99 138.212693 0.9031582 NA NA
Total 102 153.032647 1.0000000 NA NA

3.3.5 T-lymphocytes

3.3.5.1 Prepare data

outcomes <- data_lymphocytes_specific_log10 %>% 
  filter(!id %in% ids) %>%
  dplyr::select(
   `scaled_log10_CD4+ CM cells`:`scaled_log10_CD8+ TEMRA cells`
    )

data_predict <- data_clinical %>% dplyr::select(-id)

3.3.5.2 Run single-predictor analysis

modelac <- 'perm_specific_lymphT'

assign(modelac, run(
  expr = perm_over(outcome = outcomes,
                   predictors = data_predict,
                   nperm = 5000,
                   seed = 483),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% dplyr::arrange(desc(r2)) %>% kableExtra::kable()
predictor r2 p_val
CMV 0.0503748 0.0013997
Smoking 0.0436505 0.0021996
receiver_age 0.0359540 0.0081984
ASCVD 0.0330791 0.0109978
diabetes 0.0242087 0.0495901
dyslipidemia 0.0189910 0.0979804
anti_HLA 0.0102753 0.3571286
hepatitis_b 0.0074380 0.4791042
receiver_female 0.0063772 0.6338732
PRD.GN 0.0063165 0.6354729
hyperuricemia 0.0039844 0.8436313
hypertension 0.0030739 0.8600280
dialysis 0.0013107 0.9912018

3.3.5.3 Run multivariable analyses

Note that selection on the basis of P-value would result in the same multivariable PERMANOVA as based on the R2


## by margin
modelac <- 'perm_specific_lymphT_multivariable_margin'
set.seed(483)

assign(modelac, run(
  expr = adonis2(outcomes ~ 
                   CMV +
                   Smoking +
                   receiver_age +
                   ASCVD +
                   diabetes,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'margin'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()
Df SumOfSqs R2 F Pr(>F)
CMV 1 8.981104 0.0443558 5.100516 0.0020998
Smoking 1 4.433402 0.0218957 2.517802 0.0424958
receiver_age 1 7.217228 0.0356444 4.098782 0.0059994
ASCVD 1 2.759608 0.0136291 1.567226 0.1710829
diabetes 1 2.012991 0.0099417 1.143211 0.3149685
Residual 97 170.799801 0.8435451 NA NA
Total 102 202.478567 1.0000000 NA NA

## sequential
modelac <- 'perm_specific_lymphT_multivariable_sequential'
set.seed(483)
assign(modelac, run(
  expr = adonis2(outcomes ~ 
                   CMV +
                   Smoking +
                   receiver_age +
                   ASCVD +
                   diabetes,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'terms'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()
Df SumOfSqs R2 F Pr(>F)
CMV 1 10.199825 0.0503748 5.792647 0.0009999
Smoking 1 7.937709 0.0392027 4.507955 0.0026997
receiver_age 1 6.740675 0.0332908 3.828139 0.0079992
ASCVD 1 4.787567 0.0236448 2.718938 0.0347965
diabetes 1 2.012991 0.0099417 1.143211 0.3149685
Residual 97 170.799801 0.8435451 NA NA
Total 102 202.478567 1.0000000 NA NA

3.4 Correlation matrix

3.4.1 General

set.seed(483)
# Prepare table
data_matrix <- data_clinical %>% 
  dplyr::left_join(data_lymphocytes_general, by = 'id') %>% 
  dplyr::select(-id) %>% 
  cor(method = 'spearman') %>% 
  data.frame() %>% 
  dplyr::filter(
    row.names(.) %in% colnames(data_lymphocytes_general)[-1]
    ) %>% 
  dplyr::select(
    row.names(perm_specific_multivariable_sequential)[1:5],
    dialysis
    ) %>% as.matrix()

colnames(data_matrix)[1] <- 'smoking'
colnames(data_matrix)[2] <- 'CMV seropositivity'
colnames(data_matrix)[6] <- 'dialysis treatment'
colnames(data_matrix)[5] <- 'age'
colnames(data_matrix)[4] <- 'diabetes mellitus'

# Calculate distance and perform hierarchical clustering
distance_matrix2 <- dist(t(outcomes_gen))
hc2 <- hclust(distance_matrix2, method = 'average')

col_fun = colorRamp2(c(-0.5, 0, 0.5), c("blue", "white", "red"))

# Heatmap

tr <- Heatmap(data_matrix, 
        name = "Spearman rho", 
        cluster_rows = hc2, 
        show_row_dend = TRUE, 
        show_column_dend = FALSE,
        row_names_gp = gpar(fontsize = 11),
        column_names_side = "top",
        row_dend_width = unit(3.2, "cm"),
        row_split = 2,
        gap = unit(1.5, "mm"),
        row_title = NULL,
        column_title = NULL,
        row_title_rot = 0,
        row_title_gp = gpar(fontsize = 14))

print(tr)

Spearman rho correlation between selected clinical characteristics and counts of borader categories of lymphocyte sub-populations. Dendrogram shows correlated sub-populations, based on UPGMA clustering

pdf("gitignore/figures/heatmap_general.pdf", width = 5.5, height = 4)
print(tr)
dev.off()
## png 
##   2

3.4.2 Specific


# Prepare table
set.seed(483)
data_matrix <- data_clinical %>% 
  dplyr::left_join(data_lymphocytes_specific, by = 'id') %>% 
  dplyr::select(-id) %>% 
  cor(method = 'spearman') %>% 
  data.frame() %>% 
  dplyr::filter(
    row.names(.) %in% colnames(data_lymphocytes_specific)[-1]
    ) %>% 
  dplyr::select(
    row.names(perm_specific_multivariable_sequential)[1:5],
    dialysis
    ) %>% as.matrix()


colnames(data_matrix)[1] <- 'smoking'
colnames(data_matrix)[2] <- 'CMV seropositivity'
colnames(data_matrix)[6] <- 'dialysis treatment'
colnames(data_matrix)[5] <- 'age'
colnames(data_matrix)[4] <- 'diabetes mellitus'


## define outcomes as all speficif features
outcomes <- data_lymphocytes_specific_log10 %>% 
  dplyr::select(-id)

# Calculate distance and perform hierarchical clustering
distance_matrix2 <- dist(t(outcomes))
hc2 <- hclust(distance_matrix2, method = 'average')

cluster_assignment <- cutree(hc2, k = 6)

  # Assign colors per cluster
cluster_colors <- c("darkblue", "red4", "darkgreen", "grey40", "orange3", "purple3")
cluster_colors <- c("darkblue", "red4", "purple3", "grey40", "orange3", "darkgreen")

# Map colors to rows based on cluster
label_colors <- cluster_colors[cluster_assignment]

tr <- Heatmap(data_matrix, 
        name = "Spearman rho", 
        cluster_rows = hc2, 
        show_row_dend = TRUE, 
        show_column_dend = FALSE,
        row_names_gp = gpar(fontsize = 11, col = label_colors),
        column_names_side = "top",
        row_dend_width = unit(3.2, "cm"),
        row_split = 6,
        gap = unit(1.5, "mm"),
        row_title = NULL,
        column_title = NULL,
        row_title_rot = 0,
        row_title_side = c("right"),
        row_title_gp = gpar(fontsize = 14, col = label_colors),
        col = col_fun)

print(tr)

Matrix of Spearman correlation coefficients between clinical characteristics and counts of various peripheral blood immune cell subpopulations. The dendrogram illustrates similarities in cell counts, where closely linked tips indicate that the associated lymphocyte subpopulations are interrelated, with higher counts tending to co-occur. Correlations with an entire lymphocyte cluster suggest a specific influence (e.g., CMV infection), while correlations with individual subpopulations across different clusters indicate a more diffuse effect (e.g., smoking). The dendrogram is constructed using the unweighted pair group method with arithmetic mean (UPGMA) clustering. Clusters of subpopulations (as defined by UPGMA) are as follows: (1, grey) plasmablast; (2, red) plasmacytoid dendritic cells (pDC); (3, purple) B cells – including naive B cells, transitional B cells, class-switched memory cells, marginal-zone B cells, and class non-switched memory cells; (4, blue) myeloid dendritic cells (mDC1, mDC2, mDC3); (5, green) naive T cells – CD4+ and CD8+; and (6, orange) memory T cells – including CD4+ and CD8+ TEMRA, effector memory (EM), and central memory (CM) subsets.

pdf("gitignore/figures/heatmap_specific.pdf", width = 6, height = 6)
tr
dev.off()
## png 
##   2

3.5 Univariate analysis

3.5.1 Data

Lets to log10-transform data but not standardize to express log-FC. Connect to relevant clinical data

data_lymphocytes_specific_counts <- data_lymphocytes_specific %>% 
  dplyr::mutate(dplyr::across(
    .cols = dplyr::where(is.numeric),
    .names = "count_{.col}" 
  )) %>% 
  dplyr::select(-c(`mDC1` : `CD8+ TEMRA cells`)) %>% 
  dplyr::mutate(dplyr::across(
    dplyr::where(is.numeric), replace_zeros
    )) %>% 
  dplyr::mutate(dplyr::across(
    .cols = dplyr::where(is.numeric),
    .fns = ~ round(.x * 1e6) 
  ))
  

data_glm <-  data_clinical %>% 
  dplyr::mutate(receiver_age_30y = receiver_age/30) %>% 
  dplyr::select(
    id, Smoking, CMV, receiver_age_30y, dialysis, ASCVD
    ) %>% 
  dplyr::left_join(data_lymphocytes_specific_counts, by = 'id')

3.5.2 Running models

3.5.2.1 GLM


modelac <- "glm_ranint_poisson"
namesy <- names(data_lymphocytes_specific_counts)[-1]

assign(modelac, run(
  expr = glm_iori_poisson(
    outcomes = namesy,
    data = data_glm
  ),
  path = paste0("gitignore/run/", modelac),
  reuse = TRUE
))

glm_ranint_poisson <- glm_ranint_poisson %>%
  dplyr::mutate(
    Smoking_P_val_adj = p.adjust(Smoking_P_val, method = "BH"),
    cmv_P_val_adj = p.adjust(cmv_P_val, method = "BH"),
    receiver_age_30y_P_val_adj = p.adjust(
      receiver_age_30y_P_val,
      method = "BH"
    ),
    dialysis_P_val_adj = p.adjust(dialysis_P_val, method = "BH"),
    ascvd_P_val_adj = p.adjust(
      ascvd_P_val,
      method = "BH"
    )
  ) %>%
  dplyr::select(
    Outcome,
    Smoking_logFC, Smoking_P_val, Smoking_P_val_adj,
    cmv_logFC, cmv_P_val, cmv_P_val_adj,
    receiver_age_30y_logFC, receiver_age_30y_P_val, receiver_age_30y_P_val_adj,
    dialysis_logFC, dialysis_P_val, dialysis_P_val_adj,
    ascvd_logFC, ascvd_P_val, ascvd_P_val_adj
  )

glm_ranint_poisson_CR <- glm_ranint_poisson %>%
  dplyr::mutate(
    Smoking_CR = round(exp(Smoking_logFC), 2),
    cmv_CR = round(exp(cmv_logFC), 2),
    receiver_age_30y_CR = round(exp(receiver_age_30y_logFC), 2),
    dialysis_CR = round(exp(dialysis_logFC), 2),
    ascvd_CR = round(exp(ascvd_logFC), 2)
  ) %>%
  dplyr::select(
    Outcome,
    Smoking_CR, Smoking_P_val, Smoking_P_val_adj,
    cmv_CR, cmv_P_val, cmv_P_val_adj,
    receiver_age_30y_CR,
    receiver_age_30y_P_val, receiver_age_30y_P_val_adj,
    dialysis_CR, dialysis_P_val, dialysis_P_val_adj,
    ascvd_CR, ascvd_P_val, ascvd_P_val_adj
  )

kableExtra::kbl(glm_ranint_poisson_CR,
  caption = "Results of generalized linear models showing the association between clinical variables and the composition of peripheral blood immune cell subsets. The `CR` suffix indicates the count ratio, representing the fold expected change in the outcome when the predictor increases by one unit. The `P_val` suffix denotes unadjusted P-values, while the P_val_adj suffix represents P-values coCRected for multiple testing using the Benjamini-Hochberg coCRection, which accounts for the repeated estimation of each predictors effect across 18 outcomes."
)
Results of generalized linear models showing the association between clinical variables and the composition of peripheral blood immune cell subsets. The `CR` suffix indicates the count ratio, representing the fold expected change in the outcome when the predictor increases by one unit. The `P_val` suffix denotes unadjusted P-values, while the P_val_adj suffix represents P-values coCRected for multiple testing using the Benjamini-Hochberg coCRection, which accounts for the repeated estimation of each predictors effect across 18 outcomes.
Outcome Smoking_CR Smoking_P_val Smoking_P_val_adj cmv_CR cmv_P_val cmv_P_val_adj receiver_age_30y_CR receiver_age_30y_P_val receiver_age_30y_P_val_adj dialysis_CR dialysis_P_val dialysis_P_val_adj ascvd_CR ascvd_P_val ascvd_P_val_adj
count_mDC1_absp 1.45 0.0821632 0.1643264 0.73 0.1668553 0.7508491 0.75 0.1833661 0.5500984 1.22 0.3374549 0.7592735 1.44 0.1379511 0.3547313
count_mDC2_absp 0.66 0.0451806 0.1643264 1.00 0.9974734 0.9974734 1.26 0.2768216 0.7118270 1.14 0.5268359 0.8797687 1.53 0.0719346 0.2589644
count_mDC3_absp 1.12 0.6498684 0.6498684 0.93 0.7861076 0.8323492 0.62 0.0519317 0.3115902 1.35 0.1998693 0.7592735 1.57 0.1022325 0.3066975
count_pDC_absp 1.50 0.0631027 0.1643264 1.12 0.6441707 0.7829139 1.02 0.9413153 0.9966868 0.85 0.4483543 0.8797687 0.80 0.3606195 0.4993193
count_naiveB_absp 1.42 0.1292595 0.1938893 1.13 0.6284973 0.7829139 0.87 0.5432927 0.8994371 1.24 0.3345707 0.7592735 1.39 0.2186927 0.3936468
count_marginal.B 1.61 0.0436430 0.1643264 0.74 0.2367602 0.7829139 1.14 0.5746919 0.8994371 2.26 0.0003625 0.0065243 0.89 0.6576413 0.6847042
count_non_switched_memory 1.65 0.0783700 0.1643264 0.89 0.6959235 0.7829139 1.28 0.3902863 0.7805726 1.49 0.1465900 0.7592735 1.26 0.4782797 0.5739357
count_switched.memory 2.35 0.0009759 0.0175665 1.20 0.5081204 0.7829139 0.97 0.9028493 0.9966868 1.10 0.6922724 0.8900645 1.31 0.3557353 0.4993193
count_Plazmablasty_absp 1.49 0.1913251 0.2459894 0.83 0.5634339 0.7829139 0.65 0.1578373 0.5500984 1.42 0.2319909 0.7592735 1.56 0.2043473 0.3936468
count_Transitional_absp 1.43 0.2577554 0.2899748 1.16 0.6726890 0.7829139 0.92 0.8037449 0.9966868 1.55 0.1544323 0.7592735 1.45 0.3081146 0.4993193
count_CD.4.CM 1.31 0.0565256 0.1643264 0.89 0.4309489 0.7829139 0.95 0.7428517 0.9966868 0.98 0.8587102 0.9349747 1.49 0.0120619 0.1884576
count_CD.4.EM 1.43 0.0671297 0.1643264 1.23 0.3251786 0.7829139 0.90 0.5996248 0.8994371 0.83 0.3318537 0.7592735 1.62 0.0314096 0.1884576
count_CD4_naive_absp 1.34 0.1034590 0.1862263 0.69 0.0570816 0.3424894 0.61 0.0081360 0.0732243 1.01 0.9710655 0.9710655 1.32 0.1833883 0.3936468
count_CD4_TEMRA_absp 1.71 0.1639893 0.2270621 5.79 0.0000286 0.0002574 1.04 0.9299784 0.9966868 0.85 0.6662095 0.8900645 1.20 0.6847042 0.6847042
count_CD8_CM_absp 1.24 0.2954203 0.3127980 1.16 0.4986506 0.7829139 0.81 0.3262163 0.7339866 0.97 0.8830317 0.9349747 1.69 0.0255606 0.1884576
count_CD8_EM_absp 1.36 0.2256127 0.2707352 1.21 0.4844465 0.7829139 0.70 0.1650548 0.5500984 0.87 0.5702644 0.8797687 1.70 0.0705542 0.2589644
count_CD.8.naive 1.55 0.0131338 0.1182038 1.21 0.3135782 0.7829139 0.34 0.0000000 0.0000001 0.91 0.5865124 0.8797687 1.12 0.5710413 0.6424214
count_CD8_TEMRA_absp 1.44 0.1138749 0.1863407 3.01 0.0000117 0.0002106 1.00 0.9986771 0.9986771 1.08 0.7462042 0.8954450 1.24 0.4163171 0.5352649

3.5.2.2 LM with log-transformed outcomes


modelac <- "lm_lognormal_res"
namesy <- names(data_lymphocytes_specific_counts)[-1]

assign(modelac, run(
  expr = lm_lognormal(
    outcomes = namesy,
    data = data_glm
  ),
  path = paste0("gitignore/run/", modelac),
  reuse = TRUE
))

lm_lognormal_res <- lm_lognormal_res %>%
  dplyr::mutate(
    Smoking_P_val_adj = p.adjust(Smoking_P_val, method = "BH"),
    cmv_P_val_adj = p.adjust(cmv_P_val, method = "BH"),
    receiver_age_30y_P_val_adj = p.adjust(
      receiver_age_30y_P_val,
      method = "BH"
    ),
    dialysis_P_val_adj = p.adjust(dialysis_P_val, method = "BH"),
    ascvd_P_val_adj = p.adjust(
      ascvd_P_val,
      method = "BH"
    )
  ) %>%
  dplyr::select(
    Outcome,
    Smoking_logFC, Smoking_P_val, Smoking_P_val_adj,
    cmv_logFC, cmv_P_val, cmv_P_val_adj,
    receiver_age_30y_logFC, receiver_age_30y_P_val, receiver_age_30y_P_val_adj,
    dialysis_logFC, dialysis_P_val, dialysis_P_val_adj,
    ascvd_logFC, ascvd_P_val, ascvd_P_val_adj
  ) 

kableExtra::kbl(lm_lognormal_res,
  caption = "Results of linear models showing the association between clinical variables and the composition of peripheral blood immune cell subsets. The `logFC` suffix indicates effect size, representing the expected change in the log-transformed outcome when the predictor increases by one unit. The `P_val` suffix denotes unadjusted P-values, while the P_val_adj suffix represents P-values corrected for multiple testing using the Benjamini-Hochberg correction, which accounts for the repeated estimation of each predictors effect across 18 outcomes."
)
Results of linear models showing the association between clinical variables and the composition of peripheral blood immune cell subsets. The `logFC` suffix indicates effect size, representing the expected change in the log-transformed outcome when the predictor increases by one unit. The `P_val` suffix denotes unadjusted P-values, while the P_val_adj suffix represents P-values corrected for multiple testing using the Benjamini-Hochberg correction, which accounts for the repeated estimation of each predictors effect across 18 outcomes.
Outcome Smoking_logFC Smoking_P_val Smoking_P_val_adj cmv_logFC cmv_P_val cmv_P_val_adj receiver_age_30y_logFC receiver_age_30y_P_val receiver_age_30y_P_val_adj dialysis_logFC dialysis_P_val dialysis_P_val_adj ascvd_logFC ascvd_P_val ascvd_P_val_adj
count_mDC1 0.3711239 0.0948305 0.1896610 -0.3197519 0.1829036 0.7934812 -0.2887731 0.1996895 0.5990684 0.1989222 0.3541938 0.7969361 0.3626161 0.1531878 0.3939114
count_mDC2 -0.4144419 0.0548502 0.1896610 0.0007097 0.9975553 0.9975553 0.2289242 0.2935603 0.7548693 0.1272455 0.5404066 0.8982470 0.4264018 0.0838837 0.3019814
count_mDC3 0.1090727 0.6605534 0.6605534 -0.0706402 0.7928688 0.8395081 -0.4747767 0.0622559 0.3735356 0.2991869 0.2164798 0.7969361 0.4495573 0.1160404 0.3481212
count_pDC 0.4070755 0.0744107 0.1896610 0.1096121 0.6549940 0.7934812 0.0163910 0.9431970 0.9986792 -0.1612947 0.4636596 0.8982470 -0.2292577 0.3771888 0.5222614
count_naive B cells 0.3523952 0.1442311 0.2163467 0.1217758 0.6397513 0.7934812 -0.1435178 0.5567140 0.9174456 0.2177035 0.3514220 0.7969361 0.3271442 0.2355266 0.4239479
count_marginal-zone B cells 0.4748985 0.0531207 0.1896610 -0.3017388 0.2537325 0.7934812 0.1342758 0.5873032 0.9174456 0.8152768 0.0008029 0.0144526 -0.1194329 0.6681030 0.6944102
count_class non-switched memory cells 0.4990383 0.0908003 0.1896610 -0.1200495 0.7053166 0.7934812 0.2475932 0.4064960 0.8129919 0.3997357 0.1621042 0.7969361 0.2301580 0.4930265 0.5916318
count_class switched memory cells 0.8528785 0.0018573 0.0334315 0.1854228 0.5223059 0.7934812 -0.0320384 0.9061113 0.9986792 0.0994278 0.7017471 0.9022463 0.2734379 0.3724143 0.5222614
count_plasmablast 0.3972205 0.2078180 0.2671946 -0.1902894 0.5763191 0.7934812 -0.4364630 0.1736525 0.5990684 0.3529092 0.2489292 0.7969361 0.4417628 0.2210217 0.4239479
count_transitional B cells 0.3571943 0.2748097 0.3091609 0.1444667 0.6827302 0.7934812 -0.0797304 0.8099440 0.9986792 0.4365448 0.1701619 0.7969361 0.3682896 0.3250975 0.5222614
count_CD4+ CM cells 0.2662910 0.0672589 0.1896610 -0.1191345 0.4466116 0.7934812 -0.0465784 0.7508338 0.9986792 -0.0241084 0.8633987 0.9389245 0.4013788 0.0166557 0.2364102
count_CD4+ EM cells 0.3600509 0.0787413 0.1896610 0.2096334 0.3420507 0.7934812 -0.1049344 0.6116304 0.9174456 -0.1853401 0.3487012 0.7969361 0.4844738 0.0394017 0.2364102
count_CD4+ naive cells 0.2956169 0.1173124 0.2098389 -0.3742337 0.0678888 0.4073327 -0.4883709 0.0117519 0.1057674 0.0064000 0.9719739 0.9719739 0.2765180 0.1997567 0.4239479
count_CD4+ TEMRA cells 0.5389918 0.1799550 0.2491685 1.7557156 0.0000992 0.0008931 0.0345931 0.9322192 0.9986792 -0.1622497 0.6764319 0.9022463 0.1800379 0.6944102 0.6944102
count_CD8+ CM cells 0.2149534 0.3124541 0.3308338 0.1506054 0.5129856 0.7934812 -0.2050316 0.3430901 0.7719528 -0.0293573 0.8867620 0.9389245 0.5251825 0.0326982 0.2364102
count_CD8+ EM cells 0.3106208 0.2424994 0.2909993 0.1941851 0.4990655 0.7934812 -0.3617595 0.1810423 0.5990684 -0.1413279 0.5829951 0.8982470 0.5307058 0.0824452 0.3019814
count_CD8+ naive cells 0.4384032 0.0179858 0.1618722 0.1930028 0.3305317 0.7934812 -1.0653236 0.0000001 0.0000018 -0.0933802 0.5988313 0.8982470 0.1146529 0.5837404 0.6567079
count_CD8+ TEMRA cells 0.3664244 0.1282349 0.2098389 1.1007013 0.0000485 0.0008731 -0.0003711 0.9987836 0.9987836 0.0728481 0.7541683 0.9050020 0.2157022 0.4321354 0.5556027

3.5.3 Volcano plot

General setting and getting legend


glm_ranint_poisson$Outcome <- c('mDC1', 'mDC2', 'mDC3', 'pDC', 
  'naive B cells', 'marginal-zone B cells',
  'class non-switched memory cells', 'class switched memory cells', 
  'plasmablast', 'transitional B cells', 
  'CD4+ CM cells', 'CD4+ EM cells', 'CD4+ naive cells', 'CD4+ TEMRA cells', 
  'CD8+ CM cells', 'CD8+ EM cells', 'CD8+ naive cells', 'CD8+ TEMRA cells')

color_map <- c('Increased' = '#A83226', 
               'Decreased' = '#2A5CA8', 
               'FDR>0.05' = 'grey20') 


dummy_data <- data.frame(
  Count = factor(c("Increased", "Decreased", "FDR>0.05"), levels = c("Increased", "Decreased", "FDR>0.05")),
  x = 1:3,
  y = 1:3
)

dummy_plot <- ggplot(dummy_data, aes(x, y, color = Count)) +
  geom_point() +
  scale_color_manual(values = color_map) +
  theme(legend.position = "bottom",
        legend.text = element_text(size = 14),
        legend.title = element_text(size = 14))

get_legend <- function(my_ggplot) {
  tmp <- ggplot_gtable(ggplot_build(my_ggplot))
  leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
  legend <- tmp$grobs[[leg]]
  return(legend)
}

common_legend <- get_legend(dummy_plot)

3.5.3.1 Smoking

plotac <- 'fig1'
predictor <- 'Smoking'

path = paste0('gitignore/figures/',plotac,'_', predictor, '.rds')

if(file.exists(path) == FALSE){
  assign(plotac, glm_ranint_poisson %>%
           dplyr::select(Outcome, contains(predictor)) %>%
           dplyr::mutate(
             pv_adj = get(paste0(predictor, '_P_val_adj')),
             pv_raw = get(paste0(predictor, '_P_val')),
             b_coe = get(paste0(predictor, '_logFC')),
             ) %>%
           dplyr::mutate(
             m_ylab = -log10(pv_raw),
             m_xlab = b_coe,
             rescat = if_else(
               pv_adj >= 0.05,
               'FDR>0.05',
               if_else(
                 b_coe > 0,
                 'Increased',
                 'Decreased')),
             pv_adj = if_else(pv_adj < 1e-6, 1e-6, pv_adj),
             pv_raw = if_else(pv_raw < 1e-6, 1e-6, pv_raw)) %>%

           mutate(
             rescat = factor(rescat,
                             levels = c('Increased', 'Decreased', 'FDR>0.05')),
             meta_labels = if_else(pv_raw < 0.05, Outcome, '')) %>%
           mutate(
             meta_labels = str_remove(meta_labels, 'count_')
           ) %>% 
           ggplot(aes(x = b_coe,
                      y = m_ylab,
                      color = rescat,
                      alpha = rescat,
                      label = meta_labels)) +

           geom_point(size = 2, shape = 16, alpha = 0.6) +
           geom_vline(xintercept = 0, 
                      linetype = 'dashed', 
                      linewidth = 0.2, 
                      color = 'grey30') +

           geom_text_repel(aes(
             x = m_xlab,
             y = m_ylab),
             show.legend = FALSE,
             alpha = 1,
             box.padding = 0.75,
             size = 3.7,
             seed = 17) +

           labs(x = 'log(fold count increase)', y = '-log10 (P-value)') +
           scale_color_manual(values = color_map, name = '') +
           ggtitle('smoking') +
           theme(legend.text = element_text(size = 14),
                 axis.title = element_text(size = 14))+
           coord_cartesian(ylim = c(0, 6),
                           xlim = c(-1.8, 1.8)) 
         )

  saveRDS(get(plotac), file = path)
  get(plotac)

} else {assign(plotac, readRDS(path))
    get(plotac)}

3.5.3.2 Cytomegalovirus

plotac <- 'fig2'
predictor <- 'cmv'

path = paste0('gitignore/figures/',plotac,'_', predictor, '.rds')

if(file.exists(path) == FALSE){
  assign(plotac, glm_ranint_poisson %>%
           dplyr::select(Outcome, contains(predictor)) %>%
           dplyr::mutate(
             pv_adj = get(paste0(predictor, '_P_val_adj')),
             pv_raw = get(paste0(predictor, '_P_val')),
             b_coe = get(paste0(predictor, '_logFC')),
             ) %>%
           dplyr::mutate(
             m_ylab = -log10(pv_raw),
             m_xlab = b_coe,
             rescat = if_else(
               pv_adj >= 0.05,
               'FDR>0.05',
               if_else(
                 b_coe > 0,
                 'Increased',
                 'Decreased')),
             pv_adj = if_else(pv_adj < 1e-6, 1e-6, pv_adj),
             pv_raw = if_else(pv_raw < 1e-6, 1e-6, pv_raw)) %>%

           mutate(
             rescat = factor(rescat,
                             levels = c('Increased', 'Decreased', 'FDR>0.05')),
             meta_labels = if_else(pv_raw < 0.05, Outcome, '')) %>%
           mutate(
             meta_labels = str_remove(meta_labels, 'count_')
           ) %>% 
           ggplot(aes(x = b_coe,
                      y = m_ylab,
                      color = rescat,
                      alpha = rescat,
                      label = meta_labels)) +

           geom_point(size = 2, shape = 16, alpha = 0.6) +
           geom_vline(xintercept = 0, 
                      linetype = 'dashed', 
                      linewidth = 0.2, 
                      color = 'grey30') +

           geom_text_repel(aes(
             x = m_xlab,
             y = m_ylab),
             show.legend = FALSE,
             alpha = 1,
             box.padding = 0.6,
             size = 4.1,
             seed = 17) +

           labs(x = 'log(fold count increase)', y = '-log10 (P-value)') +
           scale_color_manual(values = color_map, name = '') +
           ggtitle('CMV seropositivity') +
           theme(legend.text = element_text(size = 14),
                 axis.title = element_text(size = 14))+
           coord_cartesian(ylim = c(0, 6),
                           xlim = c(-1.8, 1.8)) 
         )

  saveRDS(get(plotac), file = path)
  get(plotac)

} else {assign(plotac, readRDS(path))
    get(plotac)}

3.5.3.3 Receiver age (effect of 30 years plus)

plotac <- "fig3"
predictor <- "receiver_age_30y"

path <- paste0("gitignore/figures/", plotac, "_", predictor, ".rds")

if (file.exists(path) == FALSE) {
  assign(plotac, glm_ranint_poisson %>%
    dplyr::select(Outcome, contains(predictor)) %>%
    dplyr::mutate(
      pv_adj = get(paste0(predictor, "_P_val_adj")),
      pv_raw = get(paste0(predictor, "_P_val")),
      b_coe = get(paste0(predictor, "_logFC")),
    ) %>%
    dplyr::mutate(
      m_ylab = -log10(pv_raw),
      m_xlab = b_coe,
      rescat = if_else(
        pv_adj >= 0.05,
        "FDR>0.05",
        if_else(
          b_coe > 0,
          "Increased",
          "Decreased"
        )
      ),
      m_ylab = if_else(m_ylab > 6, 6, m_ylab)
    ) %>%
    mutate(
      rescat = factor(rescat,
        levels = c("Increased", "Decreased", "FDR>0.05")
      ),
      meta_labels = if_else(pv_raw < 0.05, Outcome, "")
    ) %>%
    mutate(
      meta_labels = str_remove(meta_labels, "count_")
    ) %>%
    ggplot(aes(
      x = b_coe,
      y = m_ylab,
      color = rescat,
      alpha = rescat,
      label = meta_labels
    )) +
    geom_point(size = 2, shape = 16, alpha = 0.6) +
    geom_vline(
      xintercept = 0,
      linetype = "dashed",
      linewidth = 0.2,
      color = "grey30"
    ) +
    geom_text_repel(
      aes(
        x = m_xlab,
        y = m_ylab
      ),
      show.legend = FALSE,
      alpha = 1,
      box.padding = 0.7,
      size = 4.1,
      seed = 17
    ) +
    labs(x = "log(fold count increase)", y = "-log10 (P-value)") +
    scale_color_manual(values = color_map, name = "") +
    ggtitle("age (30y)") +
    theme(
      legend.text = element_text(size = 14),
      axis.title = element_text(size = 14)
    ) +
    coord_cartesian(
      ylim = c(0, 6),
      xlim = c(-1.8, 1.8)
    ))

  saveRDS(get(plotac), file = path)
  get(plotac)
} else {
  assign(plotac, readRDS(path))
  get(plotac)
}

3.5.3.4 Dialysis

plotac <- 'fig4'
predictor <- 'dialysis'

path = paste0('gitignore/figures/',plotac,'_', predictor, '.rds')

if(file.exists(path) == FALSE){
  assign(plotac, glm_ranint_poisson %>%
           dplyr::select(Outcome, contains(predictor)) %>%
           dplyr::mutate(
             pv_adj = get(paste0(predictor, '_P_val_adj')),
             pv_raw = get(paste0(predictor, '_P_val')),
             b_coe = get(paste0(predictor, '_logFC')),
             ) %>%
           dplyr::mutate(
             m_ylab = -log10(pv_raw),
             m_xlab = b_coe,
             rescat = if_else(
               pv_adj >= 0.05,
               'FDR>0.05',
               if_else(
                 b_coe > 0,
                 'Increased',
                 'Decreased')),
             m_ylab = if_else(m_ylab > 6, 6, m_ylab)
             ) %>%
           mutate(
             rescat = factor(rescat,
                             levels = c('Increased', 'Decreased', 'FDR>0.05')),
             meta_labels = if_else(pv_raw < 0.05, Outcome, '')) %>%
           mutate(
             meta_labels = str_remove(meta_labels, 'count_')
           ) %>% 
           ggplot(aes(x = b_coe,
                      y = m_ylab,
                      color = rescat,
                      alpha = rescat,
                      label = meta_labels)) +

           geom_point(size = 2, shape = 16, alpha = 0.6) +
           geom_vline(xintercept = 0, 
                      linetype = 'dashed', 
                      linewidth = 0.2, 
                      color = 'grey30') +

             geom_text(aes(
              x = m_xlab - 0.3,
              y = m_ylab - 0.15, 
              label = meta_labels),
              size = 4.1,
              hjust = 0.5,  
              vjust = 1,
              alpha = 1) +  

           labs(x = 'log(fold count increase)', y = '-log10 (P-value)') +
           scale_color_manual(values = color_map, name = '') +
           ggtitle('dialysis treatment') +
           theme(legend.text = element_text(size = 14),
                 axis.title = element_text(size = 14))+
           coord_cartesian(ylim = c(0, 6),
                           xlim = c(-1.8, 1.8)) 
         )

  saveRDS(get(plotac), file = path)
  get(plotac)

} else {assign(plotac, readRDS(path))
    get(plotac)}

3.5.3.5 ASCVD

plotac <- 'fig5'
predictor <- 'ascvd'

path = paste0('gitignore/figures/',plotac,'_', predictor, '.rds')

if(file.exists(path) == FALSE){
  assign(plotac, glm_ranint_poisson %>%
           dplyr::select(Outcome, contains(predictor)) %>%
           dplyr::mutate(
             pv_adj = get(paste0(predictor, '_P_val_adj')),
             pv_raw = get(paste0(predictor, '_P_val')),
             b_coe = get(paste0(predictor, '_logFC')),
             ) %>%
           dplyr::mutate(
             m_ylab = -log10(pv_raw),
             m_xlab = b_coe,
             rescat = if_else(
               pv_adj >= 0.05,
               'FDR>0.05',
               if_else(
                 b_coe > 0,
                 'Increased',
                 'Decreased')),
             m_ylab = if_else(m_ylab > 6, 6, m_ylab)
             ) %>%
           mutate(
             rescat = factor(rescat,
                             levels = c('Increased', 'Decreased', 'FDR>0.05')),
             meta_labels = if_else(pv_raw < 0.05, Outcome, '')) %>%
           mutate(
             meta_labels = str_remove(meta_labels, 'count_')
           ) %>% 
           ggplot(aes(x = b_coe,
                      y = m_ylab,
                      color = rescat,
                      alpha = rescat,
                      label = meta_labels)) +

           geom_point(size = 2, shape = 16, alpha = 0.6) +
           geom_vline(xintercept = 0, 
                      linetype = 'dashed', 
                      linewidth = 0.2, 
                      color = 'grey30') +

           geom_text_repel(aes(
             x = m_xlab,
             y = m_ylab),
             show.legend = FALSE,
             alpha = 1,
             box.padding = 0.7,
             size = 4.1,
             seed = 17) +

           labs(x = 'log(fold count increase)', y = '-log10 (P-value)') +
           scale_color_manual(values = color_map, name = '') +
           ggtitle('ASCVD') +
           theme(legend.text = element_text(size = 14),
                 axis.title = element_text(size = 14))+
           coord_cartesian(ylim = c(0, 6),
                           xlim = c(-1.8, 1.8)) 
         )

  saveRDS(get(plotac), file = path)
  get(plotac)

} else {assign(plotac, readRDS(path))
    get(plotac)}

3.5.3.6 Volcans matrix

plotac <- "fig6"
predictor <- "volcans_matrix"

path <- paste0("gitignore/figures/", plotac, "_", predictor, ".rds")

if (!file.exists(path)) {
  # Remove legends from individual plots
  fig1 <- fig1 + theme(legend.position = "none")
  fig2 <- fig2 + theme(
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank(),
    legend.position = "none"
  )
  fig3 <- fig3 + theme(
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank(),
    legend.position = "none"
  )
  fig4 <- fig4 + theme(
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank(),
    legend.position = "none"
  )
  fig5 <- fig5 + theme(
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank(),
    legend.position = "none"
  )

  combined_plot <- ggarrange(
    fig1,
    fig2,
    fig3,
    fig4,
    fig5,
    nrow = 1,
    ncol = 5,
    widths = c(1.1, 1, 1, 1, 1)
  )

  final_plot <- ggarrange(combined_plot,
    common_legend,
    ncol = 1,
    align = "hv",
    heights = c(10, 1)
  )

  saveRDS(final_plot, file = path)

  ggsave(
    filename = paste0("gitignore/figures/", plotac, "_", predictor),
    device = "pdf",
    height = 5,
    width = 11.5
  )

  final_plot
} else {
  assign(plotac, readRDS(path))
  get(plotac)
}

The volcano plot illustrates the effects of clinical characteristics on the counts of various peripheral blood immune cell subpopulations, estimated using a generalized linear model (GLM) with a Poisson distribution. The x-axis represents the effect size of each predictor on the outcomes. For example, an effect size of 0.85 for smoking on switched memory lymphocytes suggests that smoking is associated with a 2.33-fold increase in the count of these cells, as determined by exp(0.85). The y-axis displays the −log⁡10(P-value) for each association, before P-value adjustment. Only associations with unadjusted P-values below 0.05 are labeled.

4 Reproducibility

sessionInfo()
## R version 4.4.3 (2025-02-28)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.5 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=cs_CZ.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=cs_CZ.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=cs_CZ.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=cs_CZ.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Europe/Prague
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] grid      stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] ggrepel_0.9.5         glmmTMB_1.1.9         ComplexHeatmap_2.20.0
##  [4] circlize_0.4.16       vegan_2.6-6.1         lattice_0.22-5       
##  [7] permute_0.9-7         ggbeeswarm_0.6.0      quantreg_5.98        
## [10] SparseM_1.81          coxed_0.3.3           survival_3.7-0       
## [13] rms_6.8-1             Hmisc_5.1-3           bayesplot_1.8.1      
## [16] ggdist_3.3.2          kableExtra_1.4.0      lubridate_1.8.0      
## [19] corrplot_0.92         arm_1.12-2            lme4_1.1-35.5        
## [22] MASS_7.3-65           janitor_2.2.0         brms_2.21.0          
## [25] Rcpp_1.0.13           glmnet_4.1-8          Matrix_1.7-0         
## [28] boot_1.3-31           cowplot_1.1.1         pROC_1.18.0          
## [31] mgcv_1.9-1            nlme_3.1-168          openxlsx_4.2.8       
## [34] flextable_0.9.6       sjPlot_2.8.16         RJDBC_0.2-10         
## [37] rJava_1.0-6           DBI_1.1.2             car_3.1-2            
## [40] carData_3.0-5         gtsummary_2.0.2       emmeans_1.10.4       
## [43] ggpubr_0.4.0          stringi_1.7.6         forcats_1.0.0        
## [46] stringr_1.5.1         dplyr_1.1.4           purrr_1.0.2          
## [49] readr_2.1.2           tidyr_1.3.1           tibble_3.2.1         
## [52] ggplot2_3.5.1         tidyverse_1.3.1      
## 
## loaded via a namespace (and not attached):
##   [1] fs_1.6.4                matrixStats_1.3.0       httr_1.4.2             
##   [4] RColorBrewer_1.1-2      insight_0.20.2          doParallel_1.0.17      
##   [7] repr_1.1.7              numDeriv_2016.8-1.1     tools_4.4.3            
##  [10] backports_1.5.0         sjlabelled_1.2.0        utf8_1.2.4             
##  [13] R6_2.5.1                GetoptLong_1.0.5        withr_3.0.1            
##  [16] Brobdingnag_1.2-7       prettyunits_1.1.1       gridExtra_2.3          
##  [19] cli_3.6.3               textshaping_0.3.6       performance_0.12.2     
##  [22] gt_0.11.0               Cairo_1.6-2             officer_0.6.6          
##  [25] sandwich_3.0-1          labeling_0.4.2          sass_0.4.9             
##  [28] mvtnorm_1.1-3           polspline_1.1.25        ggridges_0.5.3         
##  [31] askpass_1.1             QuickJSR_1.3.1          commonmark_1.9.1       
##  [34] systemfonts_1.0.4       StanHeaders_2.32.10     foreign_0.8-90         
##  [37] gfonts_0.2.0            svglite_2.1.3           readxl_1.3.1           
##  [40] rstudioapi_0.16.0       httpcode_0.3.0          generics_0.1.3         
##  [43] shape_1.4.6             vroom_1.5.7             distributional_0.4.0   
##  [46] zip_2.2.0               inline_0.3.19           loo_2.4.1              
##  [49] fansi_1.0.6             S4Vectors_0.42.1        abind_1.4-5            
##  [52] lifecycle_1.0.4         multcomp_1.4-18         yaml_2.3.5             
##  [55] snakecase_0.11.1        promises_1.2.0.1        crayon_1.5.0           
##  [58] haven_2.4.3             magick_2.7.3            pillar_1.9.0           
##  [61] knitr_1.48              rjson_0.2.21            estimability_1.5.1     
##  [64] codetools_0.2-19        glue_1.7.0              V8_4.4.2               
##  [67] fontLiberation_0.1.0    data.table_1.15.4       vctrs_0.6.5            
##  [70] png_0.1-7               cellranger_1.1.0        gtable_0.3.0           
##  [73] assertthat_0.2.1        datawizard_0.12.2       xfun_0.46              
##  [76] mime_0.12               coda_0.19-4             iterators_1.0.14       
##  [79] ellipsis_0.3.2          TH.data_1.1-0           bit64_4.0.5            
##  [82] fontquiver_0.2.1        rstan_2.32.6            tensorA_0.36.2.1       
##  [85] TMB_1.9.14              vipor_0.4.5             rpart_4.1.24           
##  [88] colorspace_2.0-2        BiocGenerics_0.50.0     nnet_7.3-20            
##  [91] tidyselect_1.2.1        processx_3.8.4          bit_4.0.4              
##  [94] compiler_4.4.3          curl_4.3.2              rvest_1.0.2            
##  [97] htmlTable_2.4.0         xml2_1.3.3              fontBitstreamVera_0.1.1
## [100] posterior_1.6.0         checkmate_2.3.2         scales_1.3.0           
## [103] callr_3.7.6             digest_0.6.37           minqa_1.2.4            
## [106] rmarkdown_2.27          htmltools_0.5.8.1       pkgconfig_2.0.3        
## [109] base64enc_0.1-3         highr_0.11              dbplyr_2.1.1           
## [112] fastmap_1.2.0           rlang_1.1.4             GlobalOptions_0.1.2    
## [115] htmlwidgets_1.6.4       shiny_1.9.1             farver_2.1.0           
## [118] zoo_1.8-9               jsonlite_1.8.8          magrittr_2.0.3         
## [121] Formula_1.2-4           munsell_0.5.0           gdtools_0.3.7          
## [124] plyr_1.8.6              pkgbuild_1.3.1          parallel_4.4.3         
## [127] sjmisc_2.8.10           ggeffects_1.7.0         splines_4.4.3          
## [130] hms_1.1.1               sjstats_0.19.0          ps_1.7.7               
## [133] uuid_1.0-3              markdown_1.13           ggsignif_0.6.3         
## [136] stats4_4.4.3            rstantools_2.1.1        crul_1.5.0             
## [139] reprex_2.0.1            evaluate_1.0.0          RcppParallel_5.1.8     
## [142] modelr_0.1.8            nloptr_2.0.0            tzdb_0.2.0             
## [145] foreach_1.5.2           httpuv_1.6.5            cards_0.2.2            
## [148] MatrixModels_0.5-3      openssl_1.4.6           clue_0.3-65            
## [151] cardx_0.2.1             skimr_2.1.5             broom_1.0.6            
## [154] xtable_1.8-4            rstatix_0.7.0           later_1.3.0            
## [157] viridisLite_0.4.0       ragg_1.2.1              beeswarm_0.4.0         
## [160] IRanges_2.38.1          cluster_2.1.8.1         bridgesampling_1.1-2

References

Benjamini, Yoav, and Yosef Hochberg. 1995. “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing.” Journal of the Royal Statistical Society Series B: Statistical Methodology 57 (1): 289–300. https://doi.org/10.1111/j.2517-6161.1995.tb02031.x.
Brooks, Mollie E., Kasper Kristensen, Koen J. van, Arni Magnusson, Casper W. Berg, Anders Nielsen, Hans J. Skaug, Martin Maechler, and Benjamin M. Bolker. 2017. glmmTMB Balances Speed and Flexibility Among Packages for Zero-Inflated Generalized Linear Mixed Modeling” 9. https://doi.org/10.32614/RJ-2017-066.
Gu, Zuguang, Roland Eils, and Matthias Schlesner. 2016. “Complex Heatmaps Reveal Patterns and Correlations in Multidimensional Genomic Data.” https://doi.org/10.1093/bioinformatics/btw313.
Oksanen, Jari, Gavin L. Simpson, F. Guillaume Blanchet, Roeland Kindt, Pierre Legendre, Peter R. Minchin, R. B. O’Hara, et al. 2024. “Vegan: Community Ecology Package.” https://CRAN.R-project.org/package=vegan.
R Core Team. 2024. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.