In depth flow cytometry characterization of the peripheral blood immune cell compartment composition and its association with pro-inflammatory factors in patients with stage 5 chronic kidney disease

Authors and affiliations:

Ivan Zahradka¹, Filip Tichanek², Lucie Majercikova³, Jana Douskova³, Mathias Streitz⁴, Stephan Schlickeiser⁵, Klara Osickova¹, Vojtech Petr¹, Petra Hruba³, Ondrej Viklicky^1,3

¹ Department of Nephrology, Institute for Clinical and Experimental Medicine, Czech Republic
² Department of Data Science, Institute for Clinical and Experimental Medicine, Czech Republic
³ Transplantation Laboratory, Institute for Clinical and Experimental Medicine, Prague, Czech Republic
⁴ Department of Experimental Animal Facilities and Biorisk Management, Federal Research Institute for Animal Health, Greifswald-Insel Riems, Germany
⁵ Berlin Institute of Health Center of Regenerative Therapies (BRCT), Berlin Institute of Health at Charité, Berlin, Germany

This is a statistical report of the study In depth flow cytometry characterization of the peripheral blood immune cell compartment composition and its association with pro-inflammatory factors in patients with stage 5 chronic kidney disease that under review in the Frontiers in Immunulogy

When using this code or data, cite the original publication:

TO BE ADDED

BibTex citation for the original publication:

TO BE ADDED

Original GitHub repository: https://github.com/filip-tichanek/https://github.com/filip-tichanek/rTX_lymphocytes

Statistical code with all details of statistical analysis is shown a HTML report.

Original data are available here

1 Introduction

1.1 Statistical methodology

All analyses were run in R, version 4.1 (R Core Team 2024).

Initially, we noted that only 3 patients had repeated renal transplantation. Their lymphocyte counts were noticeably different from other patients, so we excluded these re-transplanted patients from all analyses.

First, we performed single-variable Permutational Analyses of Variance (PERMANOVAs) using the ‘Vegan’ R package (Oksanen et al. 2024) to identify which factors explain the largest portion of variance in the multivariate lymphocyte composition when considered as single predictors. We then selected those with \(R^2 > 2\) and included them in a multivariable PERMANOVA, in a sequential manner, ordered by \(R^2\) (the most explanatory factor first). This analysis was conducted primarily with all ‘specific’ data (the most detailed sub-populations of lymphocytes available), but also with more general data (broader categories of lymphocyte populations), and subsets of specific data of similar types (dendritic cells, B-lymphocytes, T-lymphocytes). As a sensitivity analysis, we repeated the construction of the multivariable PERMANOVA, using a p-value threshold (< 0.05) in the univariable models for variable selection, in cases where this yielded a different set of predictors than selection based on \(R^2\). All PERMANOVAs were run with Euclidean distances after log10-transformation and z-standardization of the lymphocyte counts, using 5000 permutations.

Predictors (clinical characteristics) that showed a statistically significant effect in any of the multivariable PERMANOVAs were then used in a series of generalized linear models, fitted with ‘glmmTMB’ package (Brooks et al. 2017). Although variable selection for these PERMANOVAs was primarily based on \(R^2\), the same set of covariates would have been selected if PERMANOVA variable selection had been based on p-values < 0.05. These models predicted the count of a specific lymphocyte sub-population assuming a Poisson distribution, with observation-specific random intercepts to account for overdispersion. The selected predictors were: Smoking, cmv (past Cytomegalovirus infection), receiver_age_30y (receiver age scaled so that a 30-year increase represents a one-unit change), Dialysis, and ascvd (Atherosclerotic Cardiovascular Disease). P-values for each predictor were corrected using the Benjamini-Hochberg correction (FDR) for multiple comparisons (Benjamini and Hochberg 1995), accounting for the repeated estimation of the effect of each predictor across 18 lymphocyte sub-populations. The results were visualized using a volcano plot, showing the estimated effect of the predictor on the X axis (as estimated log-fold difference in count when the predictor value increases by 1 unit) and the -log10(P-value) on the Y axis (the higher the value, the stronger the evidence of the effect direction). As a sensitivity analysis, we repeated the same procedure using linear model with log-transformed counts as outcomes.

Further, we also show Spearman’s rank correlation between lymphocyte sub-population and the selected patients characteristics, visualized with ‘ComplexHeatmap’ R package (Gu, Eils, and Schlesner 2016). For the visualization, lymphocyte sub-populations were clustered based on similarity in distribution of their abundance using UPGMA.

2 Analysis

2.1 Initialization

2.1.1 Packages

if (TRUE) {rm(list = ls() )}
if (TRUE) {
  suppressWarnings(suppressMessages({
    library(tidyverse)
    library(stringr)
    library(stringi)
    library(ggpubr)
    library(emmeans)
    library(gtsummary)
    library(car)
    library(RJDBC)
    library(sjPlot)
    library(flextable)
    library(openxlsx)
    library(mgcv)
    library(pROC)
    library(cowplot)
    library(boot)
    library(glmnet)
    library(brms)
    library(janitor)
    library(arm)
    library(corrplot)
    library(lubridate)
    library(kableExtra)    
    library(ggdist)
    library(bayesplot)
    library(coxed)
    library(quantreg)
    library(ggbeeswarm)    
    library(vegan)
    library(circlize)
    library(ComplexHeatmap)
    library(glmmTMB)
    library(ggrepel)
  }))
}

2.1.2 Functions

setwd('~/1_ticf_sec/483_ZAHI_lymphocytes')
source('483_functions.R')

2.1.3 Directories

folders <- c("data", 
             "gitignore",
             "gitignore/run",
             "gitignore/figures",
             "gitignore/outputs",
             "gitignore/data")

invisible(
  lapply(folders, function(x) if (!dir.exists(x)) 
    dir.create(x, recursive = TRUE)
    )
  )

2.1.4 Seed

set.seed(16)

2.2 Data

2.2.1 Import and wrangling

Import

if (!file.exists("data/data_all.csv")) {
  data_excel_all <- read.xlsx("gitignore/data/FACS_lymfocyty_filip2.xlsx") %>%
    dplyr::mutate(
      id = as.character(1:nrow(.)),
      group = factor(group)
    ) %>%
    dplyr::rename(
    `non_switched_memory` = `non-switched.memory`) %>% 
    dplyr::select(
      -`rodne_cislo`, -`Jmeno`,
      -`Sekce.klinické.proměnné`,
      -`sekce.dendritické.buňky`,
      -`sekce.B-lymfocyty`,
      -`Sekce.T-lymfocyty`
    ) %>%
    dplyr::select(id, everything())

  readr::write_csv(data_excel_all, "data/data_all.csv")
}

data_excel_all <- readr::read_csv("data/data_all.csv") %>%
  mutate(group = factor(group),
         id = as.character(id))
## Rows: 136 Columns: 40
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): group
## dbl (39): id, Diabetes, PRD.GN, Smoking, hep.B, ASCVD, lecena.hypertenze, le...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.


data_avail <- data_excel_all %>% 
  dplyr::rename(
    mDC1 = mDC1_absp,
    mDC2 = mDC2_absp,
    mDC3 = mDC3_absp,
    pDC = pDC_absp,
    `naive B cells` = naiveB_absp,
    `marginal-zone B cells` = marginal.B,
    `class non-switched memory cells` = `non_switched_memory`,
    `class switched memory cells` = switched.memory,
    plasmablast = Plazmablasty_absp,
    `transitional B cells` = Transitional_absp,
    `CD4+ CM cells` = CD.4.CM,
    `CD4+ EM cells` = CD.4.EM,
    `CD4+ naive cells` = CD4_naive_absp,
    `CD4+ TEMRA cells` = CD4_TEMRA_absp,
    `CD8+ CM cells` = CD8_CM_absp,
    `CD8+ EM cells` = CD8_EM_absp,
    `CD8+ naive cells` = CD.8.naive,
    `CD8+ TEMRA cells` = CD8_TEMRA_absp,
    mDC1 = mDC1_absp,
    mDC2 = mDC2_absp,
    mDC3 = mDC3_absp,
    pDC = pDC_absp,
    `naive B cells` = naiveB_absp,
    `marginal-zone B cells` = marginal.B,
    `HLA-DR+ cells`= `HLA_DR+_Absp`, 
     `total mDC` = `mDC_absp`, 
     `total CD19+ B cells` = `CD.19`, 
     `total CD3+ cells` = `CD3_absp`, 
     `total CD4+ cells` = `CD4_absp`, 
     `total CD8+ cells` = `CD8_absp`,
     `diabetes` = `Diabetes`,
    `hepatitis_b` = `hep.B`,
    `hypertension` = `lecena.hypertenze`,
    `hyperuricemia` = `lecena.hyperurikemie`,
    `dyslipidemia` = `lecena.dyslipidemie`,
    `anti_HLA` = `anti-HLA.preTx.nebo.historicky.ano/ne`,
    `receiver_age` = `věk.příjemce`,
    `receiver_female` = `pohlavi.prijemce.(ž=1)`,
    `dialysis` = `Dialysis`,
    `re_transplantation` = `retransplantace`) 

skimr::skim(data_avail)

Data summary
Name	data_avail
Number of rows	136
Number of columns	40
_______________________
Column type frequency:
character	1
factor	1
numeric	38
________________________
Group variables	None

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
id	0	1	1	3	0	136	0

Variable type: factor

skim_variable	n_missing	complete_rate	ordered	n_unique	top_counts
group	0	1	FALSE	2	pat: 107, con: 29

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
diabetes	29	0.79	0.21	0.41	0.00	0.00	0.00	0.00	1.00	▇▁▁▁▂
PRD.GN	29	0.79	0.46	0.50	0.00	0.00	0.00	1.00	1.00	▇▁▁▁▇
Smoking	29	0.79	0.27	0.45	0.00	0.00	0.00	1.00	1.00	▇▁▁▁▃
hepatitis_b	29	0.79	0.04	0.19	0.00	0.00	0.00	0.00	1.00	▇▁▁▁▁
ASCVD	29	0.79	0.20	0.40	0.00	0.00	0.00	0.00	1.00	▇▁▁▁▂
hypertension	29	0.79	0.94	0.23	0.00	1.00	1.00	1.00	1.00	▁▁▁▁▇
hyperuricemia	29	0.79	0.42	0.50	0.00	0.00	0.00	1.00	1.00	▇▁▁▁▆
dyslipidemia	29	0.79	0.51	0.50	0.00	0.00	1.00	1.00	1.00	▇▁▁▁▇
anti_HLA	29	0.79	0.24	0.43	0.00	0.00	0.00	0.00	1.00	▇▁▁▁▂
receiver_age	29	0.79	50.81	13.12	18.86	41.76	53.14	62.18	80.44	▂▅▇▇▁
receiver_female	29	0.79	0.29	0.46	0.00	0.00	0.00	1.00	1.00	▇▁▁▁▃
dialysis	29	0.79	0.75	0.44	0.00	0.50	1.00	1.00	1.00	▃▁▁▁▇
re_transplantation	29	0.79	0.03	0.17	0.00	0.00	0.00	0.00	1.00	▇▁▁▁▁
CMV	29	0.79	0.81	0.39	0.00	1.00	1.00	1.00	1.00	▂▁▁▁▇
HLA-DR+ cells	0	1.00	69.44	45.68	2.14	39.75	57.17	87.68	263.98	▇▆▂▁▁
total mDC	0	1.00	54.32	39.33	0.30	26.43	46.15	71.87	198.29	▇▇▃▁▁
mDC1	0	1.00	8.45	6.44	0.11	4.26	7.48	10.99	32.33	▇▇▂▁▁
mDC2	0	1.00	40.12	32.28	0.10	16.67	33.05	55.62	156.81	▇▆▂▁▁
mDC3	0	1.00	0.47	0.62	0.00	0.17	0.33	0.58	5.94	▇▁▁▁▁
pDC	0	1.00	12.81	17.65	0.50	4.83	8.11	13.91	159.52	▇▁▁▁▁
total CD19+ B cells	0	1.00	137.46	128.13	0.30	56.24	99.52	182.46	815.84	▇▂▁▁▁
naive B cells	0	1.00	79.31	85.00	0.00	25.77	55.40	95.69	564.31	▇▂▁▁▁
marginal-zone B cells	0	1.00	20.25	18.36	0.04	7.86	15.16	26.10	105.19	▇▃▁▁▁
class non-switched memory cells	0	1.00	19.59	21.69	0.00	5.92	13.75	25.53	156.59	▇▁▁▁▁
class switched memory cells	0	1.00	15.89	17.11	0.00	5.81	10.70	20.28	116.93	▇▂▁▁▁
plasmablast	0	1.00	0.83	1.62	0.00	0.16	0.41	0.78	11.61	▇▁▁▁▁
transitional B cells	0	1.00	10.52	14.63	0.00	2.38	6.68	12.57	125.80	▇▁▁▁▁
total CD3+ cells	1	0.99	1269.76	669.92	0.20	776.51	1207.49	1591.39	4492.87	▅▇▂▁▁
total CD4+ cells	1	0.99	750.07	377.31	0.02	495.59	704.21	958.30	2922.12	▆▇▁▁▁
CD4+ CM cells	1	0.99	383.01	228.08	0.00	251.82	341.99	508.35	1794.23	▇▆▁▁▁
CD4+ EM cells	1	0.99	96.00	75.20	0.00	43.19	75.54	127.54	444.02	▇▅▁▁▁
CD4+ naive cells	1	0.99	249.29	175.27	0.02	118.51	212.75	326.79	957.51	▇▇▂▁▁
CD4+ TEMRA cells	1	0.99	23.14	63.55	0.00	1.93	5.99	15.31	562.90	▇▁▁▁▁
total CD8+ cells	1	0.99	433.05	325.32	0.18	217.43	348.76	567.49	2331.76	▇▃▁▁▁
CD8+ CM cells	1	0.99	63.67	67.78	0.13	28.18	46.59	76.16	565.37	▇▁▁▁▁
CD8+ EM cells	1	0.99	97.95	84.91	0.04	39.55	80.00	130.53	495.01	▇▃▁▁▁
CD8+ naive cells	1	0.99	99.66	100.77	0.00	37.69	76.30	123.68	823.58	▇▁▁▁▁
CD8+ TEMRA cells	1	0.99	172.29	199.42	0.02	56.92	109.65	213.23	1496.27	▇▁▁▁▁

names(data_avail)
##  [1] "id"                              "group"                          
##  [3] "diabetes"                        "PRD.GN"                         
##  [5] "Smoking"                         "hepatitis_b"                    
##  [7] "ASCVD"                           "hypertension"                   
##  [9] "hyperuricemia"                   "dyslipidemia"                   
## [11] "anti_HLA"                        "receiver_age"                   
## [13] "receiver_female"                 "dialysis"                       
## [15] "re_transplantation"              "CMV"                            
## [17] "HLA-DR+ cells"                   "total mDC"                      
## [19] "mDC1"                            "mDC2"                           
## [21] "mDC3"                            "pDC"                            
## [23] "total CD19+ B cells"             "naive B cells"                  
## [25] "marginal-zone B cells"           "class non-switched memory cells"
## [27] "class switched memory cells"     "plasmablast"                    
## [29] "transitional B cells"            "total CD3+ cells"               
## [31] "total CD4+ cells"                "CD4+ CM cells"                  
## [33] "CD4+ EM cells"                   "CD4+ naive cells"               
## [35] "CD4+ TEMRA cells"                "total CD8+ cells"               
## [37] "CD8+ CM cells"                   "CD8+ EM cells"                  
## [39] "CD8+ naive cells"                "CD8+ TEMRA cells"

data_excel <- data_excel_all %>% 
  na.omit()

dim(data_excel)
## [1] 106  40

Clinical data

data_clinical <- data_excel %>% 
  dplyr::select(
    id, Diabetes, PRD.GN, Smoking, hep.B, ASCVD, 
    lecena.hypertenze, lecena.hyperurikemie, lecena.dyslipidemie,
    `anti-HLA.preTx.nebo.historicky.ano/ne`,
    `věk.příjemce`, `pohlavi.prijemce.(ž=1)`, 
    Dialysis, retransplantace, CMV) %>% 
  dplyr::rename(
    `diabetes` = `Diabetes`,
    `hepatitis_b` = `hep.B`,
    `hypertension` = `lecena.hypertenze`,
    `hyperuricemia` = `lecena.hyperurikemie`,
    `dyslipidemia` = `lecena.dyslipidemie`,
    `anti_HLA` = `anti-HLA.preTx.nebo.historicky.ano/ne`,
    `receiver_age` = `věk.příjemce`,
    `receiver_female` = `pohlavi.prijemce.(ž=1)`,
    `dialysis` = `Dialysis`,
    `re_transplantation` = `retransplantace`)

summary(data_clinical)
##       id               diabetes         PRD.GN          Smoking      
##  Length:106         Min.   :0.000   Min.   :0.0000   Min.   :0.0000  
##  Class :character   1st Qu.:0.000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Mode  :character   Median :0.000   Median :0.0000   Median :0.0000  
##                     Mean   :0.217   Mean   :0.4528   Mean   :0.2642  
##                     3rd Qu.:0.000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##                     Max.   :1.000   Max.   :1.0000   Max.   :1.0000  
##   hepatitis_b          ASCVD         hypertension    hyperuricemia   
##  Min.   :0.00000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:1.0000   1st Qu.:0.0000  
##  Median :0.00000   Median :0.0000   Median :1.0000   Median :0.0000  
##  Mean   :0.03774   Mean   :0.1981   Mean   :0.9434   Mean   :0.4245  
##  3rd Qu.:0.00000   3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.00000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##   dyslipidemia       anti_HLA       receiver_age   receiver_female 
##  Min.   :0.0000   Min.   :0.0000   Min.   :18.86   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:41.48   1st Qu.:0.0000  
##  Median :1.0000   Median :0.0000   Median :53.33   Median :0.0000  
##  Mean   :0.5094   Mean   :0.2453   Mean   :50.79   Mean   :0.2925  
##  3rd Qu.:1.0000   3rd Qu.:0.0000   3rd Qu.:62.26   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :80.44   Max.   :1.0000  
##     dialysis      re_transplantation      CMV        
##  Min.   :0.0000   Min.   :0.0000     Min.   :0.0000  
##  1st Qu.:0.2500   1st Qu.:0.0000     1st Qu.:1.0000  
##  Median :1.0000   Median :0.0000     Median :1.0000  
##  Mean   :0.7453   Mean   :0.0283     Mean   :0.8113  
##  3rd Qu.:1.0000   3rd Qu.:0.0000     3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000     Max.   :1.0000

General lymphocytes (wider lymphocytes categories)

data_lymphocytes_general <- data_excel %>% 
  dplyr::select(
    id, `HLA_DR+_Absp`, `mDC_absp`, 
    `CD.19`, `CD3_absp`, `CD4_absp`, `CD8_absp`) %>% 
    rename(
     `HLA-DR+ cells`= `HLA_DR+_Absp`, 
     `total mDC` = `mDC_absp`, 
     `total CD19+ B cells` = `CD.19`, 
     `total CD3+ cells` = `CD3_absp`, 
     `total CD4+ cells` = `CD4_absp`, 
     `total CD8+ cells` = `CD8_absp`
    )

summary(data_lymphocytes_general)
##       id            HLA-DR+ cells       total mDC        total CD19+ B cells
##  Length:106         Min.   :  2.143   Min.   :  0.9969   Min.   :  0.3032   
##  Class :character   1st Qu.: 41.339   1st Qu.: 31.8812   1st Qu.: 49.3718   
##  Mode  :character   Median : 59.071   Median : 47.1999   Median : 83.6212   
##                     Mean   : 72.969   Mean   : 59.7426   Mean   :109.3094   
##                     3rd Qu.: 91.785   3rd Qu.: 74.6530   3rd Qu.:131.7112   
##                     Max.   :263.984   Max.   :198.2863   Max.   :815.8440   
##  total CD3+ cells total CD4+ cells total CD8+ cells 
##  Min.   : 240.0   Min.   : 116.2   Min.   :  44.87  
##  1st Qu.: 745.2   1st Qu.: 477.1   1st Qu.: 202.81  
##  Median :1036.9   Median : 689.2   Median : 326.15  
##  Mean   :1206.8   Mean   : 731.7   Mean   : 400.42  
##  3rd Qu.:1515.0   3rd Qu.: 912.4   3rd Qu.: 519.38  
##  Max.   :4492.9   Max.   :2922.1   Max.   :2331.76


data_lymphocytes_general_log10 <- data_lymphocytes_general %>% 
  dplyr::mutate(dplyr::across(
    .cols = dplyr::where(is.numeric),  
    .fns = ~ arm::rescale(log10(.x + 0.5)),  
    .names = "scaled_log10_{.col}" 
  )) %>% 
  dplyr::select(-c(`HLA-DR+ cells` : `total CD8+ cells`))

summary(data_lymphocytes_general_log10)
##       id            scaled_log10_HLA-DR+ cells scaled_log10_total mDC
##  Length:106         Min.   :-2.33382           Min.   :-2.2848499    
##  Class :character   1st Qu.:-0.27386           1st Qu.:-0.2557272    
##  Mode  :character   Median :-0.01045           Median : 0.0000093    
##                     Mean   : 0.00000           Mean   : 0.0000000    
##                     3rd Qu.: 0.31597           3rd Qu.: 0.3000675    
##                     Max.   : 1.10152           Max.   : 0.9421523    
##  scaled_log10_total CD19+ B cells scaled_log10_total CD3+ cells
##  Min.   :-2.47345                 Min.   :-1.41203             
##  1st Qu.:-0.23674                 1st Qu.:-0.33065             
##  Median : 0.04644                 Median :-0.01514             
##  Mean   : 0.00000                 Mean   : 0.00000             
##  3rd Qu.: 0.29138                 3rd Qu.: 0.34719             
##  Max.   : 1.27773                 Max.   : 1.38593             
##  scaled_log10_total CD4+ cells scaled_log10_total CD8+ cells
##  Min.   :-1.63279              Min.   :-1.30202             
##  1st Qu.:-0.28429              1st Qu.:-0.28711             
##  Median : 0.06748              Median : 0.03384             
##  Mean   : 0.00000              Mean   : 0.00000             
##  3rd Qu.: 0.33573              3rd Qu.: 0.34833             
##  Max.   : 1.44914              Max.   : 1.36410

Specific lymphocytes (narrower lymphocytes categories)

data_lymphocytes_specific <- data_excel %>% 
  dplyr::select(
    id, 
    `mDC1_absp`,`mDC2_absp`, `mDC3_absp`,
    `pDC_absp`,
    `naiveB_absp`,
    `marginal.B`, `non_switched_memory`,`switched.memory`,   
    `Plazmablasty_absp`,`Transitional_absp`, 
    `CD.4.CM`, `CD.4.EM`, `CD4_naive_absp`, `CD4_TEMRA_absp`,
    `CD8_CM_absp`, `CD8_EM_absp`, `CD.8.naive`, `CD8_TEMRA_absp`) %>% 
  dplyr::rename(
    mDC1 = mDC1_absp,
    mDC2 = mDC2_absp,
    mDC3 = mDC3_absp,
    pDC = pDC_absp,
    `naive B cells` = naiveB_absp,
    `marginal-zone B cells` = marginal.B,
    `class non-switched memory cells` = non_switched_memory,
    `class switched memory cells` = switched.memory,
    plasmablast = Plazmablasty_absp,
    `transitional B cells` = Transitional_absp,
    `CD4+ CM cells` = CD.4.CM,
    `CD4+ EM cells` = CD.4.EM,
    `CD4+ naive cells` = CD4_naive_absp,
    `CD4+ TEMRA cells` = CD4_TEMRA_absp,
    `CD8+ CM cells` = CD8_CM_absp,
    `CD8+ EM cells` = CD8_EM_absp,
    `CD8+ naive cells` = CD.8.naive,
    `CD8+ TEMRA cells` = CD8_TEMRA_absp
  )


summary(data_lymphocytes_specific)
##       id                 mDC1              mDC2               mDC3       
##  Length:106         Min.   : 0.2758   Min.   :  0.4985   Min.   :0.0000  
##  Class :character   1st Qu.: 4.8975   1st Qu.: 19.5353   1st Qu.:0.1776  
##  Mode  :character   Median : 7.8591   Median : 35.4651   Median :0.3278  
##                     Mean   : 9.1475   Mean   : 44.0277   Mean   :0.4632  
##                     3rd Qu.:11.7542   3rd Qu.: 57.8292   3rd Qu.:0.6290  
##                     Max.   :32.3276   Max.   :156.8147   Max.   :2.3878  
##       pDC           naive B cells    marginal-zone B cells
##  Min.   :  0.5043   Min.   :  0.00   Min.   : 0.04332     
##  1st Qu.:  4.9067   1st Qu.: 20.12   1st Qu.: 6.33619     
##  Median :  8.1418   Median : 44.65   Median :11.67662     
##  Mean   : 13.2617   Mean   : 61.07   Mean   :15.90749     
##  3rd Qu.: 13.6297   3rd Qu.: 73.50   3rd Qu.:21.74574     
##  Max.   :159.5242   Max.   :483.96   Max.   :84.39243     
##  class non-switched memory cells class switched memory cells  plasmablast     
##  Min.   :  0.000                 Min.   :  0.000             Min.   : 0.0000  
##  1st Qu.:  5.064                 1st Qu.:  4.246             1st Qu.: 0.1612  
##  Median :  9.015                 Median :  8.742             Median : 0.3952  
##  Mean   : 15.477                 Mean   : 13.659             Mean   : 0.9237  
##  3rd Qu.: 21.957                 3rd Qu.: 16.796             3rd Qu.: 0.8395  
##  Max.   :156.593                 Max.   :116.931             Max.   :11.6077  
##  transitional B cells CD4+ CM cells     CD4+ EM cells      CD4+ naive cells 
##  Min.   : 0.000       Min.   :  32.75   Min.   :  0.3629   Min.   :  5.646  
##  1st Qu.: 1.981       1st Qu.: 240.25   1st Qu.: 49.2556   1st Qu.:104.642  
##  Median : 6.113       Median : 333.86   Median : 88.3852   Median :183.540  
##  Mean   : 9.024       Mean   : 381.14   Mean   :104.3018   Mean   :222.655  
##  3rd Qu.:11.119       3rd Qu.: 509.06   3rd Qu.:134.9888   3rd Qu.:295.569  
##  Max.   :67.714       Max.   :1794.23   Max.   :444.0170   Max.   :733.693  
##  CD4+ TEMRA cells  CD8+ CM cells     CD8+ EM cells      CD8+ naive cells
##  Min.   :  0.000   Min.   :  3.097   Min.   :  0.2016   Min.   :  0.00  
##  1st Qu.:  1.808   1st Qu.: 26.823   1st Qu.: 34.9957   1st Qu.: 35.29  
##  Median :  5.434   Median : 41.187   Median : 67.9814   Median : 68.61  
##  Mean   : 25.329   Mean   : 62.879   Mean   : 87.0337   Mean   : 94.07  
##  3rd Qu.: 15.317   3rd Qu.: 76.214   3rd Qu.:113.6403   3rd Qu.:112.14  
##  Max.   :562.900   Max.   :565.372   Max.   :495.0150   Max.   :466.48  
##  CD8+ TEMRA cells  
##  Min.   :   6.427  
##  1st Qu.:  48.558  
##  Median : 107.805  
##  Mean   : 157.089  
##  3rd Qu.: 187.504  
##  Max.   :1496.275


data_lymphocytes_specific_log10 <- data_lymphocytes_specific %>% 
  dplyr::mutate(dplyr::across(
    .cols = dplyr::where(is.numeric),  
    .fns = ~ arm::rescale(log10(.x + 0.5)), 
    .names = "scaled_log10_{.col}" 
  )) %>% 
  dplyr::select(-c(`mDC1` : `CD8+ TEMRA cells`))

summary(data_lymphocytes_specific_log10)
##       id            scaled_log10_mDC1  scaled_log10_mDC2  scaled_log10_mDC3 
##  Length:106         Min.   :-1.40589   Min.   :-2.01301   Min.   :-0.76626  
##  Class :character   1st Qu.:-0.19815   1st Qu.:-0.28743   1st Qu.:-0.36353  
##  Mode  :character   Median : 0.07406   Median : 0.04923   Median :-0.09803  
##                     Mean   : 0.00000   Mean   : 0.00000   Mean   : 0.00000  
##                     3rd Qu.: 0.31237   3rd Qu.: 0.32741   3rd Qu.: 0.31316  
##                     Max.   : 0.92592   Max.   : 0.89833   Max.   : 1.55780  
##  scaled_log10_pDC  scaled_log10_naive B cells
##  Min.   :-1.2566   Min.   :-1.98655          
##  1st Qu.:-0.2893   1st Qu.:-0.28944          
##  Median :-0.0199   Median : 0.06826          
##  Mean   : 0.0000   Mean   : 0.00000          
##  3rd Qu.: 0.2627   3rd Qu.: 0.29379          
##  Max.   : 1.6573   Max.   : 1.15117          
##  scaled_log10_marginal-zone B cells
##  Min.   :-1.65986                  
##  1st Qu.:-0.27830                  
##  Median : 0.03625                  
##  Mean   : 0.00000                  
##  3rd Qu.: 0.36540                  
##  Max.   : 1.09610                  
##  scaled_log10_class non-switched memory cells
##  Min.   :-1.45126                            
##  1st Qu.:-0.28560                            
##  Median :-0.02608                            
##  Mean   : 0.00000                            
##  3rd Qu.: 0.38949                            
##  Max.   : 1.33062                            
##  scaled_log10_class switched memory cells scaled_log10_plasmablast
##  Min.   :-1.45378                         Min.   :-0.5423         
##  1st Qu.:-0.31789                         1st Qu.:-0.3329         
##  Median : 0.01853                         Median :-0.1058         
##  Mean   : 0.00000                         Mean   : 0.0000         
##  3rd Qu.: 0.33483                         3rd Qu.: 0.1962         
##  Max.   : 1.30161                         Max.   : 1.8461         
##  scaled_log10_transitional B cells scaled_log10_CD4+ CM cells
##  Min.   :-1.04586                  Min.   :-1.77095          
##  1st Qu.:-0.34169                  1st Qu.:-0.21861          
##  Median : 0.08932                  Median : 0.03896          
##  Mean   : 0.00000                  Mean   : 0.00000          
##  3rd Qu.: 0.33704                  3rd Qu.: 0.36938          
##  Max.   : 1.11522                  Max.   : 1.35669          
##  scaled_log10_CD4+ EM cells scaled_log10_CD4+ naive cells
##  Min.   :-2.66956           Min.   :-2.0134              
##  1st Qu.:-0.27555           1st Qu.:-0.2874              
##  Median : 0.06704           Median : 0.0529              
##  Mean   : 0.00000           Mean   : 0.0000              
##  3rd Qu.: 0.31594           3rd Qu.: 0.3419              
##  Max.   : 1.01746           Max.   : 0.8939              
##  scaled_log10_CD4+ TEMRA cells scaled_log10_CD8+ CM cells
##  Min.   :-0.86654              Min.   :-1.37013          
##  1st Qu.:-0.35749              1st Qu.:-0.25020          
##  Median :-0.04321              Median :-0.01689          
##  Mean   : 0.00000              Mean   : 0.00000          
##  3rd Qu.: 0.28313              3rd Qu.: 0.32002          
##  Max.   : 1.47255              Max.   : 1.42372          
##  scaled_log10_CD8+ EM cells scaled_log10_CD8+ naive cells
##  Min.   :-2.08450           Min.   :-2.49097             
##  1st Qu.:-0.23177           1st Qu.:-0.30354             
##  Median : 0.07852           Median : 0.03358             
##  Mean   : 0.00000           Mean   : 0.00000             
##  3rd Qu.: 0.31953           3rd Qu.: 0.28378             
##  Max.   : 1.01299           Max.   : 1.01216             
##  scaled_log10_CD8+ TEMRA cells
##  Min.   :-1.17442             
##  1st Qu.:-0.29107             
##  Median : 0.06679             
##  Mean   : 0.00000             
##  3rd Qu.: 0.31527             
##  Max.   : 1.25234

3 Exploration and analysis

3.1 Number of re-transplantations

There is very small proportion of re-trans plantations. How many? Are these patients different in terms of lymphocyte composition?

table(data_clinical$re_transplantation)
## 
##   0   1 
## 103   3

data_clinical %>% 
  dplyr::left_join(data_lymphocytes_specific, by = 'id') %>% 
  dplyr::select(-id) %>% 
  dplyr::mutate(re_transplantation) %>% 
  gtsummary::tbl_summary(by = 're_transplantation') %>% 
  add_p() %>% 
  add_q()

Characteristic	0 N = 103 ¹	1 N = 3 ¹	p-value ²	q-value ³
diabetes	22 (21%)	1 (33%)	0.5	>0.9
PRD.GN	46 (45%)	2 (67%)	0.6	>0.9
Smoking	27 (26%)	1 (33%)	>0.9	>0.9
hepatitis_b	4 (3.9%)	0 (0%)	>0.9	>0.9
ASCVD	21 (20%)	0 (0%)	>0.9	>0.9
hypertension	97 (94%)	3 (100%)	>0.9	>0.9
hyperuricemia	43 (42%)	2 (67%)	0.6	>0.9
dyslipidemia	52 (50%)	2 (67%)	>0.9	>0.9
anti_HLA	25 (24%)	1 (33%)	>0.9	>0.9
receiver_age	53 (41, 62)	55 (36, 63)	>0.9	>0.9
receiver_female	31 (30%)	0 (0%)	0.6	>0.9
dialysis	76 (74%)	3 (100%)	0.6	>0.9
CMV	83 (81%)	3 (100%)	>0.9	>0.9
mDC1	8 (5, 12)	12 (7, 32)	0.13	0.4
mDC2	35 (19, 58)	50 (13, 51)	>0.9	>0.9
mDC3	0.33 (0.17, 0.65)	0.37 (0.33, 0.49)	0.6	>0.9
pDC	8 (5, 14)	10 (5, 11)	>0.9	>0.9
naive B cells	45 (20, 74)	33 (13, 121)	0.7	>0.9
marginal-zone B cells	11 (6, 21)	23 (20, 24)	0.10	0.3
class non-switched memory cells	9 (5, 22)	28 (24, 40)	0.025	0.2
class switched memory cells	8 (4, 17)	26 (9, 43)	0.090	0.3
plasmablast	0.40 (0.14, 0.84)	0.24 (0.20, 1.01)	>0.9	>0.9
transitional B cells	6 (2, 12)	1 (1, 1)	0.025	0.2
CD4+ CM cells	320 (236, 497)	678 (550, 702)	0.028	0.2
CD4+ EM cells	78 (48, 134)	166 (132, 377)	0.034	0.2
CD4+ naive cells	184 (104, 309)	183 (64, 201)	0.5	>0.9
CD4+ TEMRA cells	5 (2, 15)	22 (14, 56)	0.065	0.3
CD8+ CM cells	41 (25, 75)	131 (103, 166)	0.017	0.2
CD8+ EM cells	68 (33, 106)	240 (118, 279)	0.019	0.2
CD8+ naive cells	68 (33, 111)	220 (55, 273)	0.2	0.4
CD8+ TEMRA cells	106 (46, 172)	416 (110, 457)	0.067	0.3
¹ n (%); Median (Q1, Q3)
² Fisher’s exact test; Wilcoxon rank sum test
³ False discovery rate correction for multiple testing

It seems that patients with re-transplantation may have very different lymphocyte composition, but small number of such patients does not allow to infer the effect of re-transplantation. Thus, the 3 patients will be removed for further analysis

3.1.1 Re-transplantated patients removal

ids <- data_clinical %>% filter(re_transplantation == 1) %>% 
  dplyr::select(id)

ids <- ids$id

data_clinical <- data_clinical %>% 
  dplyr::filter(!id %in% ids) %>% 
  dplyr::select(-re_transplantation)

3.2 Patients vs controls

median_diffs <- data_excel_all %>%
  dplyr::select(`HLA_DR+_Absp`:`CD8_TEMRA_absp`, group) %>%
  tidyr::pivot_longer(cols = -group, names_to = "variable", values_to = "value") %>%
  dplyr::group_by(variable, group) %>%
  dplyr::summarise(median = mean(value, na.rm = TRUE), .groups = "drop") %>%
  tidyr::pivot_wider(names_from = group, values_from = median) %>%
  dplyr::mutate(median_diff = .[[3]] / .[[2]]) %>%
  dplyr::select(variable, median_diff)

# Create the gtsummary table
tbl <- data_excel_all %>%
  dplyr::select(`HLA_DR+_Absp`:`CD8_TEMRA_absp`, group) %>%
  gtsummary::tbl_summary(
    by = group,
    statistic = all_continuous() ~ "{mean} ({p25}, {median}, {p75})"
  ) %>%
  gtsummary::modify_table_body(
    ~ dplyr::left_join(.x, median_diffs, by = "variable")
  ) %>%
  gtsummary::modify_header(median_diff ~ "**Mean counts ratio**") %>%
  gtsummary::modify_fmt_fun(
    median_diff ~ (function(x) gtsummary::style_number(x, digits = 2))
) %>% 
  gtsummary::add_p()
  
tbl

Characteristic	control N = 29 ¹	patient N = 107 ¹	Mean counts ratio	p-value ²
HLA_DR+_Absp	56 (24, 46, 79)	73 (41, 60, 93)	1.31	0.046
mDC_absp	33 (2, 27, 65)	60 (32, 48, 78)	1.79	<0.001
mDC1_absp	6 (2, 6, 9)	9 (5, 8, 12)	1.57	0.015
mDC2_absp	25 (1, 18, 45)	44 (19, 35, 58)	1.79	0.001
mDC3_absp	0.51 (0.16, 0.30, 0.47)	0.47 (0.17, 0.33, 0.65)	0.92	0.3
pDC_absp	11 (5, 8, 14)	13 (5, 8, 14)	1.15	>0.9
CD.19	243 (144, 213, 260)	109 (49, 82, 133)	0.45	<0.001
naiveB_absp	148 (88, 120, 174)	61 (20, 45, 74)	0.41	<0.001
marginal.B	36 (23, 30, 48)	16 (6, 11, 22)	0.44	<0.001
non_switched_memory	35 (19, 29, 37)	15 (5, 9, 22)	0.44	<0.001
switched.memory	24 (13, 18, 33)	14 (4, 9, 17)	0.56	<0.001
Plazmablasty_absp	0.52 (0.21, 0.46, 0.68)	0.92 (0.16, 0.39, 0.84)	1.78	0.6
Transitional_absp	16 (6, 10, 18)	9 (2, 6, 11)	0.55	0.003
CD3_absp	1,500 (1,226, 1,475, 1,841)	1,207 (743, 1,037, 1,517)	0.80	0.001
Unknown	0	1	0.80
CD4_absp	817 (647, 800, 1,033)	732 (477, 689, 914)	0.90	0.043
Unknown	0	1	0.90
CD.4.CM	390 (309, 375, 460)	381 (239, 334, 510)	0.98	0.3
Unknown	0	1	0.98
CD.4.EM	66 (27, 51, 113)	104 (49, 88, 135)	1.59	0.006
Unknown	0	1	1.59
CD4_naive_absp	347 (220, 307, 376)	223 (104, 184, 296)	0.64	<0.001
Unknown	0	1	0.64
CD4_TEMRA_absp	15 (2, 7, 15)	25 (2, 5, 16)	1.67	0.4
Unknown	0	1	1.67
CD8_absp	552 (363, 462, 723)	400 (201, 326, 520)	0.72	0.006
Unknown	0	1	0.72
CD8_CM_absp	67 (40, 54, 73)	63 (27, 41, 76)	0.94	0.2
Unknown	0	1	0.94
CD8_EM_absp	138 (78, 117, 200)	87 (35, 68, 116)	0.63	0.003
Unknown	0	1	0.63
CD.8.naive	120 (47, 94, 132)	94 (35, 69, 112)	0.78	0.13
Unknown	0	1	0.78
CD8_TEMRA_absp	228 (83, 131, 287)	157 (47, 108, 193)	0.69	0.085
Unknown	0	1	0.69
¹ Mean (Q1, Median, Q3)
² Wilcoxon rank sum test

3.3 PERMANOVA

3.3.1 General data

3.3.1.1 Prepare data

outcomes_gen <- data_lymphocytes_general_log10 %>% 
  filter(!id %in% ids) %>% 
  dplyr::select(-id)

data_predict <- data_clinical %>% 
  dplyr::select(-id)

3.3.1.2 Run single-predictor analysis

modelac <- 'perm_general'

assign(modelac, run(
  expr = perm_over(outcome = outcomes_gen,
                   predictors = data_predict,
                   nperm = 5000,
                   seed = 483),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% dplyr::arrange(desc(r2)) %>% kableExtra::kable()

predictor	r2	p_val
Smoking	0.0458487	0.0027994
ASCVD	0.0448173	0.0039992
anti_HLA	0.0223096	0.0615877
diabetes	0.0205435	0.0949810
CMV	0.0147104	0.1953609
receiver_age	0.0086260	0.4385123
hypertension	0.0084216	0.4337133
hyperuricemia	0.0082793	0.4725055
receiver_female	0.0079409	0.4851030
dialysis	0.0064187	0.5806839
PRD.GN	0.0056397	0.6440712
dyslipidemia	0.0040431	0.7630474
hepatitis_b	0.0021462	0.8810238

3.3.1.3 Run multivariable analysis - R2 selection


## by margin
modelac <- 'perm_general_multivariable_margin'
set.seed(483)

assign(modelac, run(
  expr = adonis2(outcomes_gen ~ 
                   Smoking +
                   ASCVD +
                   anti_HLA + 
                   diabetes,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'margin'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()

	Df	SumOfSqs	R2	F	Pr(>F)
Smoking	1	5.5715327	0.0363342	3.9690721	0.0110989
ASCVD	1	3.5270048	0.0230010	2.5125827	0.0595940
anti_HLA	1	2.7383895	0.0178582	1.9507856	0.1117888
diabetes	1	0.7845024	0.0051161	0.5588671	0.6512349
Residual	98	137.5662083	0.8971256	NA	NA
Total	102	153.3410767	1.0000000	NA	NA



## sequential
modelac <- 'perm_general_multivariable_sequential'
set.seed(483)

assign(modelac, run(
  expr = adonis2(outcomes_gen ~ 
                   Smoking +
                   ASCVD +
                   anti_HLA + 
                   diabetes,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'terms'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()

	Df	SumOfSqs	R2	F	Pr(>F)
Smoking	1	7.0304932	0.0458487	5.0084126	0.0027997
ASCVD	1	5.1051071	0.0332925	3.6367979	0.0144986
anti_HLA	1	2.8547657	0.0186171	2.0336901	0.1017898
diabetes	1	0.7845024	0.0051161	0.5588671	0.6512349
Residual	98	137.5662083	0.8971256	NA	NA
Total	102	153.3410767	1.0000000	NA	NA

3.3.1.4 Run multivariable analysis - P-val selection


## by margin
modelac <- 'perm_general_multivariable_margin_sensP'
set.seed(483)

assign(modelac, run(
  expr = adonis2(outcomes_gen ~ 
                   Smoking +
                   ASCVD,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'margin'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()

	Df	SumOfSqs	R2	F	Pr(>F)
Smoking	1	5.263274	0.0343240	3.727387	0.0134987
ASCVD	1	5.105107	0.0332925	3.615375	0.0148985
Residual	100	141.205476	0.9208588	NA	NA
Total	102	153.341077	1.0000000	NA	NA



## sequential
modelac <- 'perm_general_multivariable_sequential_sensP'
set.seed(483)

assign(modelac, run(
  expr = adonis2(outcomes_gen ~ 
                   Smoking +
                   ASCVD,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'terms'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()

	Df	SumOfSqs	R2	F	Pr(>F)
Smoking	1	7.030493	0.0458487	4.978910	0.0027997
ASCVD	1	5.105107	0.0332925	3.615375	0.0148985
Residual	100	141.205476	0.9208588	NA	NA
Total	102	153.341077	1.0000000	NA	NA

3.3.2 All specific data

3.3.2.1 Prepare

outcomes <- data_lymphocytes_specific_log10 %>% 
  filter(!id %in% ids) %>% 
  dplyr::select(-id)

data_predict <- data_clinical %>% 
  dplyr::select(-id)

3.3.2.2 Run single-predictor analysis

modelac <- 'perm_specific'

assign(modelac, run(
  expr = perm_over(outcome = outcomes,
                   predictors = data_predict,
                   nperm = 5000,
                   seed = 483),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% dplyr::arrange(desc(r2)) %>% kableExtra::kable()

predictor	r2	p_val
Smoking	0.0405064	0.0011998
ASCVD	0.0267541	0.0103979
CMV	0.0246470	0.0179964
diabetes	0.0218696	0.0279944
receiver_age	0.0202068	0.0415917
dialysis	0.0147878	0.1343731
dyslipidemia	0.0129008	0.2155569
anti_HLA	0.0112876	0.2871426
receiver_female	0.0110641	0.3135373
hyperuricemia	0.0093547	0.4423115
PRD.GN	0.0062281	0.7750450
hepatitis_b	0.0051746	0.7854429
hypertension	0.0031043	0.9694061

3.3.2.3 Run multivariable analyses

Note that both selection with R2 > 2 as well as P-value < 0.05 selections results in the same multivariable PERMANOVA


## by margin
modelac <- 'perm_specific_multivariable_margin'
set.seed(483)

assign(modelac, run(
  expr = adonis2(outcomes ~ 
                   Smoking +
                   CMV +
                   ASCVD +
                   diabetes +
                   receiver_age,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'margin'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()

	Df	SumOfSqs	R2	F	Pr(>F)
Smoking	1	11.838048	0.0257911	2.825554	0.0088991
CMV	1	9.960709	0.0217011	2.377463	0.0242976
ASCVD	1	6.828019	0.0148760	1.629740	0.1107889
diabetes	1	5.654930	0.0123202	1.349742	0.1947805
receiver_age	1	9.165706	0.0199690	2.187709	0.0330967
Residual	97	406.394825	0.8853984	NA	NA
Total	102	458.996571	1.0000000	NA	NA


## sequential
modelac <- 'perm_specific_multivariable_sequential'
set.seed(483)
assign(modelac, run(
  expr = adonis2(outcomes ~ 
                   Smoking +
                   CMV +
                   ASCVD +
                   diabetes +
                   receiver_age,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'terms'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()

	Df	SumOfSqs	R2	F	Pr(>F)
Smoking	1	18.592295	0.0405064	4.437686	0.0007999
CMV	1	10.431152	0.0227260	2.489750	0.0198980
ASCVD	1	8.249543	0.0179730	1.969035	0.0554945
diabetes	1	6.163051	0.0134272	1.471023	0.1525847
receiver_age	1	9.165706	0.0199690	2.187709	0.0330967
Residual	97	406.394825	0.8853984	NA	NA
Total	102	458.996571	1.0000000	NA	NA

3.3.3 Dendritic cells

3.3.3.1 Prepare data

outcomes <- data_lymphocytes_specific_log10 %>% 
  filter(!id %in% ids) %>% 
  dplyr::select(
    scaled_log10_mDC1:scaled_log10_pDC
    )

data_predict <- data_clinical %>%
  dplyr::select(-id)

3.3.3.2 Run single-predictor analysis

modelac <- 'perm_specific_dendritic'

assign(modelac, run(
  expr = perm_over(outcome = outcomes,
                   predictors = data_predict,
                   nperm = 5000,
                   seed = 483),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% dplyr::arrange(desc(r2)) %>% kableExtra::kable()

predictor	r2	p_val
Smoking	0.0282209	0.0335933
ASCVD	0.0224578	0.0681864
receiver_age	0.0154331	0.1801640
receiver_female	0.0112941	0.3221356
dialysis	0.0107099	0.3305339
dyslipidemia	0.0095974	0.4065187
diabetes	0.0087107	0.4435113
PRD.GN	0.0086675	0.4519096
CMV	0.0040339	0.7716457
hyperuricemia	0.0037099	0.8062388
hypertension	0.0034601	0.8002400
hepatitis_b	0.0027088	0.8526295
anti_HLA	0.0018789	0.9256149

3.3.3.3 Run multivariable analyses


## by margin
modelac <- 'perm_specific_dendritic_multivariable_margin'
set.seed(483)

assign(modelac, run(
  expr = adonis2(outcomes ~ 
                   Smoking +
                   ASCVD,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'margin'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()

	Df	SumOfSqs	R2	F	Pr(>F)
Smoking	1	3.007452	0.0290616	3.064018	0.0283972
ASCVD	1	2.411052	0.0232985	2.456401	0.0591941
Residual	100	98.153856	0.9484806	NA	NA
Total	102	103.485356	1.0000000	NA	NA


## sequential
modelac <- 'perm_specific_dendritic_multivariable_sequential'
set.seed(483)
assign(modelac, run(
  expr = adonis2(outcomes ~ 
                   Smoking +
                   ASCVD,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'terms'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()

	Df	SumOfSqs	R2	F	Pr(>F)
Smoking	1	2.920448	0.0282209	2.975378	0.0323968
ASCVD	1	2.411052	0.0232985	2.456401	0.0591941
Residual	100	98.153856	0.9484806	NA	NA
Total	102	103.485356	1.0000000	NA	NA

Note that as only Smoking would be selected on the basis of p-value from a multivariable model, resulting in a multivariable model.

3.3.4 B-lymphocytes

3.3.4.1 Prepare data

outcomes <- data_lymphocytes_specific_log10 %>% 
  filter(!id %in% ids) %>%
  dplyr::select(
   `scaled_log10_naive B cells`:`scaled_log10_transitional B cells`
    )

data_predict <- data_clinical %>% dplyr::select(-id)

3.3.4.2 Run single-predictor analysis

modelac <- 'perm_specific_lymphB'

assign(modelac, run(
  expr = perm_over(outcome = outcomes,
                   predictors = data_predict,
                   nperm = 5000,
                   seed = 483),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% dplyr::arrange(desc(r2)) %>% kableExtra::kable()

predictor	r2	p_val
Smoking	0.0446542	0.0077984
dialysis	0.0353771	0.0227954
diabetes	0.0276732	0.0411918
ASCVD	0.0212909	0.0927814
hyperuricemia	0.0202773	0.0983803
anti_HLA	0.0189893	0.1113777
receiver_female	0.0171100	0.1535693
dyslipidemia	0.0070766	0.5236953
CMV	0.0045456	0.7064587
PRD.GN	0.0044617	0.7204559
hepatitis_b	0.0038473	0.7268546
hypertension	0.0029038	0.8392322
receiver_age	0.0025997	0.8854229

3.3.4.3 Run multivariable analyses - R2 selection


## by margin
modelac <- 'perm_specific_lymphB_multivariable_margin'
set.seed(483)

assign(modelac, run(
  expr = adonis2(outcomes ~ 
                   Smoking +
                   dialysis +
                   diabetes +
                   ASCVD +
                   hyperuricemia,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'margin'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()

	Df	SumOfSqs	R2	F	Pr(>F)
Smoking	1	4.9076996	0.0320696	3.4901724	0.0261974
dialysis	1	2.9890295	0.0195320	2.1256860	0.1008899
diabetes	1	2.6227091	0.0171382	1.8651726	0.1283872
ASCVD	1	0.9802364	0.0064054	0.6971075	0.5213479
hyperuricemia	1	0.8173329	0.0053409	0.5812566	0.6107389
Residual	97	136.3963756	0.8912894	NA	NA
Total	102	153.0326473	1.0000000	NA	NA


## sequential
modelac <- 'perm_specific_lymphB_multivariable_sequential'
set.seed(483)
assign(modelac, run(
  expr = adonis2(outcomes ~ 
                   Smoking +
                   dialysis +
                   diabetes +
                   ASCVD +
                   hyperuricemia,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'terms'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()

	Df	SumOfSqs	R2	F	Pr(>F)
Smoking	1	6.8335483	0.0446542	4.8597639	0.0076992
dialysis	1	5.3788543	0.0351484	3.8252400	0.0187981
diabetes	1	2.6075522	0.0170392	1.8543936	0.1282872
ASCVD	1	0.9989839	0.0065279	0.7104400	0.5104490
hyperuricemia	1	0.8173329	0.0053409	0.5812566	0.6107389
Residual	97	136.3963756	0.8912894	NA	NA
Total	102	153.0326473	1.0000000	NA	NA

3.3.4.4 Run multivariable analyses - P-value selection


## by margin
modelac <- 'perm_specific_lymphB_multivariable_margin_sensP'
set.seed(483)

assign(modelac, run(
  expr = adonis2(outcomes ~ 
                   Smoking +
                   dialysis +
                   diabetes,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'margin'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()

	Df	SumOfSqs	R2	F	Pr(>F)
Smoking	1	5.857672	0.0382773	4.195776	0.0127987
dialysis	1	4.783796	0.0312600	3.426573	0.0286971
diabetes	1	2.607552	0.0170392	1.867757	0.1260874
Residual	99	138.212693	0.9031582	NA	NA
Total	102	153.032647	1.0000000	NA	NA


## sequential
modelac <- 'perm_specific_lymphB_multivariable_sequential_sensP'
set.seed(483)
assign(modelac, run(
  expr = adonis2(outcomes ~ 
                   Smoking +
                   dialysis +
                   diabetes ,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'terms'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()

	Df	SumOfSqs	R2	F	Pr(>F)
Smoking	1	6.833548	0.0446542	4.894784	0.0076992
dialysis	1	5.378854	0.0351484	3.852805	0.0179982
diabetes	1	2.607552	0.0170392	1.867757	0.1260874
Residual	99	138.212693	0.9031582	NA	NA
Total	102	153.032647	1.0000000	NA	NA

3.3.5 T-lymphocytes

3.3.5.1 Prepare data

outcomes <- data_lymphocytes_specific_log10 %>% 
  filter(!id %in% ids) %>%
  dplyr::select(
   `scaled_log10_CD4+ CM cells`:`scaled_log10_CD8+ TEMRA cells`
    )

data_predict <- data_clinical %>% dplyr::select(-id)

3.3.5.2 Run single-predictor analysis

modelac <- 'perm_specific_lymphT'

assign(modelac, run(
  expr = perm_over(outcome = outcomes,
                   predictors = data_predict,
                   nperm = 5000,
                   seed = 483),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% dplyr::arrange(desc(r2)) %>% kableExtra::kable()

predictor	r2	p_val
CMV	0.0503748	0.0013997
Smoking	0.0436505	0.0021996
receiver_age	0.0359540	0.0081984
ASCVD	0.0330791	0.0109978
diabetes	0.0242087	0.0495901
dyslipidemia	0.0189910	0.0979804
anti_HLA	0.0102753	0.3571286
hepatitis_b	0.0074380	0.4791042
receiver_female	0.0063772	0.6338732
PRD.GN	0.0063165	0.6354729
hyperuricemia	0.0039844	0.8436313
hypertension	0.0030739	0.8600280
dialysis	0.0013107	0.9912018

3.3.5.3 Run multivariable analyses

Note that selection on the basis of P-value would result in the same multivariable PERMANOVA as based on the R2


## by margin
modelac <- 'perm_specific_lymphT_multivariable_margin'
set.seed(483)

assign(modelac, run(
  expr = adonis2(outcomes ~ 
                   CMV +
                   Smoking +
                   receiver_age +
                   ASCVD +
                   diabetes,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'margin'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()

	Df	SumOfSqs	R2	F	Pr(>F)
CMV	1	8.981104	0.0443558	5.100516	0.0020998
Smoking	1	4.433402	0.0218957	2.517802	0.0424958
receiver_age	1	7.217228	0.0356444	4.098782	0.0059994
ASCVD	1	2.759608	0.0136291	1.567226	0.1710829
diabetes	1	2.012991	0.0099417	1.143211	0.3149685
Residual	97	170.799801	0.8435451	NA	NA
Total	102	202.478567	1.0000000	NA	NA


## sequential
modelac <- 'perm_specific_lymphT_multivariable_sequential'
set.seed(483)
assign(modelac, run(
  expr = adonis2(outcomes ~ 
                   CMV +
                   Smoking +
                   receiver_age +
                   ASCVD +
                   diabetes,
                 data = data_predict,
                 permutations = 10000,
                 method = 'euclidean',
                 by = 'terms'),
  path = paste0('gitignore/run/', modelac),
  reuse = TRUE))

get(modelac) %>% kableExtra::kable()

	Df	SumOfSqs	R2	F	Pr(>F)
CMV	1	10.199825	0.0503748	5.792647	0.0009999
Smoking	1	7.937709	0.0392027	4.507955	0.0026997
receiver_age	1	6.740675	0.0332908	3.828139	0.0079992
ASCVD	1	4.787567	0.0236448	2.718938	0.0347965
diabetes	1	2.012991	0.0099417	1.143211	0.3149685
Residual	97	170.799801	0.8435451	NA	NA
Total	102	202.478567	1.0000000	NA	NA

3.4 Correlation matrix

3.4.1 General

set.seed(483)
# Prepare table
data_matrix <- data_clinical %>% 
  dplyr::left_join(data_lymphocytes_general, by = 'id') %>% 
  dplyr::select(-id) %>% 
  cor(method = 'spearman') %>% 
  data.frame() %>% 
  dplyr::filter(
    row.names(.) %in% colnames(data_lymphocytes_general)[-1]
    ) %>% 
  dplyr::select(
    row.names(perm_specific_multivariable_sequential)[1:5],
    dialysis
    ) %>% as.matrix()

colnames(data_matrix)[1] <- 'smoking'
colnames(data_matrix)[2] <- 'CMV seropositivity'
colnames(data_matrix)[6] <- 'dialysis treatment'
colnames(data_matrix)[5] <- 'age'
colnames(data_matrix)[4] <- 'diabetes mellitus'

# Calculate distance and perform hierarchical clustering
distance_matrix2 <- dist(t(outcomes_gen))
hc2 <- hclust(distance_matrix2, method = 'average')

col_fun = colorRamp2(c(-0.5, 0, 0.5), c("blue", "white", "red"))

# Heatmap

tr <- Heatmap(data_matrix, 
        name = "Spearman rho", 
        cluster_rows = hc2, 
        show_row_dend = TRUE, 
        show_column_dend = FALSE,
        row_names_gp = gpar(fontsize = 11),
        column_names_side = "top",
        row_dend_width = unit(3.2, "cm"),
        row_split = 2,
        gap = unit(1.5, "mm"),
        row_title = NULL,
        column_title = NULL,
        row_title_rot = 0,
        row_title_gp = gpar(fontsize = 14))

print(tr)

Spearman rho correlation between selected clinical characteristics and counts of borader categories of lymphocyte sub-populations. Dendrogram shows correlated sub-populations, based on UPGMA clustering


pdf("gitignore/figures/heatmap_general.pdf", width = 5.5, height = 4)
print(tr)
dev.off()
## png 
##   2

3.4.2 Specific


# Prepare table
set.seed(483)
data_matrix <- data_clinical %>% 
  dplyr::left_join(data_lymphocytes_specific, by = 'id') %>% 
  dplyr::select(-id) %>% 
  cor(method = 'spearman') %>% 
  data.frame() %>% 
  dplyr::filter(
    row.names(.) %in% colnames(data_lymphocytes_specific)[-1]
    ) %>% 
  dplyr::select(
    row.names(perm_specific_multivariable_sequential)[1:5],
    dialysis
    ) %>% as.matrix()


colnames(data_matrix)[1] <- 'smoking'
colnames(data_matrix)[2] <- 'CMV seropositivity'
colnames(data_matrix)[6] <- 'dialysis treatment'
colnames(data_matrix)[5] <- 'age'
colnames(data_matrix)[4] <- 'diabetes mellitus'


## define outcomes as all speficif features
outcomes <- data_lymphocytes_specific_log10 %>% 
  dplyr::select(-id)

# Calculate distance and perform hierarchical clustering
distance_matrix2 <- dist(t(outcomes))
hc2 <- hclust(distance_matrix2, method = 'average')

cluster_assignment <- cutree(hc2, k = 6)

  # Assign colors per cluster
cluster_colors <- c("darkblue", "red4", "darkgreen", "grey40", "orange3", "purple3")
cluster_colors <- c("darkblue", "red4", "purple3", "grey40", "orange3", "darkgreen")

# Map colors to rows based on cluster
label_colors <- cluster_colors[cluster_assignment]

tr <- Heatmap(data_matrix, 
        name = "Spearman rho", 
        cluster_rows = hc2, 
        show_row_dend = TRUE, 
        show_column_dend = FALSE,
        row_names_gp = gpar(fontsize = 11, col = label_colors),
        column_names_side = "top",
        row_dend_width = unit(3.2, "cm"),
        row_split = 6,
        gap = unit(1.5, "mm"),
        row_title = NULL,
        column_title = NULL,
        row_title_rot = 0,
        row_title_side = c("right"),
        row_title_gp = gpar(fontsize = 14, col = label_colors),
        col = col_fun)

print(tr)

Matrix of Spearman correlation coefficients between clinical characteristics and counts of various peripheral blood immune cell subpopulations. The dendrogram illustrates similarities in cell counts, where closely linked tips indicate that the associated lymphocyte subpopulations are interrelated, with higher counts tending to co-occur. Correlations with an entire lymphocyte cluster suggest a specific influence (e.g., CMV infection), while correlations with individual subpopulations across different clusters indicate a more diffuse effect (e.g., smoking). The dendrogram is constructed using the unweighted pair group method with arithmetic mean (UPGMA) clustering. Clusters of subpopulations (as defined by UPGMA) are as follows: (1, grey) plasmablast; (2, red) plasmacytoid dendritic cells (pDC); (3, purple) B cells – including naive B cells, transitional B cells, class-switched memory cells, marginal-zone B cells, and class non-switched memory cells; (4, blue) myeloid dendritic cells (mDC1, mDC2, mDC3); (5, green) naive T cells – CD4+ and CD8+; and (6, orange) memory T cells – including CD4+ and CD8+ TEMRA, effector memory (EM), and central memory (CM) subsets.


pdf("gitignore/figures/heatmap_specific.pdf", width = 6, height = 6)
tr
dev.off()
## png 
##   2

3.5 Univariate analysis

3.5.1 Data

Lets to log10-transform data but not standardize to express log-FC. Connect to relevant clinical data

data_lymphocytes_specific_counts <- data_lymphocytes_specific %>% 
  dplyr::mutate(dplyr::across(
    .cols = dplyr::where(is.numeric),
    .names = "count_{.col}" 
  )) %>% 
  dplyr::select(-c(`mDC1` : `CD8+ TEMRA cells`)) %>% 
  dplyr::mutate(dplyr::across(
    dplyr::where(is.numeric), replace_zeros
    )) %>% 
  dplyr::mutate(dplyr::across(
    .cols = dplyr::where(is.numeric),
    .fns = ~ round(.x * 1e6) 
  ))
  

data_glm <-  data_clinical %>% 
  dplyr::mutate(receiver_age_30y = receiver_age/30) %>% 
  dplyr::select(
    id, Smoking, CMV, receiver_age_30y, dialysis, ASCVD
    ) %>% 
  dplyr::left_join(data_lymphocytes_specific_counts, by = 'id')

3.5.2 Running models

3.5.2.1 GLM


modelac <- "glm_ranint_poisson"
namesy <- names(data_lymphocytes_specific_counts)[-1]

assign(modelac, run(
  expr = glm_iori_poisson(
    outcomes = namesy,
    data = data_glm
  ),
  path = paste0("gitignore/run/", modelac),
  reuse = TRUE
))

glm_ranint_poisson <- glm_ranint_poisson %>%
  dplyr::mutate(
    Smoking_P_val_adj = p.adjust(Smoking_P_val, method = "BH"),
    cmv_P_val_adj = p.adjust(cmv_P_val, method = "BH"),
    receiver_age_30y_P_val_adj = p.adjust(
      receiver_age_30y_P_val,
      method = "BH"
    ),
    dialysis_P_val_adj = p.adjust(dialysis_P_val, method = "BH"),
    ascvd_P_val_adj = p.adjust(
      ascvd_P_val,
      method = "BH"
    )
  ) %>%
  dplyr::select(
    Outcome,
    Smoking_logFC, Smoking_P_val, Smoking_P_val_adj,
    cmv_logFC, cmv_P_val, cmv_P_val_adj,
    receiver_age_30y_logFC, receiver_age_30y_P_val, receiver_age_30y_P_val_adj,
    dialysis_logFC, dialysis_P_val, dialysis_P_val_adj,
    ascvd_logFC, ascvd_P_val, ascvd_P_val_adj
  )

glm_ranint_poisson_CR <- glm_ranint_poisson %>%
  dplyr::mutate(
    Smoking_CR = round(exp(Smoking_logFC), 2),
    cmv_CR = round(exp(cmv_logFC), 2),
    receiver_age_30y_CR = round(exp(receiver_age_30y_logFC), 2),
    dialysis_CR = round(exp(dialysis_logFC), 2),
    ascvd_CR = round(exp(ascvd_logFC), 2)
  ) %>%
  dplyr::select(
    Outcome,
    Smoking_CR, Smoking_P_val, Smoking_P_val_adj,
    cmv_CR, cmv_P_val, cmv_P_val_adj,
    receiver_age_30y_CR,
    receiver_age_30y_P_val, receiver_age_30y_P_val_adj,
    dialysis_CR, dialysis_P_val, dialysis_P_val_adj,
    ascvd_CR, ascvd_P_val, ascvd_P_val_adj
  )

kableExtra::kbl(glm_ranint_poisson_CR,
  caption = "Results of generalized linear models showing the association between clinical variables and the composition of peripheral blood immune cell subsets. The `CR` suffix indicates the count ratio, representing the fold expected change in the outcome when the predictor increases by one unit. The `P_val` suffix denotes unadjusted P-values, while the P_val_adj suffix represents P-values coCRected for multiple testing using the Benjamini-Hochberg coCRection, which accounts for the repeated estimation of each predictors effect across 18 outcomes."
)

Results of generalized linear models showing the association between clinical variables and the composition of peripheral blood immune cell subsets. The `CR` suffix indicates the count ratio, representing the fold expected change in the outcome when the predictor increases by one unit. The `P_val` suffix denotes unadjusted P-values, while the P_val_adj suffix represents P-values coCRected for multiple testing using the Benjamini-Hochberg coCRection, which accounts for the repeated estimation of each predictors effect across 18 outcomes.
Outcome	Smoking_CR	Smoking_P_val	Smoking_P_val_adj	cmv_CR	cmv_P_val	cmv_P_val_adj	receiver_age_30y_CR	receiver_age_30y_P_val	receiver_age_30y_P_val_adj	dialysis_CR	dialysis_P_val	dialysis_P_val_adj	ascvd_CR	ascvd_P_val	ascvd_P_val_adj
count_mDC1_absp	1.45	0.0821632	0.1643264	0.73	0.1668553	0.7508491	0.75	0.1833661	0.5500984	1.22	0.3374549	0.7592735	1.44	0.1379511	0.3547313
count_mDC2_absp	0.66	0.0451806	0.1643264	1.00	0.9974734	0.9974734	1.26	0.2768216	0.7118270	1.14	0.5268359	0.8797687	1.53	0.0719346	0.2589644
count_mDC3_absp	1.12	0.6498684	0.6498684	0.93	0.7861076	0.8323492	0.62	0.0519317	0.3115902	1.35	0.1998693	0.7592735	1.57	0.1022325	0.3066975
count_pDC_absp	1.50	0.0631027	0.1643264	1.12	0.6441707	0.7829139	1.02	0.9413153	0.9966868	0.85	0.4483543	0.8797687	0.80	0.3606195	0.4993193
count_naiveB_absp	1.42	0.1292595	0.1938893	1.13	0.6284973	0.7829139	0.87	0.5432927	0.8994371	1.24	0.3345707	0.7592735	1.39	0.2186927	0.3936468
count_marginal.B	1.61	0.0436430	0.1643264	0.74	0.2367602	0.7829139	1.14	0.5746919	0.8994371	2.26	0.0003625	0.0065243	0.89	0.6576413	0.6847042
count_non_switched_memory	1.65	0.0783700	0.1643264	0.89	0.6959235	0.7829139	1.28	0.3902863	0.7805726	1.49	0.1465900	0.7592735	1.26	0.4782797	0.5739357
count_switched.memory	2.35	0.0009759	0.0175665	1.20	0.5081204	0.7829139	0.97	0.9028493	0.9966868	1.10	0.6922724	0.8900645	1.31	0.3557353	0.4993193
count_Plazmablasty_absp	1.49	0.1913251	0.2459894	0.83	0.5634339	0.7829139	0.65	0.1578373	0.5500984	1.42	0.2319909	0.7592735	1.56	0.2043473	0.3936468
count_Transitional_absp	1.43	0.2577554	0.2899748	1.16	0.6726890	0.7829139	0.92	0.8037449	0.9966868	1.55	0.1544323	0.7592735	1.45	0.3081146	0.4993193
count_CD.4.CM	1.31	0.0565256	0.1643264	0.89	0.4309489	0.7829139	0.95	0.7428517	0.9966868	0.98	0.8587102	0.9349747	1.49	0.0120619	0.1884576
count_CD.4.EM	1.43	0.0671297	0.1643264	1.23	0.3251786	0.7829139	0.90	0.5996248	0.8994371	0.83	0.3318537	0.7592735	1.62	0.0314096	0.1884576
count_CD4_naive_absp	1.34	0.1034590	0.1862263	0.69	0.0570816	0.3424894	0.61	0.0081360	0.0732243	1.01	0.9710655	0.9710655	1.32	0.1833883	0.3936468
count_CD4_TEMRA_absp	1.71	0.1639893	0.2270621	5.79	0.0000286	0.0002574	1.04	0.9299784	0.9966868	0.85	0.6662095	0.8900645	1.20	0.6847042	0.6847042
count_CD8_CM_absp	1.24	0.2954203	0.3127980	1.16	0.4986506	0.7829139	0.81	0.3262163	0.7339866	0.97	0.8830317	0.9349747	1.69	0.0255606	0.1884576
count_CD8_EM_absp	1.36	0.2256127	0.2707352	1.21	0.4844465	0.7829139	0.70	0.1650548	0.5500984	0.87	0.5702644	0.8797687	1.70	0.0705542	0.2589644
count_CD.8.naive	1.55	0.0131338	0.1182038	1.21	0.3135782	0.7829139	0.34	0.0000000	0.0000001	0.91	0.5865124	0.8797687	1.12	0.5710413	0.6424214
count_CD8_TEMRA_absp	1.44	0.1138749	0.1863407	3.01	0.0000117	0.0002106	1.00	0.9986771	0.9986771	1.08	0.7462042	0.8954450	1.24	0.4163171	0.5352649

3.5.2.2 LM with log-transformed outcomes


modelac <- "lm_lognormal_res"
namesy <- names(data_lymphocytes_specific_counts)[-1]

assign(modelac, run(
  expr = lm_lognormal(
    outcomes = namesy,
    data = data_glm
  ),
  path = paste0("gitignore/run/", modelac),
  reuse = TRUE
))

lm_lognormal_res <- lm_lognormal_res %>%
  dplyr::mutate(
    Smoking_P_val_adj = p.adjust(Smoking_P_val, method = "BH"),
    cmv_P_val_adj = p.adjust(cmv_P_val, method = "BH"),
    receiver_age_30y_P_val_adj = p.adjust(
      receiver_age_30y_P_val,
      method = "BH"
    ),
    dialysis_P_val_adj = p.adjust(dialysis_P_val, method = "BH"),
    ascvd_P_val_adj = p.adjust(
      ascvd_P_val,
      method = "BH"
    )
  ) %>%
  dplyr::select(
    Outcome,
    Smoking_logFC, Smoking_P_val, Smoking_P_val_adj,
    cmv_logFC, cmv_P_val, cmv_P_val_adj,
    receiver_age_30y_logFC, receiver_age_30y_P_val, receiver_age_30y_P_val_adj,
    dialysis_logFC, dialysis_P_val, dialysis_P_val_adj,
    ascvd_logFC, ascvd_P_val, ascvd_P_val_adj
  ) 

kableExtra::kbl(lm_lognormal_res,
  caption = "Results of linear models showing the association between clinical variables and the composition of peripheral blood immune cell subsets. The `logFC` suffix indicates effect size, representing the expected change in the log-transformed outcome when the predictor increases by one unit. The `P_val` suffix denotes unadjusted P-values, while the P_val_adj suffix represents P-values corrected for multiple testing using the Benjamini-Hochberg correction, which accounts for the repeated estimation of each predictors effect across 18 outcomes."
)

Results of linear models showing the association between clinical variables and the composition of peripheral blood immune cell subsets. The `logFC` suffix indicates effect size, representing the expected change in the log-transformed outcome when the predictor increases by one unit. The `P_val` suffix denotes unadjusted P-values, while the P_val_adj suffix represents P-values corrected for multiple testing using the Benjamini-Hochberg correction, which accounts for the repeated estimation of each predictors effect across 18 outcomes.
Outcome	Smoking_logFC	Smoking_P_val	Smoking_P_val_adj	cmv_logFC	cmv_P_val	cmv_P_val_adj	receiver_age_30y_logFC	receiver_age_30y_P_val	receiver_age_30y_P_val_adj	dialysis_logFC	dialysis_P_val	dialysis_P_val_adj	ascvd_logFC	ascvd_P_val	ascvd_P_val_adj
count_mDC1	0.3711239	0.0948305	0.1896610	-0.3197519	0.1829036	0.7934812	-0.2887731	0.1996895	0.5990684	0.1989222	0.3541938	0.7969361	0.3626161	0.1531878	0.3939114
count_mDC2	-0.4144419	0.0548502	0.1896610	0.0007097	0.9975553	0.9975553	0.2289242	0.2935603	0.7548693	0.1272455	0.5404066	0.8982470	0.4264018	0.0838837	0.3019814
count_mDC3	0.1090727	0.6605534	0.6605534	-0.0706402	0.7928688	0.8395081	-0.4747767	0.0622559	0.3735356	0.2991869	0.2164798	0.7969361	0.4495573	0.1160404	0.3481212
count_pDC	0.4070755	0.0744107	0.1896610	0.1096121	0.6549940	0.7934812	0.0163910	0.9431970	0.9986792	-0.1612947	0.4636596	0.8982470	-0.2292577	0.3771888	0.5222614
count_naive B cells	0.3523952	0.1442311	0.2163467	0.1217758	0.6397513	0.7934812	-0.1435178	0.5567140	0.9174456	0.2177035	0.3514220	0.7969361	0.3271442	0.2355266	0.4239479
count_marginal-zone B cells	0.4748985	0.0531207	0.1896610	-0.3017388	0.2537325	0.7934812	0.1342758	0.5873032	0.9174456	0.8152768	0.0008029	0.0144526	-0.1194329	0.6681030	0.6944102
count_class non-switched memory cells	0.4990383	0.0908003	0.1896610	-0.1200495	0.7053166	0.7934812	0.2475932	0.4064960	0.8129919	0.3997357	0.1621042	0.7969361	0.2301580	0.4930265	0.5916318
count_class switched memory cells	0.8528785	0.0018573	0.0334315	0.1854228	0.5223059	0.7934812	-0.0320384	0.9061113	0.9986792	0.0994278	0.7017471	0.9022463	0.2734379	0.3724143	0.5222614
count_plasmablast	0.3972205	0.2078180	0.2671946	-0.1902894	0.5763191	0.7934812	-0.4364630	0.1736525	0.5990684	0.3529092	0.2489292	0.7969361	0.4417628	0.2210217	0.4239479
count_transitional B cells	0.3571943	0.2748097	0.3091609	0.1444667	0.6827302	0.7934812	-0.0797304	0.8099440	0.9986792	0.4365448	0.1701619	0.7969361	0.3682896	0.3250975	0.5222614
count_CD4+ CM cells	0.2662910	0.0672589	0.1896610	-0.1191345	0.4466116	0.7934812	-0.0465784	0.7508338	0.9986792	-0.0241084	0.8633987	0.9389245	0.4013788	0.0166557	0.2364102
count_CD4+ EM cells	0.3600509	0.0787413	0.1896610	0.2096334	0.3420507	0.7934812	-0.1049344	0.6116304	0.9174456	-0.1853401	0.3487012	0.7969361	0.4844738	0.0394017	0.2364102
count_CD4+ naive cells	0.2956169	0.1173124	0.2098389	-0.3742337	0.0678888	0.4073327	-0.4883709	0.0117519	0.1057674	0.0064000	0.9719739	0.9719739	0.2765180	0.1997567	0.4239479
count_CD4+ TEMRA cells	0.5389918	0.1799550	0.2491685	1.7557156	0.0000992	0.0008931	0.0345931	0.9322192	0.9986792	-0.1622497	0.6764319	0.9022463	0.1800379	0.6944102	0.6944102
count_CD8+ CM cells	0.2149534	0.3124541	0.3308338	0.1506054	0.5129856	0.7934812	-0.2050316	0.3430901	0.7719528	-0.0293573	0.8867620	0.9389245	0.5251825	0.0326982	0.2364102
count_CD8+ EM cells	0.3106208	0.2424994	0.2909993	0.1941851	0.4990655	0.7934812	-0.3617595	0.1810423	0.5990684	-0.1413279	0.5829951	0.8982470	0.5307058	0.0824452	0.3019814
count_CD8+ naive cells	0.4384032	0.0179858	0.1618722	0.1930028	0.3305317	0.7934812	-1.0653236	0.0000001	0.0000018	-0.0933802	0.5988313	0.8982470	0.1146529	0.5837404	0.6567079
count_CD8+ TEMRA cells	0.3664244	0.1282349	0.2098389	1.1007013	0.0000485	0.0008731	-0.0003711	0.9987836	0.9987836	0.0728481	0.7541683	0.9050020	0.2157022	0.4321354	0.5556027

3.5.3 Volcano plot

General setting and getting legend


glm_ranint_poisson$Outcome <- c('mDC1', 'mDC2', 'mDC3', 'pDC', 
  'naive B cells', 'marginal-zone B cells',
  'class non-switched memory cells', 'class switched memory cells', 
  'plasmablast', 'transitional B cells', 
  'CD4+ CM cells', 'CD4+ EM cells', 'CD4+ naive cells', 'CD4+ TEMRA cells', 
  'CD8+ CM cells', 'CD8+ EM cells', 'CD8+ naive cells', 'CD8+ TEMRA cells')

color_map <- c('Increased' = '#A83226', 
               'Decreased' = '#2A5CA8', 
               'FDR>0.05' = 'grey20') 


dummy_data <- data.frame(
  Count = factor(c("Increased", "Decreased", "FDR>0.05"), levels = c("Increased", "Decreased", "FDR>0.05")),
  x = 1:3,
  y = 1:3
)

dummy_plot <- ggplot(dummy_data, aes(x, y, color = Count)) +
  geom_point() +
  scale_color_manual(values = color_map) +
  theme(legend.position = "bottom",
        legend.text = element_text(size = 14),
        legend.title = element_text(size = 14))

get_legend <- function(my_ggplot) {
  tmp <- ggplot_gtable(ggplot_build(my_ggplot))
  leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
  legend <- tmp$grobs[[leg]]
  return(legend)
}

common_legend <- get_legend(dummy_plot)

3.5.3.1 Smoking

plotac <- 'fig1'
predictor <- 'Smoking'

path = paste0('gitignore/figures/',plotac,'_', predictor, '.rds')

if(file.exists(path) == FALSE){
  assign(plotac, glm_ranint_poisson %>%
           dplyr::select(Outcome, contains(predictor)) %>%
           dplyr::mutate(
             pv_adj = get(paste0(predictor, '_P_val_adj')),
             pv_raw = get(paste0(predictor, '_P_val')),
             b_coe = get(paste0(predictor, '_logFC')),
             ) %>%
           dplyr::mutate(
             m_ylab = -log10(pv_raw),
             m_xlab = b_coe,
             rescat = if_else(
               pv_adj >= 0.05,
               'FDR>0.05',
               if_else(
                 b_coe > 0,
                 'Increased',
                 'Decreased')),
             pv_adj = if_else(pv_adj < 1e-6, 1e-6, pv_adj),
             pv_raw = if_else(pv_raw < 1e-6, 1e-6, pv_raw)) %>%

           mutate(
             rescat = factor(rescat,
                             levels = c('Increased', 'Decreased', 'FDR>0.05')),
             meta_labels = if_else(pv_raw < 0.05, Outcome, '')) %>%
           mutate(
             meta_labels = str_remove(meta_labels, 'count_')
           ) %>% 
           ggplot(aes(x = b_coe,
                      y = m_ylab,
                      color = rescat,
                      alpha = rescat,
                      label = meta_labels)) +

           geom_point(size = 2, shape = 16, alpha = 0.6) +
           geom_vline(xintercept = 0, 
                      linetype = 'dashed', 
                      linewidth = 0.2, 
                      color = 'grey30') +

           geom_text_repel(aes(
             x = m_xlab,
             y = m_ylab),
             show.legend = FALSE,
             alpha = 1,
             box.padding = 0.75,
             size = 3.7,
             seed = 17) +

           labs(x = 'log(fold count increase)', y = '-log10 (P-value)') +
           scale_color_manual(values = color_map, name = '') +
           ggtitle('smoking') +
           theme(legend.text = element_text(size = 14),
                 axis.title = element_text(size = 14))+
           coord_cartesian(ylim = c(0, 6),
                           xlim = c(-1.8, 1.8)) 
         )

  saveRDS(get(plotac), file = path)
  get(plotac)

} else {assign(plotac, readRDS(path))
    get(plotac)}

3.5.3.2 Cytomegalovirus

plotac <- 'fig2'
predictor <- 'cmv'

path = paste0('gitignore/figures/',plotac,'_', predictor, '.rds')

if(file.exists(path) == FALSE){
  assign(plotac, glm_ranint_poisson %>%
           dplyr::select(Outcome, contains(predictor)) %>%
           dplyr::mutate(
             pv_adj = get(paste0(predictor, '_P_val_adj')),
             pv_raw = get(paste0(predictor, '_P_val')),
             b_coe = get(paste0(predictor, '_logFC')),
             ) %>%
           dplyr::mutate(
             m_ylab = -log10(pv_raw),
             m_xlab = b_coe,
             rescat = if_else(
               pv_adj >= 0.05,
               'FDR>0.05',
               if_else(
                 b_coe > 0,
                 'Increased',
                 'Decreased')),
             pv_adj = if_else(pv_adj < 1e-6, 1e-6, pv_adj),
             pv_raw = if_else(pv_raw < 1e-6, 1e-6, pv_raw)) %>%

           mutate(
             rescat = factor(rescat,
                             levels = c('Increased', 'Decreased', 'FDR>0.05')),
             meta_labels = if_else(pv_raw < 0.05, Outcome, '')) %>%
           mutate(
             meta_labels = str_remove(meta_labels, 'count_')
           ) %>% 
           ggplot(aes(x = b_coe,
                      y = m_ylab,
                      color = rescat,
                      alpha = rescat,
                      label = meta_labels)) +

           geom_point(size = 2, shape = 16, alpha = 0.6) +
           geom_vline(xintercept = 0, 
                      linetype = 'dashed', 
                      linewidth = 0.2, 
                      color = 'grey30') +

           geom_text_repel(aes(
             x = m_xlab,
             y = m_ylab),
             show.legend = FALSE,
             alpha = 1,
             box.padding = 0.6,
             size = 4.1,
             seed = 17) +

           labs(x = 'log(fold count increase)', y = '-log10 (P-value)') +
           scale_color_manual(values = color_map, name = '') +
           ggtitle('CMV seropositivity') +
           theme(legend.text = element_text(size = 14),
                 axis.title = element_text(size = 14))+
           coord_cartesian(ylim = c(0, 6),
                           xlim = c(-1.8, 1.8)) 
         )

  saveRDS(get(plotac), file = path)
  get(plotac)

} else {assign(plotac, readRDS(path))
    get(plotac)}

3.5.3.3 Receiver age (effect of 30 years plus)

plotac <- "fig3"
predictor <- "receiver_age_30y"

path <- paste0("gitignore/figures/", plotac, "_", predictor, ".rds")

if (file.exists(path) == FALSE) {
  assign(plotac, glm_ranint_poisson %>%
    dplyr::select(Outcome, contains(predictor)) %>%
    dplyr::mutate(
      pv_adj = get(paste0(predictor, "_P_val_adj")),
      pv_raw = get(paste0(predictor, "_P_val")),
      b_coe = get(paste0(predictor, "_logFC")),
    ) %>%
    dplyr::mutate(
      m_ylab = -log10(pv_raw),
      m_xlab = b_coe,
      rescat = if_else(
        pv_adj >= 0.05,
        "FDR>0.05",
        if_else(
          b_coe > 0,
          "Increased",
          "Decreased"
        )
      ),
      m_ylab = if_else(m_ylab > 6, 6, m_ylab)
    ) %>%
    mutate(
      rescat = factor(rescat,
        levels = c("Increased", "Decreased", "FDR>0.05")
      ),
      meta_labels = if_else(pv_raw < 0.05, Outcome, "")
    ) %>%
    mutate(
      meta_labels = str_remove(meta_labels, "count_")
    ) %>%
    ggplot(aes(
      x = b_coe,
      y = m_ylab,
      color = rescat,
      alpha = rescat,
      label = meta_labels
    )) +
    geom_point(size = 2, shape = 16, alpha = 0.6) +
    geom_vline(
      xintercept = 0,
      linetype = "dashed",
      linewidth = 0.2,
      color = "grey30"
    ) +
    geom_text_repel(
      aes(
        x = m_xlab,
        y = m_ylab
      ),
      show.legend = FALSE,
      alpha = 1,
      box.padding = 0.7,
      size = 4.1,
      seed = 17
    ) +
    labs(x = "log(fold count increase)", y = "-log10 (P-value)") +
    scale_color_manual(values = color_map, name = "") +
    ggtitle("age (30y)") +
    theme(
      legend.text = element_text(size = 14),
      axis.title = element_text(size = 14)
    ) +
    coord_cartesian(
      ylim = c(0, 6),
      xlim = c(-1.8, 1.8)
    ))

  saveRDS(get(plotac), file = path)
  get(plotac)
} else {
  assign(plotac, readRDS(path))
  get(plotac)
}

3.5.3.4 Dialysis

plotac <- 'fig4'
predictor <- 'dialysis'

path = paste0('gitignore/figures/',plotac,'_', predictor, '.rds')

if(file.exists(path) == FALSE){
  assign(plotac, glm_ranint_poisson %>%
           dplyr::select(Outcome, contains(predictor)) %>%
           dplyr::mutate(
             pv_adj = get(paste0(predictor, '_P_val_adj')),
             pv_raw = get(paste0(predictor, '_P_val')),
             b_coe = get(paste0(predictor, '_logFC')),
             ) %>%
           dplyr::mutate(
             m_ylab = -log10(pv_raw),
             m_xlab = b_coe,
             rescat = if_else(
               pv_adj >= 0.05,
               'FDR>0.05',
               if_else(
                 b_coe > 0,
                 'Increased',
                 'Decreased')),
             m_ylab = if_else(m_ylab > 6, 6, m_ylab)
             ) %>%
           mutate(
             rescat = factor(rescat,
                             levels = c('Increased', 'Decreased', 'FDR>0.05')),
             meta_labels = if_else(pv_raw < 0.05, Outcome, '')) %>%
           mutate(
             meta_labels = str_remove(meta_labels, 'count_')
           ) %>% 
           ggplot(aes(x = b_coe,
                      y = m_ylab,
                      color = rescat,
                      alpha = rescat,
                      label = meta_labels)) +

           geom_point(size = 2, shape = 16, alpha = 0.6) +
           geom_vline(xintercept = 0, 
                      linetype = 'dashed', 
                      linewidth = 0.2, 
                      color = 'grey30') +

             geom_text(aes(
              x = m_xlab - 0.3,
              y = m_ylab - 0.15, 
              label = meta_labels),
              size = 4.1,
              hjust = 0.5,  
              vjust = 1,
              alpha = 1) +  

           labs(x = 'log(fold count increase)', y = '-log10 (P-value)') +
           scale_color_manual(values = color_map, name = '') +
           ggtitle('dialysis treatment') +
           theme(legend.text = element_text(size = 14),
                 axis.title = element_text(size = 14))+
           coord_cartesian(ylim = c(0, 6),
                           xlim = c(-1.8, 1.8)) 
         )

  saveRDS(get(plotac), file = path)
  get(plotac)

} else {assign(plotac, readRDS(path))
    get(plotac)}

3.5.3.5 ASCVD

plotac <- 'fig5'
predictor <- 'ascvd'

path = paste0('gitignore/figures/',plotac,'_', predictor, '.rds')

if(file.exists(path) == FALSE){
  assign(plotac, glm_ranint_poisson %>%
           dplyr::select(Outcome, contains(predictor)) %>%
           dplyr::mutate(
             pv_adj = get(paste0(predictor, '_P_val_adj')),
             pv_raw = get(paste0(predictor, '_P_val')),
             b_coe = get(paste0(predictor, '_logFC')),
             ) %>%
           dplyr::mutate(
             m_ylab = -log10(pv_raw),
             m_xlab = b_coe,
             rescat = if_else(
               pv_adj >= 0.05,
               'FDR>0.05',
               if_else(
                 b_coe > 0,
                 'Increased',
                 'Decreased')),
             m_ylab = if_else(m_ylab > 6, 6, m_ylab)
             ) %>%
           mutate(
             rescat = factor(rescat,
                             levels = c('Increased', 'Decreased', 'FDR>0.05')),
             meta_labels = if_else(pv_raw < 0.05, Outcome, '')) %>%
           mutate(
             meta_labels = str_remove(meta_labels, 'count_')
           ) %>% 
           ggplot(aes(x = b_coe,
                      y = m_ylab,
                      color = rescat,
                      alpha = rescat,
                      label = meta_labels)) +

           geom_point(size = 2, shape = 16, alpha = 0.6) +
           geom_vline(xintercept = 0, 
                      linetype = 'dashed', 
                      linewidth = 0.2, 
                      color = 'grey30') +

           geom_text_repel(aes(
             x = m_xlab,
             y = m_ylab),
             show.legend = FALSE,
             alpha = 1,
             box.padding = 0.7,
             size = 4.1,
             seed = 17) +

           labs(x = 'log(fold count increase)', y = '-log10 (P-value)') +
           scale_color_manual(values = color_map, name = '') +
           ggtitle('ASCVD') +
           theme(legend.text = element_text(size = 14),
                 axis.title = element_text(size = 14))+
           coord_cartesian(ylim = c(0, 6),
                           xlim = c(-1.8, 1.8)) 
         )

  saveRDS(get(plotac), file = path)
  get(plotac)

} else {assign(plotac, readRDS(path))
    get(plotac)}

3.5.3.6 Volcans matrix

plotac <- "fig6"
predictor <- "volcans_matrix"

path <- paste0("gitignore/figures/", plotac, "_", predictor, ".rds")

if (!file.exists(path)) {
  # Remove legends from individual plots
  fig1 <- fig1 + theme(legend.position = "none")
  fig2 <- fig2 + theme(
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank(),
    legend.position = "none"
  )
  fig3 <- fig3 + theme(
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank(),
    legend.position = "none"
  )
  fig4 <- fig4 + theme(
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank(),
    legend.position = "none"
  )
  fig5 <- fig5 + theme(
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank(),
    legend.position = "none"
  )

  combined_plot <- ggarrange(
    fig1,
    fig2,
    fig3,
    fig4,
    fig5,
    nrow = 1,
    ncol = 5,
    widths = c(1.1, 1, 1, 1, 1)
  )

  final_plot <- ggarrange(combined_plot,
    common_legend,
    ncol = 1,
    align = "hv",
    heights = c(10, 1)
  )

  saveRDS(final_plot, file = path)

  ggsave(
    filename = paste0("gitignore/figures/", plotac, "_", predictor),
    device = "pdf",
    height = 5,
    width = 11.5
  )

  final_plot
} else {
  assign(plotac, readRDS(path))
  get(plotac)
}

The volcano plot illustrates the effects of clinical characteristics on the counts of various peripheral blood immune cell subpopulations, estimated using a generalized linear model (GLM) with a Poisson distribution. The x-axis represents the effect size of each predictor on the outcomes. For example, an effect size of 0.85 for smoking on switched memory lymphocytes suggests that smoking is associated with a 2.33-fold increase in the count of these cells, as determined by exp(0.85). The y-axis displays the −log⁡10(P-value) for each association, before P-value adjustment. Only associations with unadjusted P-values below 0.05 are labeled.

4 Reproducibility

sessionInfo()
## R version 4.4.3 (2025-02-28)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.5 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=cs_CZ.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=cs_CZ.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=cs_CZ.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=cs_CZ.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Europe/Prague
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] grid      stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] ggrepel_0.9.5         glmmTMB_1.1.9         ComplexHeatmap_2.20.0
##  [4] circlize_0.4.16       vegan_2.6-6.1         lattice_0.22-5       
##  [7] permute_0.9-7         ggbeeswarm_0.6.0      quantreg_5.98        
## [10] SparseM_1.81          coxed_0.3.3           survival_3.7-0       
## [13] rms_6.8-1             Hmisc_5.1-3           bayesplot_1.8.1      
## [16] ggdist_3.3.2          kableExtra_1.4.0      lubridate_1.8.0      
## [19] corrplot_0.92         arm_1.12-2            lme4_1.1-35.5        
## [22] MASS_7.3-65           janitor_2.2.0         brms_2.21.0          
## [25] Rcpp_1.0.13           glmnet_4.1-8          Matrix_1.7-0         
## [28] boot_1.3-31           cowplot_1.1.1         pROC_1.18.0          
## [31] mgcv_1.9-1            nlme_3.1-168          openxlsx_4.2.8       
## [34] flextable_0.9.6       sjPlot_2.8.16         RJDBC_0.2-10         
## [37] rJava_1.0-6           DBI_1.1.2             car_3.1-2            
## [40] carData_3.0-5         gtsummary_2.0.2       emmeans_1.10.4       
## [43] ggpubr_0.4.0          stringi_1.7.6         forcats_1.0.0        
## [46] stringr_1.5.1         dplyr_1.1.4           purrr_1.0.2          
## [49] readr_2.1.2           tidyr_1.3.1           tibble_3.2.1         
## [52] ggplot2_3.5.1         tidyverse_1.3.1      
## 
## loaded via a namespace (and not attached):
##   [1] fs_1.6.4                matrixStats_1.3.0       httr_1.4.2             
##   [4] RColorBrewer_1.1-2      insight_0.20.2          doParallel_1.0.17      
##   [7] repr_1.1.7              numDeriv_2016.8-1.1     tools_4.4.3            
##  [10] backports_1.5.0         sjlabelled_1.2.0        utf8_1.2.4             
##  [13] R6_2.5.1                GetoptLong_1.0.5        withr_3.0.1            
##  [16] Brobdingnag_1.2-7       prettyunits_1.1.1       gridExtra_2.3          
##  [19] cli_3.6.3               textshaping_0.3.6       performance_0.12.2     
##  [22] gt_0.11.0               Cairo_1.6-2             officer_0.6.6          
##  [25] sandwich_3.0-1          labeling_0.4.2          sass_0.4.9             
##  [28] mvtnorm_1.1-3           polspline_1.1.25        ggridges_0.5.3         
##  [31] askpass_1.1             QuickJSR_1.3.1          commonmark_1.9.1       
##  [34] systemfonts_1.0.4       StanHeaders_2.32.10     foreign_0.8-90         
##  [37] gfonts_0.2.0            svglite_2.1.3           readxl_1.3.1           
##  [40] rstudioapi_0.16.0       httpcode_0.3.0          generics_0.1.3         
##  [43] shape_1.4.6             vroom_1.5.7             distributional_0.4.0   
##  [46] zip_2.2.0               inline_0.3.19           loo_2.4.1              
##  [49] fansi_1.0.6             S4Vectors_0.42.1        abind_1.4-5            
##  [52] lifecycle_1.0.4         multcomp_1.4-18         yaml_2.3.5             
##  [55] snakecase_0.11.1        promises_1.2.0.1        crayon_1.5.0           
##  [58] haven_2.4.3             magick_2.7.3            pillar_1.9.0           
##  [61] knitr_1.48              rjson_0.2.21            estimability_1.5.1     
##  [64] codetools_0.2-19        glue_1.7.0              V8_4.4.2               
##  [67] fontLiberation_0.1.0    data.table_1.15.4       vctrs_0.6.5            
##  [70] png_0.1-7               cellranger_1.1.0        gtable_0.3.0           
##  [73] assertthat_0.2.1        datawizard_0.12.2       xfun_0.46              
##  [76] mime_0.12               coda_0.19-4             iterators_1.0.14       
##  [79] ellipsis_0.3.2          TH.data_1.1-0           bit64_4.0.5            
##  [82] fontquiver_0.2.1        rstan_2.32.6            tensorA_0.36.2.1       
##  [85] TMB_1.9.14              vipor_0.4.5             rpart_4.1.24           
##  [88] colorspace_2.0-2        BiocGenerics_0.50.0     nnet_7.3-20            
##  [91] tidyselect_1.2.1        processx_3.8.4          bit_4.0.4              
##  [94] compiler_4.4.3          curl_4.3.2              rvest_1.0.2            
##  [97] htmlTable_2.4.0         xml2_1.3.3              fontBitstreamVera_0.1.1
## [100] posterior_1.6.0         checkmate_2.3.2         scales_1.3.0           
## [103] callr_3.7.6             digest_0.6.37           minqa_1.2.4            
## [106] rmarkdown_2.27          htmltools_0.5.8.1       pkgconfig_2.0.3        
## [109] base64enc_0.1-3         highr_0.11              dbplyr_2.1.1           
## [112] fastmap_1.2.0           rlang_1.1.4             GlobalOptions_0.1.2    
## [115] htmlwidgets_1.6.4       shiny_1.9.1             farver_2.1.0           
## [118] zoo_1.8-9               jsonlite_1.8.8          magrittr_2.0.3         
## [121] Formula_1.2-4           munsell_0.5.0           gdtools_0.3.7          
## [124] plyr_1.8.6              pkgbuild_1.3.1          parallel_4.4.3         
## [127] sjmisc_2.8.10           ggeffects_1.7.0         splines_4.4.3          
## [130] hms_1.1.1               sjstats_0.19.0          ps_1.7.7               
## [133] uuid_1.0-3              markdown_1.13           ggsignif_0.6.3         
## [136] stats4_4.4.3            rstantools_2.1.1        crul_1.5.0             
## [139] reprex_2.0.1            evaluate_1.0.0          RcppParallel_5.1.8     
## [142] modelr_0.1.8            nloptr_2.0.0            tzdb_0.2.0             
## [145] foreach_1.5.2           httpuv_1.6.5            cards_0.2.2            
## [148] MatrixModels_0.5-3      openssl_1.4.6           clue_0.3-65            
## [151] cardx_0.2.1             skimr_2.1.5             broom_1.0.6            
## [154] xtable_1.8-4            rstatix_0.7.0           later_1.3.0            
## [157] viridisLite_0.4.0       ragg_1.2.1              beeswarm_0.4.0         
## [160] IRanges_2.38.1          cluster_2.1.8.1         bridgesampling_1.1-2

References

Benjamini, Yoav, and Yosef Hochberg. 1995. “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing.” Journal of the Royal Statistical Society Series B: Statistical Methodology 57 (1): 289–300. https://doi.org/10.1111/j.2517-6161.1995.tb02031.x.

Brooks, Mollie E., Kasper Kristensen, Koen J. van, Arni Magnusson, Casper W. Berg, Anders Nielsen, Hans J. Skaug, Martin Maechler, and Benjamin M. Bolker. 2017. “glmmTMB Balances Speed and Flexibility Among Packages for Zero-Inflated Generalized Linear Mixed Modeling” 9. https://doi.org/10.32614/RJ-2017-066.

Gu, Zuguang, Roland Eils, and Matthias Schlesner. 2016. “Complex Heatmaps Reveal Patterns and Correlations in Multidimensional Genomic Data.” https://doi.org/10.1093/bioinformatics/btw313.

Oksanen, Jari, Gavin L. Simpson, F. Guillaume Blanchet, Roeland Kindt, Pierre Legendre, Peter R. Minchin, R. B. O’Hara, et al. 2024. “Vegan: Community Ecology Package.” https://CRAN.R-project.org/package=vegan.

R Core Team. 2024. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.