Low Risk of Prolonged SARS-CoV-2 Shedding and Molecular Evolution in Kidney Transplant Recipients during the Omicron Era: A Prospective Observational Study
Statistical report
Authors and affiliations:
Ivan Zahradka1, Vojtech Petr1, Jan Paces2, Jana Zdychova3, Alena Srbova3, Radomira Limberkova4, Timotej Suri4, Filip Tichanek5, Denisa Husakova4, Helena Jirincova4, Miluse Hradilova2, Ilja Striz3, Ondrej Viklicky1
1 Department of Nephrology, Institute for Clinical and Experimental Medicine.
2 Institute of Molecular Genetics, Academy of Sciences of the Czech Republic.
3 Department of Immunology, Institute for Clinical and Experimental Medicine.
4 National Reference Laboratory for Influenza and Other Respiratory Viruses, National Institute of Public Health.
5 Department of Data Science, Institute for Clinical and Experimental Medicine.
This is a statistical report of the study Low Risk of Prolonged SARS-CoV-2 Shedding and Molecular Evolution in Kidney Transplant Recipients during the Omicron Era: A Prospective Observational Study that is accepted in the American Journal of Transplantation journal.
When using this code or data, cite the original publication:
Zahradka, Ivan, Vojtech Petr, Jan Paces, Jana Zdychova, Alena Srbova, Radomira Limberkova, Timotej Suri, et al. 2024. “Low Risk of Prolonged SARS-CoV-2 Shedding and Molecular Evolution in Kidney Transplant Recipients During the Omicron Era: A Prospective Observational Study.” American Journal of Transplantation, December. https://doi.org/10.1016/j.ajt.2024.11.031.
BibTex citation for the original publication: is provided in CITATION.bib
Original GitHub repository: https://github.com/filip-tichanek/covid_viability
Statistical code with all details of statistical analysis is shown a HTML report.
Custom functions are shown in a separate R file
1 Statistical analysis description
Statistical analysis was performed in R (R Core Team 2023). The entire analytical procedure, including the R code, is available online at: https://filip-tichanek.github.io/covid_viability/
Repeatedly measured data (see probability maps models and COVID mutations over time below) were analyzed using a Bayesian framework (Bayesian hierarchical models) to better accommodate the uncertainty associated with the very limited number of independent units (patients) in our study (McElreath 2018; Schoot and Miočević 2020).
1.1 Probability maps
Initially, we used a Bayesian hierarchical model with a Bernoulli distribution to estimate the probability of a viable virus (via) based on the time post the symptoms onset (time) and Ct. Both predictors were assumed to have a linear and additive effect, as exploratory analysis did not suggest the importance of non-linearity and/or interactions. The model was fitted using the brms package (Bürkner 2018, 2017), employing Stan software for probabilistic computing (Stan Development Team 2021). Given that the observations are from repeatedly measured patients, the subject (patient) ID (patient_ID) was included in the model as a random intercept.
Based on the model, we created two maps showing the estimated probability of the viable virus based on the two predictors (time, Ct). One map (fitted map) shows probabilities fitted with the median posterior estimate, not accounting for the uncertainty related to parameter estimation. The second map (whole posterior map) includes all uncertainties in the estimates.
The probabilities based on the fitted map address the question: “What is the probability of a viable virus if the estimated parameters were true?” This approach does not include uncertainty about the parameters. In other words, a probability of viability under 5% does NOT directly reflect a truly minimal risk of viability but represents a “hypothetical” scenario (assuming that the model parameters were estimated correctly). Conversely, the whole posterior map provides the answer to the question: “What is the probability of a viable virus given the available data, also accounting for the uncertainties in the model?”.
The Bayesian models were run using priors specified based on results from a previous study (Berengua et al. 2022).
Mathematically, \(P(\beta | \mathbf{X}, Y)\) is proportional to the product of the prior distribution \(P(\beta)\) and the likelihood \(P(Y | \mathbf{X}, \beta)\):
\[ P(\beta | \mathbf{X}, Y) \propto P(Y | \mathbf{X}, \beta) \times P(\beta) \]
We ran models with both weakly informative priors (with minimal impact on the resulting estimates) and informative (strong) priors. Both types of priors had a normal (Gaussian) distribution, centered at the effect estimated from Berengua data (Berengua et al. 2022), with the spread defined as \(2*SE\) of the estimate (strong prior) or \(10*SE\) (weakly informative prior).
Therefore, the priors can be expressed as \(normal(\mu, \sigma)\), where:
\[\mu = \beta_{berengua}\]
\[\sigma = SE(\beta_{berengua}) \times 2\] for informative prior
and
\[\sigma = SE(\beta_{berengua}) \times 10\] for weakly informative prior, with \(\beta_{berengua}\) being estimated effect from (Berengua et al. 2022), and \(SE[\beta_{berengua}]\) being standard error of estimated effects according to (Berengua et al. 2022).
1.2 COVID mutations over time
The mutation frequency over time was modeled using a Bayesian hierarchical regression model with a poisson distribution and using the same packages as the probability maps (see above). An individual patient was included as a random intercept together with individual observation to handle potential overdispersion. The model included time (week), treatment group (molnupiravir, with remdesivir as the reference), and their interaction (week:molnupiravir).
Weakly informative priors were used to slightly regularize the estimated fixed effects, pushing them toward the null effect, except for week:molnupiravir interaction, where a positive interaction was expected due to the mutagenicity of molnupiravir treatment as recently described by Fountain-Jones (Fountain-Jones et al. 2024).
The priors for the main fixed effects were defined as Gaussian distributions centered at zero:
For the numerical predictor week: \(\mu = 0\) and \(\sigma = 4\times\text{SD}(\text{week})\)
For the binary predictor molnupiravir: \(\mu = 0\) and \(\sigma = 2\)
The prior for the week:molnupiravir interaction was set to \(\mu = 0.1\) and \(\sigma = 0.5\), reflecting the expected association of molnupiravir with slightly higher mutagenicity.
1.3 Univariable negative binomial regression
Negative binomial generalized linear models (NB-GLM), fitted with MASS package (Venables and Ripley 2002), were used to assess the effect of several predictors (fitted in separate univariable models) on the number of days when the patient was positive as shown either using PCR (symptoms_neg_PCR_days) or cell cultures (symptoms_neg_viability_days). Due to the small sample size, P-values were validated using a permutation test of the model coefficients. The small sample size is also the reason why we did not perform multivariable model. As a sensitivity analysis, we also performed two-variables NB-GLM, with the same predictors as above plus with days after last vaccination as a covariate.
2 Analysis
2.1 Initialization
2.1.1 Packages
Open code
if (TRUE) {rm(list = ls() )}
if (TRUE) {
suppressWarnings(suppressMessages({
library(tidyverse)
library(stringr)
library(stringi)
library(ggpubr)
library(emmeans)
library(gtsummary)
library(car)
library(RJDBC)
library(sjPlot)
library(flextable)
library(openxlsx)
library(mgcv)
library(pROC)
library(cowplot)
library(boot)
library(glmnet)
library(brms)
library(projpred)
library(janitor)
library(arm)
library(corrplot)
library(lubridate)
library(kableExtra)
library(ggdist)
library(bayesplot)
library(coxed)
library(quantreg)
library(ggbeeswarm)
library(ggpubr)
library(ggdag)
library(dagitty)
# Functions clashes
select <- dplyr::select
rename <- dplyr::rename
mutate <- dplyr::mutate
recode <- dplyr::recode
summarize <- dplyr::summarize
count <- dplyr::count
# Simple math functions
logit <-function(x){log(x/(1-x))}
inv_logit <- function(x){exp(x)/(1+exp(x))}
}))
}2.1.2 Functions
Open code
getwd()
## [1] "/home/ticf/GitRepo/ticf/446_ZAHI_covid_viability"
source('functions.R')2.1.3 Directories
Open code
folders <- c("data",
"gitignore",
"gitignore/run",
"gitignore/figures",
"gitignore/outputs",
"gitignore/data")
invisible(
lapply(folders, function(x) if (!dir.exists(x))
dir.create(x, recursive = TRUE)
)
)2.1.4 Seed
Open code
set.seed(16)2.2 Data
2.2.1 Import
Open code
data <- read.xlsx('gitignore/data/zdroj virova rna2.xlsx') %>%
select(patient_ID:T_depl_year,
D0_date:D6_PCR,
PRA:DM,
days_last_dose:vacc.last.date) %>%
mutate(patient_ID = as.character(patient_ID),
antiviral = as.numeric(antiviral-1)) %>%
mutate(across(where(is.character),
~if_else(. == "neg.", "0", .))) %>%
mutate(across(where(is.character),
~if_else(. == "neg", "0", .)))
dates <- colnames(data)[grepl('date', colnames(data))]
data <- excel_date(data, dates = dates) %>%
mutate(days_postsymp = D0_date - symptoms_date)
summary(data)
## patient_ID male_sex age_at_COVID birth_date
## Length:23 Min. :0.0000 Min. :28.52 Min. :1952-07-06
## Class :character 1st Qu.:0.5000 1st Qu.:48.19 1st Qu.:1961-08-12
## Mode :character Median :1.0000 Median :52.87 Median :1970-04-13
## Mean :0.7391 Mean :53.47 Mean :1969-07-11
## 3rd Qu.:1.0000 3rd Qu.:61.28 3rd Qu.:1974-10-16
## Max. :1.0000 Max. :70.65 Max. :1994-08-18
##
## tx_date symptoms_date positivity_date
## Min. :2007-09-13 Min. :2022-06-29 Min. :2022-06-30
## 1st Qu.:2014-05-29 1st Qu.:2022-10-08 1st Qu.:2022-10-09
## Median :2019-10-22 Median :2023-01-14 Median :2023-01-14
## Mean :2018-03-13 Mean :2022-12-16 Mean :2022-12-18
## 3rd Qu.:2022-06-22 3rd Qu.:2023-02-16 3rd Qu.:2023-02-20
## Max. :2023-03-03 Max. :2023-06-19 Max. :2023-06-20
##
## neg_PCR_date neg_viability_date symptoms_neg_PCR_days
## Min. :2022-08-01 Min. :2022-07-21 Min. : 8.00
## 1st Qu.:2022-10-29 1st Qu.:2022-10-18 1st Qu.:15.00
## Median :2023-01-30 Median :2023-01-26 Median :18.00
## Mean :2023-01-08 Mean :2022-12-29 Mean :22.74
## 3rd Qu.:2023-03-13 3rd Qu.:2023-02-27 3rd Qu.:29.50
## Max. :2023-07-27 Max. :2023-07-13 Max. :61.00
##
## symptoms_neg_viability_days PCR_viability_difference antiviral
## Min. : 4.00 Min. : 0.000 Min. :0.0000
## 1st Qu.: 8.00 1st Qu.: 4.500 1st Qu.:0.0000
## Median :11.00 Median : 8.000 Median :1.0000
## Mean :13.04 Mean : 9.696 Mean :0.6087
## 3rd Qu.:13.50 3rd Qu.:13.500 3rd Qu.:1.0000
## Max. :33.00 Max. :31.000 Max. :1.0000
##
## days_from_Tx COVID_first_year T_depl_year D0_date
## Min. : 9 Min. :0.0000 Min. :0.0000 Min. :2022-06-30
## 1st Qu.: 80 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:2022-10-11
## Median :1210 Median :0.0000 Median :0.0000 Median :2023-01-16
## Mean :1739 Mean :0.3478 Mean :0.2609 Mean :2022-12-19
## 3rd Qu.:3196 3rd Qu.:1.0000 3rd Qu.:0.5000 3rd Qu.:2023-02-20
## Max. :5529 Max. :1.0000 Max. :1.0000 Max. :2023-06-21
##
## D0_via D0_PCR D1_date D1_via
## Min. :0.0000 Min. :12.80 Min. :2022-07-04 Min. :0.0000
## 1st Qu.:1.0000 1st Qu.:18.30 1st Qu.:2022-10-18 1st Qu.:0.0000
## Median :1.0000 Median :21.20 Median :2023-01-26 Median :0.0000
## Mean :0.7647 Mean :21.84 Mean :2022-12-26 Mean :0.2174
## 3rd Qu.:1.0000 3rd Qu.:26.45 3rd Qu.:2023-02-27 3rd Qu.:0.0000
## Max. :1.0000 Max. :32.50 Max. :2023-06-26 Max. :1.0000
## NA's :6
## D1_PCR D2_date D2_via D2_PCR
## Length:23 Min. :2022-07-11 Min. :0.0000 Length:23
## Class :character 1st Qu.:2022-10-10 1st Qu.:0.0000 Class :character
## Mode :character Median :2023-01-30 Median :0.0000 Mode :character
## Mean :2022-12-30 Mean :0.2353
## 3rd Qu.:2023-03-14 3rd Qu.:0.0000
## Max. :2023-06-29 Max. :1.0000
## NA's :4 NA's :6
## D3_date D3_via D3_PCR D4_date
## Min. :2022-07-18 Min. :0.00 Length:23 Min. :2022-07-25
## 1st Qu.:2022-09-30 1st Qu.:0.00 Class :character 1st Qu.:2022-10-09
## Median :2022-11-09 Median :0.00 Mode :character Median :2022-11-15
## Mean :2022-11-30 Mean :0.25 Mean :2022-12-31
## 3rd Qu.:2023-03-03 3rd Qu.:0.25 3rd Qu.:2023-03-31
## Max. :2023-04-04 Max. :1.00 Max. :2023-07-13
## NA's :14 NA's :15 NA's :16
## D4_via D4_PCR D5_date D5_via
## Min. :0.0000 Length:23 Min. :2022-08-01 Min. :0
## 1st Qu.:0.0000 Class :character 1st Qu.:2022-12-17 1st Qu.:0
## Median :0.0000 Mode :character Median :2023-05-05 Median :0
## Mean :0.1429 Mean :2023-03-01 Mean :0
## 3rd Qu.:0.0000 3rd Qu.:2023-06-15 3rd Qu.:0
## Max. :1.0000 Max. :2023-07-27 Max. :0
## NA's :16 NA's :20 NA's :20
## D5_PCR D6_date D6_via D6_PCR
## Length:23 Min. :2022-08-08 Min. :0 Length:23
## Class :character 1st Qu.:2022-10-16 1st Qu.:0 Class :character
## Mode :character Median :2022-12-24 Median :0 Mode :character
## Mean :2022-12-24 Mean :0
## 3rd Qu.:2023-03-03 3rd Qu.:0
## Max. :2023-05-12 Max. :0
## NA's :21 NA's :22
## PRA HLA.missmatch Tx.přes.DSA Doba.na.HD.(roky)
## Min. : 0.0 Min. :1.000 Min. :0.0000 Min. : 0.000
## 1st Qu.: 1.5 1st Qu.:2.000 1st Qu.:0.0000 1st Qu.: 0.500
## Median :13.0 Median :3.000 Median :0.0000 Median : 1.700
## Mean :17.3 Mean :3.217 Mean :0.1304 Mean : 2.587
## 3rd Qu.:20.0 3rd Qu.:4.000 3rd Qu.:0.0000 3rd Qu.: 3.500
## Max. :89.0 Max. :6.000 Max. :1.0000 Max. :15.100
##
## CMV.pozitivita.příjemce CMV.dárce imunosuprese.před.Tx DM
## Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000
## Median :1.0000 Median :1.0000 Median :0.0000 Median :0.0000
## Mean :0.6087 Mean :0.6957 Mean :0.3478 Mean :0.2174
## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:0.0000
## Max. :1.0000 Max. :1.0000 Max. :1.0000 Max. :1.0000
##
## days_last_dose vacc.last.date days_postsymp
## Min. : 43.0 Min. :2021-02-06 Length:23
## 1st Qu.:196.0 1st Qu.:2021-10-21 Class :difftime
## Median :339.5 Median :2021-11-26 Mode :numeric
## Mean :332.7 Mean :2022-01-16
## 3rd Qu.:509.0 3rd Qu.:2022-07-25
## Max. :614.0 Max. :2022-11-11
## NA's :1 NA's :1
dim(data)
## [1] 23 482.2.2 Long format
Open code
if(file.exists('data/data_long.txt') == FALSE){
data_long <- data %>%
mutate(D0_PCR = as.character(D0_PCR)) %>%
pivot_longer(
cols = -c(patient_ID:T_depl_year, days_postsymp),
names_to = c("time", ".value"),
names_pattern = "(D\\d+)_(.*)") %>%
filter(!is.na(PCR)) %>%
mutate(PCR = as.numeric(PCR)) %>%
mutate(time = if_else(time == 'D0', '0', time),
time = if_else(time == 'D1', '1', time),
time = if_else(time == 'D2', '2', time),
time = if_else(time == 'D3', '3', time),
time = if_else(time == 'D4', '4', time),
time = if_else(time == 'D5', '5', time),
time = if_else(time == 'D6', '6', time)) %>%
mutate(time = as.numeric(time) + (as.numeric(days_postsymp)/7)) %>%
mutate(time = if_else(time < 0, 0, time)) %>%
filter(!is.na(via)) %>%
select(patient_ID, PCR, time, via) %>%
mutate(patient_ID = factor(patient_ID))
write.table(data_long, 'data/data_long.txt')
}
data_long <- read.table('data/data_long.txt') %>%
mutate(patient_ID = factor(patient_ID))
summary(data_long)
## patient_ID PCR time via
## 20 : 7 Min. : 0.00 Min. :0.1429 Min. :0.0000
## 1 : 6 1st Qu.: 9.60 1st Qu.:1.1071 1st Qu.:0.0000
## 4 : 5 Median :22.95 Median :1.7143 Median :0.0000
## 5 : 5 Mean :20.24 Mean :2.0526 Mean :0.3289
## 17 : 5 3rd Qu.:31.27 3rd Qu.:2.8929 3rd Qu.:1.0000
## 23 : 5 Max. :37.70 Max. :6.2857 Max. :1.0000
## (Other):43
dim(data_long)
## [1] 76 42.2.3 Data for univariable GLMs
Open code
if(file.exists('data/data.txt') == FALSE){
data <- data %>%
select(male_sex, age_at_COVID, T_depl_year, PRA,
days_from_Tx, HLA.missmatch, `Doba.na.HD.(roky)`,
`CMV.dárce`, `CMV.pozitivita.příjemce`,
`imunosuprese.před.Tx`, DM, tx_date, birth_date,
antiviral, days_last_dose,
symptoms_neg_PCR_days,
symptoms_neg_viability_days) %>%
mutate(age_at_Tx = as.numeric(tx_date - birth_date)/365.25,
Years_since_Tx = (days_from_Tx/365.25),
HLA_MM = HLA.missmatch,
HD_vignette = `Doba.na.HD.(roky)`,
CMV_rec = `CMV.pozitivita.příjemce`,
CMV_don = `CMV.dárce`,
IS_preTx = `imunosuprese.před.Tx`,
Remdsivir = antiviral) %>%
mutate(Tx_less_year = if_else(Years_since_Tx < 1, 1, 0),
CMV_MM = if_else(CMV_don == 1 & CMV_rec == 0, 1, 0)) %>%
select(male_sex, age_at_COVID, age_at_Tx,
Years_since_Tx, Tx_less_year,
T_depl_year, PRA, HLA_MM, HD_vignette, CMV_rec, CMV_don,
IS_preTx, DM, Remdsivir,
days_last_dose,
symptoms_neg_PCR_days,
symptoms_neg_viability_days)
write.table(data, 'data/data.txt')
}
data <- read.table('data/data.txt')
summary(data)
## male_sex age_at_COVID age_at_Tx Years_since_Tx
## Min. :0.0000 Min. :28.52 Min. :20.86 Min. : 0.02464
## 1st Qu.:0.5000 1st Qu.:48.19 1st Qu.:44.01 1st Qu.: 0.21903
## Median :1.0000 Median :52.87 Median :51.34 Median : 3.31280
## Mean :0.7391 Mean :53.47 Mean :48.67 Mean : 4.76208
## 3rd Qu.:1.0000 3rd Qu.:61.28 3rd Qu.:58.48 3rd Qu.: 8.74880
## Max. :1.0000 Max. :70.65 Max. :67.10 Max. :15.13758
##
## Tx_less_year T_depl_year PRA HLA_MM
## Min. :0.0000 Min. :0.0000 Min. : 0.0 Min. :1.000
## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.: 1.5 1st Qu.:2.000
## Median :0.0000 Median :0.0000 Median :13.0 Median :3.000
## Mean :0.3478 Mean :0.2609 Mean :17.3 Mean :3.217
## 3rd Qu.:1.0000 3rd Qu.:0.5000 3rd Qu.:20.0 3rd Qu.:4.000
## Max. :1.0000 Max. :1.0000 Max. :89.0 Max. :6.000
##
## HD_vignette CMV_rec CMV_don IS_preTx
## Min. : 0.000 Min. :0.0000 Min. :0.0000 Min. :0.0000
## 1st Qu.: 0.500 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000
## Median : 1.700 Median :1.0000 Median :1.0000 Median :0.0000
## Mean : 2.587 Mean :0.6087 Mean :0.6957 Mean :0.3478
## 3rd Qu.: 3.500 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:1.0000
## Max. :15.100 Max. :1.0000 Max. :1.0000 Max. :1.0000
##
## DM Remdsivir days_last_dose symptoms_neg_PCR_days
## Min. :0.0000 Min. :0.0000 Min. : 43.0 Min. : 8.00
## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:196.0 1st Qu.:15.00
## Median :0.0000 Median :1.0000 Median :339.5 Median :18.00
## Mean :0.2174 Mean :0.6087 Mean :332.7 Mean :22.74
## 3rd Qu.:0.0000 3rd Qu.:1.0000 3rd Qu.:509.0 3rd Qu.:29.50
## Max. :1.0000 Max. :1.0000 Max. :614.0 Max. :61.00
## NA's :1
## symptoms_neg_viability_days
## Min. : 4.00
## 1st Qu.: 8.00
## Median :11.00
## Mean :13.04
## 3rd Qu.:13.50
## Max. :33.00
## 2.2.4 Data of Berengua 2022
Open code
if(file.exists('data/data_berengua.txt') == FALSE){
data_berengua <- read.xlsx('gitignore/data/berengua2022.xlsx') %>%
mutate(week = days/7) %>%
select(ct, week, viability)
write.table(data_berengua, 'data/data_berengua.txt')
}
data_berengua <- read.table('data/data_berengua.txt')
summary(data_berengua)
## ct week viability
## Min. :12.00 Min. :0.1429 Min. :0.0000
## 1st Qu.:17.25 1st Qu.:0.2143 1st Qu.:0.0000
## Median :23.50 Median :0.5714 Median :1.0000
## Mean :23.78 Mean :1.3300 Mean :0.5862
## 3rd Qu.:29.50 3rd Qu.:2.1429 3rd Qu.:1.0000
## Max. :37.50 Max. :4.4286 Max. :1.00002.2.5 Data mutations per time
Open code
if(file.exists('data/data_mutation_time.txt') == FALSE){
data_mutations <- read.xlsx('gitignore/data/stery_mutace_mixedmodel.xlsx')
write.table(data_mutations, 'data/stery_mutace_mixedmodel.txt')
}
data_mutations <- read.table('data/stery_mutace_mixedmodel.txt') %>%
mutate(across(c(treatment, s_all, Patient.ID), as.factor)) %>%
mutate(f_50 = f_50,
f_5 = f_5,
f_05 = f_05,
all = all) %>%
mutate(week = (Day-1)/7,
molnupiravir = if_else(treatment == 'molnupiravir',
1, 0)) %>%
mutate(id_obs = factor(1:nrow(data_mutations)))
summary(data_mutations)
## Day treatment Patient.ID f_50 f_5
## Min. : 1.00 molnupiravir:14 1:14 Min. :28.00 Min. : 30.00
## 1st Qu.: 4.00 remdesivir :38 3:12 1st Qu.:36.00 1st Qu.: 44.00
## Median :10.50 4: 6 Median :56.50 Median : 65.50
## Mean :11.77 5: 8 Mean :51.83 Mean : 65.62
## 3rd Qu.:18.00 6: 6 3rd Qu.:62.00 3rd Qu.: 75.25
## Max. :36.00 7: 6 Max. :82.00 Max. :153.00
##
## f_05 all s_all week molnupiravir
## Min. : 64.0 Min. : 681 all :26 Min. :0.0000 Min. :0.0000
## 1st Qu.: 94.0 1st Qu.: 693 spike:26 1st Qu.:0.4286 1st Qu.:0.0000
## Median :187.0 Median :1926 Median :1.3571 Median :0.0000
## Mean :247.6 Mean :1583 Mean :1.5385 Mean :0.2692
## 3rd Qu.:313.8 3rd Qu.:2445 3rd Qu.:2.4286 3rd Qu.:1.0000
## Max. :782.0 Max. :2451 Max. :5.0000 Max. :1.0000
##
## id_obs
## 1 : 1
## 2 : 1
## 3 : 1
## 4 : 1
## 5 : 1
## 6 : 1
## (Other):46
data_mutations_spike <- data_mutations %>%
filter(s_all == 'spike')
data_mutations_all <- data_mutations %>%
filter(s_all == 'all')2.3 Directed acyclic graph (DAG)
Open code
set.seed(667)
dag <- dagify(
Covid19_clearence ~ Vaccination + Treatment + Immunocompetence,
Immunocompetence ~ IS + Sex + Diabetes +
eGFR + Dialysis_vignette + Age,
IS ~ IS_pre_Tx + Maintance_IS + T_cell_depletion,
Maintance_IS ~ Immunological_risk + Time_from_Tx,
T_cell_depletion ~ Immunological_risk + Time_from_Tx,
Immunological_risk ~ HLA_mismatch + PRA + DSA,
outcome = "Covid19_clearence",
latent = c("Immunological_risk", "Immunocompetence", "IS")
)
tidy_dag <- tidy_dagitty(dag)
tidy_dag$data$node_type <- ifelse(
tidy_dag$data$name %in% dagitty::exposures(dag),
"exposure",
ifelse(tidy_dag$data$name %in% dagitty::outcomes(dag), "outcome",
ifelse(tidy_dag$data$name %in% dagitty::latents(dag), "latent", "other")
)
)
tidy_dag$data$name <- gsub("_", " ", tidy_dag$data$name)
tidy_dag$data[which(tidy_dag$data$name == 'Maintance IS'),]$x <- 2.8
tidy_dag$data[which(tidy_dag$data$to == 'Maintance_IS'),]$xend <- 2.8
tidy_dag$data[which(tidy_dag$data$name == 'Maintance IS'),]$y <- 2.2
tidy_dag$data[which(tidy_dag$data$to == 'Maintance_IS'),]$yend <- 2.2
tidy_dag$data[which(tidy_dag$data$name == 'Immunological risk'),]$y <- 2.9
tidy_dag$data[which(tidy_dag$data$to == 'Immunological_risk'),]$yend <- 2.9
tidy_dag$data[which(tidy_dag$data$name == 'T cell depletion'),]$y <- 1.7
tidy_dag$data[which(tidy_dag$data$to == 'T_cell_depletion'),]$yend <- 1.7
tidy_dag$data[which(tidy_dag$data$name == 'Covid19 clearence'),]$y <- 0.1
tidy_dag$data[which(tidy_dag$data$to == 'Covid19_clearence'),]$yend <- 0.1
tidy_dag$data[which(tidy_dag$data$name == 'Dialysis vignette'),]$x <- 0.3
tidy_dag$data[which(tidy_dag$data$name == 'PRA'),]$x <- 6
tidy_dag$data[which(tidy_dag$data$name == 'DSA'),]$y <- 3.5
# Plot the DAG with custom styles for exposure, latent, outcome, and other nodes
dag <- ggdag(tidy_dag, layout = "nicely") +
geom_dag_point(aes(color = node_type),
fill = "white",
size = 18,
stroke = 1
) +
geom_dag_edges(edge_color = "steelblue") +
geom_dag_text(color = "black",
size = 3.4) +
theme_dag() +
scale_color_manual(
values = c(
"outcome" = "lightblue",
"latent" = "lightgray",
"other" = "pink"
)
) +
labs(color = "Type of variable")
dag IS: immunosupressionOpen code
path <- paste0('gitignore/figures/dag.pdf')
if(file.exists(path) == FALSE){
ggsave(path,
plot = dag, width = 10, height = 7, units = "in")
}2.4 Model - probability map
2.4.1 Initiation
2.4.1.1 Data modification
Open code
data_long_work <- data_long %>%
mutate(PCR = if_else(PCR == 0, 40, PCR))
mean(data_long_work$PCR)
## [1] 30.24079
mean(data_long_work$time)
## [1] 2.052632
data_long2 <- data_long_work %>%
mutate(PCR = PCR - mean(PCR),
time = time - mean(time))2.4.1.2 Berengua model
Open code
data_berengua <- data_berengua %>%
mutate(week_scaled = week - mean(data_long_work$time),
ct_scaled = ct - mean(data_long_work$PCR))
model_berengua <- glm(viability ~ ct_scaled + week_scaled,
family = binomial(link = 'logit'),
data = data_berengua)
model_berengua_gam <- gam(viability ~ s(ct_scaled, k=4) + week_scaled,
family = binomial(link = 'logit'),
data = data_berengua, method = 'ML')
model_berengua_int <- gam(viability ~ ct_scaled*week_scaled,
family = binomial(link = 'logit'),
data = data_berengua, method = 'ML')
summary(model_berengua)
##
## Call:
## glm(formula = viability ~ ct_scaled + week_scaled, family = binomial(link = "logit"),
## data = data_berengua)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -5.5103 1.9299 -2.855 0.00430 **
## ct_scaled -1.0356 0.3563 -2.907 0.00365 **
## week_scaled -0.3112 0.4165 -0.747 0.45505
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 118.008 on 86 degrees of freedom
## Residual deviance: 23.425 on 84 degrees of freedom
## AIC: 29.425
##
## Number of Fisher Scoring iterations: 8
summary(model_berengua_gam)
##
## Family: binomial
## Link function: logit
##
## Formula:
## viability ~ s(ct_scaled, k = 4) + week_scaled
##
## Parametric coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 1.1790 0.7590 1.553 0.120
## week_scaled -0.3112 0.4166 -0.747 0.455
##
## Approximate significance of smooth terms:
## edf Ref.df Chi.sq p-value
## s(ct_scaled) 1 1 8.447 0.00366 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.824 Deviance explained = 80.1%
## -ML = 11.713 Scale est. = 1 n = 87
summary(model_berengua_int)
##
## Family: binomial
## Link function: logit
##
## Formula:
## viability ~ ct_scaled * week_scaled
##
## Parametric coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -6.6533 3.0243 -2.200 0.0278 *
## ct_scaled -1.2469 0.5631 -2.214 0.0268 *
## week_scaled -1.5190 2.0392 -0.745 0.4563
## ct_scaled:week_scaled -0.2175 0.3623 -0.600 0.5484
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## R-sq.(adj) = 0.823 Deviance explained = 80.5%
## -ML = 11.484 Scale est. = 1 n = 87
AIC(model_berengua, model_berengua_gam, model_berengua_int)
## df AIC
## model_berengua 3 29.42546
## model_berengua_gam 3 29.42546
## model_berengua_int 4 30.967942.4.1.3 Priors setting
Open code
prior1 <- c(
prior(normal(-5.5, 3.86), class = "Intercept"),
prior(normal(-1.04, 0.71), class = b, coef = 'PCR'),
prior(normal(-0.31, 0.83), class = b, coef = 'time'))
prior2_weak <- c(
prior(normal(-5.5, 19.3), class = "Intercept"),
prior(normal(-1.04, 3.56), class = b, coef = 'PCR'),
prior(normal(-0.31, 4.17), class = b, coef = 'time'))2.4.2 Model fitting
2.4.2.1 Strong prior
Open code
model1_SP <- run(
expr = brm(
formula = via ~ PCR + time + (1|patient_ID),
data = data_long2,
family = bernoulli(link = 'logit'),
chains = 4, iter = 12000, warmup = 2000, cores = 1,
seed = 123,
control = list(adapt_delta = 0.999),
prior = prior1
),
path = 'gitignore/run/model1_SP',
reuse = TRUE)
summary(model1_SP)
## Family: bernoulli
## Links: mu = logit
## Formula: via ~ PCR + time + (1 | patient_ID)
## Data: data_long2 (Number of observations: 76)
## Draws: 4 chains, each with iter = 12000; warmup = 2000; thin = 1;
## total post-warmup draws = 40000
##
## Multilevel Hyperparameters:
## ~patient_ID (Number of levels: 23)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 3.01 1.50 0.45 6.36 1.00 8923 9820
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -5.50 1.92 -9.79 -2.37 1.00 12051 17954
## PCR -0.92 0.30 -1.61 -0.45 1.00 12776 16173
## time -1.18 0.51 -2.28 -0.27 1.00 17980 23033
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).2.4.2.2 Weakly informative prior
Open code
model2_WIP <- run(
expr = brm(
formula = via ~ PCR + time + (1|patient_ID),
data = data_long2,
family = bernoulli(link = 'logit'),
chains = 4, iter = 12000, warmup = 2000, cores = 1,
seed = 123,
control = list(adapt_delta = 0.999),
prior = prior2_weak
),
path = 'gitignore/run/model2_WIP',
reuse = TRUE)
summary(model2_WIP)
## Family: bernoulli
## Links: mu = logit
## Formula: via ~ PCR + time + (1 | patient_ID)
## Data: data_long2 (Number of observations: 76)
## Draws: 4 chains, each with iter = 12000; warmup = 2000; thin = 1;
## total post-warmup draws = 40000
##
## Multilevel Hyperparameters:
## ~patient_ID (Number of levels: 23)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 6.79 3.91 1.38 16.33 1.00 8611 12452
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -11.07 5.90 -25.91 -3.14 1.00 9112 10939
## PCR -1.73 0.91 -4.02 -0.55 1.00 9518 12299
## time -3.22 1.82 -7.65 -0.66 1.00 10078 13699
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).2.4.2.3 Model comparison
Open code
waic(model1_SP, model2_WIP)
## Warning:
## 4 (5.3%) p_waic estimates greater than 0.4. We recommend trying loo instead.
## Warning:
## 4 (5.3%) p_waic estimates greater than 0.4. We recommend trying loo instead.
## Output of model 'model1_SP':
##
## Computed from 40000 by 76 log-likelihood matrix
##
## Estimate SE
## elpd_waic -14.5 3.9
## p_waic 7.6 2.4
## waic 29.0 7.9
##
## 4 (5.3%) p_waic estimates greater than 0.4. We recommend trying loo instead.
##
## Output of model 'model2_WIP':
##
## Computed from 40000 by 76 log-likelihood matrix
##
## Estimate SE
## elpd_waic -9.9 2.8
## p_waic 6.1 1.8
## waic 19.8 5.6
##
## 4 (5.3%) p_waic estimates greater than 0.4. We recommend trying loo instead.
##
## Model comparisons:
## elpd_diff se_diff
## model2_WIP 0.0 0.0
## model1_SP -4.6 1.42.4.3 Predicted probabilities
2.4.3.1 Getting probability from whole posterior distribution
Defining new data
Open code
PCR <- seq(min(data_long2$PCR), max(data_long2$PCR), length.out = 100)
time <- seq(min(data_long2$time), max(data_long2$time), length.out = 100)
new_data <- expand_grid(PCR, time) %>%
mutate(patient_ID = NA)
prediction <- function(model, newdata, ndraws){
save_pred <- brms::posterior_epred(model,
newdata = newdata,
ndraws = ndraws,
allow_new_level = TRUE)
newdata <- newdata %>%
mutate(probability = apply(save_pred,
2,
function(x) mean(x, na.rm = TRUE))) %>%
mutate(week = time + mean(data_long_work$time),
PCR_N_runs = PCR + mean(data_long_work$PCR)) %>%
select(-patient_ID)
return(newdata)
}
epred_prediction_SP <- run(
expr = prediction(model1_SP,
new_data,
10000),
path = 'gitignore/run/epred_prediction_SP',
reuse = TRUE)
epred_prediction_WIP <- run(
expr = prediction(model2_WIP,
new_data,
10000),
path = 'gitignore/run/epred_prediction_WIP',
reuse = TRUE)Print cycle threshold per week with probability of viability = ~ 0.05
Open code
## strong prior model, whole posterior distribution taken into account
epred_prediction_SP %>%
filter(
probability>0.045 & probability<0.055,
(week>0.92 & week<1.08) |
(week>1.92 & week<2.08) |
(week>2.92 & week<3.08) |
(week>3.92 & week<4.08) |
(week>4.92 & week<5.08)
) %>%
select(week, PCR_N_runs, probability) %>%
mutate(week = round(week)) %>%
mutate(week = factor(week)) %>%
group_by(week) %>%
summarize(CT_threshold = mean(PCR_N_runs))
## # A tibble: 5 × 2
## week CT_threshold
## <fct> <dbl>
## 1 1 32.3
## 2 2 30.8
## 3 3 29.5
## 4 4 28.3
## 5 5 27.3
## weakly informative prior model, whole posterior distribution taken into account
epred_prediction_WIP %>%
filter(
probability>0.045 & probability<0.055,
(week>0.92 & week<1.08) |
(week>1.92 & week<2.08) |
(week>2.92 & week<3.08) |
(week>3.92 & week<4.08) |
(week>4.92 & week<5.08)
) %>%
select(week, PCR_N_runs, probability) %>%
mutate(week = round(week)) %>%
mutate(week = factor(week)) %>%
group_by(week) %>%
summarize(CT_threshold = mean(PCR_N_runs))
## # A tibble: 5 × 2
## week CT_threshold
## <fct> <dbl>
## 1 1 33.0
## 2 2 30.8
## 3 3 28.9
## 4 4 27.2
## 5 5 25.6Print cycle threshold per week with probability of viability = ~ 0.2
Open code
## strong prior model, whole posterior distribution taken into account
epred_prediction_SP %>%
filter(
probability>0.195 & probability<0.205,
(week>0.92 & week<1.08) |
(week>1.92 & week<2.08) |
(week>2.92 & week<3.08) |
(week>3.92 & week<4.08) |
(week>4.92 & week<5.08)
) %>%
select(week, PCR_N_runs, probability) %>%
mutate(week = round(week)) %>%
mutate(week = factor(week)) %>%
group_by(week) %>%
summarize(CT_threshold = mean(PCR_N_runs))
## # A tibble: 5 × 2
## week CT_threshold
## <fct> <dbl>
## 1 1 29.0
## 2 2 27.6
## 3 3 26.3
## 4 4 25.2
## 5 5 24.1
## weakly informative prior model, whole posterior distribution taken into account
epred_prediction_WIP %>%
filter(
probability>0.195 & probability<0.205,
(week>0.92 & week<1.08) |
(week>1.92 & week<2.08) |
(week>2.92 & week<3.08) |
(week>3.92 & week<4.08) |
(week>4.92 & week<5.08)
) %>%
select(week, PCR_N_runs, probability) %>%
mutate(week = round(week)) %>%
mutate(week = factor(week)) %>%
group_by(week) %>%
summarize(CT_threshold = mean(PCR_N_runs))
## # A tibble: 5 × 2
## week CT_threshold
## <fct> <dbl>
## 1 1 29.4
## 2 2 27.5
## 3 3 25.7
## 4 4 24.1
## 5 5 22.4Print cycle threshold per week with probability of viability = ~ 0.5
Open code
## strong prior model, whole posterior distribution taken into account
epred_prediction_SP %>%
filter(
probability>0.495 & probability<0.505,
(week>0.92 & week<1.08) |
(week>1.92 & week<2.08) |
(week>2.92 & week<3.08) |
(week>3.92 & week<4.08) |
(week>4.92 & week<5.08)
) %>%
select(week, PCR_N_runs, probability) %>%
mutate(week = round(week)) %>%
mutate(week = factor(week)) %>%
group_by(week) %>%
summarize(CT_threshold = mean(PCR_N_runs))
## # A tibble: 5 × 2
## week CT_threshold
## <fct> <dbl>
## 1 1 25.7
## 2 2 24.3
## 3 3 23.2
## 4 4 21.9
## 5 5 20.5
## weakly informative prior model, whole posterior distribution taken into account
epred_prediction_WIP %>%
filter(
probability>0.495 & probability<0.505,
(week>0.92 & week<1.08) |
(week>1.92 & week<2.08) |
(week>2.92 & week<3.08) |
(week>3.92 & week<4.08) |
(week>4.92 & week<5.08)
) %>%
select(week, PCR_N_runs, probability) %>%
mutate(week = round(week)) %>%
mutate(week = factor(week)) %>%
group_by(week) %>%
summarize(CT_threshold = mean(PCR_N_runs))
## # A tibble: 5 × 2
## week CT_threshold
## <fct> <dbl>
## 1 1 25.7
## 2 2 23.9
## 3 3 22.1
## 4 4 20.2
## 5 5 18.42.4.3.2 Fitted probabilities (median posterior estimate)
Generating values predicted solely on the basis of median posterior estimate, not taking into account the the uncertainty of the estimates
Open code
new_data <- new_data %>%
mutate(fitted_SP = inv_logit(fixef(model1_SP)[1,1] +
fixef(model1_SP)[2,1]*PCR +
fixef(model1_SP)[3,1]*time),
fitted_WIP = inv_logit(fixef(model2_WIP)[1,1] +
fixef(model2_WIP)[2,1]*PCR +
fixef(model2_WIP)[3,1]*time))Print cycle threshold per week with probability of viability = ~ 0.05
Open code
## strong prior model, only median posterior estimate taken into account
new_data %>%
mutate(
week = time + mean(data_long_work$time),
Ct = PCR + mean(data_long_work$PCR)
) %>%
filter(
fitted_SP>0.045 & fitted_SP<0.055,
(week>0.92 & week<1.08) |
(week>1.92 & week<2.08) |
(week>2.92 & week<3.08) |
(week>3.92 & week<4.08) |
(week>4.92 & week<5.08)
) %>%
select(week, Ct, fitted_SP) %>%
mutate(week = round(week)) %>%
mutate(week = factor(week)) %>%
group_by(week) %>%
summarize(CT_threshold = mean(Ct))
## # A tibble: 5 × 2
## week CT_threshold
## <fct> <dbl>
## 1 1 28.7
## 2 2 27.5
## 3 3 26.3
## 4 4 25.0
## 5 5 23.8
## weakly informative prior model, only median posterior estimate taken into account
new_data %>%
mutate(
week = time + mean(data_long_work$time),
Ct = PCR + mean(data_long_work$PCR)
) %>%
filter(
fitted_WIP>0.045 & fitted_WIP<0.055,
(week>0.92 & week<1.08) |
(week>1.92 & week<2.08) |
(week>2.92 & week<3.08) |
(week>3.92 & week<4.08) |
(week>4.92 & week<5.08)
) %>%
select(week, Ct, fitted_WIP) %>%
mutate(week = round(week)) %>%
mutate(week = factor(week)) %>%
group_by(week) %>%
summarize(CT_threshold = mean(Ct))
## # A tibble: 5 × 2
## week CT_threshold
## <fct> <dbl>
## 1 1 27.5
## 2 2 25.7
## 3 3 23.8
## 4 4 21.9
## 5 5 20.1Print cycle threshold per week with probability of viability = ~ 0.2
Open code
## strong prior model, only median posterior estimate taken into account
new_data %>%
mutate(
week = time + mean(data_long_work$time),
Ct = PCR + mean(data_long_work$PCR)
) %>%
filter(
fitted_SP>0.18 & fitted_SP<0.222,
(week>0.9 & week<1.1) |
(week>1.9 & week<2.1) |
(week>2.9 & week<3.1) |
(week>3.9 & week<4.1) |
(week>4.9 & week<5.1)
) %>%
select(week, Ct, fitted_SP) %>%
mutate(week = round(week)) %>%
mutate(week = factor(week)) %>%
group_by(week) %>%
summarize(CT_threshold = mean(Ct))
## # A tibble: 5 × 2
## week CT_threshold
## <fct> <dbl>
## 1 1 27.1
## 2 2 25.8
## 3 3 24.6
## 4 4 23.3
## 5 5 22.0
## weakly informative prior model, only median posterior estimate taken into account
new_data %>%
mutate(
week = time + mean(data_long_work$time),
Ct = PCR + mean(data_long_work$PCR)
) %>%
filter(
fitted_WIP>0.18 & fitted_WIP<0.22,
(week>0.9 & week<1.1) |
(week>1.9 & week<2.1) |
(week>2.9 & week<3.1) |
(week>3.9 & week<4.1) |
(week>4.9 & week<5.1)
) %>%
select(week, Ct, fitted_WIP) %>%
mutate(week = round(week)) %>%
mutate(week = factor(week)) %>%
group_by(week) %>%
summarize(CT_threshold = mean(Ct))
## # A tibble: 5 × 2
## week CT_threshold
## <fct> <dbl>
## 1 1 26.5
## 2 2 24.8
## 3 3 23.0
## 4 4 21.0
## 5 5 19.1Print cycle threshold per week with probability of viability = ~ 0.5
Open code
## strong prior model, only median posterior estimate taken into account
new_data %>%
mutate(
week = time + mean(data_long_work$time),
Ct = PCR + mean(data_long_work$PCR)
) %>%
filter(
fitted_SP>0.49 & fitted_SP<0.51,
(week>0.92 & week<1.08) |
(week>1.92 & week<2.08) |
(week>2.92 & week<3.08) |
(week>3.92 & week<4.08) |
(week>4.92 & week<5.08)
) %>%
select(week, Ct, fitted_SP) %>%
mutate(week = round(week)) %>%
mutate(week = factor(week)) %>%
group_by(week) %>%
summarize(CT_threshold = mean(Ct))
## # A tibble: 5 × 2
## week CT_threshold
## <fct> <dbl>
## 1 1 25.7
## 2 2 24.3
## 3 3 23.0
## 4 4 21.9
## 5 5 20.5
## weakly informative prior model, only median posterior estimate taken into account
new_data %>%
mutate(
week = time + mean(data_long_work$time),
Ct = PCR + mean(data_long_work$PCR)
) %>%
filter(
fitted_WIP>0.48 & fitted_WIP<0.52,
(week>0.9 & week<1.1) |
(week>1.9 & week<2.1) |
(week>2.9 & week<3.1) |
(week>3.9 & week<4.1) |
(week>4.9 & week<5.1)
) %>%
select(week, Ct, fitted_WIP) %>%
mutate(week = round(week)) %>%
mutate(week = factor(week)) %>%
group_by(week) %>%
summarize(CT_threshold = mean(Ct))
## # A tibble: 5 × 2
## week CT_threshold
## <fct> <dbl>
## 1 1 25.7
## 2 2 23.9
## 3 3 22.1
## 4 4 20.2
## 5 5 18.32.4.4 Visualization
Setting plots range and breaks
Open code
xl <- c(1/7, 5)
yl <- c(20, 36)
cont_breaks <- c(0.5, 0.2, 0.05)
grey_color <- c('grey10', 'grey30', 'grey50')
white_color <- c('grey99', 'grey84', 'grey69')2.4.4.1 Prediction accounting uncertainity
Heatmap - strong prior
Open code
fig1a_SP <- epred_prediction_SP %>%
mutate(Ct = PCR_N_runs) %>%
ggplot(aes(x = week, y = Ct, z = probability)) +
geom_raster(aes(fill = probability)) +
scale_fill_gradient(low = 'seagreen3', high = 'red3',
name = "Probability of viability") +
geom_contour(aes(color = factor(after_stat(level))),
breaks = cont_breaks,
linewidth = 0.5) +
scale_color_manual(values = white_color,
name = "Probability of viability") +
coord_cartesian(xlim = c(xl[1], xl[2]),
ylim = c(yl[1], yl[2]),
expand = FALSE) +
guides(
fill = guide_colorbar(order = 1, revers = TRUE),
color = guide_legend(order = 2, , title = NULL)) +
ylab("Cycle threshold") +
xlab("Weeks since the symptoms onset")
plotac <- 'fig1a_SP'
get(plotac)Open code
path = paste0('gitignore/figures/',plotac, '.pdf')
if(file.exists(path) == FALSE){
ggsave(path,
plot = get(plotac), width = 7, height = 4.5, units = "in")
}Heatmap - weakly informative prior
Open code
fig2a_WIP <- epred_prediction_WIP %>%
mutate(Ct = PCR_N_runs) %>%
ggplot(aes(x = week, y = Ct, z = probability)) +
geom_raster(aes(fill = probability)) +
scale_fill_gradient(low = 'seagreen3', high = 'red3',
name = "Probability of viability") +
geom_contour(aes(color = factor(after_stat(level))),
breaks = cont_breaks,
linewidth = 0.5) +
scale_color_manual(values = white_color,
name = "Probability of viability") +
coord_cartesian(xlim = c(xl[1], xl[2]),
ylim = c(yl[1], yl[2]),
expand = FALSE) +
guides(
fill = guide_colorbar(order = 1, revers = TRUE),
color = guide_legend(order = 2, title = NULL)) +
ylab("Cycle threshold") +
xlab("Weeks since the symptoms onset")
plotac <- 'fig2a_WIP'
get(plotac)Open code
path = paste0('gitignore/figures/',plotac, '.pdf')
if(file.exists(path) == FALSE){
ggsave(path,
plot = get(plotac), width = 7, height = 4.5, units = "in")
}2.4.4.2 Fitted probabilities (median posterior estimates)
Heatmap variant - strong prior
Open code
fig3a_SP <- new_data %>%
mutate(
week = time + mean(data_long_work$time),
Ct = PCR + mean(data_long_work$PCR)
) %>%
ggplot(aes(x = week, y = Ct, z = fitted_SP)) +
geom_raster(aes(fill = fitted_SP)) +
scale_fill_gradient(low = 'seagreen3',
high = 'red3',
name = "Probability of viability") +
geom_contour(aes(color = factor(after_stat(level))),
breaks = cont_breaks,
linewidth = 0.5) +
scale_color_manual(values = white_color,
name = "Probability of viability") +
coord_cartesian(xlim = c(xl[1], xl[2]),
ylim = c(yl[1], yl[2]),
expand = FALSE) +
guides(
fill = guide_colorbar(order = 1, revers = TRUE),
color = guide_legend(order = 2, , title = NULL)) +
theme(legend.position = "right",
legend.box = "vertical") +
ylab("Cycle threshold") +
xlab("Weeks since the symptoms onset")
plotac <- 'fig3a_SP'
get(plotac)Open code
path = paste0('gitignore/figures/',plotac, '.pdf')
if(file.exists(path) == FALSE){
ggsave(path,
plot = get(plotac), width = 7, height = 4.5, units = "in")
}Heatmap variant - weakly informative prior
Open code
fig4a_WIP <- new_data %>%
mutate(week = time + mean(data_long_work$time),
Ct = PCR + mean(data_long_work$PCR)) %>%
ggplot(aes(x = week, y = Ct, z = fitted_WIP)) +
geom_raster(aes(fill = fitted_SP)) +
scale_fill_gradient(low = 'seagreen3', high = 'red3',
name = "Probability of viability") +
geom_contour(aes(color = factor(after_stat(level))),
breaks = cont_breaks,
linewidth = 0.5) +
scale_color_manual(values = white_color,
name = "Probability of viability") +
coord_cartesian(xlim = c(xl[1], xl[2]),
ylim = c(yl[1], yl[2]),
expand = FALSE) +
guides(
fill = guide_colorbar(order = 1, revers = TRUE),
color = guide_legend(order = 2, , title = NULL)) +
ylab("Cycle threshold") +
xlab("Weeks since the symptoms onset") +
theme(legend.position = "right",
legend.box = "vertical")
plotac <- 'fig4a_WIP'
get(plotac)Open code
path = paste0('gitignore/figures/',plotac, '.pdf')
if(file.exists(path) == FALSE){
ggsave(path,
plot = get(plotac), width = 7, height = 4.5, units = "in")
}2.4.5 ROC curve
Open code
data_long2 <- data_long2 %>%
mutate(neg_culture = 1-via)
roc_curve <- roc(factor(data_long2$neg_culture) ~ data_long2$PCR,
direction = '<',
quiet = TRUE)
opt_tres <- coords(roc_curve, "best", ret=c("threshold"))$threshold
opt_tres + mean(data_long_work$PCR)
## [1] 24.15
opt_coord <- coords(roc_curve, x = opt_tres)
opt_coord$sensitivity
## [1] 0.9607843
opt_coord$specificity
## [1] 0.84
prevalence <- mean(data_long2$neg_culture) # Proportion of positive cases if not known
true_negatives <- sum(data_long2$neg_culture == 0 & data_long2$PCR < opt_tres)
false_negatives <- sum(data_long2$neg_culture == 1 & data_long2$PCR < opt_tres)
npv <- true_negatives / (true_negatives + false_negatives)
npv
## [1] 0.9130435
roc_data_main <- data.frame(
tpr = rev(roc_curve$sensitivities), # True Positive Rate
fpr = rev(1 - roc_curve$specificities), # False Positive Rate
thresholds = rev(roc_curve$thresholds)
)
roc_currve <- ggplot(roc_data_main, aes(x = fpr, y = tpr)) +
geom_line(color = "blue") +
geom_abline(linetype = "dashed", color = "grey") +
labs(x = "False Positive Rate", y = "True Positive Rate")
roc_currve Open code
roc_curve$auc
## Area under the curve: 0.96672.4.6 ROC with CI
Open code
dat_samples <- clustdat_sampler(data_long2, id_col = "patient_ID", N = 400, seed = 123)
roc_data <- list()
auc <- c()
specificity <- c()
sensitivity <- c()
npv <- c()
for(i in 1:length(dat_samples)){
roc_curve <- roc(dat_samples[[i]]$neg_culture ~ dat_samples[[i]]$PCR,
direction = '<',
quiet = TRUE)
roc_data[[i]] <- data.frame(
tpr = rev(roc_curve$sensitivities), # True Positive Rate
fpr = rev(1 - roc_curve$specificities), # False Positive Rate
thresholds = rev(roc_curve$thresholds))
auc[i] <- roc_curve$auc
opt_coord <- coords(roc_curve, x = opt_tres)
sensitivity[i] <- opt_coord$sensitivity
specificity[i] <- opt_coord$specificity
prevalence <- mean(dat_samples[[i]]$neg_culture)
true_negatives <- sum(dat_samples[[i]]$neg_culture == 0 & dat_samples[[i]]$PCR < opt_tres)
false_negatives <- sum(dat_samples[[i]]$neg_culture == 1 & dat_samples[[i]]$PCR < opt_tres)
npv[i] <- true_negatives / (true_negatives + false_negatives)
}
roc_data_main <- roc_data_main %>%
mutate(list_i = 1)
roc_data <- bind_rows(roc_data, .id = "list_i") %>%
mutate(list_i = factor(list_i))
auc_ci <- round(quantile(auc, probs = c(0.025, 0.975)),2)
roc_curve$auc <- round(roc_curve$auc, 2)
roc_currve <- ggplot(roc_data, aes(x = fpr, y = tpr, by = list_i)) +
geom_line(color = 'skyblue2', alpha = 0.3) +
geom_line(data = roc_data_main,
aes(x = fpr, y = tpr),
color = 'blue') +
geom_abline(linetype = "dashed", color = "grey") +
labs(x = "False Positive Rate", y = "True Positive Rate") +
annotate("text",
x = 0.53,
y = 0.06,
label = paste0("AUC = ", roc_curve$auc, " [", auc_ci[1], " to ", auc_ci[2], "]"),
size = 3,
color = "blue")
roc_currve Open code
## AUC
quantile(auc, probs = c(0.025, 0.975))
## 2.5% 97.5%
## 0.9327034 0.9900882
## sensitivity
quantile(sensitivity, probs = c(0.025, 0.975))
## 2.5% 97.5%
## 0.893617 1.000000
## specificity
quantile(specificity, probs = c(0.025, 0.975))
## 2.5% 97.5%
## 0.7390732 0.9500595
## negative predictive value
quantile(npv, probs = c(0.025, 0.975))
## 2.5% 97.5%
## 0.7618227 1.00000002.4.7 ROC - fitted probability plot combo
Open code
roc_fitprob_combo <- plot_grid(roc_currve, fig3a_SP,
rel_widths = c(0.5, 1),
labels = c("A", "B"))
plotac <- 'roc_fitprob_combo'
get(plotac)Open code
path = paste0('gitignore/figures/',plotac, '.pdf')
if(file.exists(path) == FALSE){
ggsave(path,
plot = get(plotac),
width = 8, height = 4, units = "in")
}2.4.8 ROC - probability (whole posterior) plot combo
Open code
roc_prob_combo <- plot_grid(roc_currve, fig1a_SP,
rel_widths = c(0.5, 1),
labels = c("A", "B"))
plotac <- 'roc_prob_combo'
get(plotac)Open code
path = paste0('gitignore/figures/',plotac, '.pdf')
if(file.exists(path) == FALSE){
ggsave(path,
plot = get(plotac),
width = 8, height = 4, units = "in")
}2.5 Univariable models
2.5.1 Data exploration
2.5.1.1 Disribution of outcomes
Open code
p1 <- data %>%
ggplot(aes(x = symptoms_neg_PCR_days)) +
geom_histogram()
p2 <- data %>%
ggplot(aes(x = symptoms_neg_viability_days)) +
geom_histogram()
plot_grid(p1, p2)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.2.5.1.2 Summary table
Open code
data %>% tbl_summary(
type = list(PRA ~ "continuous")) %>%
modify_caption("Patient characteristics")Characteristic |
N = 23 1 |
|---|---|
| male_sex | 17 (74%) |
| age_at_COVID | 53 (48, 62) |
| age_at_Tx | 51 (44, 59) |
| Years_since_Tx | 3.3 (0.1, 10.4) |
| Tx_less_year | 8 (35%) |
| T_depl_year | 6 (26%) |
| PRA | 13 (0, 22) |
| HLA_MM | |
| 1 | 1 (4.3%) |
| 2 | 7 (30%) |
| 3 | 6 (26%) |
| 4 | 6 (26%) |
| 5 | 1 (4.3%) |
| 6 | 2 (8.7%) |
| HD_vignette | 1.70 (0.50, 3.50) |
| CMV_rec | 14 (61%) |
| CMV_don | 16 (70%) |
| IS_preTx | 8 (35%) |
| DM | 5 (22%) |
| Remdsivir | 14 (61%) |
| days_last_dose | 340 (191, 509) |
| Unknown | 1 |
| symptoms_neg_PCR_days | 18 (15, 30) |
| symptoms_neg_viability_days | 11 (8, 14) |
| 1
n (%); Median (Q1, Q3) |
|
2.5.2 Running GLM-NB models
2.5.2.1 Function to run all models
Open code
run_perm <- function(predictors_vector, outcome, data){
b_coef <- vector('double', length(predictors_vector))
CI_L <- vector('double', length(predictors_vector))
CI_U <- vector('double', length(predictors_vector))
P_parametric <- vector('double', length(predictors_vector))
P_permutation <- vector('double', length(predictors_vector))
for (i in 1:length(predictors_vector)){
pred <- data[, predictors_vector[i]]
out <- data[, outcome]
model <- glm.nb(out ~ pred,
data = data)
permP <- perm_model(model, nsim = 21000)
b_coef[i] <- coef(model)[2]
CI_L[i] <- confint(model)[2,1]
CI_U[i] <- confint(model)[2,2]
P_parametric[i] <- summary(model)$coefficients[2, 4]
P_permutation[i] <- permP[2]
}
results <- data.frame(
Predictor = predictors_vector,
Rate_ratio = exp(b_coef),
CI_L = exp(CI_L),
CI_U = exp(CI_U), P_parametric, P_permutation
)
return(results)
}2.5.2.2 Function to run parametric model with days after last vaccination as a covariate
Open code
run_glmnb <- function(predictors_vector, outcome, data){
b_coef <- vector('double', length(predictors_vector))
CI_L <- vector('double', length(predictors_vector))
CI_U <- vector('double', length(predictors_vector))
P_parametric <- vector('double', length(predictors_vector))
b_coef_DLD <- vector('double', length(predictors_vector))
P_parametric_DLD <- vector('double', length(predictors_vector))
for (i in 1:length(predictors_vector)){
pred <- data[, predictors_vector[i]]
out <- data[, outcome]
model <- glm.nb(out ~ pred + days_last_dose,
data = data)
b_coef[i] <- coef(model)[2]
CI_L[i] <- confint(model)[2,1]
CI_U[i] <- confint(model)[2,2]
P_parametric[i] <- summary(model)$coefficients[2, 4]
b_coef_DLD[i] <- coef(model)[3]
P_parametric_DLD[i] <- summary(model)$coefficients[3, 4]
}
results <- data.frame(
Predictor = predictors_vector,
Rate_ratio = exp(b_coef),
CI_L = exp(CI_L),
CI_U = exp(CI_U), P_parametric,
Rate_ratio_DLD = exp(b_coef_DLD),
P_parametric_DLD
)
return(results)
}2.5.2.3 Days of PCR positive
Open code
set.seed(446)
res_PCR <- run(
expr = run_perm(names(data)[1:15], 'symptoms_neg_PCR_days', data),
path = 'gitignore/run/res_PCR',
reuse = TRUE)
res_PCR[, 2:4] <- round(res_PCR[, 2:4], 3)
res_PCR[, 5:6] <- round(res_PCR[, 5:6], 4)
kableExtra::kable(res_PCR,
caption = 'Effect of various predictors on the number of days with PCR positivity. Estimates are derived from a series of single-variable generalized linear models (GLMs) with a negative binomial distribution, supplemented by a permutation test. `Rate ratio` represents the fold-change in the response for each unit increase in the predictor. `CI_L` and `CI_U` indicate the lower and upper bounds of the 95% confidence interval, respectively. `P_parametric` refers to the P-value from the parametric NB-GLM, while `P_permutation` refers to the P-value obtained from the permutation test.')| Predictor | Rate_ratio | CI_L | CI_U | P_parametric | P_permutation |
|---|---|---|---|---|---|
| male_sex | 0.690 | 0.451 | 1.037 | 0.0795 | 0.1487 |
| age_at_COVID | 1.011 | 0.992 | 1.031 | 0.1978 | 0.2787 |
| age_at_Tx | 1.020 | 1.004 | 1.036 | 0.0085 | 0.0221 |
| Years_since_Tx | 0.937 | 0.905 | 0.970 | 0.0003 | 0.0026 |
| Tx_less_year | 1.744 | 1.244 | 2.459 | 0.0014 | 0.0103 |
| T_depl_year | 1.827 | 1.283 | 2.631 | 0.0010 | 0.0094 |
| PRA | 1.002 | 0.994 | 1.012 | 0.5863 | 0.5437 |
| HLA_MM | 1.216 | 1.073 | 1.383 | 0.0029 | 0.0148 |
| HD_vignette | 1.043 | 0.988 | 1.109 | 0.1514 | 0.1940 |
| CMV_rec | 1.204 | 0.805 | 1.789 | 0.3602 | 0.4680 |
| CMV_don | 1.221 | 0.792 | 1.855 | 0.3572 | 0.4742 |
| IS_preTx | 1.299 | 0.877 | 1.943 | 0.1972 | 0.2804 |
| DM | 0.894 | 0.560 | 1.463 | 0.6446 | 0.7725 |
| Remdsivir | 0.676 | 0.465 | 0.975 | 0.0373 | 0.0885 |
| days_last_dose | 1.001 | 1.000 | 1.002 | 0.2128 | 0.3019 |
Open code
if(file.exists('gitignore/outputs/res_PCR.xlsx') == FALSE){
write.xlsx(res_PCR, 'gitignore/outputs/res_PCR.xlsx')
}2.5.2.4 Days of viable virus
Open code
set.seed(446)
res_viability <- run(
expr = run_perm(names(data)[1:15], 'symptoms_neg_viability_days', data),
path = 'gitignore/run/res_viability',
reuse = TRUE)
res_viability[, 2:4] <- round(res_viability[, 2:4], 3)
res_viability[, 5:6] <- round(res_viability[, 5:6], 4)
kableExtra::kable(res_viability,
caption = 'Effect of various predictors on the number of days with viable virus. Estimates are derived from a series of single-variable generalized linear models (GLMs) with a negative binomial distribution, supplemented by a permutation test. `Rate ratio` represents the fold-change in the response for each unit increase in the predictor. `CI_L` and `CI_U` indicate the lower and upper bounds of the 95% confidence interval, respectively. `P_parametric` refers to the P-value from the parametric NB-GLM, while `P_permutation` refers to the P-value obtained from the permutation test.')| Predictor | Rate_ratio | CI_L | CI_U | P_parametric | P_permutation |
|---|---|---|---|---|---|
| male_sex | 0.637 | 0.398 | 1.001 | 0.0541 | 0.0961 |
| age_at_COVID | 1.006 | 0.984 | 1.029 | 0.5468 | 0.6280 |
| age_at_Tx | 1.015 | 0.997 | 1.033 | 0.0939 | 0.1545 |
| Years_since_Tx | 0.934 | 0.897 | 0.972 | 0.0012 | 0.0057 |
| Tx_less_year | 2.086 | 1.475 | 2.963 | 0.0000 | 0.0035 |
| T_depl_year | 2.513 | 1.899 | 3.328 | 0.0000 | 0.0001 |
| PRA | 1.003 | 0.994 | 1.013 | 0.5415 | 0.5390 |
| HLA_MM | 1.117 | 0.946 | 1.325 | 0.1943 | 0.2593 |
| HD_vignette | 1.048 | 0.987 | 1.122 | 0.1494 | 0.2044 |
| CMV_rec | 1.095 | 0.691 | 1.723 | 0.6977 | 0.7625 |
| CMV_don | 1.646 | 1.029 | 2.615 | 0.0358 | 0.0847 |
| IS_preTx | 1.395 | 0.900 | 2.186 | 0.1405 | 0.2013 |
| DM | 1.116 | 0.662 | 1.934 | 0.6867 | 0.6733 |
| Remdsivir | 0.715 | 0.462 | 1.100 | 0.1295 | 0.1973 |
| days_last_dose | 1.001 | 1.000 | 1.002 | 0.1709 | 0.2581 |
Open code
if(file.exists('gitignore/outputs/res_viability.xlsx') == FALSE){
write.xlsx(res_PCR, 'gitignore/outputs/res_viability.xlsx')
}2.5.2.5 Days of PCR positive with days after vaccination as a covariate
Open code
res_PCR <- run(
expr = run_glmnb(names(data)[1:14], 'symptoms_neg_PCR_days', data),
path = 'gitignore/run/res_PCR_lastdose',
reuse = TRUE)
res_PCR[, 2:4] <- round(res_PCR[, 2:4], 3)
res_PCR[, 5:6] <- round(res_PCR[, 5:6], 4)
colnames(res_PCR)[1] <- 'Main predictor'
colnames(res_PCR)[5] <- 'P'
colnames(res_PCR)[7] <- 'P_DLD'
kableExtra::kable(res_PCR,
caption = 'Effect of various predictors on the number of days with PCR positivity. Estimates are derived from a series of two-variable generalized linear models (GLMs) with a negative binomial distribution, with the first predictor being named in the `main predictor` column and the second predictor being the number of days after last dose application (`DLD`). `Rate ratio` represents the fold-change in the response for each unit increase in the main predictor. `CI_L` and `CI_U` indicate the lower and upper bounds of the 95% confidence interval, respectively. `P` refers to the P-value for the effect of the main predictor, based on the parametric NB-GLM. `Rate_ratio_DLD` and `P_DLD` represent the rate ratio and P-value, respectively, for the effect of DLD') %>%
add_header_above(c(" " = 1,
"Main predictor" = 4,
"Days after last dose" = 2))| Main predictor | Rate_ratio | CI_L | CI_U | P | Rate_ratio_DLD | P_DLD |
|---|---|---|---|---|---|---|
| male_sex | 0.740 | 0.471 | 1.146 | 0.1971 | 1.0004 | 0.4426080 |
| age_at_COVID | 1.011 | 0.989 | 1.033 | 0.2818 | 1.0006 | 0.2236862 |
| age_at_Tx | 1.020 | 1.004 | 1.037 | 0.0099 | 1.0006 | 0.1943146 |
| Years_since_Tx | 0.932 | 0.902 | 0.963 | 0.0000 | 1.0007 | 0.0770951 |
| Tx_less_year | 1.828 | 1.321 | 2.547 | 0.0003 | 1.0006 | 0.1792402 |
| T_depl_year | 1.756 | 1.210 | 2.574 | 0.0029 | 1.0003 | 0.5004493 |
| PRA | 1.000 | 0.990 | 1.010 | 0.9697 | 1.0007 | 0.2586838 |
| HLA_MM | 1.214 | 1.046 | 1.412 | 0.0135 | 1.0000 | 0.9750709 |
| HD_vignette | 1.031 | 0.970 | 1.102 | 0.3385 | 1.0004 | 0.4318466 |
| CMV_rec | 1.111 | 0.701 | 1.751 | 0.6404 | 1.0005 | 0.3604924 |
| CMV_don | 1.255 | 0.821 | 1.897 | 0.2857 | 1.0007 | 0.1894136 |
| IS_preTx | 1.305 | 0.866 | 1.987 | 0.2046 | 1.0005 | 0.3207277 |
| DM | 0.938 | 0.584 | 1.543 | 0.8011 | 1.0006 | 0.2532625 |
| Remdsivir | 0.701 | 0.481 | 1.016 | 0.0602 | 1.0006 | 0.2680594 |
Open code
if(file.exists('gitignore/outputs/res_PCR_lastdose.xlsx') == FALSE){
write.xlsx(res_PCR, 'gitignore/outputs/res_PCR_lastdose.xlsx')
}We will try also estimate effect of the `
2.5.2.6 Days of viable virus with days after vaccination as a covariate
Open code
res_viability <- run(
expr = run_glmnb(names(data)[1:14],
'symptoms_neg_viability_days',
data),
path = 'gitignore/run/res_viability_lastdose',
reuse = TRUE)
res_viability[, 2:4] <- round(res_viability[, 2:4], 3)
res_viability[, 5:6] <- round(res_viability[, 5:6], 4)
colnames(res_viability)[1] <- 'Main predictor'
colnames(res_viability)[5] <- 'P'
colnames(res_viability)[7] <- 'P_DLD'
kableExtra::kable(res_viability,
caption = 'Effect of various predictors on the number of days with viable virus. Estimates are derived from a series of two-variable generalized linear models (GLMs) with a negative binomial distribution, with the first predictor being named in the `main predictor` column and the second predictor being the number of days after last dose application (`DLD`). `Rate ratio` represents the fold-change in the response for each unit increase in the main predictor. `CI_L` and `CI_U` indicate the lower and upper bounds of the 95% confidence interval, respectively. `P` refers to the P-value for the effect of the main predictor, based on the parametric NB-GLM. `Rate_ratio_DLD` and `P_DLD` represent the rate ratio and P-value, respectively, for the effect of DLD') %>%
add_header_above(c(" " = 1,
"Main predictor" = 4,
"Days after last dose" = 2))| Main predictor | Rate_ratio | CI_L | CI_U | P | Rate_ratio_DLD | P_DLD |
|---|---|---|---|---|---|---|
| male_sex | 0.702 | 0.443 | 1.100 | 0.1472 | 1.0006 | 0.3573231 |
| age_at_COVID | 0.998 | 0.976 | 1.021 | 0.8778 | 1.0008 | 0.1722869 |
| age_at_Tx | 1.012 | 0.994 | 1.030 | 0.1931 | 1.0008 | 0.1444768 |
| Years_since_Tx | 0.924 | 0.892 | 0.956 | 0.0000 | 1.0008 | 0.0465434 |
| Tx_less_year | 2.304 | 1.771 | 3.000 | 0.0000 | 1.0007 | 0.0473752 |
| T_depl_year | 2.362 | 1.803 | 3.094 | 0.0000 | 1.0004 | 0.3321615 |
| PRA | 1.000 | 0.991 | 1.011 | 0.9329 | 1.0008 | 0.2363278 |
| HLA_MM | 1.047 | 0.874 | 1.258 | 0.6217 | 1.0006 | 0.3294210 |
| HD_vignette | 1.032 | 0.969 | 1.104 | 0.3478 | 1.0006 | 0.3594460 |
| CMV_rec | 0.976 | 0.585 | 1.619 | 0.9190 | 1.0008 | 0.1959789 |
| CMV_don | 1.708 | 1.117 | 2.611 | 0.0132 | 1.0008 | 0.1441467 |
| IS_preTx | 1.482 | 0.979 | 2.262 | 0.0649 | 1.0006 | 0.2739468 |
| DM | 1.164 | 0.708 | 1.951 | 0.5651 | 1.0009 | 0.1483567 |
| Remdsivir | 0.754 | 0.500 | 1.133 | 0.1762 | 1.0008 | 0.1615478 |
Open code
if(file.exists(
'gitignore/outputs/res_viability_lastdose.xlsx') == FALSE){
write.xlsx(res_viability,
'gitignore/outputs/res_viability_lastdose.xlsx')
}2.6 COVID mutations over time
2.6.1 Priors
2.6.1.1 Calculations to get prior
Open code
2*2*sd(data_mutations$week)
## [1] 5.728403
## check
(5.73/2)*sd(rescale(data_mutations$week))
## [1] 1.4325
sd(data_mutations$week)
## [1] 1.4321012.6.1.2 Priors specification
Open code
prior4_weakly_regularizing <- c(
prior(normal(4, 5), class = "Intercept"),
#prior(exponential(1), class = "sd", group = "Patient.ID"),
#prior(exponential(1), class = "sd", group = "id_obs"),
prior(normal(0, 5.73), class = b, coef = 'week'),
prior(normal(0.1, 2), class = b, coef = 'molnupiravir'),
prior(normal(0, 0.5), class = b, coef = 'week:molnupiravir')
)
prior4_wide <- c(
prior(normal(4, 10), class = "Intercept"),
prior(exponential(1), class = "sd", group = "Patient.ID"),
prior(normal(0, 29), class = b, coef = 'week'),
prior(normal(0.1, 10), class = b, coef = 'molnupiravir'),
prior(normal(0, 2.5), class = b, coef = 'week:molnupiravir'))2.6.2 f_05 all mutations model
2.6.2.1 Fitting and summary
Open code
data <- data_mutations_all
data <- data %>% mutate(outcome = f_05)
model <- 'mutfreBM_all05_wr'
assign(model, run(
expr = brm(outcome ~ week +
molnupiravir +
week:molnupiravir +
(1|Patient.ID) +
(1|id_obs),
family = poisson(),
data = data,
chains = 4, cores = 4,
iter = 12000, warmup = 2000,
seed = 123,
backend = "cmdstanr",
control = list(adapt_delta = 0.999999,
max_treedepth = 15),
prior = prior4_weakly_regularizing),
path = paste0('gitignore/run/', model),
reuse = TRUE))
summary(get(model))
## Family: poisson
## Links: mu = log
## Formula: outcome ~ week + molnupiravir + week:molnupiravir + (1 | Patient.ID) + (1 | id_obs)
## Data: data (Number of observations: 26)
## Draws: 4 chains, each with iter = 12000; warmup = 2000; thin = 1;
## total post-warmup draws = 40000
##
## Multilevel Hyperparameters:
## ~id_obs (Number of levels: 26)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.34 0.06 0.24 0.48 1.00 15403 22991
##
## ~Patient.ID (Number of levels: 6)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.24 0.22 0.01 0.82 1.00 8276 14482
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 5.71 0.20 5.31 6.10 1.00 21369 19549
## week 0.06 0.05 -0.05 0.17 1.00 20989 24282
## molnupiravir 0.12 0.35 -0.55 0.80 1.00 23934 21993
## week:molnupiravir 0.22 0.13 -0.04 0.48 1.00 27861 26633
##
## Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
prior_summary(get(model))
## prior class coef group resp dpar nlpar lb
## (flat) b
## normal(0.1, 2) b molnupiravir
## normal(0, 5.73) b week
## normal(0, 0.5) b week:molnupiravir
## normal(4, 5) Intercept
## student_t(3, 0, 2.5) sd 0
## student_t(3, 0, 2.5) sd id_obs 0
## student_t(3, 0, 2.5) sd Intercept id_obs 0
## student_t(3, 0, 2.5) sd Patient.ID 0
## student_t(3, 0, 2.5) sd Intercept Patient.ID 0
## ub source
## default
## user
## user
## user
## user
## default
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)2.6.2.2 Posterior check
Open code
pp_check(get(model),
type = 'dens_overlay_grouped',
group = 'molnupiravir',
ndraws = 50)Open code
pp_check(get(model),
type = 'dens_overlay',
ndraws = 50)2.6.2.3 Results exploration
Open code
exp(fixef(get(model))[-1,c(1,3,4)])
## Estimate Q2.5 Q97.5
## week 1.064303 0.9545806 1.185138
## molnupiravir 1.131704 0.5778112 2.225651
## week:molnupiravir 1.245922 0.9561683 1.613694
draws <- as_draws_df(get(model)) %>%
mutate(week_molnupiravir = b_week + `b_week:molnupiravir`,
week_remdesivir = b_week)
## 95% credible interval for week effect in molnupiravir group
## (fold change per week)
draws%>%
summarize(
p2.5 = exp(quantile(week_molnupiravir, 0.025)),
p97.5 = exp(quantile(week_molnupiravir, 0.975))
)
## # A tibble: 1 × 2
## p2.5 p97.5
## <dbl> <dbl>
## 1 1.04 1.69
## 95% credible interval for week effect in remdesivir group
## (fold change per week)
draws%>%
summarize(
p2.5 = exp(quantile(week_remdesivir, 0.025)),
p97.5 = exp(quantile(week_remdesivir, 0.975))
)
## # A tibble: 1 × 2
## p2.5 p97.5
## <dbl> <dbl>
## 1 0.955 1.19
## 95% credible interval between-treatment difference
## (difference in 1 week log-fold-changes between the two treatment groups)
draws%>%
summarize(
p2.5 = quantile(`b_week:molnupiravir`, 0.025),
p97.5 = quantile(`b_week:molnupiravir`, 0.975)
)
## # A tibble: 1 × 2
## p2.5 p97.5
## <dbl> <dbl>
## 1 -0.0448 0.4792.6.3 f_5 all mutations model
2.6.3.1 Fitting and summary
Open code
data <- data_mutations_all
data <- data %>% mutate(outcome = f_5)
model <- 'mutfreBM_all5_wr'
assign(model, run(
expr = brm(outcome ~ week +
molnupiravir +
week:molnupiravir +
(1|Patient.ID) +
(1|id_obs),
family = poisson(),
data = data,
chains = 4, cores = 4,
iter = 12000, warmup = 2000,
seed = 123,
backend = "cmdstanr",
control = list(adapt_delta = 0.999999,
max_treedepth = 15),
prior = prior4_weakly_regularizing),
path = paste0('gitignore/run/', model),
reuse = TRUE))
summary(get(model))
## Family: poisson
## Links: mu = log
## Formula: outcome ~ week + molnupiravir + week:molnupiravir + (1 | Patient.ID) + (1 | id_obs)
## Data: data (Number of observations: 26)
## Draws: 4 chains, each with iter = 12000; warmup = 2000; thin = 1;
## total post-warmup draws = 40000
##
## Multilevel Hyperparameters:
## ~id_obs (Number of levels: 26)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.08 0.04 0.01 0.17 1.00 10426 9950
##
## ~Patient.ID (Number of levels: 6)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.30 0.20 0.10 0.81 1.00 10304 12771
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 4.34 0.18 3.98 4.70 1.00 12160 13038
## week 0.04 0.02 -0.01 0.09 1.00 31832 24663
## molnupiravir 0.01 0.31 -0.61 0.63 1.00 15255 15894
## week:molnupiravir 0.18 0.06 0.07 0.29 1.00 33197 28194
##
## Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
prior_summary(get(model))
## prior class coef group resp dpar nlpar lb
## (flat) b
## normal(0.1, 2) b molnupiravir
## normal(0, 5.73) b week
## normal(0, 0.5) b week:molnupiravir
## normal(4, 5) Intercept
## student_t(3, 0, 2.5) sd 0
## student_t(3, 0, 2.5) sd id_obs 0
## student_t(3, 0, 2.5) sd Intercept id_obs 0
## student_t(3, 0, 2.5) sd Patient.ID 0
## student_t(3, 0, 2.5) sd Intercept Patient.ID 0
## ub source
## default
## user
## user
## user
## user
## default
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)2.6.3.2 Posterior check
Open code
pp_check(get(model),
type = 'dens_overlay_grouped',
group = 'molnupiravir',
ndraws = 50)Open code
pp_check(get(model),
type = 'dens_overlay',
ndraws = 50)2.6.3.3 Results exploration
Open code
exp(fixef(get(model))[-1,c(1,3,4)])
## Estimate Q2.5 Q97.5
## week 1.040404 0.9905187 1.091040
## molnupiravir 1.007267 0.5452719 1.884628
## week:molnupiravir 1.196312 1.0716785 1.335922
draws <- as_draws_df(get(model)) %>%
mutate(week_molnupiravir = b_week + `b_week:molnupiravir`,
week_remdesivir = b_week)
## 95% credible interval for week effect in molnupiravir group
## (fold change per week)
draws %>%
summarize(
p2.5 = exp(quantile(week_molnupiravir, 0.025)),
p97.5 = exp(quantile(week_molnupiravir, 0.975))
)
## # A tibble: 1 × 2
## p2.5 p97.5
## <dbl> <dbl>
## 1 1.13 1.37
## 95% credible interval for week effect in remdesivir group
## (fold change per week)
draws%>%
summarize(
p2.5 = exp(quantile(week_remdesivir, 0.025)),
p97.5 = exp(quantile(week_remdesivir, 0.975))
)
## # A tibble: 1 × 2
## p2.5 p97.5
## <dbl> <dbl>
## 1 0.991 1.09
## 95% credible interval between-treatment difference
## (difference in week log-ratios between treatment groups)
draws %>%
summarize(
p2.5 = quantile(`b_week:molnupiravir`, 0.025),
p97.5 = quantile(`b_week:molnupiravir`, 0.975)
)
## # A tibble: 1 × 2
## p2.5 p97.5
## <dbl> <dbl>
## 1 0.0692 0.2902.6.4 f_50 all mutations model
2.6.4.1 Fitting and summary
Open code
data <- data_mutations_all
data <- data %>% mutate(outcome = f_50)
model <- 'mutfreBM_all50_wr'
assign(model, run(
expr = brm(outcome ~ week +
molnupiravir +
week:molnupiravir +
(1|Patient.ID) +
(1|id_obs),
family = poisson(),
data = data,
chains = 4, cores = 4,
iter = 12000, warmup = 2000,
seed = 123,
backend = "cmdstanr",
control = list(adapt_delta = 0.999999,
max_treedepth = 15),
prior = prior4_weakly_regularizing),
path = paste0('gitignore/run/', model),
reuse = TRUE))
summary(get(model))
## Warning: There were 2 divergent transitions after warmup. Increasing
## adapt_delta above 0.999999 may help. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## Family: poisson
## Links: mu = log
## Formula: outcome ~ week + molnupiravir + week:molnupiravir + (1 | Patient.ID) + (1 | id_obs)
## Data: data (Number of observations: 26)
## Draws: 4 chains, each with iter = 12000; warmup = 2000; thin = 1;
## total post-warmup draws = 40000
##
## Multilevel Hyperparameters:
## ~id_obs (Number of levels: 26)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.03 0.02 0.00 0.08 1.00 21563 15089
##
## ~Patient.ID (Number of levels: 6)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.19 0.14 0.05 0.56 1.00 8539 10258
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 4.20 0.13 3.95 4.44 1.00 12113 12081
## week -0.00 0.02 -0.04 0.04 1.00 29788 28963
## molnupiravir -0.11 0.22 -0.53 0.31 1.00 13247 13605
## week:molnupiravir 0.05 0.05 -0.05 0.15 1.00 32761 30612
##
## Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
prior_summary(get(model))
## prior class coef group resp dpar nlpar lb
## (flat) b
## normal(0.1, 2) b molnupiravir
## normal(0, 5.73) b week
## normal(0, 0.5) b week:molnupiravir
## normal(4, 5) Intercept
## student_t(3, 0, 2.5) sd 0
## student_t(3, 0, 2.5) sd id_obs 0
## student_t(3, 0, 2.5) sd Intercept id_obs 0
## student_t(3, 0, 2.5) sd Patient.ID 0
## student_t(3, 0, 2.5) sd Intercept Patient.ID 0
## ub source
## default
## user
## user
## user
## user
## default
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)2.6.4.2 Posterior check
Open code
pp_check(get(model),
type = 'dens_overlay_grouped',
group = 'molnupiravir',
ndraws = 50)Open code
pp_check(get(model),
type = 'dens_overlay',
ndraws = 50)2.6.4.3 Results exploration
Open code
exp(fixef(get(model))[-1,c(1,3,4)])
## Estimate Q2.5 Q97.5
## week 0.9985492 0.9565773 1.042196
## molnupiravir 0.8978597 0.5892974 1.369922
## week:molnupiravir 1.0504875 0.9487221 1.161298
draws <- as_draws_df(get(model)) %>%
mutate(week_molnupiravir = b_week + `b_week:molnupiravir`,
week_remdesivir = b_week)
## 95% credible interval for week effect in molnupiravir group
## (fold change per week)
draws%>%
summarize(
p2.5 = exp(quantile(week_molnupiravir, 0.025)),
p97.5 = exp(quantile(week_molnupiravir, 0.975))
)
## # A tibble: 1 × 2
## p2.5 p97.5
## <dbl> <dbl>
## 1 0.956 1.15
## 95% credible interval for week effect in remdesivir group
## (fold change per week)
draws%>%
summarize(
p2.5 = exp(quantile(week_remdesivir, 0.025)),
p97.5 = exp(quantile(week_remdesivir, 0.975))
)
## # A tibble: 1 × 2
## p2.5 p97.5
## <dbl> <dbl>
## 1 0.957 1.04
## 95% credible interval between-treatment difference
## (difference in week log-ratios between treatment groups)
draws%>%
summarize(
p2.5 = quantile(`b_week:molnupiravir`, 0.025),
p97.5 = quantile(`b_week:molnupiravir`, 0.975)
)
## # A tibble: 1 × 2
## p2.5 p97.5
## <dbl> <dbl>
## 1 -0.0526 0.1502.6.5 f_05 spike mutations model
2.6.5.1 Fitting and summary
Open code
data <- data_mutations_spike
data <- data %>% mutate(outcome = f_05)
model <- 'mutfreBM_spike05_wr'
assign(model, run(
expr = brm(outcome ~ week +
molnupiravir +
week:molnupiravir +
(1|Patient.ID) +
(1|id_obs),
family = poisson(),
data = data,
chains = 4, cores = 4,
iter = 12000, warmup = 2000,
seed = 123,
backend = "cmdstanr",
control = list(adapt_delta = 0.999999,
max_treedepth = 15),
prior = prior4_weakly_regularizing),
path = paste0('gitignore/run/', model),
reuse = TRUE))
summary(get(model))
## Warning: There were 1 divergent transitions after warmup. Increasing
## adapt_delta above 0.999999 may help. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## Family: poisson
## Links: mu = log
## Formula: outcome ~ week + molnupiravir + week:molnupiravir + (1 | Patient.ID) + (1 | id_obs)
## Data: data (Number of observations: 26)
## Draws: 4 chains, each with iter = 12000; warmup = 2000; thin = 1;
## total post-warmup draws = 40000
##
## Multilevel Hyperparameters:
## ~id_obs (Number of levels: 26)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.26 0.05 0.18 0.38 1.00 13276 20959
##
## ~Patient.ID (Number of levels: 6)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.14 0.14 0.01 0.48 1.00 8131 14502
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 4.47 0.14 4.20 4.74 1.00 17140 19460
## week 0.02 0.05 -0.07 0.11 1.00 17063 22071
## molnupiravir 0.07 0.25 -0.40 0.57 1.00 19107 21011
## week:molnupiravir 0.18 0.11 -0.04 0.39 1.00 18530 22110
##
## Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
prior_summary(get(model))
## prior class coef group resp dpar nlpar lb
## (flat) b
## normal(0.1, 2) b molnupiravir
## normal(0, 5.73) b week
## normal(0, 0.5) b week:molnupiravir
## normal(4, 5) Intercept
## student_t(3, 0, 2.5) sd 0
## student_t(3, 0, 2.5) sd id_obs 0
## student_t(3, 0, 2.5) sd Intercept id_obs 0
## student_t(3, 0, 2.5) sd Patient.ID 0
## student_t(3, 0, 2.5) sd Intercept Patient.ID 0
## ub source
## default
## user
## user
## user
## user
## default
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)2.6.5.2 Posterior check
Open code
pp_check(get(model),
type = 'dens_overlay_grouped',
group = 'molnupiravir',
ndraws = 50)Open code
pp_check(get(model),
type = 'dens_overlay',
ndraws = 50)2.6.5.3 Results exploration
Open code
exp(fixef(get(model))[-1,c(1,3,4)])
## Estimate Q2.5 Q97.5
## week 1.020113 0.9316610 1.115652
## molnupiravir 1.076704 0.6670866 1.769934
## week:molnupiravir 1.195696 0.9637282 1.478440
draws <- as_draws_df(get(model)) %>%
mutate(week_molnupiravir = b_week + `b_week:molnupiravir`,
week_remdesivir = b_week)
## 95% credible interval for week effect in molnupiravir group
## (fold change per week)
draws%>%
summarize(
p2.5 = exp(quantile(week_molnupiravir, 0.025)),
p97.5 = exp(quantile(week_molnupiravir, 0.975))
)
## # A tibble: 1 × 2
## p2.5 p97.5
## <dbl> <dbl>
## 1 0.999 1.49
## 95% credible interval for week effect in remdesivir group
## (fold change per week)
draws%>%
summarize(
p2.5 = exp(quantile(week_remdesivir, 0.025)),
p97.5 = exp(quantile(week_remdesivir, 0.975))
)
## # A tibble: 1 × 2
## p2.5 p97.5
## <dbl> <dbl>
## 1 0.932 1.12
## 95% credible interval between-treatment difference
## (difference in week log-ratios between treatment groups)
draws%>%
summarize(
p2.5 = quantile(`b_week:molnupiravir`, 0.025),
p97.5 = quantile(`b_week:molnupiravir`, 0.975)
)
## # A tibble: 1 × 2
## p2.5 p97.5
## <dbl> <dbl>
## 1 -0.0369 0.3912.6.6 f_5 spike mutations model
2.6.6.1 Fitting and summary
Open code
data <- data_mutations_spike
data <- data %>% mutate(outcome = f_5)
model <- 'mutfreBM_spike5_wr'
assign(model, run(
expr = brm(outcome ~ week +
molnupiravir +
week:molnupiravir +
(1|Patient.ID) +
(1|id_obs),
family = poisson(),
data = data,
chains = 4, cores = 4,
iter = 12000, warmup = 2000,
seed = 123,
backend = "cmdstanr",
control = list(adapt_delta = 0.999999,
max_treedepth = 15),
prior = prior4_weakly_regularizing),
path = paste0('gitignore/run/', model),
reuse = TRUE))
summary(get(model))
## Family: poisson
## Links: mu = log
## Formula: outcome ~ week + molnupiravir + week:molnupiravir + (1 | Patient.ID) + (1 | id_obs)
## Data: data (Number of observations: 26)
## Draws: 4 chains, each with iter = 12000; warmup = 2000; thin = 1;
## total post-warmup draws = 40000
##
## Multilevel Hyperparameters:
## ~id_obs (Number of levels: 26)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.05 0.04 0.00 0.14 1.00 18182 17615
##
## ~Patient.ID (Number of levels: 6)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.28 0.19 0.09 0.76 1.00 11677 14363
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 3.83 0.17 3.49 4.17 1.00 16969 16950
## week 0.01 0.03 -0.05 0.06 1.00 44113 31013
## molnupiravir 0.03 0.29 -0.56 0.60 1.00 20410 17585
## week:molnupiravir 0.05 0.06 -0.07 0.18 1.00 44175 30411
##
## Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
prior_summary(get(model))
## prior class coef group resp dpar nlpar lb
## (flat) b
## normal(0.1, 2) b molnupiravir
## normal(0, 5.73) b week
## normal(0, 0.5) b week:molnupiravir
## normal(4, 5) Intercept
## student_t(3, 0, 2.5) sd 0
## student_t(3, 0, 2.5) sd id_obs 0
## student_t(3, 0, 2.5) sd Intercept id_obs 0
## student_t(3, 0, 2.5) sd Patient.ID 0
## student_t(3, 0, 2.5) sd Intercept Patient.ID 0
## ub source
## default
## user
## user
## user
## user
## default
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)2.6.6.2 Posterior check
Open code
pp_check(get(model),
type = 'dens_overlay_grouped',
group = 'molnupiravir',
ndraws = 50)Open code
pp_check(get(model),
type = 'dens_overlay',
ndraws = 50)2.6.6.3 Results exploration
Open code
exp(fixef(get(model))[-1,c(1,3,4)])
## Estimate Q2.5 Q97.5
## week 1.008393 0.9556913 1.063445
## molnupiravir 1.027577 0.5724063 1.827344
## week:molnupiravir 1.056404 0.9353810 1.192400
draws <- as_draws_df(get(model)) %>%
mutate(week_molnupiravir = b_week + `b_week:molnupiravir`,
week_remdesivir = b_week)
## 95% credible interval for week effect in molnupiravir group
## (fold change per week)
draws%>%
summarize(
p2.5 = exp(quantile(week_molnupiravir, 0.025)),
p97.5 = exp(quantile(week_molnupiravir, 0.975))
)
## # A tibble: 1 × 2
## p2.5 p97.5
## <dbl> <dbl>
## 1 0.956 1.19
## 95% credible interval for week effect in remdesivir group
## (fold change per week)
draws%>%
summarize(
p2.5 = exp(quantile(week_remdesivir, 0.025)),
p97.5 = exp(quantile(week_remdesivir, 0.975))
)
## # A tibble: 1 × 2
## p2.5 p97.5
## <dbl> <dbl>
## 1 0.956 1.06
## 95% credible interval between-treatment difference
## (difference in week log-ratios between treatment groups)
draws %>%
summarize(
p2.5 = quantile(`b_week:molnupiravir`, 0.025),
p97.5 = quantile(`b_week:molnupiravir`, 0.975)
)
## # A tibble: 1 × 2
## p2.5 p97.5
## <dbl> <dbl>
## 1 -0.0668 0.1762.6.7 f_50 spike mutations model
2.6.7.1 Fitting and summary
Open code
data <- data_mutations_spike
data <- data %>% mutate(outcome = f_50)
model <- 'mutfreBM_spike50_wr'
assign(model, run(
expr = brm(outcome ~ week +
molnupiravir +
week:molnupiravir +
(1|Patient.ID) +
(1|id_obs),
family = poisson(),
data = data,
chains = 4, cores = 4,
iter = 12000, warmup = 2000,
seed = 123,
backend = "cmdstanr",
control = list(adapt_delta = 0.999999,
max_treedepth = 15),
prior = prior4_weakly_regularizing),
path = paste0('gitignore/run/', model),
reuse = TRUE))
summary(get(model))
## Family: poisson
## Links: mu = log
## Formula: outcome ~ week + molnupiravir + week:molnupiravir + (1 | Patient.ID) + (1 | id_obs)
## Data: data (Number of observations: 26)
## Draws: 4 chains, each with iter = 12000; warmup = 2000; thin = 1;
## total post-warmup draws = 40000
##
## Multilevel Hyperparameters:
## ~id_obs (Number of levels: 26)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.06 0.04 0.00 0.17 1.00 16124 17554
##
## ~Patient.ID (Number of levels: 6)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.33 0.21 0.12 0.89 1.00 10398 13451
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 3.72 0.20 3.32 4.12 1.00 15344 15233
## week 0.00 0.03 -0.05 0.06 1.00 32713 26292
## molnupiravir -0.13 0.34 -0.80 0.54 1.00 18212 15747
## week:molnupiravir -0.02 0.07 -0.16 0.13 1.00 33792 27484
##
## Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
prior_summary(get(model))
## prior class coef group resp dpar nlpar lb
## (flat) b
## normal(0.1, 2) b molnupiravir
## normal(0, 5.73) b week
## normal(0, 0.5) b week:molnupiravir
## normal(4, 5) Intercept
## student_t(3, 0, 2.5) sd 0
## student_t(3, 0, 2.5) sd id_obs 0
## student_t(3, 0, 2.5) sd Intercept id_obs 0
## student_t(3, 0, 2.5) sd Patient.ID 0
## student_t(3, 0, 2.5) sd Intercept Patient.ID 0
## ub source
## default
## user
## user
## user
## user
## default
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)2.6.7.2 Posterior check
Open code
pp_check(get(model),
type = 'dens_overlay_grouped',
group = 'molnupiravir',
ndraws = 50)Open code
pp_check(get(model),
type = 'dens_overlay',
ndraws = 50)2.6.7.3 Results exploration
Open code
exp(fixef(get(model))[-1,c(1,3,4)])
## Estimate Q2.5 Q97.5
## week 1.0040371 0.9469659 1.064095
## molnupiravir 0.8809410 0.4508526 1.719678
## week:molnupiravir 0.9842648 0.8556413 1.133287
draws <- as_draws_df(get(model)) %>%
mutate(week_molnupiravir = b_week + `b_week:molnupiravir`,
week_remdesivir = b_week)
## 95% credible interval for week effect in molnupiravir group
## (fold change per week)
draws%>%
summarize(
p2.5 = exp(quantile(week_molnupiravir, 0.025)),
p97.5 = exp(quantile(week_molnupiravir, 0.975))
)
## # A tibble: 1 × 2
## p2.5 p97.5
## <dbl> <dbl>
## 1 0.870 1.12
## 95% credible interval for week effect in remdesivir group
## (fold change per week)
draws%>%
summarize(
p2.5 = exp(quantile(week_remdesivir, 0.025)),
p97.5 = exp(quantile(week_remdesivir, 0.975))
)
## # A tibble: 1 × 2
## p2.5 p97.5
## <dbl> <dbl>
## 1 0.947 1.06
## 95% credible interval between-treatment difference
## (difference in week log-ratios between treatment groups)
draws%>%
summarize(
p2.5 = quantile(`b_week:molnupiravir`, 0.025),
p97.5 = quantile(`b_week:molnupiravir`, 0.975)
)
## # A tibble: 1 × 2
## p2.5 p97.5
## <dbl> <dbl>
## 1 -0.156 0.1252.6.8 Table of mutations over time
2.6.8.1 Whole genome
Open code
draws_all50_wr <- as_draws_df(mutfreBM_all50_wr) %>%
mutate(week_molnupiravir = b_week + `b_week:molnupiravir`,
week_remdesivir = b_week) %>%
summarize(
est_wm = exp(quantile(week_molnupiravir, 0.5)),
CIL_wm = exp(quantile(week_molnupiravir, 0.025)),
CIP_wm = exp(quantile(week_molnupiravir, 0.975)),
est_wr = exp(quantile(week_remdesivir, 0.5)),
CIL_wr = exp(quantile(week_remdesivir, 0.025)),
CIP_wr = exp(quantile(week_remdesivir, 0.975)),
est_int = exp(quantile(`b_week:molnupiravir`, 0.5)),
CIL_int = exp(quantile(`b_week:molnupiravir`, 0.025)),
CIP_int = exp(quantile(`b_week:molnupiravir`, 0.975))
)
draws_all5_wr <- as_draws_df(mutfreBM_all5_wr) %>%
mutate(week_molnupiravir = b_week + `b_week:molnupiravir`,
week_remdesivir = b_week) %>%
summarize(
est_wm = exp(quantile(week_molnupiravir, 0.5)),
CIL_wm = exp(quantile(week_molnupiravir, 0.025)),
CIP_wm = exp(quantile(week_molnupiravir, 0.975)),
est_wr = exp(quantile(week_remdesivir, 0.5)),
CIL_wr = exp(quantile(week_remdesivir, 0.025)),
CIP_wr = exp(quantile(week_remdesivir, 0.975)),
est_int = exp(quantile(`b_week:molnupiravir`, 0.5)),
CIL_int = exp(quantile(`b_week:molnupiravir`, 0.025)),
CIP_int = exp(quantile(`b_week:molnupiravir`, 0.975))
)
draws_all05_wr <- as_draws_df(mutfreBM_all05_wr) %>%
mutate(week_molnupiravir = b_week + `b_week:molnupiravir`,
week_remdesivir = b_week) %>%
summarize(
est_wm = exp(quantile(week_molnupiravir, 0.5)),
CIL_wm = exp(quantile(week_molnupiravir, 0.025)),
CIP_wm = exp(quantile(week_molnupiravir, 0.975)),
est_wr = exp(quantile(week_remdesivir, 0.5)),
CIL_wr = exp(quantile(week_remdesivir, 0.025)),
CIP_wr = exp(quantile(week_remdesivir, 0.975)),
est_int = exp(quantile(`b_week:molnupiravir`, 0.5)),
CIL_int = exp(quantile(`b_week:molnupiravir`, 0.025)),
CIP_int = exp(quantile(`b_week:molnupiravir`, 0.975))
)
effects = c('1-week FC in molnupiravir',
'1-week FC in remdesivir',
'molnupiravir/remdesivir (ratio of FCs)')
f50tab <- data.frame(
FC = as.numeric(draws_all50_wr[seq(1, 9, 3)]),
CI_L = as.numeric(draws_all50_wr[seq(2, 9, 3)]),
CI_U = as.numeric(draws_all50_wr[seq(3, 9, 3)]))
f5tab <- data.frame(
FC = as.numeric(draws_all5_wr[seq(1, 9, 3)]),
CI_L = as.numeric(draws_all5_wr[seq(2, 9, 3)]),
CI_U = as.numeric(draws_all5_wr[seq(3, 9, 3)]))
f05tab <- data.frame(
FC = as.numeric(draws_all05_wr[seq(1, 9, 3)]),
CI_L = as.numeric(draws_all05_wr[seq(2, 9, 3)]),
CI_U = as.numeric(draws_all05_wr[seq(3, 9, 3)]))
all_table <- cbind(f50tab, f5tab, f05tab)
all_table[,c(1:9)] <- round(all_table[,c(1:9)], 2)
row.names(all_table) <- effects
all_table_html <- kbl(all_table, caption =
"Table X. Estimated 1-week fold-change (FC) in whole-genome mutations numbers over time, separately for different treatment (rows 1 and 2), and ratio between FCs of different treatment groups (molnupiravir/remdesivir; row 3). CI_L and CI_U show bounds of the 95% credible interval. Values of 1 imply null effect. Estimates are based on a Bayesian hierarchical generalized linear model with a Poisson distribution and weakly informative (slightly regularizing) priors. See methods for details") %>%
kable_styling("striped",full_width = T) %>%
add_header_above(c(" " = 1, "Frequency >50%" = 3,
"Frequency >5%" = 3, "Frequency >0.5%" = 3)) %>%
add_header_above(c("Whole genome ICVs" = 10)) %>%
column_spec(1, width_min = '2.4in')
all_table_html| FC | CI_L | CI_U | FC | CI_L | CI_U | FC | CI_L | CI_U | |
|---|---|---|---|---|---|---|---|---|---|
| 1-week FC in molnupiravir | 1.05 | 0.96 | 1.15 | 1.25 | 1.13 | 1.37 | 1.33 | 1.04 | 1.69 |
| 1-week FC in remdesivir | 1.00 | 0.96 | 1.04 | 1.04 | 0.99 | 1.09 | 1.06 | 0.95 | 1.19 |
| molnupiravir/remdesivir (ratio of FCs) | 1.05 | 0.95 | 1.16 | 1.20 | 1.07 | 1.34 | 1.25 | 0.96 | 1.61 |
2.6.8.2 Spike protein
Open code
draws_spike50_wr <- as_draws_df(mutfreBM_spike50_wr) %>%
mutate(week_molnupiravir = b_week + `b_week:molnupiravir`,
week_remdesivir = b_week) %>%
summarize(
est_wm = exp(quantile(week_molnupiravir, 0.5)),
CIL_wm = exp(quantile(week_molnupiravir, 0.025)),
CIP_wm = exp(quantile(week_molnupiravir, 0.975)),
est_wr = exp(quantile(week_remdesivir, 0.5)),
CIL_wr = exp(quantile(week_remdesivir, 0.025)),
CIP_wr = exp(quantile(week_remdesivir, 0.975)),
est_int = exp(quantile(`b_week:molnupiravir`, 0.5)),
CIL_int = exp(quantile(`b_week:molnupiravir`, 0.025)),
CIP_int = exp(quantile(`b_week:molnupiravir`, 0.975))
)
draws_spike5_wr <- as_draws_df(mutfreBM_spike5_wr) %>%
mutate(week_molnupiravir = b_week + `b_week:molnupiravir`,
week_remdesivir = b_week) %>%
summarize(
est_wm = exp(quantile(week_molnupiravir, 0.5)),
CIL_wm = exp(quantile(week_molnupiravir, 0.025)),
CIP_wm = exp(quantile(week_molnupiravir, 0.975)),
est_wr = exp(quantile(week_remdesivir, 0.5)),
CIL_wr = exp(quantile(week_remdesivir, 0.025)),
CIP_wr = exp(quantile(week_remdesivir, 0.975)),
est_int = exp(quantile(`b_week:molnupiravir`, 0.5)),
CIL_int = exp(quantile(`b_week:molnupiravir`, 0.025)),
CIP_int = exp(quantile(`b_week:molnupiravir`, 0.975))
)
draws_spike05_wr <- as_draws_df(mutfreBM_spike05_wr) %>%
mutate(week_molnupiravir = b_week + `b_week:molnupiravir`,
week_remdesivir = b_week) %>%
summarize(
est_wm = exp(quantile(week_molnupiravir, 0.5)),
CIL_wm = exp(quantile(week_molnupiravir, 0.025)),
CIP_wm = exp(quantile(week_molnupiravir, 0.975)),
est_wr = exp(quantile(week_remdesivir, 0.5)),
CIL_wr = exp(quantile(week_remdesivir, 0.025)),
CIP_wr = exp(quantile(week_remdesivir, 0.975)),
est_int = exp(quantile(`b_week:molnupiravir`, 0.5)),
CIL_int = exp(quantile(`b_week:molnupiravir`, 0.025)),
CIP_int = exp(quantile(`b_week:molnupiravir`, 0.975))
)
effects = c('1-week FC in molnupiravir',
'1-week FC in remdesivir',
'molnupiravir/remdesivir (ratio of FCs)')
f50tab <- data.frame(
FC = as.numeric(draws_spike50_wr[seq(1, 9, 3)]),
CI_L = as.numeric(draws_spike50_wr[seq(2, 9, 3)]),
CI_U = as.numeric(draws_spike50_wr[seq(3, 9, 3)]))
f5tab <- data.frame(
FC = as.numeric(draws_spike5_wr[seq(1, 9, 3)]),
CI_L = as.numeric(draws_spike5_wr[seq(2, 9, 3)]),
CI_U = as.numeric(draws_spike5_wr[seq(3, 9, 3)]))
f05tab <- data.frame(
FC = as.numeric(draws_spike05_wr[seq(1, 9, 3)]),
CI_L = as.numeric(draws_spike05_wr[seq(2, 9, 3)]),
CI_U = as.numeric(draws_spike05_wr[seq(3, 9, 3)]))
spike_table <- cbind(f50tab, f5tab, f05tab)
spike_table[,c(1:9)] <- round(spike_table[,c(1:9)], 2)
row.names(spike_table) <- effects
spike_table_html <- kbl(spike_table, caption =
"Table XX. Estimated 1-week fold-change (FC) in spike protein mutations numbers over time, separately for different treatment (rows 1 and 2), and ratio between FCs of different treatment groups (molnupiravir/remdesivir; row 3). CI_L and CI_U show bounds of the 95% credible interval. Values of 1 imply null effect. Estimates are based on a Bayesian hierarchical generalized linear model with a Poisson distribution and weakly informative (slightly regularizing) priors. See methods for details") %>%
kable_styling("striped",full_width = T) %>%
add_header_above(c(" " = 1, "Frequency >50%" = 3,
"Frequency >5%" = 3, "Frequency >0.5%" = 3)) %>%
add_header_above(c("Spike protein ICVs" = 10)) %>%
column_spec(1, width_min = '2.4in')
spike_table_html| FC | CI_L | CI_U | FC | CI_L | CI_U | FC | CI_L | CI_U | |
|---|---|---|---|---|---|---|---|---|---|
| 1-week FC in molnupiravir | 0.99 | 0.87 | 1.12 | 1.07 | 0.96 | 1.19 | 1.22 | 1.00 | 1.49 |
| 1-week FC in remdesivir | 1.00 | 0.95 | 1.06 | 1.01 | 0.96 | 1.06 | 1.02 | 0.93 | 1.12 |
| molnupiravir/remdesivir (ratio of FCs) | 0.98 | 0.86 | 1.13 | 1.06 | 0.94 | 1.19 | 1.20 | 0.96 | 1.48 |
2.6.9 Individual figures
2.6.9.1 Whole genome, >50%
Open code
time <- seq(0, 5, length.out = 100)
draws <- as_draws_df(mutfreBM_all50_wr) %>%
mutate(week_molnupiravir = b_week + `b_week:molnupiravir`,
week_remdesivir = b_week) %>%
select(week_molnupiravir, week_remdesivir, b_molnupiravir, b_Intercept)
remd <- data.frame()
for(i in 1:length(time)){
remd[1:40000,i] <- draws$b_Intercept + draws$week_remdesivir*time[i]}
remd <- sapply(remd , function(p) quantile(p, probs = c(0.025,0.975,0.5)))
moln <- data.frame()
for(i in 1:length(time)){
moln[1:40000,i] <- draws$b_Intercept + draws$b_molnupiravir +
draws$week_molnupiravir*time[i]}
moln <- sapply(moln , function(p) quantile(p, probs = c(0.025,0.975,0.5)))
predict <- data.frame(
prediction =
unlist(c(
exp(remd[3,]), exp(moln[3,])
)),
cil =
unlist(c(
exp(remd[1,]), exp(moln[1,])
)),
ciu =
unlist(c(
exp(remd[2,]), exp(moln[2,])
)),
time = rep(time, 2),
treatment = c(
rep("remdesivir", 100), rep("molnupiravir",100)
))
## figure
cole <- c("#FF00FF", "#01AF40")
fig_01a <- ggplot() +
geom_line(data = predict, aes(x = time,
y = prediction,
group = treatment,
color = treatment), size=1.05) +
geom_ribbon(data = predict,
aes(x = time, ymin = cil, ymax = ciu, fill = treatment),
alpha = 0.25, color = NA) +
geom_line(data = data_mutations_all,
aes(x = week, y = f_50, group = Patient.ID),
alpha = 0.8, color = 'grey30', size = 0.15,
method = 'lm', se=FALSE) +
scale_color_manual(values = cole) +
scale_fill_manual(values = cole) +
labs(x = "Weeks", y = 'Mutations counts') +
facet_wrap(~treatment, ncol = 2) +
ggtitle("Whole genome, >50%") +
theme(plot.title = element_text(size = 11), axis.text.y = element_text(size = 9))
fig_01a2.6.9.2 Whole genome, >5%
Open code
time <- seq(0, 5, length.out = 100)
draws <- as_draws_df(mutfreBM_all5_wr) %>%
mutate(week_molnupiravir = b_week + `b_week:molnupiravir`,
week_remdesivir = b_week) %>%
select(week_molnupiravir, week_remdesivir, b_molnupiravir, b_Intercept)
remd <- data.frame()
for(i in 1:length(time)){
remd[1:40000,i] <- draws$b_Intercept + draws$week_remdesivir*time[i]}
remd <- sapply(remd , function(p) quantile(p, probs = c(0.025,0.975,0.5)))
moln <- data.frame()
for(i in 1:length(time)){
moln[1:40000,i] <- draws$b_Intercept + draws$b_molnupiravir +
draws$week_molnupiravir*time[i]}
moln <- sapply(moln , function(p) quantile(p, probs = c(0.025,0.975,0.5)))
predict <- data.frame(
prediction =
unlist(c(
exp(remd[3,]), exp(moln[3,])
)),
cil =
unlist(c(
exp(remd[1,]), exp(moln[1,])
)),
ciu =
unlist(c(
exp(remd[2,]), exp(moln[2,])
)),
time = rep(time, 2),
treatment = c(
rep("remdesivir", 100), rep("molnupiravir",100)
))
## figure
cole <- c("#FF00FF", "#01AF40")
fig_01b <- ggplot() +
geom_line(data = predict, aes(x = time,
y = prediction,
group = treatment,
color = treatment), size=1.05) +
geom_ribbon(data = predict,
aes(x = time, ymin = cil, ymax = ciu, fill = treatment),
alpha = 0.25, color = NA) +
geom_line(data = data_mutations_all,
aes(x = week, y = f_5, group = Patient.ID),
alpha = 0.8, color = 'grey30', size = 0.15,
method = 'glm', se=FALSE,
method.args=list(family="poisson")) +
scale_color_manual(values = cole) +
scale_fill_manual(values = cole) +
labs(x = "Weeks", y = 'Mutations counts') +
facet_wrap(~treatment, ncol = 2) +
ggtitle("Whole genome, >5%") +
theme(plot.title = element_text(size = 11), axis.text.y = element_text(size = 9))
fig_01b2.6.9.3 Whole genome, >0.5%
Open code
time <- seq(0, 5, length.out = 100)
draws <- as_draws_df(mutfreBM_all05_wr) %>%
mutate(week_molnupiravir = b_week + `b_week:molnupiravir`,
week_remdesivir = b_week) %>%
select(week_molnupiravir, week_remdesivir, b_molnupiravir, b_Intercept)
remd <- data.frame()
for(i in 1:length(time)){
remd[1:40000,i] <- draws$b_Intercept + draws$week_remdesivir*time[i]}
remd <- sapply(remd , function(p) quantile(p, probs = c(0.025,0.975,0.5)))
moln <- data.frame()
for(i in 1:length(time)){
moln[1:40000,i] <- draws$b_Intercept + draws$b_molnupiravir +
draws$week_molnupiravir*time[i]}
moln <- sapply(moln , function(p) quantile(p, probs = c(0.025,0.975,0.5)))
predict <- data.frame(
prediction =
unlist(c(
exp(remd[3,]), exp(moln[3,])
)),
cil =
unlist(c(
exp(remd[1,]), exp(moln[1,])
)),
ciu =
unlist(c(
exp(remd[2,]), exp(moln[2,])
)),
time = rep(time, 2),
treatment = c(
rep("remdesivir", 100), rep("molnupiravir",100)
))
## figure
cole <- c("#FF00FF", "#01AF40")
fig_01c <- ggplot() +
geom_line(data = predict, aes(x = time,
y = prediction,
group = treatment,
color = treatment), size=1.05) +
geom_ribbon(data = predict,
aes(x = time, ymin = cil, ymax = ciu, fill = treatment),
alpha = 0.25, color = NA) +
geom_line(data = data_mutations_all,
aes(x = week, y = f_05, group = Patient.ID),
alpha = 0.8, color = 'grey30', size = 0.15,
method = 'glm', se=FALSE,
method.args=list(family="poisson")) +
scale_color_manual(values = cole) +
scale_fill_manual(values = cole) +
labs(x = "Weeks", y = 'Mutations counts') +
facet_wrap(~treatment, ncol = 2)+
ggtitle("Whole genome, >0.5%") +
theme(plot.title = element_text(size = 11), axis.text.y = element_text(size = 9))
fig_01c2.6.9.4 Spike protein, >50%
Open code
time <- seq(0, 5, length.out = 100)
draws <- as_draws_df(mutfreBM_spike50_wr) %>%
mutate(week_molnupiravir = b_week + `b_week:molnupiravir`,
week_remdesivir = b_week) %>%
select(week_molnupiravir, week_remdesivir, b_molnupiravir, b_Intercept)
remd <- data.frame()
for(i in 1:length(time)){
remd[1:40000,i] <- draws$b_Intercept + draws$week_remdesivir*time[i]}
remd <- sapply(remd , function(p) quantile(p, probs = c(0.025,0.975,0.5)))
moln <- data.frame()
for(i in 1:length(time)){
moln[1:40000,i] <- draws$b_Intercept + draws$b_molnupiravir +
draws$week_molnupiravir*time[i]}
moln <- sapply(moln , function(p) quantile(p, probs = c(0.025,0.975,0.5)))
predict <- data.frame(
prediction =
unlist(c(
exp(remd[3,]), exp(moln[3,])
)),
cil =
unlist(c(
exp(remd[1,]), exp(moln[1,])
)),
ciu =
unlist(c(
exp(remd[2,]), exp(moln[2,])
)),
time = rep(time, 2),
treatment = c(
rep("remdesivir", 100), rep("molnupiravir",100)
))
## figure
cole <- c("#FF00FF", "#01AF40")
fig_01d <- ggplot() +
geom_line(data = predict, aes(x = time,
y = prediction,
group = treatment,
color = treatment), size=1.05) +
geom_ribbon(data = predict,
aes(x = time, ymin = cil, ymax = ciu, fill = treatment),
alpha = 0.25, color = NA) +
geom_line(data = data_mutations_spike,
aes(x = week, y = f_50, group = Patient.ID),
alpha = 0.8, color = 'grey30', size = 0.15,
method = 'lm', se=FALSE) +
scale_color_manual(values = cole) +
scale_fill_manual(values = cole) +
labs(x = "Weeks", y = 'Mutations counts') +
facet_wrap(~treatment, ncol = 2) +
ggtitle("Spike protein, >50%") +
theme(plot.title = element_text(size = 11), axis.text.y = element_text(size = 9))
fig_01d2.6.9.5 Spike protein, >5%
Open code
time <- seq(0, 5, length.out = 100)
draws <- as_draws_df(mutfreBM_spike5_wr) %>%
mutate(week_molnupiravir = b_week + `b_week:molnupiravir`,
week_remdesivir = b_week) %>%
select(week_molnupiravir, week_remdesivir, b_molnupiravir, b_Intercept)
remd <- data.frame()
for(i in 1:length(time)){
remd[1:40000,i] <- draws$b_Intercept + draws$week_remdesivir*time[i]}
remd <- sapply(remd , function(p) quantile(p, probs = c(0.025,0.975,0.5)))
moln <- data.frame()
for(i in 1:length(time)){
moln[1:40000,i] <- draws$b_Intercept + draws$b_molnupiravir +
draws$week_molnupiravir*time[i]}
moln <- sapply(moln , function(p) quantile(p, probs = c(0.025,0.975,0.5)))
predict <- data.frame(
prediction =
unlist(c(
exp(remd[3,]), exp(moln[3,])
)),
cil =
unlist(c(
exp(remd[1,]), exp(moln[1,])
)),
ciu =
unlist(c(
exp(remd[2,]), exp(moln[2,])
)),
time = rep(time, 2),
treatment = c(
rep("remdesivir", 100), rep("molnupiravir",100)
))
## figure
cole <- c("#FF00FF", "#01AF40")
fig_01e <- ggplot() +
geom_line(data = predict, aes(x = time,
y = prediction,
group = treatment,
color = treatment), size=1.05) +
geom_ribbon(data = predict,
aes(x = time, ymin = cil, ymax = ciu, fill = treatment),
alpha = 0.25, color = NA) +
geom_line(data = data_mutations_spike,
aes(x = week, y = f_5, group = Patient.ID),
alpha = 0.8, color = 'grey30', size = 0.15,
method = 'glm', se=FALSE,
method.args=list(family="poisson")) +
scale_color_manual(values = cole) +
scale_fill_manual(values = cole) +
labs(x = "Weeks", y = 'Mutations counts') +
facet_wrap(~treatment, ncol = 2) +
ggtitle("Spike protein, >5%") +
theme(plot.title = element_text(size = 11), axis.text.y = element_text(size = 9))
fig_01e2.6.9.6 Spike protein, >0.5%
Open code
time <- seq(0, 5, length.out = 100)
draws <- as_draws_df(mutfreBM_spike05_wr) %>%
mutate(week_molnupiravir = b_week + `b_week:molnupiravir`,
week_remdesivir = b_week) %>%
select(week_molnupiravir, week_remdesivir, b_molnupiravir, b_Intercept)
remd <- data.frame()
for(i in 1:length(time)){
remd[1:40000,i] <- draws$b_Intercept + draws$week_remdesivir*time[i]}
remd <- sapply(remd , function(p) quantile(p, probs = c(0.025,0.975,0.5)))
moln <- data.frame()
for(i in 1:length(time)){
moln[1:40000,i] <- draws$b_Intercept + draws$b_molnupiravir +
draws$week_molnupiravir*time[i]}
moln <- sapply(moln , function(p) quantile(p, probs = c(0.025,0.975,0.5)))
predict <- data.frame(
prediction =
unlist(c(
exp(remd[3,]), exp(moln[3,])
)),
cil =
unlist(c(
exp(remd[1,]), exp(moln[1,])
)),
ciu =
unlist(c(
exp(remd[2,]), exp(moln[2,])
)),
time = rep(time, 2),
treatment = c(
rep("remdesivir", 100), rep("molnupiravir",100)
))
## figure
cole <- c("#FF00FF", "#01AF40")
fig_01f <- ggplot() +
geom_line(data = predict, aes(x = time,
y = prediction,
group = treatment,
color = treatment), size=1.05) +
geom_ribbon(data = predict,
aes(x = time, ymin = cil, ymax = ciu, fill = treatment),
alpha = 0.25, color = NA) +
geom_line(data = data_mutations_spike,
aes(x = week, y = f_05, group = Patient.ID),
alpha = 0.8, color = 'grey30', size = 0.15,
method = 'glm', se=FALSE,
method.args=list(family="poisson")) +
scale_color_manual(values = cole) +
scale_fill_manual(values = cole) +
labs(x = "Weeks", y = 'Mutations counts') +
facet_wrap(~treatment, ncol = 2)+
ggtitle("Spike protein, >0.5%") +
theme(plot.title = element_text(size = 11), axis.text.y = element_text(size = 9))
fig_01f2.6.10 Combo figure
Open code
mutation_time_combo <- ggarrange(
fig_01a + theme(
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.x = element_blank()),
fig_01b +
theme(axis.title.y = element_blank(),
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.x = element_blank()),
fig_01c +
theme(axis.title.y = element_blank(),
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.x = element_blank()),
fig_01d,
fig_01e +
theme(axis.title.y = element_blank()),
fig_01f +
theme(axis.title.y = element_blank()),
labels = c("A", "B", "C", "D", "E", "F"),
nrow = 2, ncol=3,
heights = c(1, 1.25),
widths = c(1.1, 1, 1),
common.legend = TRUE,
legend = "bottom"
)
plotac <- 'mutation_time_combo'
get(plotac)Open code
path = paste0('gitignore/figures/',plotac, '.pdf')
if(file.exists(path) == FALSE){
ggsave(path,
plot = get(plotac),
width = 8, height = 6, units = "in")
}3 Results description
3.1 Probability maps
A probabilistic model confirmed that the probability of a viable virus depends on both the time from the onset of symptoms and Ct. Specifically, a model with informed priors suggests that a one-unit change in Ct is related to a 60% decrease (95% CI: 36 to 80% decrease) in the odds of viability (OR: 0.4, 95% CI: 0.2 to 0.64). Similarly, the odds of viability decrease with each additional week by 69% (95% CI: 24 to 88%; odds ratio: 0.31, 95% CI: 0.12 to 0.76).
Figure XXX shows a probability map indicating the probability of the viable virus given the Ct and time from symptom onset, accounting for the uncertainty in the estimated model parameters and adjusting for repeated measurements of the same individuals. The results thus indicate that there is less than a 5% risk of a viable virus when the Ct exceeds 32.6 (1 week post-symptoms onset), 30.9 (two weeks), 29.6 (three weeks), 28.5 (four weeks), and 27.4 (five weeks post-symptoms onset).
Sensitivity analysis with very weakly informative priors shows similar results (Suppl. Figure XX1). Supplementary Figures XX2 and XX3 show the same map but assume that the median posterior estimates are correct (i.e., the uncertainty in the estimated parameters is not accounted for).
3.2 Covid mutations over time
We used Bayesian hierarchical models to evaluate changes in mutation numbers over time and their relation to treatment. Across various ISNVs (spike vs. whole genome, frequencies >0.5%, >5%, and >50%), results were relatively consistent: negligible to small changes in Remdesivir-treated patients (very likely <20% change per week across all ISNVs) and a possibly substantial increase in the Molnupiravir group (Table X, Figure X). For instance, the model for whole genome ISNVs with a frequency >5% showed a probably noticeable increase in the Molnupiravir group (95% CrI: 1.13 to 1.38) but indicated negligible to small change in the Remdesivir group (95% CrI for 1-week fold-change: 0.995 to 1.09). Models of other ISNVs showed similar yet statistically less clear patterns.
4 Reproducibility
Open code
sessionInfo()
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.5 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=cs_CZ.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=cs_CZ.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=cs_CZ.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=cs_CZ.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Europe/Prague
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] dagitty_0.3-4 ggdag_0.2.13 ggbeeswarm_0.6.0 quantreg_5.98
## [5] SparseM_1.81 coxed_0.3.3 survival_3.7-0 rms_6.8-1
## [9] Hmisc_5.1-3 bayesplot_1.8.1 ggdist_3.3.2 kableExtra_1.4.0
## [13] lubridate_1.8.0 corrplot_0.92 arm_1.12-2 lme4_1.1-35.5
## [17] MASS_7.3-61 janitor_2.2.0 projpred_2.0.2 brms_2.21.0
## [21] Rcpp_1.0.13 glmnet_4.1-8 Matrix_1.7-0 boot_1.3-30
## [25] cowplot_1.1.1 pROC_1.18.0 mgcv_1.9-1 nlme_3.1-165
## [29] openxlsx_4.2.5 flextable_0.9.6 sjPlot_2.8.16 RJDBC_0.2-10
## [33] rJava_1.0-6 DBI_1.1.2 car_3.1-2 carData_3.0-5
## [37] gtsummary_2.0.2 emmeans_1.10.4 ggpubr_0.4.0 stringi_1.7.6
## [41] forcats_1.0.0 stringr_1.5.1 dplyr_1.1.4 purrr_1.0.2
## [45] readr_2.1.2 tidyr_1.3.1 tibble_3.2.1 ggplot2_3.5.1
## [49] tidyverse_1.3.1
##
## loaded via a namespace (and not attached):
## [1] fs_1.6.4 matrixStats_1.3.0 httr_1.4.2
## [4] insight_0.20.2 tools_4.4.2 backports_1.5.0
## [7] sjlabelled_1.2.0 utf8_1.2.4 R6_2.5.1
## [10] withr_3.0.1 Brobdingnag_1.2-7 prettyunits_1.1.1
## [13] gridExtra_2.3 cli_3.6.3 textshaping_0.3.6
## [16] performance_0.12.2 gt_0.11.0 isoband_0.2.5
## [19] officer_0.6.6 sandwich_3.0-1 sass_0.4.9
## [22] labeling_0.4.2 mvtnorm_1.1-3 polspline_1.1.25
## [25] ggridges_0.5.3 askpass_1.1 QuickJSR_1.3.1
## [28] commonmark_1.9.1 systemfonts_1.0.4 StanHeaders_2.32.10
## [31] foreign_0.8-86 gfonts_0.2.0 svglite_2.1.0
## [34] readxl_1.3.1 rstudioapi_0.16.0 httpcode_0.3.0
## [37] generics_0.1.3 shape_1.4.6 distributional_0.4.0
## [40] zip_2.2.0 inline_0.3.19 loo_2.4.1
## [43] fansi_1.0.6 abind_1.4-5 lifecycle_1.0.4
## [46] multcomp_1.4-18 yaml_2.3.5 snakecase_0.11.1
## [49] grid_4.4.2 promises_1.2.0.1 crayon_1.5.0
## [52] lattice_0.22-5 haven_2.4.3 pillar_1.9.0
## [55] knitr_1.48 estimability_1.5.1 codetools_0.2-19
## [58] glue_1.7.0 V8_4.4.2 fontLiberation_0.1.0
## [61] data.table_1.15.4 vctrs_0.6.5 cellranger_1.1.0
## [64] gtable_0.3.0 assertthat_0.2.1 datawizard_0.12.2
## [67] cachem_1.1.0 xfun_0.46 mime_0.12
## [70] tidygraph_1.3.1 coda_0.19-4 iterators_1.0.14
## [73] ellipsis_0.3.2 TH.data_1.1-0 fontquiver_0.2.1
## [76] rstan_2.32.6 tensorA_0.36.2.1 vipor_0.4.5
## [79] rpart_4.1.23 colorspace_2.0-2 nnet_7.3-19
## [82] tidyselect_1.2.1 processx_3.8.4 compiler_4.4.2
## [85] curl_4.3.2 rvest_1.0.2 htmlTable_2.4.0
## [88] xml2_1.3.3 fontBitstreamVera_0.1.1 posterior_1.6.0
## [91] checkmate_2.3.2 scales_1.3.0 callr_3.7.6
## [94] digest_0.6.37 minqa_1.2.4 rmarkdown_2.27
## [97] htmltools_0.5.8.1 pkgconfig_2.0.3 base64enc_0.1-3
## [100] highr_0.11 dbplyr_2.1.1 fastmap_1.2.0
## [103] rlang_1.1.4 htmlwidgets_1.6.4 shiny_1.9.1
## [106] farver_2.1.0 zoo_1.8-9 jsonlite_1.8.8
## [109] magrittr_2.0.3 Formula_1.2-4 munsell_0.5.0
## [112] viridis_0.6.2 gdtools_0.3.7 ggraph_2.2.1
## [115] plyr_1.8.6 pkgbuild_1.3.1 parallel_4.4.2
## [118] ggrepel_0.9.5 sjmisc_2.8.10 graphlayouts_1.1.1
## [121] ggeffects_1.7.0 splines_4.4.2 hms_1.1.1
## [124] sjstats_0.19.0 ps_1.7.7 igraph_2.0.3
## [127] uuid_1.0-3 markdown_1.13 ggsignif_0.6.3
## [130] reshape2_1.4.4 stats4_4.4.2 rstantools_2.1.1
## [133] crul_1.5.0 reprex_2.0.1 evaluate_1.0.0
## [136] RcppParallel_5.1.8 modelr_0.1.8 nloptr_2.0.0
## [139] tzdb_0.2.0 foreach_1.5.2 tweenr_2.0.3
## [142] httpuv_1.6.5 cards_0.2.2 MatrixModels_0.5-3
## [145] openssl_1.4.6 polyclip_1.10-0 ggforce_0.4.2
## [148] broom_1.0.6 xtable_1.8-4 rstatix_0.7.0
## [151] later_1.3.0 viridisLite_0.4.0 ragg_1.2.1
## [154] memoise_2.0.1 beeswarm_0.4.0 cluster_2.1.6
## [157] gamm4_0.2-6 cmdstanr_0.8.0.9000 bridgesampling_1.1-2