Generating PRS of clocks from mQTLs

Author

Gibran Hemani

Published

January 4, 2024

Background

Can we generate a PRS for an epigenetic clock using just mQTLs?

Model

The clock is a weighted sum of CpGs. Each CpG has an mQTL. No other genetic factors influence the clock.

Simulation

library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(simulateGP)
ncpg <- 100
nid <- 10000

w <- rnorm(ncpg)
g <- matrix(rbinom(ncpg * nid, 2, 0.5), nid, ncpg)
b <- rnorm(ncpg, sd=0.1)
e <- matrix(rnorm(ncpg * nid), nid, ncpg)
cpgg <- t(t(g) * b)
cpg <- cpgg + e
clock <- cpg %*% w
clockgwas <- gwas(clock, g)
clockprs_direct <- g %*% clockgwas$bhat

bhat <- sapply(1:ncpg, \(i) {
    fast_assoc(cpg[,i], g[,i])$bhat
})
cpghat <- t(t(g) * bhat)
clockprs_mqtl <- cpghat %*% w
plot(clockprs_direct, clockprs_mqtl)

Which is better powered?

summary(lm(clock ~ clockprs_direct))

Call:
lm(formula = clock ~ clockprs_direct)

Residuals:
    Min      1Q  Median      3Q     Max 
-37.279  -7.035  -0.015   6.933  44.983 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)     -0.92219    0.13001  -7.093  1.4e-12 ***
clockprs_direct  1.01061    0.07803  12.951  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 10.35 on 9998 degrees of freedom
Multiple R-squared:  0.0165,    Adjusted R-squared:  0.0164 
F-statistic: 167.7 on 1 and 9998 DF,  p-value: < 2.2e-16
summary(lm(clock ~ clockprs_mqtl))

Call:
lm(formula = clock ~ clockprs_mqtl)

Residuals:
    Min      1Q  Median      3Q     Max 
-37.123  -7.076   0.032   6.962  45.099 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)     0.2571     0.1059   2.428   0.0152 *  
clockprs_mqtl   1.0943     0.1332   8.215 2.39e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 10.41 on 9998 degrees of freedom
Multiple R-squared:  0.006704,  Adjusted R-squared:  0.006605 
F-statistic: 67.48 on 1 and 9998 DF,  p-value: 2.387e-16

Summary

  • Generating clock PRS using GWAS of clock is equivalent to generating the clock PRS indirectly from mQTLs
  • It’s better powered to GWAS the clock directly than to use mQTLs, assuming same sample sizes for clock GWAS and mQTL
  • There may be latent heritable factors that influence CpGs that are not the known mQTLs, and which in aggregate are better powered to be detected by the clock GWAS. But these are likely to be a minority of the genetic variation for the clock.

sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.6

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

time zone: Europe/London
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] simulateGP_0.1.2 dplyr_1.1.4     

loaded via a namespace (and not attached):
 [1] digest_0.6.33     utf8_1.2.4        R6_2.5.1          fastmap_1.1.1    
 [5] tidyselect_1.2.0  xfun_0.41         magrittr_2.0.3    glue_1.6.2       
 [9] tibble_3.2.1      knitr_1.45        pkgconfig_2.0.3   htmltools_0.5.7  
[13] rmarkdown_2.25    generics_0.1.3    lifecycle_1.0.4   cli_3.6.1        
[17] fansi_1.0.5       vctrs_0.6.4       compiler_4.3.2    tools_4.3.2      
[21] pillar_1.9.0      evaluate_0.23     yaml_2.3.7        rlang_1.1.2      
[25] jsonlite_1.8.7    htmlwidgets_1.6.3