# Decomposing drug X side effect matrices

Author

Gibran Hemani

Published

October 31, 2023

## Background

Aim: Want to predict side effects using MR.

Have a matrix of side effects x drugs.

``A = n_se X m_drugs``

Each drug binds some genes

``B = p_genes X m_drugs``

MR of each gene on all traits

``C = q_traits X p_genes``

and a matrix linking trait terms to side effect terms

``D = q_traits X n_se``

## Basic simulation

• m=3 drugs
• n=5 side effects
• p=6 genes
• q=10 traits
``````# gene x drug - e.g. based on binding affinities
B <- matrix(c(
0, 1, 1,
1, 0, 0,
0, 1, 0,
0, 0, 1,
0, 0, 0,
0, 1, 1
), 6, 3)

# trait x se - matches trait names to side effect terms
D <- matrix(c(
1, 0, 0, 0, 0,
1, 0, 0, 0, 0,
0, 1, 0, 0, 0,
0, 1, 0, 0, 0,
0, 0, 1, 0, 0,
0, 0, 1, 0, 0,
0, 0, 1, 0, 0,
0, 0, 1, 0, 0,
0, 0, 0, 1, 0,
0, 0, 0, 1, 0
), 10, 5)

# True mapping of genes to side effects - we don't observe this
gse <- matrix(c(
1, 0, 0, 0, 0, 0,
1, 1, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0,
0, 0, 0, 1, 1, 0,
0, 0, 0, 0, 1, 0
), 5, 6)

# True drug x side effect matrix is generated from gene side effects by gene drug binding
A <- gse %*% B
A``````
``````     [,1] [,2] [,3]
[1,]    0    0    0
[2,]    1    1    1
[3,]    1    1    1
[4,]    0    1    1
[5,]    1    0    0``````

We don’t actually see the gse matrix. If everything works as we hypothesise then the trait x gene matrix that we observe would follow:

``````C <- D %*% gse
C``````
``````      [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1    0    0    0    0    0
[2,]    0    1    0    0    1    0
[3,]    0    1    0    0    1    1
[4,]    0    0    1    0    0    0
[5,]    0    0    0    0    0    0
[6,]    1    0    0    0    0    0
[7,]    0    1    0    0    1    0
[8,]    0    1    0    0    1    1
[9,]    0    0    1    0    0    0
[10,]    0    0    0    0    0    0``````

Now we have B, C and D. How do we get back to A? Need to invert D, which isn’t square so use Moore-Penrose pseudoinverse

``````library(pracma)
Ahat <- pinv(D) %*% C %*% B``````

Does the prediction match the true A?

``cor(c(Ahat), c(A))``
``[1] 0.9279607``
``plot(Ahat, A)``

Quite close - the pseudoinverse has failed to get some of the values correct. Alternative to using pseudoinverse is to just manually re-label trait names with side effect values

``sessionInfo()``
``````R version 4.3.0 (2023-04-21)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.6

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

time zone: Europe/London
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] pracma_2.4.2

loaded via a namespace (and not attached):
[1] htmlwidgets_1.6.2 compiler_4.3.0    fastmap_1.1.1     cli_3.6.1
[5] tools_4.3.0       htmltools_0.5.5   rstudioapi_0.14   yaml_2.3.7
[9] rmarkdown_2.22    knitr_1.43        xfun_0.39         digest_0.6.31
[13] jsonlite_1.8.7    rlang_1.1.1       evaluate_0.21    ``````