Checking PCA projection

Statistics
Author

Gibran Hemani

Published

June 6, 2023

Background

Need to be able to calculate genetic principal components in unrelateds and then project into relateds.

Simulate

library(MASS)
m <- 20
A <- matrix(runif(m^2)*2-1, ncol=m)
sigma <- t(A) %*% A
gen <- mvrnorm(1000, rep(0, m), sigma)

Generate in full sample

pcafull <- princomp(gen)
names(pcafull)
[1] "sdev"     "loadings" "center"   "scale"    "n.obs"    "scores"   "call"    

Generate in 90%

pca90 <- princomp(gen[1:900,])

Project into remaining 10%

pca10 <- gen[901:1000,] %*% pca90$loadings

correlate full with projected

diag(cor(pca10, pcafull$scores[901:1000,]))
   Comp.1    Comp.2    Comp.3    Comp.4    Comp.5    Comp.6    Comp.7    Comp.8 
0.9979686 0.9869467 0.9829454 0.9843170 0.9927274 0.9905525 0.9942163 0.9960611 
   Comp.9   Comp.10   Comp.11   Comp.12   Comp.13   Comp.14   Comp.15   Comp.16 
0.9994964 0.9965984 0.9981390 0.9818166 0.9838703 0.9984871 0.9972718 0.9968159 
  Comp.17   Comp.18   Comp.19   Comp.20 
0.9982290 0.9993551 0.9991197 0.9989832 

This works fine. Looks like GCTA does something similar here: https://yanglab.westlake.edu.cn/software/gcta/#PCloadingandprojection


sessionInfo()
R version 4.3.0 (2023-04-21)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.6.2

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

time zone: Europe/London
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] MASS_7.3-58.4

loaded via a namespace (and not attached):
 [1] htmlwidgets_1.6.2 compiler_4.3.0    fastmap_1.1.1     cli_3.6.1        
 [5] tools_4.3.0       htmltools_0.5.5   rstudioapi_0.14   yaml_2.3.7       
 [9] rmarkdown_2.22    knitr_1.43        xfun_0.39         digest_0.6.31    
[13] jsonlite_1.8.4    rlang_1.1.1       evaluate_0.21