Is there a problem of standard errors in MR methods being overly precise?
Simulate summary data where the exposure has an influence of bxy on the outcome, but the SNPs may have independent effects on the outcome as well.
Simulate 10k SNPs, with 50% of them contributing to heritability.
The pleiotropy effect is that for X and Y, 50% of SNPs are selected independently to contribute to some heritability.
The causal effect is that the the bgx effects on have an additional effect on y of bgx * bxy.
bxy = 0, pleiotropy = 0
bxy = 0, pleiotropy = 0.4
bxy = 0.4, pleiotropy = 0
bxy = 0.4, pleiotropy = 0.4
For scenarios where bxy = 0, expect that false positive rate is appropriately controlled for IVW, weighted median, weighted mode.
set.seed(12345)library(TwoSampleMR)
TwoSampleMR version 0.5.11
[>] New: Option to use non-European LD reference panels for clumping etc
[>] Some studies temporarily quarantined to verify effect allele
[>] See news(package = 'TwoSampleMR') and https://gwas.mrcieu.ac.uk for further details
library(simulateGP)
Attaching package: 'simulateGP'
The following objects are masked from 'package:TwoSampleMR':
allele_frequency, contingency, get_population_allele_frequency
library(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
library(ggplot2)library(knitr)sim <-function(nsnp, nid, plei, bxy, sim=1) {# Allele frequencies map <-arbitrary_map(runif(nsnp, 0.01, 0.99))# Effects for x paramsx <-generate_gwas_params(map=map, h2=0.4, S=-0.4, Pi=0.5)# Summary stats for x betax <-generate_gwas_ss(paramsx, nid=nid)# Effects for y paramsy <-generate_gwas_params(map=map, h2=plei, S=-0.4, Pi=0.5) %>%mutate(beta = beta + paramsx$beta * bxy)# summary stats for y betay <-generate_gwas_ss(paramsy, nid=nid) dat <-merge_exp_out(betax, betay) bind_rows(# Analysis using thresholded instruments dat %>%filter(pval.exposure <5e-8) %>%mr(., method_list=c("mr_ivw", "mr_weighted_median", "mr_weighted_mode")) %>%mutate(inst ="threshold"),# Analysis using all variants regardless of threshold dat %>%mr(., method_list=c("mr_ivw", "mr_weighted_median", "mr_weighted_mode")) %>%mutate(inst ="all"), ) %>%mutate(plei=plei, bxy=bxy, sim=sim)}sim(10000, 100000, 0.4, 0)
Analysing 'X' on 'Y'
Analysing 'X' on 'Y'
id.exposure id.outcome outcome exposure method nsnp
1 X Y Y X Inverse variance weighted 346
2 X Y Y X Weighted median 346
3 X Y Y X Weighted mode 346
4 X Y Y X Inverse variance weighted 10000
5 X Y Y X Weighted median 10000
6 X Y Y X Weighted mode 10000
b se pval inst plei bxy sim
1 -0.0132785411 1.862067e-02 0.4757792 threshold 0.4 0 1
2 0.0053904701 1.356861e-02 0.6911646 threshold 0.4 0 1
3 0.0325270703 4.133091e-02 0.4318272 threshold 0.4 0 1
4 -0.0089724040 1.006037e-02 0.3724701 all 0.4 0 1
5 -0.0003497783 8.313815e-03 0.9664413 all 0.4 0 1
6 4.6421866726 1.341177e+05 0.9999724 all 0.4 0 1
sim(10000, 100000, 0, 0.4)
Analysing 'X' on 'Y'
Analysing 'X' on 'Y'
id.exposure id.outcome outcome exposure method nsnp
1 X Y Y X Inverse variance weighted 338
2 X Y Y X Weighted median 338
3 X Y Y X Weighted mode 338
4 X Y Y X Inverse variance weighted 10000
5 X Y Y X Weighted median 10000
6 X Y Y X Weighted mode 10000
b se pval inst plei bxy sim
1 0.3646894 7.852256e-03 0.000000e+00 threshold 0 0.4 1
2 0.3515298 1.220235e-02 1.684991e-182 threshold 0 0.4 1
3 0.3313245 4.271241e-02 1.043069e-13 threshold 0 0.4 1
4 0.3242457 4.724237e-03 0.000000e+00 all 0 0.4 1
5 0.3190166 7.740202e-03 0.000000e+00 all 0 0.4 1
6 -2.2205335 2.014174e+03 9.991204e-01 all 0 0.4 1
`summarise()` has grouped output by 'plei', 'bxy', 'method'. You can override
using the `.groups` argument.
plei
bxy
method
inst
mean_se
mean_beta
power
minp
0.0
0.4
Inverse variance weighted
all
4.74690e-03
0.3203781
1
0.0000000
0.0
0.4
Inverse variance weighted
threshold
8.26810e-03
0.3607363
1
0.0000000
0.0
0.4
Weighted median
all
7.76120e-03
0.3170201
1
0.0000000
0.0
0.4
Weighted median
threshold
1.21785e-02
0.3478450
1
0.0000000
0.0
0.4
Weighted mode
all
5.44511e+05
-16.6974825
0
0.9988647
0.0
0.4
Weighted mode
threshold
4.15776e-02
0.3481614
1
0.0000000
0.4
0.4
Inverse variance weighted
all
1.01325e-02
0.3184288
1
0.0000000
0.4
0.4
Inverse variance weighted
threshold
1.83467e-02
0.3601333
1
0.0000000
0.4
0.4
Weighted median
all
8.99970e-03
0.2923377
1
0.0000000
0.4
0.4
Weighted median
threshold
1.41112e-02
0.3272884
1
0.0000000
0.4
0.4
Weighted mode
all
8.06599e+06
147.1660233
0
0.9984827
0.4
0.4
Weighted mode
threshold
4.08282e-02
0.3404088
1
0.0000000
Summary
Under no pleiotropy the false positive rate is controlled (actually median and mode are slightly over conservative)
Under pleiotropy the false positive rate is controlled for weighted mode but slightly inflated for weighted median
Not obvious that the bootstrap approach for obtaining standard errors here, which is used by weighted median and weighted mode, is performing particularly poorly.
This is only 100 replications so may be unstable but there isn’t something very obviously wrong here.