# Do odds ratios change for varying prevalence?

Author

Gibran Hemani

Published

November 5, 2023

## Background

Disease prevalence may change across ancestries but effect sizes stay the same. Does this lead to different effect estimates?

mean of risk factor changes - does it influence beta hat?

``library(dplyr)``
``````
Attaching package: 'dplyr'``````
``````The following objects are masked from 'package:stats':

filter, lag``````
``````The following objects are masked from 'package:base':

intersect, setdiff, setequal, union``````
``````library(ggplot2)

p <- expand.grid(
b=c(-1, 0.5, 0, 0.5, 1),
m=seq(-0.8, 0.8, by=0.1),
bhat=NA,
prev=NA
)

n <- 10000
for(i in 1:nrow(p)) {
a <- rnorm(n, mean=p\$m[i])
b <- a * p\$b[i] + rnorm(n)
d <- rbinom(n, 1, plogis(b))
p\$bhat[i] <- glm(d ~ a, family="binomial")\$coef[2]
p\$prev[i] <- mean(d)
}

ggplot(p, aes(x=prev, y=bhat)) +
geom_point(aes(colour=m)) +
facet_grid(. ~ b)``````

no influence.

What about if prevalence changes due to another factor

``````p <- expand.grid(
b=c(-1, 0.5, 0, 0.5, 1),
m1=seq(-0.8, 0.8, by=0.1),
m2=seq(-0.8, 0.8, by=0.1),
bhat=NA,
prev=NA
)

n <- 10000
for(i in 1:nrow(p)) {
a <- rnorm(n, mean=p\$m1[i])
a1 <- rnorm(n, mean=p\$m2[i])
b <- a * p\$b[i] + rnorm(n) + a1
d <- rbinom(n, 1, plogis(b))
p\$bhat[i] <- glm(d ~ a, family="binomial")\$coef[2]
p\$prev[i] <- mean(d)
}

ggplot(p, aes(x=prev, y=bhat)) +
geom_point(aes(colour=m2)) +
facet_grid(m1 ~ b)``````
``summary(lm(bhat ~ m1 + m2, p))``
``````
Call:
lm(formula = bhat ~ m1 + m2, data = p)

Residuals:
Min      1Q  Median      3Q     Max
-0.9782 -0.1597  0.2058  0.2426  0.6615

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.146779   0.013228  11.096   <2e-16 ***
m1          -0.001546   0.027002  -0.057    0.954
m2          -0.001075   0.027002  -0.040    0.968
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.5028 on 1442 degrees of freedom
Multiple R-squared:  3.373e-06, Adjusted R-squared:  -0.001384
F-statistic: 0.002432 on 2 and 1442 DF,  p-value: 0.9976``````

no influence

``sessionInfo()``
``````R version 4.3.0 (2023-04-21)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.6

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

time zone: America/New_York
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] ggplot2_3.4.2 dplyr_1.1.2

loaded via a namespace (and not attached):
[1] vctrs_0.6.3       cli_3.6.1         knitr_1.43        rlang_1.1.1
[5] xfun_0.39         generics_0.1.3    jsonlite_1.8.7    labeling_0.4.2
[9] glue_1.6.2        colorspace_2.1-0  htmltools_0.5.5   scales_1.2.1
[13] fansi_1.0.4       rmarkdown_2.22    grid_4.3.0        munsell_0.5.0
[17] evaluate_0.21     tibble_3.2.1      fastmap_1.1.1     yaml_2.3.7
[21] lifecycle_1.0.3   compiler_4.3.0    htmlwidgets_1.6.2 pkgconfig_2.0.3
[25] rstudioapi_0.14   farver_2.1.1      digest_0.6.31     R6_2.5.1
[29] tidyselect_1.2.0  utf8_1.2.3        pillar_1.9.0      magrittr_2.0.3
[33] withr_2.5.0       tools_4.3.0       gtable_0.3.3     ``````