Frankel’s health attention index

Author

Gibran Hemani

Published

October 17, 2025

Background

Frankel 1989 discussed that part of the issue of waiting lists was that professionals would prioritise treatment for things that were more fashionable rather than more urgent or prevalent.

This is the dataset:

library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(data.table)

Attaching package: 'data.table'
The following objects are masked from 'package:dplyr':

    between, first, last
library(ggplot2)
library(ggrepel)
library(knitr)
dat <- fread("frankel-index.csv")
kable(dat)
Diagnoses Discharges and deaths Index of interest (Papers/D&D × 1000)
Slow virus diseases of CNS 40 2000.0
Myasthenia Gravis 930 156.0
Crohn’s disease 6670 44.0
Carcinoma of the breast 41220 33.0
Rheumatoid arthritis 26060 27.0
Carcinoma of the bronchus 54440 20.0
Myocardial infarction 102720 10.0
Cerebrovascular disease 111250 7.7
Irritable bowel etc.* 19840 6.7
Cataract 54990 6.5
Hip replacement 37400 6.0
Haemorrhoids 20700 5.0
Inguinal hernia 64400 1.8
Tonsils and adenoids 76600 0.7
Varicose veins 47160 0.6

Work out the number of papers per disease

names(dat) <- c("disease", "dd", "index")
dat$papers <- dat$index * dat$dd / 1000

The paper points out that there are a highly disproportionate number of papers for slow viruses of the CNS. This does look like a major misalignment of resources and needs presented this way, but it is a ratio of two low numbers.

ggplot(dat, aes(x=dd, y=index)) +
geom_point() +
geom_text_repel(aes(label=disease), size=2)

Instead, plot papers against numbers of deaths and discharges. This might more reflect the degree to which there is misalignment between resources and health needs

ggplot(dat, aes(x=dd, y=papers)) +
geom_point() +
geom_text_repel(aes(label=disease), size=2) +
geom_smooth(method="lm")
`geom_smooth()` using formula = 'y ~ x'

The fraction of all papers that were on the slow viruses of the CNS was 0.012.

Inequality analysis

The Gini index represents the degree to which resources are equally allocated. A high Gini index means that a large fraction of resources are attributed to a small number of categories. The concentration index examines the degree to which another variable can explain the inequality within a Gini coefficient. A concentration index that is equal to the Gini index means that there is perfect alignment of the second variable to the resource. A concentration index of 0 means that there is no alignment (it’s basically random).

This plot shows that there is unequal attention to diseases in the literature, and there does match to some degree the unequal distribution of disease burden.

library(rineq)

out1 <- rineq::ci(
    ineqvar = dat$papers,
    outcome = dat$papers, 
    method = "direct"
)
gini <- tibble(
    ci = out1$concentration_index,
    ci_se = sqrt(out1$variance),
    ci_lci = ci - 1.96 * ci_se,
    ci_uci = ci + 1.96 * ci_se
)

out2 <- rineq::ci(
    ineqvar = dat$papers,
    outcome = dat$dd, 
    method = "direct"
)
ciout <- tibble(
    ci = out2$concentration_index,
    ci_se = sqrt(out2$variance),
    ci_lci = ci - 1.96 * ci_se,
    ci_uci = ci + 1.96 * ci_se
)

make_plot_dat <- function(x) {
  myOrder <- order(x$fractional_rank)
  xCoord <- x$fractional_rank[myOrder]
  y <- x$outcome[myOrder]
  cumdist <- cumsum(y) / sum(y)
  tibble(xCoord, cumdist)
}

plot_dat <- bind_rows(
    make_plot_dat(out1) %>% mutate(group=paste0("Number of papers (Gini index = ", round(out1$concentration_index, 2), ")")), 
    make_plot_dat(out2) %>% mutate(group=paste0("Deaths and discharges (concentration index = ", round(out2$concentration_index, 2), ")"))
)

ggplot(aes(x = xCoord, y = cumdist, group = group), data = plot_dat) +
  geom_line(aes(colour = group)) +
  geom_abline(slope = 1, intercept = 0, linetype = "dashed") +
  theme_bw() +
  theme(legend.position = "inside", legend.position.inside=c(0.3,0.8)) +
  labs(
    x = "Fractional rank of number of papers",
    y = "Cumulative proportion of deaths and discharges",
    colour = "Outcome"
  )

However the concentration index has quite a large confidence interval.

ciout %>% kable
ci ci_se ci_lci ci_uci
0.1328638 0.1044544 -0.0718669 0.3375944

Summary

There is some concordance between the inequality in disease attention and the disease impact, but these estimates are imprecise.


sessionInfo()
R version 4.5.1 (2025-06-13)
Platform: aarch64-apple-darwin20
Running under: macOS Sonoma 14.6.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/London
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rineq_0.3.0       knitr_1.50        ggrepel_0.9.6     ggplot2_3.5.2    
[5] data.table_1.17.8 dplyr_1.1.4      

loaded via a namespace (and not attached):
 [1] vctrs_0.6.5        nlme_3.1-168       cli_3.6.5          rlang_1.1.6       
 [5] xfun_0.52          generics_0.1.4     jsonlite_2.0.0     labeling_0.4.3    
 [9] glue_1.8.0         htmltools_0.5.8.1  scales_1.4.0       rmarkdown_2.29    
[13] grid_4.5.1         evaluate_1.0.4     tibble_3.3.0       fastmap_1.2.0     
[17] yaml_2.3.10        lifecycle_1.0.4    compiler_4.5.1     RColorBrewer_1.1-3
[21] Rcpp_1.1.0         htmlwidgets_1.6.4  pkgconfig_2.0.3    mgcv_1.9-3        
[25] lattice_0.22-7     farver_2.1.2       digest_0.6.37      R6_2.6.1          
[29] tidyselect_1.2.1   splines_4.5.1      pillar_1.11.0      magrittr_2.0.3    
[33] Matrix_1.7-3       withr_3.0.2        tools_4.5.1        gtable_0.3.6