Last year I linked to a series of perspectives in NEJM with contrasting views on the success or failure of GWAS - David Goldstein's paper and Nick Wade's synopsis that soon followed in the New York Times being particularly pessimistic. Earlier this year I was swayed by an essay in Cell by Jon McClellan and Mary-Claire King condemning the common disease common variant hypothesis and chalking up most GWAS hits to population stratification. The main argument, that the frequency of the risk allele being hypervariable even in European populations - was persuasive until Kai Wang pointed out on this blog and recently expanded in a correspondence in Cell that McClellan and King used a cohort with extremely small sample sizes to estimate allele frequencies, and that this locus was no more variable than the average SNP in European populations. (There are a series of three letters in Cell, including a response by McClellan and King that are definitely worth reading here, here, and here.)
One of the main arguments in David Goldstein's 2009 NEJM paper is that doing GWAS with increasingly larger sample sizes will not yield meaningful discoveries, especially if the newly detected loci explain such a small proportion of the heritability of the trait being studied. A paper published last week in Nature provides empirical evidence refuting such claims. "Biological, clinical and population relevance of 95 loci for blood lipids" by Teslovich et al (with Goncalo Abecasis, Cristen Willer, Sekar Kathiresan, Leena Peltonen, Kari Steffanson, Yurii Aulchenko, Chiara Sabatti, Robert Hegele, Francis Collins, and many, many other co-authors) presents a meta-analysis of blood lipid levels in over 100,000 samples in multiple ethnic groups. This study identified 95 loci (59 novel) associated with either total cholesterol, LDL, HDL, or Triglycerides, explaining 10-12% of the variation in these traits. A handful of these loci demonstrated clear clinical and/or biological significance: several are common variants in or near genes harboring rare variants known to cause extreme dyslipidemias, and with others the authors demonstrated altered lipid levels in mice after disturbing the regulation of several of the newly discovered genes. Furthermore, most of the newly discovered loci were significant in other non-European populations with the same direction of association .
This study demonstrates that combining studies using meta-analysis, achieving massive sample sizes to detect extremely small effects can result in both clinically and biologically meaningful discoveries using GWAS. This study also demonstrates that most of the significant results are in fact associated with lipid traits across global populations, which has implications for enabling personal genomics / personalized medicine in non-European populations. Furthermore, as Teri Manolio noted in her recent review in NEJM, one cannot equate variance explained with potential clinical importance: Type II diabetes associated genes PPARG and KCNJ11 and psoriasis-associated IL12B encode proteins that are drug targets for thiazolidinediones, sulfonylureas, and anti-p40 antibodies respectively, yet all of these associations have odds ratios less than 1.45. So a GWAS with >100,000 samples uncovers new loci with extremely small effects... while these loci alone may not be useful today for treatment or clinical risk stratification, it's difficult to judge the importance of these loci until you perturb the system with pharmaceutical or some other environmental intervention.
A true testament to the success of GWAS, the paper is a pleasure to read, (even though unfortunately the real substance of the paper is buried in the 19 tables and 3 figures in the 83 page supplement).
Biological, clinical and population relevance of 95 loci for blood lipids