Monday, August 22, 2011

Estimating Trait Heritability from GWAS Data

Peter Visscher and colleagues have recently published a flurry of papers employing a new software package called GCTA to estimate the heritability of traits using GWAS data (GCTA stands for Genome-wide Complex Trait Analysis -- clever acronymity!). The tool, supported (and presumably coded) by Jian Yang is remarkably easy to use, based in part on the familiar PLINK commandline interface. The GCTA Homepage provides an excellent walk-through of the available options.

The basic idea is to use GWAS data to estimate the degree of "genetic sharing" or relatedness among the samples, computing what the authors call a genetic relationship matrix (GRM). The degree of genetic sharing among samples is then related to the amount of phenotypic sharing using restricted maximum likelihood analysis (REML). The result is an estimate of the variance explained by the SNPs used to generate the GRM. Full details of the stats along with all the gory matrix notation can be found in their software publication.

The approach has been applied to several disorders studied by the WTCCC and to a recent study of human height. Interestingly, the developers have also used the approach to partition the trait variance across chromosomes, resulting in something similar to population-based variance-components linkage analysis. The approach works for both quantitative and dichotomous traits, however the authors warn that variance estimates of dichotomous trait liability are influenced by genotyping artifacts.

The package also includes several other handy features, including a relatively easy way to estimate principal components for population structure correction, a GWAS simulation tool, and a regression-based LD mapping tool. Download and play -- a binary is available for Linux, MacOS, and DOS/Windows.


  1. I've been using GCTA for a few months, definitely simple to use. Though, I would be careful saying it estimates heritability, as this is not necessarily true. Using "unrelated" individuals, it estimates the variance explained by common SNPs through LD with causal variants, which can in certain situations, reflect a lower bound for the heritability. Anyway, a subtle point, but probably an important distinction.

  2. Does V(1)/Vp give us the estimate for heritability? Did you guys come across to elevated standard errors for the estimates? I am getting a S.E. of 0.42. Maybe it is due to the sample size (n=695).


Note: Only a member of this blog may post a comment.

Creative Commons License
Getting Genetics Done by Stephen Turner is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License.