Monday, January 18, 2010

Genome-wide Manhattan Plots in STATA

I have used this chunk of code on numerous occasions to plot GWAS data, so I thought I'd share!
twoway scatter logpval absposMB, msize(tiny) ysize(2) xsize(7) xaxis(1 2) yline(16.881438 2.9957323, lcolor(red))
xline(245.522847 488.541076 688.046816 879.458034 1060.3159 1231.291599 1389.919738 1536.194564 1674.623832 1810.03746 1944.489844 2076.939655 2191.082635 2297.45122 2397.790135 2486.617389 2565.392131 2641.509284 2705.320935 2767.756899 2814.701222 2864.255932, lpattern(dash) lcolor(gs12))
xlabel(122 "1" 367 "2" 588 "3" 783 "4" 969 "5" 1145 "6" 1310 "7" 1463 "8" 1605 "9" 1742 "10" 1877 "11" 2010 "12" 2134
"13" 2244 "14" 2347 "15" 2442 "16" 2526 "17" 2603 "18" 2673 "19" 2736 "20" 2791 "21" 2839 "22" 2941 "X", axis(2))
xlabel( 0(50)3000, axis(1) alternate) subtitle(,bcolor(white) ring(30) orientation(vertical) position(9) nobexpand) ytitle("-Log p-value") xtitle("Physical Distance (Mb)", axis(1))
The variables needed are a log p-value (or some other statistic) and the absolute genomic position of each SNP (distance from the beginning of chromosome 1). If you need the offsets to compute this absolute position, they are listed in MB in the xline(---) portion of the plot. It makes something like this:



Enjoy!

3 comments:

  1. data codes is one of the most interesting things, I really like it, I would like to be a professional in this area!

    ReplyDelete
  2. how do I put the inputs of -log p value in y axis in stata command?
    Can you please explain the command in small pieces?

    ReplyDelete
  3. I'd highly recommend moving to use the qqman package. Here's a link to the blog post:

    http://www.gettinggeneticsdone.com/2014/05/qqman-r-package-for-qq-and-manhattan-plots-for-gwas-results.html

    ReplyDelete

Creative Commons License
Getting Genetics Done by Stephen Turner is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License.