Friday, October 29, 2010

Reproducible Research in the Omics Era: A Presentation and Panel Discussion

Seminar announcement for Vanderbilt folks:

Vanderbilt-Ingram Cancer Center
Quantitative Sciences Seminar Series


Reproducible Research in the Omics Era:
A Presentation and Panel Discussion
Kevin R. Coombes, PhD
Deputy Chair, Bioinformatics, and Professor of Bioinformatics and
Computational Biology
M.D. Anderson Cancer Center


Keith Baggerly, PhD
Associate Professor, Dept. of Bioinformatics and Computational Biology
M.D. Anderson Cancer Center

Panel Discussion at 1 p.m., following presentations:
Featuring Drs. Baggerly and Coombes, along with
Vanderbilt University School of Medicine’s
Dr. William Pao, Dr. Frank Harrell, and Dr. Yu Shyr

Friday, November 19, 2010
12 noon – 2 PM
214 Light Hall

Thursday, October 28, 2010

PacBio Film, Discussion & Reception/Dinner at ASHG 2010

Pacific Biosciences is hosting a reception and dinner, and is screening their film The New Biology at this year's ASHG meeting. According to a flyer the mailed me, the film will showcase their SMRT sequencing technology and how it can be used to "create predictive models of living systems and gain wisdom about the fundamental nature of life itself." While the last bit is perhaps an overstatement, the event should nonetheless be an event worth attending. The event includes a reception, dinner, and a moderated discussion featuring individuals from the film. Unfortunately this conflicts with the previously mentioned 1000 Genomes Tutorial, but if you get waitlisted at the tutorial, sign up for this event at the link below!

Wednesday, November 3 2010


Smithsonian National Air and Space Museum
Independence Ave at 6th St SW
Washington, DC 20560

RSVP here -

Wednesday, October 27, 2010

Prioritizing GWAS Results: A Review of Statistical Methods and Recommendations for Their Application

While writing my thesis I came across this nice review by Rita Cantor, Kenneth Lange, and Janet Sinsheimer at UCLA, "Prioritizing GWAS Results: A Review of Statistical Methods and Recommendations for Their Application." Skip the introduction unless you're new to GWAS, in which case you'll probably want to start with this more recent review by Teri Manolio. After skipping the intro you'll find succinct introduction to meta-analysis for GWAS with lots of very good references, including these among others:

DerSimonian R., Laird N. Meta-analysis in clinical trials. Control. Clin. Trials. 1986;7:177–188. [PubMed]

Fleiss J.L. The statistical basis of meta-analysis. Stat. Methods Med. Res. 1993;2:121–145. [PubMed]

Yesupriya A., Yu W., Clyne M., Gwinn M., Khoury M.J. The continued need to synthesize the results of genetic associations across multiple studies. Genet. Med. 2008;10:633–635. [PubMed]

Lau J., Ioannidis J.P., Schmid C.H. Quantitative synthesis in systematic reviews. Ann. Intern. Med. 1997;127:820–826. [PubMed]

Allison D.B., Schork N.J. Selected methodological issues in meiotic mapping of obesity genes in humans: Issues of power and efficiency. Behav. Genet. 1997;27:401–421. [PubMed]

Ioannidis J.P., Gwinn M., Little J., Higgins J.P., Bernstein J.L., Boffetta P., Bondy M., Bray M.S., Brenchley P.E., Buffler P.A., Human Genome Epidemiology Network and the Network of Investigator Networks A road map for efficient and reliable human genome epidemiology. Nat. Genet. 2006;38:3–5. [PubMed]

de Bakker P.I., Ferreira M.A., Jia X., Neale B.M., Raychaudhuri S., Voight B.F. Practical aspects of imputation-driven meta-analysis of genome-wide association studies. Hum. Mol. Genet. 2008;17(R2):R122–R128. [PMC free article] [PubMed]

Sagoo G.S., Little J., Higgins J.P., Human Genome Epidemiology Network Systematic reviews of genetic association studies. PLoS Med. 2009;6:e28. [PMC free article] [PubMed]

Zeggini E., Ioannidis J.P. Meta-analysis in genome-wide association studies. Pharmacogenomics. 2009;10:191–201. [PMC free article] [PubMed]

Egger M., Smith G.D., Phillips A.N. Meta-analysis: Principles and procedures. BMJ. 1997;315:1533–1537. [PMC free article] [PubMed]

Ioannidis J.P., Patsopoulos N.A., Evangelou E. Heterogeneity in meta-analyses of genome-wide association investigations. PLoS ONE. 2007;2:e841. [PMC free article] [PubMed]

This section covers using imputation in meta-analysis, fixed effects versus random effects meta-analysis, canned software for meta-analysis (such as METAL), Bayesian hierarchical approaches, and references to many applications of meta-analysis in GWAS.

After the meta-analysis section there's a nice section on modeling epistasis, or gene-gene interactions, to prioritize associations with links to other reviews of statistical methods, and brief coverage of data mining procedures like CART, MDR, random forests, conditional entropy methods, neural networks, genetic programming, logic regression, pattern mining, Bayesian partitioning, and penalized regression approaches, again with lots of references. This section also covers parameterization of epistatic models, and covers some of the computation and statistical issues you'll face with the dimensionality problem.

Finally, the review concludes with a section on pathway analysis. As the review admits, pathway analysis in GWAS has no set of strict guidelines or best practices, and new approaches arise every day.

While this review is nearly a year old at this point, I think it's a real gem because of all the references it offers, especially in the meta-analysis and epistasis sections.

AJHG: Prioritizing GWAS Results: A Review of Statistical Methods and Recommendations for Their Application

Thursday, October 14, 2010

Tutorial on the 1000 Genomes Project Data

There will be a (free) tutorial on the 1000 genomes project at this year's ASHG meeting on Wednesday, November 3, 7:00 – 9:30pm. You can register online at the link below. The tutorial will describe the 1000 genomes data, how to access it, and what to do with it. Specifically, the speakers and topics covered are:

1. Introduction
2. Description of the 1000 Genomes data -- Gabor Marth
3. How to access the data -- Steve Sherry
4. How to use the browser -- Paul Flicek
5. Structural variants -- Jan Korbel
6. How to use the data in disease studies -- Jeff Barrett
7. Q&A

Online registration for 1000 genomes tutorial

Hopefully I'll see some of you there. I'm not sure if imputation is covered in this tutorial. If not, I will cover it here in a future post. I'll soon be using Goncalo Abecasis's 1000 Genomes Imputation Cookbook to impute my own data to the 1kG SNPs, and I'll share any tips I discover along the way.

Wednesday, October 6, 2010

Random forests for high-dimensional genomics data

I know I've been MIA for a while. My defense date is December 3, and I've still got a thesis to write! I'll try to post more soon, but in the meantime follow me on Twitter for things that won't make it into a full blog post.

For those at Vanderbilt and the surrounding environs: I saw this announcement for the next cancer biostatistics workshop that looked interesting.

2010 Cancer Biostatistics Workshop

Friday, october 15, 2010
1:00 to 2:00 PM
898B Preston Research Building

Random forests for high-dimensional genomics data

Xi (Steven) Chen, PhD
Assistant Professor
Department of Biostatistics
Cancer Biostatistics Center, Vanderbilt-Ingram Cancer Center
Creative Commons License
Getting Genetics Done by Stephen Turner is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License.