tag:blogger.com,1999:blog-6232819486261696035.post834070921101479936..comments2018-10-05T08:05:00.315-05:00Comments on Getting Genetics Done: Split, apply, and combine in R using PLYRStephen Turnerhttp://www.blogger.com/profile/06656711316726116187noreply@blogger.comBlogger5125tag:blogger.com,1999:blog-6232819486261696035.post-21715533422208947052012-03-28T13:12:29.170-05:002012-03-28T13:12:29.170-05:00I enjoy your blog. It is helpful. Plyr is definite...I enjoy your blog. It is helpful. Plyr is definitely cool and provides some much needed consistency when applying functions across data frames, arrays, and lists. I plan to use it more. On the other hand - I just wanted to point out that this exercise is also very doable with a "standard" R function like aggregate - in case there were those who thought that one *had* to use plyr to solve a problem like this (mostly new comers to R). This is off the top of my head so there may even be more "short cuts" though I think this is readable. I used your data as it appears on this page.<br /><br /><br />> f = function(x) {mx=mean(x);sdx=sd(x);mz=c(mean=mx,sd=sdx)}<br />> aggregate(mydata[c('X1','X2')],by=list(SNP1=SNP1,SNP2=SNP2),f)<br /><br /> SNP1 SNP2 X1.mean X1.sd X2.mean X2.sd<br />1 AA BB 0.4007287 0.8677048 6.013512 2.594847<br />2 Aa Bb 0.1908830 0.8702564 5.810701 1.806888<br />3 aa bb 0.4249260 1.4266786 5.496132 1.824754Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6232819486261696035.post-52815784580439056412010-01-14T08:37:14.249-06:002010-01-14T08:37:14.249-06:00Hi there ! Struggling with the same problem here. ...Hi there ! Struggling with the same problem here. Coming from SQL and "group by"-thinking I had huge problems finding a similarly simple and elegant solution in R, e.g. tapply() works, but is cumbersome. I could just use the sqldf package but that would be cheating :)<br /><br />If u want the mean and sd of one variable only, a more "readable" version of the code can be made combining plyr with the incredibly useful smean.sd() function from the Hmisc package.<br /><br />ddply(mydata, .(SNP1), function(df)smean.sd(df$SNP1)) <br /><br />I love the fact that R has a million ways of solving a problem and that each user can find one that suits him or her.<br /><br />BTW the "data.frame" statement in your code is superfluous , a c() will suffice (as ddply always returns a dataframe)<br /><br />I'm sure there are other ways, anyhow plyr is definitely in my toolbox to stay!Janhttps://www.blogger.com/profile/17332094484864627982noreply@blogger.comtag:blogger.com,1999:blog-6232819486261696035.post-65895841170967809322010-01-14T08:36:19.806-06:002010-01-14T08:36:19.806-06:00This comment has been removed by the author.Janhttps://www.blogger.com/profile/17332094484864627982noreply@blogger.comtag:blogger.com,1999:blog-6232819486261696035.post-63202065954917223932010-01-14T08:35:01.203-06:002010-01-14T08:35:01.203-06:00This comment has been removed by the author.Janhttps://www.blogger.com/profile/17332094484864627982noreply@blogger.comtag:blogger.com,1999:blog-6232819486261696035.post-35610307013805885922009-12-04T11:57:21.291-06:002009-12-04T11:57:21.291-06:00thanks for bringing this to light - as someone fam...thanks for bringing this to light - as someone familiar with SQL and now being "forced" to solely work in R, i've been frustrated with the lack of group-by type functions. your blog overall has been very helpful to me. i'm so glad i came by your poster at ASHG this fall - otherwise i would have never found out about it!Anonymousnoreply@blogger.com