Monday, August 9, 2010

Quickly Find the Class of data.frame vectors in R

Aviad Klein over at My ContRibution wrote a convenient R function to list the classes of all the vectors that make up a data.frame. You would think apply(kyphosis,2,class) would do the job but it doesn't - it calls every vector a character class. Aviad wrote an elegant little function that does the job perfectly without having to load any external package:  

allClass<-function(x) {unlist(lapply(unclass(x),class))}.

Here it is in action:

> # load the CO2 dataset
> data(CO2)
> 
> # look at the first few rows
> head(CO2)
  Plant   Type  Treatment conc uptake
1   Qn1 Quebec nonchilled   95   16.0
2   Qn1 Quebec nonchilled  175   30.4
3   Qn1 Quebec nonchilled  250   34.8
4   Qn1 Quebec nonchilled  350   37.2
5   Qn1 Quebec nonchilled  500   35.3
6   Qn1 Quebec nonchilled  675   39.2
> 
> # this doesn't work
> apply(CO2,2,class)
      Plant        Type   Treatment        conc      uptake 
"character" "character" "character" "character" "character" 
> 
> # this does
> allClass <- function(x) {unlist(lapply(unclass(x),class))}
> 
> allClass(CO2)
   Plant1    Plant2      Type Treatment      conc    uptake 
"ordered"  "factor"  "factor"  "factor" "numeric" "numeric" 

Nice tip, Aviad.

4 comments:

  1. Another option is

    unlist(sapply(x, class))

    ReplyDelete
  2. What about just 'str(CO2)'?

    You get a little bit more of what you were looking at in the first place, but it's probably a lot useful anyway. And you don't mess up with the 'Plant' type, Plant being duplicated because of its two classes (factor, and upon it ordered).

    With the allClass function, it gets very messy with Plant1 and Plant2 as variables.

    > CO2$Plant1 <- CO2$Plant
    > CO2$Plant2 <- CO2$Plant
    > allClass(CO2)
    Plant1 Plant2 Type Treatment conc uptake
    "ordered" "factor" "factor" "factor" "numeric" "numeric"
    Plant11 Plant12 Plant21 Plant22
    "ordered" "factor" "ordered" "factor"

    Still OK with 'str(CO2)'...

    ReplyDelete
  3. What about:
    sapply(blast, class)

    much simpler.

    ReplyDelete

Creative Commons License
Getting Genetics Done by Stephen Turner is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License.