## Thursday, February 17, 2011

### R: Given column name in a Data Frame, Get the Index

Had a mental block today trying to figure out how to get the indices of columns in a data frame given their names. Simple task but difficult to search Google for an answer. Thanks to jashapiro, Matt, and Vince for giving me a heads up on the which() function. The which() function returns the indices of TRUE values in a logical vector.

If you're looking at the iris data:

```data(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa
```

And you needed to know which column number "Sepal.Width" and "Species" were, use the which() function:

```mycols <- c("Sepal.Width","Species")
which(names(iris) %in% mycols)
[1] 2 5
which(names(iris)=="Petal.Length")
[1] 3
```

Simple.

1. The 'match' function is even easier for this. It returns indices for the first vector argument's matches in the second one.

> match(mycols, colnames(iris))
[1] 2 5

2. Following Alex, note that if you had specified

>mycols <-c("Species","Sepal.Width")

then your first answer would be wrong (if you cared about retaining order, which I usually do). match() would give the right answer.

Also, match has the potential to be faster if you are doing a lot of comparisons.

3. haha I totally spent a good 1/2 hr with this same question yesterday morning. Just saw the repost to R-bloggers

Thanks for both versions

4. brilliant, neat answer. thanks!

Note: Only a member of this blog may post a comment.