Basics
There are a number of operators that can be used to extract subsets of R objects.
[ - always returns an object of the same class as the original, can be used to select more than one element
[ [ - used to extract elements of a list or data frame. can only be used to extract a single element and the class of the returned object will not necessarily be a list or data frame
$ - used to extract elements of a list or data frame by name; semantics are similar to that of [ [
> x <- c("a", "b", "c", "c", "d", "a") > x[1] [1] "a" > x[2] [1] "b" > x[1:4] [1] "a" "b" "c" "c" > x[x > "a"] [1] "b" "c" "c" "d" > u <- x > "a" > u [1] FALSE TRUE TRUE TRUE TRUE FALSE > x[u] [1] "b" "c" "c" "d"
Lists
> x <- list(foo = 1:4, bar = 0.6)
First element is foo. Second element is bar.
> x[1] # returns list with sequence $foo [1] 1 2 3 4 > x[[1]] # returns sequence from list [1] 1 2 3 4
If you can't remember the position of "bar" in the list, you can access it using its name rather than its index.
> x$bar # returns element associated with "bar" [1] 0.6 > x[["bar"]] # equivalent to above [1] 0.6 > x["bar"] # returns list with element $bar [1] 0.6
To extract multiple elements from a list, use the [] operator.
> x <- list(foo = 1:4, bar = 0.6, baz = "hello") > x[c(1, 3)] $foo [1] 1 2 3 4 $baz [1] "hello"
You can't use the [[]] or $ operators to extract multiple elements from a list.
The [[]] operator can be used with indices; $ can only be used with literal names.
> x <- list(foo = 1:4, bar = 0.6, baz = "hello") > name <- "foo" > x[[name]] [1] 1 2 3 4 > x$name NULL > x$foo [1] 1 2 3 4
[[]] can take an integer sequence.
> x <- list(a = list(10, 12, 14), b = c(3.14, 2.81)) > x[[c(1, 3)]] [1] 14 > x[[1]][[3]] [1] 14 > x[[c(2, 1)]] [1] 3.14
Matrices
> x <- matrix(1:6, 2, 3) > x \t[ ,1]\t[ ,2]\t[ ,3] [1, ]\t1\t3\t5 [2, ]\t2\t4\t6 > x[1, 2] [1] 3 > x[2, 1] [1] 2
Indices can also be missing.
> x[1, ] [1] 1 3 5 > x[, 2] [1] 3 4
By default, when a single element from a matrix is retrieved, it is returned as a vector of length 1 rather than a 1 x 1 matrix. This behavior can be turned off by setting drop = FALSE.
> x[1, 2, drop = FALSE] \t[ ,1] [1, ]\t3 > x[1, , drop = FALSE] \t[ ,1]\t[ ,2]\t[ ,3] [1, ]\t1\t3\t5
Partial Matching
Partial matching of names is allowed with [[]] and $.
$ looks for a name in the list that matches the letter "a"
> x <- list(aardvark = 1:5) > x$a [1] 1 2 3 4 5
[[]] looks for a name that's an exact match.
> x[["a"]] NULL
The exact = FALSE argument drops the exactness requirement.
> x[["a", exact = FALSE]] [1] 1 2 3 4 5
Removing Missing (NA) Values
> x <- c(1, 2, NA, 4, NA, 5) > bad <- is.na(x) > x[!bad] [1] 1 2 4 5 > y <- c("a", "b", NA, "d", NA, "f") > good <- complete.cases(x, y) > good [1] TRUE TRUE FALSE TRUE FALSE TRUE > x[good] [1] 1 2 4 5 > y[good] "a" "b" "d" "f"
You can also use complete.cases to remove missing values from data frames. To get the rows of a data frame where all the values are not missing:
> good <- complete.cases(dataframename) > dataframename[good, ]