1 Data structures
Q: What are the six types of atomic vector? How does a list differ from an atomic vector?
A: The six types are logical, integer, double, character, complex and raw. The elements of a list don’t have to be of the same type.
Q: What makes
is.numeric()fundamentally different to
A: The first two tests don’t check for a specific type.
Q: Test your knowledge of vector coercion rules by predicting the output of the following uses of
c(1, FALSE) # will be coerced to numeric -> 1 0 c("a", 1) # will be coerced to character -> "a" "1" c(list(1), "a") # will be coerced to a list with two elements of type double and character c(TRUE, 1L) # will be coerced to integer -> 1 1
Q: Why do you need to use
unlist()to convert a list to an atomic vector? Why doesn’t
A: To get rid of (flatten) the nested structure.
Q: Why is
1 == "1"true? Why is
-1 < FALSEtrue? Why is
"one" < 2false?
A: These operators are all functions which coerce their arguments (in these cases) to character, double and character. To enlighten the latter case: “one” comes after “2” in ASCII.
Q: Why is the default missing value,
NA, a logical vector? What’s special about logical vectors? (Hint: think about
A: It is a practical thought. When you combine
c()with other atomic types they will be coerced like
(NA_character_). Recall that in R there is a hierarchy of recursion that goes logical -> integer -> double -> character. If
NAwere, for example, a character, including
NAin a set of integers or logicals would result in them getting coerced to characters which would be undesirable. Making
NAa logical means that involving an
NAin a dataset (which happens often) will not result in coercion.
Q: An early draft used this code to illustrate
structure(1:5, comment = "my attribute") #>  1 2 3 4 5
But when you print that object you don’t see the comment attribute. Why? Is the attribute missing, or is there something else special about it? (Hint: try using help.)
A: From the help of comment
Contrary to other attributes, the comment is not printed (by print or print.default).
Also from the help of attributes
Note that some attributes (namely class, comment, dim, dimnames, names, row.names and tsp) are treated specially and have restrictions on the values which can be set.
Q: What happens to a factor when you modify its levels?
f1 <- factor(letters) levels(f1) <- rev(levels(f1))
A: Both, the entries of the factor and also its levels are being reversed:
f1 #>  z y x w v u t s r q p o n m l k j i h g f e d c b a #> Levels: z y x w v u t s r q p o n m l k j i h g f e d c b a
Q: What does this code do? How do
f2 <- rev(factor(letters)) # changes only the entries of the factor f3 <- factor(letters, levels = rev(letters)) # changes only the levels of the factor
f3change only one thing. They change the order of the factor or its levels, but not both at the same time.
1.3 Matrices and arrays
Q: What does
dim()return when applied to a vector?
TRUE, what will
TRUE, as also documented in
A two-dimensional array is the same thing as a matrix.
Q: How would you describe the following three objects? What makes them different to
x1 <- array(1:5, c(1, 1, 5)) # 1 row, 1 column, 5 in third dimension x2 <- array(1:5, c(1, 5, 1)) # 1 row, 5 columns, 1 in third dimension x3 <- array(1:5, c(5, 1, 1)) # 5 rows, 1 column, 1 in third dimension
A: They are of class array and so they have a
1.4 Data frames
Q: What attributes does a data frame possess?
A: names, row.names and class.
Q: What does
as.matrix()do when applied to a data frame with columns of different types?
The method for data frames will return a character matrix if there is only atomic columns and any non-(numeric/logical/complex) column, applying as.vector to factors and format to other non-character columns. Otherwise the usual coercion hierarchy (logical < integer < double < complex) will be used, e.g., all-logical data frames will be coerced to a logical matrix, mixed logical-integer will give a integer matrix, etc.
Q: Can you have a data frame with 0 rows? What about 0 columns?
A: Yes, you can create them easily. Also both dimensions can be 0:
# here we use the recycling rules for logical subsetting, but you could # also subset with 0, a negative index or a zero length atomic (i.e. # logical(0), character(0), integer(0), double(0)) iris[FALSE,] #>  Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> <0 rows> (or 0-length row.names) iris[ , FALSE] # or iris[FALSE] #> data frame with 0 columns and 150 rows iris[FALSE, FALSE] # or just data.frame() #> data frame with 0 columns and 0 rows