6.2 Understanding Types, Classes, and Coercion
6.2.2 Object Class
An object’s class is one of the most useful attributes for describing an entity in R. Every object you create is identified, either implicitly or explicitly, with at least one class. R is an object-oriented programming language, meaning entities are stored as objects and have methods that act upon them. In such a language, class identification is formally referred to as inheritance.
NOTE This section will focus on the most common classing structure used in R, called S3.
There is another structure, S4, which is essentially a more formal set of rules for the identification and treatment of different objects. For most practical intents and cer- tainly for beginners, understanding and using S3 will be sufficient. You can find further details in R’s online documentation.
The class of an object is explicit in situations where you have user- defined object structures or an object such as a factor vector or data frame where other attributes play an important part in the handling of the object itself—for example, level labels of a factor vector, or variable names in a data
frame, are modifiable attributes that play a primary role in accessing the observations of each object. Elementary R objects such as vectors, matrices, and arrays, on the other hand, are implicitly classed, which means the class is not identified with theattributesfunction. Whether implicit or explicit, the class of a given object can always be retrieved using the attribute-specific functionclass.
Stand-Alone Vectors
Let’s create some simple vectors to use as examples.
R> num.vec1 <- 1:4 R> num.vec1 [1] 1 2 3 4 R> num.vec2 <- seq(from=1,to=4,length=6) R> num.vec2 [1] 1.0 1.6 2.2 2.8 3.4 4.0 R> char.vec <- c("a","few","strings","here") R> char.vec
[1] "a" "few" "strings" "here" R> logic.vec <- c(T,F,F,F,T,F,T,T) R> logic.vec
[1] TRUE FALSE FALSE FALSE TRUE FALSE TRUE TRUE
R> fac.vec <- factor(c("Blue","Blue","Green","Red","Green","Yellow")) R> fac.vec
[1] Blue Blue Green Red Green Yellow Levels: Blue Green Red Yellow
You can pass any object to theclassfunction, and it returns a character vector as output. Here are examples using the vectors just created:
R> class(num.vec1) [1] "integer" R> class(num.vec2) [1] "numeric" R> class(char.vec) [1] "character" R> class(logic.vec) [1] "logical" R> class(fac.vec) [1] "factor"
The output from usingclasson the character vector, the logical vector, and the factor vector simply match the kind of data that has been stored. The output from the number vectors is a little more intricate, however. So far, I’ve referred to any object with an arithmetically valid set of numbers as “numeric.” If all the numbers stored in a vector are whole, then R identifies the vector as"integer". Numbers with decimal places (called floating-point numbers), on the other hand, are identified as"numeric". This distinction
is necessary because some tasks strictly require integers, not floating-point numbers. Colloquially, I’ll continue to refer to both types as “numeric” and in fact, theis.numericfunction will returnTRUEfor both integer and floating- point structures, as you’ll see in Section 6.2.3.
Other Data Structures
As mentioned earlier, R’s classes are essentially designed to facilitate object- oriented programming. As such,classusually reports on the nature of the data structure, rather than the type of data that’s stored—it returns the data type only when used on stand-alone vectors. Let’s try it on some matrices.
R> num.mat1 <- matrix(data=num.vec1,nrow=2,ncol=2) R> num.mat1 [,1] [,2] [1,] 1 3 [2,] 2 4 R> num.mat2 <- matrix(data=num.vec2,nrow=2,ncol=3) R> num.mat2 [,1] [,2] [,3] [1,] 1.0 2.2 3.4 [2,] 1.6 2.8 4.0 R> char.mat <- matrix(data=char.vec,nrow=2,ncol=2) R> char.mat [,1] [,2] [1,] "a" "strings" [2,] "few" "here" R> logic.mat <- matrix(data=logic.vec,nrow=4,ncol=2) R> logic.mat [,1] [,2] [1,] TRUE TRUE [2,] FALSE FALSE [3,] FALSE TRUE [4,] FALSE TRUE
Note from Section 4.3.1 that factors are used only in vector form, so
fac.vecis not included here. Now check these matrices withclass.
R> class(num.mat1) [1] "matrix" R> class(num.mat2) [1] "matrix" R> class(char.mat) [1] "matrix" R> class(logic.mat) [1] "matrix" 118 Chapter 6
You see that regardless of the data type,classreports the structure of the object itself—all matrices. The same is true for other object structures, like arrays, lists, and data frames.
Multiple Classes
Certain objects will have multiple classes. A variant on a standard form of an object, such as an ordered factor vector, will inherit the usual factor class and also contain the additionalorderedclass. Both are returned if you use theclassfunction.
R> ordfac.vec <- factor(x=c("Small","Large","Large","Regular","Small"), levels=c("Small","Regular","Large"),
ordered=TRUE) R> ordfac.vec
[1] Small Large Large Regular Small Levels: Small < Regular < Large
R> class(ordfac.vec) [1] "ordered" "factor"
Earlier,fac.vecwas identified as"factor"only, but the class ofordfac.vec
has two components. It’s still identified as"factor", but it also includes
"ordered", which identifies the variant of the"factor"class also present in the object. Here, you can think of"ordered"as a subclass of"factor". In other words, it is a special case that inherits from, and therefore behaves like, a
"factor". For further technical details on R subclasses, I recommend Chap- ter 9 of The Art of R Programming by Matloff (2011).
NOTE I have focused on theclassfunction here because it’s directly relevant to the object- oriented programming style exercised in this text, especially in Part II. There are other functions that show some of the complexities of R’s classing rules. For example, the functiontypeofreports the type of data contained within an object, not just for vectors but also for matrices and arrays. Note, however, that the terminology in the output of
typeofdoesn’t always match the output ofclass. See the help file?typeoffor details on the values it returns.
To summarize, an object’s class is first and foremost a descriptor of the data structure, though for simple vectors, theclassfunction reports the type of data stored. If the vector entries are exclusively whole numbers, then R classes the vector as"integer", whereas"numeric"is used to label a vector with floating-point numbers.