• No results found

Aesthetic Mapping with Geoms

In document The Book of R (Page 177-182)

7.4 The ggplot2 Package

7.4.3 Aesthetic Mapping with Geoms

Geoms andggplot2also provide efficient, automated ways to apply differ- ent styles to different subsets of a plot. If you split a data set into categories using a factor object,ggplot2can automatically apply particular styles to dif- ferent categories. Inggplot2’s documentation, the factor that holds these cat- egories is called a variable, whichggplot2can map to aesthetic values. This gets rid of much of the effort that goes into isolating subsets of data and plotting them separately using base R graphics (as you did in Section 7.3).

All this is best illustrated with an example. Let’s return to the 20 obser- vations you manually plotted, step-by-step, to produce the elaborate plot in Figure 7-6.

R> x <- 1:20

R> y <- c(-1.49,3.37,2.59,-2.78,-3.94,-0.92,6.43,8.51,3.41,-8.23, -12.01,-6.58,2.87,14.12,9.63,-4.58,-14.78,-11.67,1.17,15.62)

In Section 7.3, you defined several categories that classified each obser- vation as either “standard,” “sweet,” “too big,” or “too small” based on their

xandyvalues. Using those same classification rules, let’s explicitly define a factor to correspond toxandy.

R> ptype <- rep(NA,length(x=x)) R> ptype[y>=5] <- "too_big" R> ptype[y<=-5] <- "too_small" R> ptype[(x>=5&x<=15)&(y>-5&y<5)] <- "sweet" R> ptype[(x<5|x>15)&(y>-5&y<5)] <- "standard" R> ptype <- factor(x=ptype) R> ptype

[1] standard standard standard standard sweet sweet too_big [8] too_big sweet too_small too_small too_small sweet too_big [15] too_big standard too_small too_small standard too_big

Levels: standard sweet too_big too_small

Now you have a factor with 20 values sorted into four levels. You’ll use this factor to tellqplothow to map your aesthetics. Here’s a simple way to do that:

R> qplot(x,y,color=ptype,shape=ptype)

This single line of code produces the left plot in Figure 7-11, which sep- arates the four categories by color and point character and even provides a legend. This was all done by the aesthetic mapping in the call toqplot, where you setcolorandshapeto be mapped to theptypevariable.

Figure 7-11: Demonstration of aesthetic mapping using qplot and geoms in ggplot2. Left: The initial call to qplot, which maps point character and color using ptype. Right: Augmenting the left plot using various geoms to override the default mappings.

Now, let’s replot these data using the sameqplotobject along with a suite of geom modifications in order to get something more like Figure 7-6. Executing the following produces the plot on the right of Figure 7-11:

R> qplot(x,y,color=ptype,shape=ptype) + geom_point(size=4) + geom_line(mapping=aes(group=1),color="black",lty=2) + geom_hline(mapping=aes(yintercept=c(-5,5)),color="red") +

geom_segment(mapping=aes(x=5,y=-5,xend=5,yend=5),color="red",lty=3) + geom_segment(mapping=aes(x=15,y=-5,xend=15,yend=5),color="red",lty=3)

In the first line, you addgeom_point(size=4)to increase the size of all the points on the graph. In the lines that follow, you add a line connecting all the points, plus horizontal and vertical lines to mark out the sweet spot. For those last four lines, you have to useaesto set alternate aesthetic mappings for the point categories. Let’s look a little closer at what’s going on there.

Since you usedptypefor aesthetic mapping in the initial call toqplot, by default all other geoms will be mapped to each category in the same way,

unless you override that default mapping withaes. For example, when you callgeom_lineto connect all the points, if you were to stick with the default mapping toptypeinstead of includingmapping=aes(group=1), this geom would draw lines connecting points within each category. You would see four sepa- rate dashed lines—one connecting all “standard” points, another connecting all “sweet” points, and so on. But that’s not what you want here; you want a line that connects all of the points, from left to right. So, you tellgeom_lineto treat all the observations as one group by enteringaes(group=1).

After that, you use thegeom_hlinefunction to draw horizontal lines at y=−5 and y = 5 using itsyinterceptargument, again passed toaesto rede- fine that geom’smapping. In this case, you need to redefine the mapping to operate on the vectorc(-5,5), rather than using the observed data inxand

y. Similarly, you end by usinggeom_segmentto draw the two vertical dotted line segments.geom_segmentoperates much likesegments—you redefine the mapping based on a “from” coordinate (argumentsxandy) and a “to” co- ordinate (xendandyendhere). Since the first geom,geom_point(size=4), sets a constant enlarged size for every plotted point, it doesn’t matter how the geom is mapped because it simply makes a uniform change to each point. Plotting in R, from base graphics to contributed packages likeggplot2, stays true to the nature of the language. The element-wise matching allows you to create intricate plots with a handful of straightforward and intuitive functions. Once you display a plot, you can save it to the hard drive by select- ing the graphics device and choosing File → Save. However, you can also write plots to a file directly, as you’ll see momentarily in Section 8.3.

The graphical capabilities explored in this section are merely the tip of the iceberg, and you’ll continue to use data visualizations from this point onward.

Exercise 7.2

In Exercise 7.1 (b), you used base R graphics to plot some weight and height data, distinguishing males and females using different points or colors. Repeat this task usingggplot2.

Important Code in This Chapter

Function/operator Brief description First occurrence plot Create/display base R plot Section 7.1, p. 128 type Set plot type Section 7.2.1, p. 130 main,xlab,ylab Set axis labels Section 7.2.2, p. 130 col Set point/line color Section 7.2.3, p. 131 pch,cex Set point type/size Section 7.2.4, p. 133 lty,lwd Set line type/width Section 7.2.4, p. 133 xlim,ylim Set plot region limits Section 7.2.5, p. 134 abline Add vertical/horizontal line Section 7.3, p. 137 segments Add specific line segments Section 7.3, p. 137 points Add points Section 7.3, p. 137 lines Add lines following coords Section 7.3, p. 138 arrows Add arrows Section 7.3, p. 138 text Add text Section 7.3, p. 138 legend Add/control legend Section 7.3, p. 138 qplot Createggplot2“quick plot” Section 7.4.1, p. 140 geom_point Add points geom Section 7.4.2, p. 141 geom_line Add lines geom Section 7.4.2, p. 141 size,shape,color Set geom constants Section 7.4.2, p. 142 linetype Set geom line type Section 7.4.2, p. 142 mapping,aes Geom aesthetic mapping Section 7.4.3, p. 145 geom_hline Add horizontal lines geom Section 7.4.3, p. 145 geom_segment Add line segments geom Section 7.4.3, p. 145

8

R E A D I N G A N D W R I T I N G F I L E S

Now I’ll cover one more fundamental

aspect of working with R: loading and sav-

ing data in an active workspace by reading

and writing files. Typically, to work with a large

data set, you’ll need to read in the data from an exter-

nal file, whether it’s stored as plain text, in a spread-

sheet file, or on a website. R provides command line

functions you can use to import these data sets, usually as a data frame object. You can also export data frames from R by writing a new file on your computer, plus you can save any plots you create as image files. In this chapter, I’ll go over some useful command-based read and write oper- ations for importing and exporting data.

8.1 R-Ready Data Sets

First, let’s take a brief look at some of the data sets that are built into the software or are part of user-contributed packages. These data sets are useful samples to practice with and to experiment with functionality.

Enterdata()at the prompt to bring up a window listing these ready-to- use data sets along with a one-line description. These data sets are organized in alphabetical order by name and grouped by package (the exact list that

appears will depend on what contributed packages have been installed from CRAN; see Section A.2).

In document The Book of R (Page 177-182)