Generally speaking, each call toplotwill refresh the active graphics device for a new plotting region. But this is not always desired—to build more com- plicated plots, it’s easiest to start with an empty plotting region and progres- sively add any required points, lines, text, and legends to this canvas. Here are some useful, ready-to-use functions in R that will add to a plot without refreshing or clearing the window:
points Adds points
lines,abline,segments Adds lines
text Writes text
arrows Adds arrows
legend Adds a legend
The syntax for calling and setting parameters for these functions is the same asplot. The best way to see how these work is through an extended example, which I’ll base on some hypothetical data made up of 20 (x, y) locations.
R> x <- 1:20
R> y <- c(-1.49,3.37,2.59,-2.78,-3.94,-0.92,6.43,8.51,3.41,-8.23, -12.01,-6.58,2.87,14.12,9.63,-4.58,-14.78,-11.67,1.17,15.62)
Using these data, you’ll build up the plot shown in Figure 7-6 (note that you may need to manually enlarge your graphics device and replot to ensure the legend doesn’t overlap other features of the image). It’s worth remembering a generally accepted rule in plotting: “keep it clear and simple.” Figure 7-6 is an exception for the sake of demonstrating the R commands used.
Figure 7-6: An elaborate final plot of some hypothetical data
In Figure 7-6, the data points will be plotted differently according to their x and y locations, depending on their relation to the “sweet spot” pointed out in the figure. Points with a y value greater than 5 are marked with a purple ×; points with a y value less than −5 are marked with a green +. Points between these two y values but still outside of the sweet spot are marked with a ◦. Finally, points in the sweet spot (with x between 5 and 15 and with y between −5 and 5) are marked as a blue •. Red horizon- tal and vertical lines delineate the sweet spot, which is labeled with an arrow, and there’s also a legend.
Ten lines of code were used to build this plot in its entirety (plus one additional line to add the legend). The plot, as it looks at each step, is given in Figure 7-7. The lines of code are detailed next.
1. The first step is to create the empty plotting region where you can add points and draw lines. This first line tells R to plot the data inxandy, though the optiontypeis set to"n". As mentioned in Section 7.2, this opens or refreshes the graphics device and sets the axes to the appropri- ate lengths (with labels and axes), but it doesn’t plot any points or lines.
R> plot(x,y,type="n",main="")
2. Theablinefunction is a simple way to add straight lines spanning a plot. The line (or lines) can be specified with slope and intercept values (see the later discussions on regression in Chapter 20). You can also simply add horizontal or vertical lines. This line of code adds two separate horizontal lines, one at y = 5 and the other at y = 5, using
h=c(-5,5). The three parameters (covered in Section 7.2) make these
Figure 7-7: Building the final plot given in Figure 7-6. The plots (1) through (10) correspond to the itemized lines of code in the text.
two lines red, dashed, and double-thickness. For vertical lines, you could have writtenv=c(-5,5), which would have drawn them at x = −5 and
x = 5.
R> abline(h=c(-5,5),col="red",lty=2,lwd=2)
3. The third line of code adds shorter vertical lines between the horizontal ones drawn in step 2 to form a box. For this you usesegments, notabline, since you don’t want these lines to span the entire plotting region. The
segmentscommand takes a “from” coordinate (given asx0andy0) and a “to” coordinate (asx1andy1) and draws the corresponding line. The vector-oriented behavior of R matches up the two sets of “from” and “to” coordinates. Both lines are red and dotted and have double-thickness. (You could also supply vectors of length 2 to these parameters, in which case the first segment would use the first parameter value and the sec- ond segment would use the second value.)
R> segments(x0=c(5,15),y0=c(-5,-5),x1=c(5,15),y1=c(5,5),col="red",lty=3, lwd=2)
4. As step 4, you usepointsto begin adding specific coordinates fromx
andyto the plot. Just likeplot,pointstakes two vectors of equal lengths with x and y values. In this case, you want points plotted differently according to their location, so you use logical vector subsetting (see Section 4.1.5) to identify and extract elements ofxandywhere the y value is greater than or equal to 5. These (and only these) points are added as purple × symbols and are enlarged by a factor of 2 withcex.
R> points(x[y>=5],y[y>=5],pch=4,col="darkmagenta",cex=2)
5. The fifth line of code is much like the fourth; this time it extracts the coordinates where y values are less than or equal to −5. A + point char- acter is used, and you set the color to dark green.
R> points(x[y<=-5],y[y<=-5],pch=3,col="darkgreen",cex=2)
6. The sixth step adds the blue “sweet spot” points, which are identified with(x>=5&x<=15)&(y>-5&y<5). This slightly more complicated set of condi- tions extracts the points whose x location lies between 5 and 15 (inclu- sive) AND whose y location lies between −5 and 5 (exclusive). Note that this line uses the “short” form of the logical operator&throughout since you want element-wise comparisons here (see Section 4.1.3).
R> points(x[(x>=5&x<=15)&(y>-5&y<5)],y[(x>=5&x<=15)&(y>-5&y<5)],pch=19, col="blue")
7. This next command identifies the remaining points in the data set (with an x value that is either less than 5 OR greater than 15 AND a y value between −5 and 5). No graphical parameters are specified, so these points are plotted with the default black ◦.
R> points(x[(x<5|x>15)&(y>-5&y<5)],y[(x<5|x>15)&(y>-5&y<5)])
8. To draw lines connecting the coordinates inxandy, you uselines. Here you’ve also setltyto4, which draws a dash-dot-dash style line.
R> lines(x,y,lty=4)
9. The ninth line of code adds the arrow pointing to the sweet spot. The functionarrowsis used just likesegments, where you provide a “from” coordinate (x0,y0) and a “to” coordinate (x1,y1). By default, the head of the arrow is located at the “to” coordinate, though this (and other options such as the angle and length of the head) can be altered using optional arguments described in the help file?arrows.
R> arrows(x0=8,y0=14,x1=11,y1=2.5)
10. The tenth line prints a label on the plot at the top of the arrow. As per the default behavior oftext, the string supplied aslabelsis centered on the coordinates provided with the argumentsxandy.
R> text(x=8,y=15,labels="sweet spot")
As a finishing touch, you can add the legend with thelegendfunction, which gives you the final product shown in Figure 7-6.
legend("bottomleft",
legend=c("overall process","sweet","standard",
"too big","too small","sweet y range","sweet x range"), pch=c(NA,19,1,4,3,NA,NA),lty=c(4,NA,NA,NA,NA,2,3),
col=c("black","blue","black","darkmagenta","darkgreen","red","red"), lwd=c(1,NA,NA,NA,NA,2,2),pt.cex=c(NA,1,1,2,2,NA,NA))
The first argument sets where the legend should be placed. There are various ways to do this (including setting exact x- and y-coordinates), but it often suffices to pick a corner using one of the four following character strings:"topleft","topright","bottomleft", or"bottomright". Next you supply the labels as a vector of character strings to thelegendargument. Then you need to supply the remaining argument values in vectors of the same length so that the right elements match up with each label.
For example, for the first label ("overall process"), you want a line of type 4 with default thickness and color. So, in the first positions of the remaining argument vectors, you setpch=NA,lty=4,col="black",lwd=1, and
pt.cex=NA(all of these are default values, except forlty). Here,pt.cexsimply refers to thecexparameter when callingpoints(using justcexinlegendwould expand the text used, not the points).
Note that you have to fill in some elements in these vectors withNAwhen you don’t want to set the corresponding graphical parameter. This is just to preserve the equal lengths of the vectors supplied so R can track which parameter values correspond to each particular reference. As you work through this book, you’ll see plenty more examples usinglegend.
Exercise 7.1
a. As closely as you can, re-create the following plot:
b. With the following data, create a plot of weight on the x-axis and height on the y-axis. Use different point characters or colors to distinguish between males and females and provide a matching legend. Label the axes and give the plot a title.
Weight (kg) Height (cm) Sex
55 161 female 85 185 male 75 174 male 42 154 female 93 188 male 63 178 male 58 170 female 75 167 male 89 181 male 67 178 female