NMSA230: Software for Mathematics and
Stochastics
Sweave Example file
1
Some Sweave examples
This document was prepared using Sweave (Leisch, 2002) in R (R Core Team, 2015), version 3.2.0 (2015-04-16). Additionally, we use the following extension packages: xtable (Dahl, 2014) and colorspace (Ihaka et al.,2015;Zeileis et al.,2009).
Here we define our working directory.
Students must change this!
> ROOT <- "/home/komarek/teach/mff_2014/nmsa230_SoftProstr/Rko/" Read data:
> print(load(paste(ROOT, "Data/cars.RData", sep = ""))) [1] "cars"
Subset of data containing only selected variables:
> subcars <- subset(cars, select = c("vname", "ftype", "fdrive", + "weight", "consumption"))
Basic descriptive statistics: > summary(subcars)
vname ftype fdrive weight consumption Length:428 osobni :245 front:226 Min. : 839 Min. : 3.75 Class :character combi : 30 rear :110 1st Qu.:1407 1st Qu.: 9.65 Mode :character SUV : 60 4x4 : 92 Median :1576 Median :10.65 pickup : 24 Mean :1623 Mean :10.71 sport : 49 3rd Qu.:1803 3rd Qu.:11.64 minivan: 20 Max. :3261 Max. :21.55
NA's :2 NA's :14 Here, descriptive statistics are calculated but not shown:
> ssubcars <- summary(subcars)
Here, descriptive statistics are calculated, results shown but the code is not shown: vname ftype fdrive weight consumption Length:428 osobni :245 front:226 Min. : 839 Min. : 3.75 Class :character combi : 30 rear :110 1st Qu.:1407 1st Qu.: 9.65 Mode :character SUV : 60 4x4 : 92 Median :1576 Median :10.65 pickup : 24 Mean :1623 Mean :10.71 sport : 49 3rd Qu.:1803 3rd Qu.:11.64 minivan: 20 Max. :3261 Max. :21.55
NA's :2 NA's :14 Here, descriptive statistics are calculated but neither results nor the code are shown: Here, only code is shown but nothing calculated:
> summary(subcars)
It is also possible to use a calculated number (calculated numbers) in the body of the text: > meanConsump <- mean(subcars$consumption, na.rm = TRUE)
> meanConsump <- format(round(meanConsump, 2), nsmall = 2) > print(meanConsump)
[1] "10.71"
2
Tables
Descriptive statistics of consumption given drive. > attach(cars)
> sconsumpt <- data.frame(
+ Mean = tapply(consumption, fdrive, mean, na.rm=TRUE), + SD = tapply(consumption, fdrive, sd, na.rm=TRUE), + Median = tapply(consumption, fdrive, median, na.rm=TRUE),
+ Q1 = tapply(consumption, fdrive, quantile, prob=0.25, na.rm=TRUE), + Q3 = tapply(consumption, fdrive, quantile, prob=0.75, na.rm=TRUE), + n = tapply(!is.na(consumption), fdrive, sum),
+ NAs = tapply(is.na(consumption), fdrive, sum)) > detach(cars)
> print(sconsumpt)
Mean SD Median Q1 Q3 n NAs front 9.674306 1.888841 9.800 8.45 10.70 216 10 rear 11.293981 1.293581 11.250 10.55 11.85 108 2 4x4 12.477222 2.339009 11.725 10.70 14.05 90 2 Table created using the xtable package
> rownames(sconsumpt) <- c("Front", "Rear", "4x4")
> colnames(sconsumpt) <- c("Mean", "SD", "Median", "Q1", "Q3", "n", "NA's") > #
> library("xtable")
> tconsumpt <- xtable(sconsumpt, align=c("l", rep("r", 7)), + digits=c(0, rep(2, 5), 0, 0),
+ caption="Table of descriptive statistics of consumption (l/100 km).", + label="tab:descrConsumpt01")
> print(tconsumpt)
Mean SD Median Q1 Q3 n NA’s Front 9.67 1.89 9.80 8.45 10.70 216 10 Rear 11.29 1.29 11.25 10.55 11.85 108 2 4x4 12.48 2.34 11.72 10.70 14.05 90 2 Table 1: Table of descriptive statistics of consumption (l/100 km). See Table1 for descriptive statistics of consumption.
Table created by hand being as nice as I wish:
Table 2: Table of descriptive statistics of consumption (l/100 km).
Category Mean (Std. Dev.) Median (Q1 – Q3) n Missing
Front 9.67 (1.89) 9.80 (8.45 – 10.70) 216 10
Rear 11.29 (1.29) 11.25 (10.55 – 11.85) 108 2
4x4 12.48 (2.34) 11.72 (10.70 – 14.05) 90 2 Small improvement:
Table 3: Second table of descriptive statistics of consumption (l/100 km).
Category Mean (Std. Dev.) Median (Q1 – Q3) n Missing
Front 9.67 (1.89) 9.80 ( 8.45 – 10.70) 216 10
Rear 11.29 (1.29) 11.25 (10.55 – 11.85) 108 2
3
Figures
Define what should be conducted before each plotting. > figSweave <- function(){
+ par(bty = "n", mar = c(5, 4, 4, 1) + 0.1) + ## WHATEVER OTHER R COMMANDS
+ }
> options(SweaveHooks = list(fig = figSweave))
Figure which is drawn, saved as PDF and automatically placed in a text (pdfLATEX must then
be used):
> library("colorspace")
> COL <- rainbow_hcl(1, start = 90)
> boxplot(cars$consumption, ylab = "Consumption [l/100 km]", col = COL)
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5 10 15 20 Consumption [l/100 km]
Figure 1: Boxplot of consumption.
Figure which is drawn, saved as PDF but it is nowhere placed automatically. Placing the figure into the document is the author’s responsibility.
> COL <- rainbow_hcl(1, start = 150)
> boxplot(cars$consumption, ylab = "Consumption [l/100 km]", col = COL)
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
5
10
15
20
Consumption [l/100 km]
It is also possible to use standard functions pdf(), postscript(), png() etc. to save a plot in an arbitrary format on an arbitrary place with an arbitrary filename:
> COL <- rainbow_hcl(1, start = 180)
> postscript("./Obrazky/boxplot-consumpt.eps", width = 6, height = 6, + horizontal = FALSE, paper = "special")
> boxplot(cars$consumption, ylab = "Consumption [l/100 km]", col = COL) > dev.off()
> #
> RES <- 500
> png("./Obrazky/boxplot-consumpt.png", width = 6*RES, height = 6*RES, res = RES) > boxplot(cars$consumption, ylab = "Consumption [l/100 km]", col = COL)
> dev.off()
References
Dahl, D. B. (2014). xtable: Export tables to LATEX or HTML. URLhttp://CRAN.R-project. org/package=xtable. R package version 1.7-4.
Ihaka, R., Murrell, P., Hornik, K., Fisher, J. C., and Zeileis, A. (2015). colorspace: Color Space Manipulation. URL http://CRAN.R-project.org/package= colorspace. R package version 1.2-6.
Leisch, F. (2002). Dynamic generation of statistical reports using literate data analysis. In H¨ardle, W. and R¨onz, B., editors, COMPSTAT 2002 – Proceedings in Computational Statistics, pages 575–580, Heidelberg, 2002. Physica-Verlag.
R Core Team (2015). R: A Language and Environment for Statistical Computing. R Founda-tion for Statistical Computing, Vienna, Austria. URLhttp://www.R-project.org/.
Zeileis, A., Hornik, K., and Murrell, P. (2009). Escaping RGBland: Selecting col-ors for statistical graphics. Computational Statistics and Data Analysis, 53(9), 3259–3270. doi:10.1016/j.csda.2008.11.033.