NMSA230: Software for Mathematics and Stochastics Sweave Example file

(1)

NMSA230: Software for Mathematics and

Stochastics

Sweave Example file

1 Some Sweave examples

This document was prepared using Sweave (Leisch, 2002) in R (R Core Team, 2015), version 3.2.0 (2015-04-16). Additionally, we use the following extension packages: xtable (Dahl, 2014) and colorspace (Ihaka et al.,2015;Zeileis et al.,2009).

Here we define our working directory.

Students must change this!

> ROOT <- "/home/komarek/teach/mff_2014/nmsa230_SoftProstr/Rko/" Read data:

> print(load(paste(ROOT, "Data/cars.RData", sep = ""))) [1] "cars"

Subset of data containing only selected variables:

> subcars <- subset(cars, select = c("vname", "ftype", "fdrive", + "weight", "consumption"))

Basic descriptive statistics: > summary(subcars)

vname ftype fdrive weight consumption Length:428 osobni :245 front:226 Min. : 839 Min. : 3.75 Class :character combi : 30 rear :110 1st Qu.:1407 1st Qu.: 9.65 Mode :character SUV : 60 4x4 : 92 Median :1576 Median :10.65 pickup : 24 Mean :1623 Mean :10.71 sport : 49 3rd Qu.:1803 3rd Qu.:11.64 minivan: 20 Max. :3261 Max. :21.55

NA's :2 NA's :14 Here, descriptive statistics are calculated but not shown:

(2)

> ssubcars <- summary(subcars)

Here, descriptive statistics are calculated, results shown but the code is not shown: vname ftype fdrive weight consumption Length:428 osobni :245 front:226 Min. : 839 Min. : 3.75 Class :character combi : 30 rear :110 1st Qu.:1407 1st Qu.: 9.65 Mode :character SUV : 60 4x4 : 92 Median :1576 Median :10.65 pickup : 24 Mean :1623 Mean :10.71 sport : 49 3rd Qu.:1803 3rd Qu.:11.64 minivan: 20 Max. :3261 Max. :21.55

NA's :2 NA's :14 Here, descriptive statistics are calculated but neither results nor the code are shown: Here, only code is shown but nothing calculated:

> summary(subcars)

It is also possible to use a calculated number (calculated numbers) in the body of the text: > meanConsump <- mean(subcars$consumption, na.rm = TRUE)

> meanConsump <- format(round(meanConsump, 2), nsmall = 2) > print(meanConsump)

[1] "10.71"

(3)

2 Tables

Descriptive statistics of consumption given drive. > attach(cars)

> sconsumpt <- data.frame(

+ Mean = tapply(consumption, fdrive, mean, na.rm=TRUE), + SD = tapply(consumption, fdrive, sd, na.rm=TRUE), + Median = tapply(consumption, fdrive, median, na.rm=TRUE),

+ Q1 = tapply(consumption, fdrive, quantile, prob=0.25, na.rm=TRUE), + Q3 = tapply(consumption, fdrive, quantile, prob=0.75, na.rm=TRUE), + n = tapply(!is.na(consumption), fdrive, sum),

+ NAs = tapply(is.na(consumption), fdrive, sum)) > detach(cars)

> print(sconsumpt)

Mean SD Median Q1 Q3 n NAs front 9.674306 1.888841 9.800 8.45 10.70 216 10 rear 11.293981 1.293581 11.250 10.55 11.85 108 2 4x4 12.477222 2.339009 11.725 10.70 14.05 90 2 Table created using the xtable package

> rownames(sconsumpt) <- c("Front", "Rear", "4x4")

> colnames(sconsumpt) <- c("Mean", "SD", "Median", "Q1", "Q3", "n", "NA's") > #

> library("xtable")

> tconsumpt <- xtable(sconsumpt, align=c("l", rep("r", 7)), + digits=c(0, rep(2, 5), 0, 0),

+ caption="Table of descriptive statistics of consumption (l/100 km).", + label="tab:descrConsumpt01")

> print(tconsumpt)

Mean SD Median Q1 Q3 n NA’s Front 9.67 1.89 9.80 8.45 10.70 216 10 Rear 11.29 1.29 11.25 10.55 11.85 108 2 4x4 12.48 2.34 11.72 10.70 14.05 90 2 Table 1: Table of descriptive statistics of consumption (l/100 km). See Table1 for descriptive statistics of consumption.

(4)

Table created by hand being as nice as I wish:

Table 2: Table of descriptive statistics of consumption (l/100 km).

Category Mean (Std. Dev.) Median (Q1 – Q3) n Missing

Front 9.67 (1.89) 9.80 (8.45 – 10.70) 216 10

Rear 11.29 (1.29) 11.25 (10.55 – 11.85) 108 2

4x4 12.48 (2.34) 11.72 (10.70 – 14.05) 90 2 Small improvement:

Table 3: Second table of descriptive statistics of consumption (l/100 km).

Category Mean (Std. Dev.) Median (Q1 – Q3) n Missing

Front 9.67 (1.89) 9.80 ( 8.45 – 10.70) 216 10

Rear 11.29 (1.29) 11.25 (10.55 – 11.85) 108 2

(5)

3 Figures

Define what should be conducted before each plotting. > figSweave <- function(){

+ par(bty = "n", mar = c(5, 4, 4, 1) + 0.1) + ## WHATEVER OTHER R COMMANDS

+ }

> options(SweaveHooks = list(fig = figSweave))

Figure which is drawn, saved as PDF and automatically placed in a text (pdfLA_{TEX must then}

be used):

> library("colorspace")

> COL <- rainbow_hcl(1, start = 90)

> boxplot(cars$consumption, ylab = "Consumption [l/100 km]", col = COL)

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5 10 15 20 Consumption [l/100 km]

Figure 1: Boxplot of consumption.

(6)

Figure which is drawn, saved as PDF but it is nowhere placed automatically. Placing the figure into the document is the author’s responsibility.

> boxplot(cars$consumption, ylab = "Consumption [l/100 km]", col = COL)

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

5

10

15

20 Consumption [l/100 km]

(7)

It is also possible to use standard functions pdf(), postscript(), png() etc. to save a plot in an arbitrary format on an arbitrary place with an arbitrary filename:

> postscript("./Obrazky/boxplot-consumpt.eps", width = 6, height = 6, + horizontal = FALSE, paper = "special")

> boxplot(cars$consumption, ylab = "Consumption [l/100 km]", col = COL) > dev.off()

> #

> RES <- 500

> png("./Obrazky/boxplot-consumpt.png", width = 6*RES, height = 6*RES, res = RES) > boxplot(cars$consumption, ylab = "Consumption [l/100 km]", col = COL)

> dev.off()

(8)

References

Dahl, D. B. (2014). xtable: Export tables to LATEX or HTML. URLhttp://CRAN.R-project. org/package=xtable. R package version 1.7-4.

Ihaka, R., Murrell, P., Hornik, K., Fisher, J. C., and Zeileis, A. (2015). colorspace: Color Space Manipulation. URL http://CRAN.R-project.org/package= colorspace. R package version 1.2-6.

Leisch, F. (2002). Dynamic generation of statistical reports using literate data analysis. In H¨ardle, W. and R¨onz, B., editors, COMPSTAT 2002 – Proceedings in Computational Statistics, pages 575–580, Heidelberg, 2002. Physica-Verlag.

R Core Team (2015). R: A Language and Environment for Statistical Computing. R Founda-tion for Statistical Computing, Vienna, Austria. URLhttp://www.R-project.org/.

Zeileis, A., Hornik, K., and Murrell, P. (2009). Escaping RGBland: Selecting col-ors for statistical graphics. Computational Statistics and Data Analysis, 53(9), 3259–3270. doi:10.1016/j.csda.2008.11.033.