InstallingRandrelatedAdd

(1)

R installation and analysis notes

Murtaza Haider June 1, 2010

Installing R and related Add‐ins

You can learn about the R Project website: http://www.r‐project.org/

To download R from the University of Toronto’s site, please click on http://probability.ca/cran/. R is available for Linux, Windows, and MacOS X.

To download the Windows version, please visit:

http://cran.stat.sfu.ca/ and click on base

and click onDownload R 2.9.0 for Windows (36 megabytes)

Save the file and then double‐click it to install R.

The following dialogue will appear:

Select OK.

(4)

Click Next on the following dialogue box:

Select the directory to install the software. For default location, click Next.

(5)

Click Next on the following dialogue box:

(6)

Click Next on the following ensuring that you have selected (Yes customize startup):

Select SDI (separate Windows) on the following dialogue box.

Select NEXT on the following dialogue boxes:

(7)

‐‐

If more dialogues appear, please click Next. R will install and you’ll notice the following symbol on your desktop after installation:

(8)

Double click to run R.

R will launch and appear as follows:

Researchers around the world have contributed approximately 2000 packages for R. You can search for a package from the R website and download/install it directly from within R using:

Packages>install packages.

R will prompt you to select a mirror site to

download the packages. I often use the Ontario site.

Select site & Click ok.

(9)

R with point‐and‐click GUI feature

Until recently, R was a command driven software. John Fox at McMaster University has added the GUI capabilities to R in a package. For details, please see:

http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/

To install R Commander, please type the following at the red cursor in R:

>install.packages("Rcmdr", dependencies=TRUE)

This command will download R commander.

Once downloaded, load R Commander by typing the following at the red prompt:

>library(Rcmdr)

This will launch the Rcmdr, which is a GUI like environment shown on the next page.

Also, update packages by selecting

>packages>update packages

(10)

To add additional functionality to R Cmdr, various packages in R have provided plugins for R Cmdr to offer the point and click functionality. Those plugins could be installed from within R as well.

Use Packages>Install Packages to select Rcmdr plug‐ins.

Other packages required

 AER

 Hmisc

 IsWR

 mlogit

 Zelig

 Psych

 R2HTML

 Ecdat

 Estout and apsrtable

 Betareg

(11)

Learning R and R Cmdr

If you are migrating from SPSS or SAS, you’ll find the Quick R site extremely helpful.

http://www.statmethods.net/

Other sources include:

 An Introduction to R: http://cran.r‐project.org/doc/manuals/R‐intro.pdf

 UCLA website to learn R is a great resource for stats/econometrics in R: http://www.ats.ucla.edu/stat/R/

o Multinomial Logit in R: http://www.ats.ucla.edu/stat/R/dae/mlogit.htm

 Using R: http://cran.r‐project.org/doc/contrib/usingR.pdf

 For Discrete Choice Models, you need to install the mlogit package:

http://cran.r‐project.org/web/packages/mlogit/index.html

Please note that data analysis and modeling is far more convenient in R Cmdr. See the following page for example.

(12)

A sample session in R Cmdr

I have loaded a data set from a peer‐reviewed publication that documents the determinants of extra marital affairs.

Fair's Extramarital Affairs Data

Infidelity data, known as Fair's Affairs. Cross‐section data from a survey conducted by Psychology Today in 1969.

Usage

data("Affairs")

Format

A data frame containing 601 observations on 9 variables.

affairs

numeric. How often engaged in extramarital sexual intercourse during the past year? gender

factor indicating gender.

age

numeric variable coding age in years: 17.5 = under 20, 22 = 20–24, 27 = 25–29, 32 = 30–34, 37

= 35–39, 42 = 40–44, 47 = 45–49, 52 = 50–54, 57 = 55 or over.

Yearsmarried

numeric variable coding number of years married: 0.125 = 3 months or less, 0.417 = 4–6

months, 0.75 = 6 months–1 year, 1.5 = 1–2 years, 4 = 3–5 years, 7 = 6–8 years, 10 = 9–11

years, 15 = 12 or more years.

children

factor. Are there children in the marriage? religiousness

numeric variable coding religiousness: 1 = anti, 2 = not at all, 3 = slightly, 4 = somewhat, 5 =

very.

education

numeric variable coding level of education: 9 = grade school, 12 = high school graduate, 14 = some college, 16 = college graduate, 17 = some graduate work, 18 = master's degree, 20 =

Ph.D., M.D., or other advanced degree.

occupation

numeric variable coding occupation according to Hollingshead classification (reverse numbering).

rating

numeric variable coding self rating of marriage: 1 = very unhappy, 2 = somewhat unhappy, 3 =

(13)

Source

Online complements to Greene (2003). Table F22.2.

http://pages.stern.nyu.edu/~wgreene/Text/tables/tablelist5.htm

In R Cmdr, I clicked on the following:

Statistics>Summaries>Active Dataset

The command produces the following syntax:

summary(Affairs)

and the following output:

The average number of affairs is 1.46. There are 315 women and 286 men in the data set. Average age is 32.5 years. The average number of years married is 8.2 years. 430 out of the 601 respondents had children. On a scale of 1 to 5, the mean religiousness equaled 3.1. The average years of schooling was 16.17 years.

Using Ordinal Logit, we ask if all else being equal, what is the impact of marital bliss on the propensity to have an affair.

In R Cmdr, I first typed the following in script window:

fac.affairs<‐factor(affairs)

I clicked on the following:

(14)

The following output is generated:

polr(formula = fac.affairs ~ age + children + gender + factor(religiousness) + factor(rating), data = Affairs, subset = affairs < 7, Hess = TRUE, method = "logistic")

Coefficients:

Value Std. Error t value age -0.02587355 0.01669516 -1.5497639 children[T.yes] 0.78015980 0.35433369 2.2017658 gender[T.male] 0.52169948 0.27329031 1.9089571 factor(religiousness)[T.2] -1.06156173 0.49774279 -2.1327516 factor(religiousness)[T.3] -0.46820651 0.48288700 -0.9695985 factor(religiousness)[T.4] -0.99231507 0.47187055 -2.1029392 factor(religiousness)[T.5] -0.96170741 0.58225090 -1.6517062 factor(rating)[T.2] -0.42747211 0.77127502 -0.5542408 factor(rating)[T.3] -0.92664592 0.74809036 -1.2386818 factor(rating)[T.4] -1.33084546 0.72230890 -1.8424880 factor(rating)[T.5] -1.73421084 0.73402207 -2.3626140

Intercepts:

Value Std. Error t value 0|1 -0.1773 0.9512 -0.1864 1|2 0.5975 0.9521 0.6276 2|3 1.2950 0.9603 1.3485 3|7 17.0637 0.9603 17.7690 7|12 17.9800 0.9603 18.7232

(15)

AIC: 561.8889

The model suggests that male are more likely to report an affair, presence of children correlates with fewer affairs, religious minded individuals reported fewer affairs, and yes, those who reported that they were happily married were also less likely to have affairs.

(16)

R Data Analysis

Objects

Listing objects:

Objects

()

Removing objects:

rm(x,y)

Storing R session

Stored as .Rdata and .Rhistory

Summary tables

 Summarising a dummy variable summary(walk.jun3$NSL)

 Without quantiles and FACTOR

numSummary(Housing_VPT[,"total"], groups=Housing_VPT$nud,

statistics=c("mean", "sd", "length"), quantiles=c( 0,.25,.5,.75,1 ))

 With quantiles

numSummary(Housing_VPT[,"n.cars"],

groups=Housing_VPT$attach.nbhd, statistics=c("mean", "sd", "quantiles"), quantiles=c( 0,.25,.5,.75,1 )

 More than one variable

numSummary(Housing_VPT[,c(".est.work", ".In.cars.1")], statistics=c("mean"), quantiles=c( 0,.25,.5,.75,1 ))  More than one variable by FACTOR

numSummary(Housing_VPT[,c(".est.work", ".In.cars.1")], groups=Housing_VPT$club.sport, statistics=c("mean", "sd", "quantiles"), quantiles=c( 0,.25,.5,.75,1 ))

Aggregate

tapply(dep.var, list(cat.var1,cat.var2),mean)

tapply(TeachingRatings$eval, list(gender=TeachingRatings$gender, minority=TeachingRatings$minority), mean, na.rm=TRUE)

aggregate(variables, list(Urban=urban,Auto=auto.cat),mean,na.rm=T) as.data.frame(aggregate(variables,

(17)

t(aggregate(variables, list(Urban=urban,Auto=auto.cat),mean,na.rm=T)) aggregate(walktrip$n.park, by=list(NHD=walktrip$nhd),

FUN=c("count","sum","mean","median","sd","min","max"), na.rm=TRUE,length.warning=TRUE)

Writing your own functions

Often one needs more than just mean or sd in the calculations, however, the allowance is for just one statistics, e.g., mean. One can write custom function to avoid the limitation. Consider a function that produces both mean ,sd, min, and max.

meansd<-function(x) { c(mean(x),sd(x),min(x),max(x))} meansd(x)

tapply(x, factor,meansd)

Conditional transformation of a variable

Let's say we are interested in revising a variable subject to certain condition, e.g., if one variable crosses a threshold modify the other variable as 1 and 0 otherwise. Here is how the code works in R for a new variable that identifies birth rates as low and high based on a threshold of 2.

Data set: demog

Existing variable: brate

New variable: brate.cat (0 if brate<2, and 1 otherwise) demog$brate.cat<-1

demog$brate.cat [brate<2] <- 0

Cross Tabs

mytable <- table(walktrip$atelkey.maitre,walktrip$nhd) # A will be rows, B will be columns

#mytable # print table

margin.table(mytable, 1) # A frequencies (summed over rows) margin.table(mytable, 2) # B frequencies (summed over columns)

Recoding characters into Numeric Factor

Risk.profile is a factor variable with very long string definitions:

table(risk.profile)

You flip a fair coin. If head occurs, you receive $100 and if tail occurs, you receive nothing. 32

You receive $50 for sure. 167

rmg$risk.prone <- 1

rmg$risk.prone[rmg$risk.profile == "You receive $50 for sure."] = 0 rmg$risk.prone <- factor(rmg$risk.prone, labels=c('no','yes'))

table(risk.prone)

2nd Approach

(18)

rmg$risk.3 <- 1

rmg$risk.3[rmg$risk.2 == "You r"] =0

Missing values

# recode 99 to missing for variable v1

# select rows where v1 is 99 and recode column v1 mydata[v1==99,"v1"] <- NA

Although it is probably easier to use

replace()

:

test <- c(1, 1, 2, 1, 1, 8, 1, 2, 1, 10, 1, 8, 2, 1, 9, 1, 2, 9, 10, 1)

test

test <- replace(test, test == 8 | test == 9 | test == 10, NA)

test <- replace(test, test == 1, 0) test <- replace(test, test == 2, 1)

#specify the name and address of the remote file

datafilename="http://personality-project.org/r/datasets/maps.mixx.msq1.epi.bf.txt"

data =read.table(datafilename,header=TRUE) #read the data file msq=data[,2:73] #select the subset of items in the MSQ

msq[msq=="9"] = NA # change all occurences of 9 to be missing values msq <- data.frame(msq) #convert the input matrix into a data frame for

easier manipulation

names(msq) #what are the variables?

summary(msq) #basic summary statistics -- check for miscodings

cleaned=na.omit(msq) #remove the cases with missing values

Factor Analysis

f2=factanal(cleaned,2,rotation="varimax") #factor analyze the

resulting item

#(f2) #show the result load=loadings(f2)

print(load,sort=TRUE,digits=2,cutoff=0.01) #show the loadings plot(load) #plot factor 1 by 2

identify(load,labels=names(msq)) #put names of selected points onto the figure

Further Factor Analysis

(19)

?prcomp summary(total) library(Hmisc)

summary(cbind(dwelling.density, employment.with.5.km, street.density.length.area, housing.mix, pedestrian.connectivity, stlanedens, commerce.landuse))

summary(cbind(sidewalks, street.width.ft, pathways.m.1km, greenspace.m2.1km, retail.num.1km, net.to.sl, greenspace.per))

built<‐na.exclude(cbind(sidewalks, street.width.ft, pathways.m.1km, greenspace.m2.1km, retail.num.1km,

net.to.sl, greenspace.per)) pca1<‐prcomp(built,scale=TRUE) pca1

summary(pca1) plot(pca1,main="") biplot(pca1)

walk.aug18$green.pca1<‐predict(pca1)[,1] w4$f3.pca1<‐predict(pca1)[,3]

names(walk.aug18)

rcorr(cbind(f1.pca1, f2.pca1, f3.pca1))

*** Fails to deal with categorical data w5<‐na.omit(w2)

pca2<‐prcomp(w5,scale=TRUE) *** ERROR: 'x' must be numeric

****** ‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐

*** FACTOR ANALYSIS ?factanal

walk.sep14<‐na.exclude(walk.aug18)

fac2<‐factanal(x=built,factors=2,scores="regression",na.action=na.exclude) fac2

print(fac2, digits=2, cutoff=.3, sort=TRUE)

walk.aug18$fac.green<‐fac2$scores[,1]

walk.aug18$fac.green<‐napredict(na.act,walk.aug18) ?as.data.frame ?na.omit loadings(fac1)[,1]

(20)

fac.w$f1<‐fac3$scores[,1]

fit <‐ factanal(fac.w, 2, rotation="varimax") print(fit, digits=2, cutoff=.3, sort=TRUE)

Modelling Categorical Dependant Variables

The following models are covered:

1. Binary Logit 2. Multinomial Logit 3. Conditional Logit 4. Nested Logit 5. Mixed Logit 6. Grouped Logit 7. Probit Models

Logistic Regression and Exponential Coefficients

# where F is a binary factor and # x1-x3 are continuous predictors

fit <- glm(F~x1+x2+x3,data=mydata,family=binomial()) summary(fit) # display results

Names(fit)

Conditional Logit

Using package mlogit

load("H:/Research/Projects/Workshop/Logit/R/Hensher.rda") data <- mlogit.data

(h09,choice="mc",shape="long",id.var="id",alt.levels=c("air","train","bus","c ar"))

summary(mod1<-mlogit(mc~invc+invt|hinc,data,reflevel="car")) summary(mod2<-mlogit(mc~invc+invt|hinc,data,reflevel="train"))

Forecasts

fcast <-fitted(mod2) ###Forecasts only the dominant mode

(21)

fcast.w<-fitted(mod2,outcome=F) colMeans(fcast.w)

colMeans(fitted(mod2,outcome=F))

h09b <- subset(h09, subset=mc=="yes")

fcastw<-data.frame(cbind(h09b$alt,fcast.w))

fcastw$mode <- factor(fcastw$V1, labels=c('air','bus','car','train')) mean(fcastw[2:5])

With Zelig

setwd("H:/Research/Projects/Workshop/Logit/R/CLogit")

load("H:/Research/Projects/Workshop/Logit/R/CLogit/h09.rda")

attach(h09)

h09$t = 2 - mc h09$choice<-2-h09$t

h09$alt <- factor(h09$alt, labels=c('air','bus','car','train')) h09$mc <- factor(h09$mc, labels=c('no','yes'))

*** Where mode is 1 if chosen and 0 otherwise.

library(mlogit) library(Zelig) names(h09)

table(mc,alt)

z1 <- zelig(Surv(t,choice) ~ invt+twait+gc+aasc+tasc+ basc+hinca+ strata(id), model = "coxph",data = h09,na.action=na.exclude)

z2 <- zelig(Surv(t,choice) ~ invt+twait+gc+hinc*alt+ strata(id), model = "coxph",data = h09,na.action=na.exclude)

summary(z2)

***this gets the forecasts:

h09$fexpect<-predict(z2,type="expected")

** Not sure about the following

h09$flp<-predict(z2,type="lp") h09$frisk<-predict(z2,type="risk") h09$fterms<-predict(z2,type="terms")

numSummary(h09[,c("flp", "fexpect")], groups=h09$alt, statistics=c("mean", "sd"), quantiles=c(0,.25,.5,.75,1))

(22)

> numSummary(h09[,"fexpect2"], groups=h09$alt, statistics=c("mean", "sd", "quantiles"), quantiles=c(0,.25,

+ .5,.75,1))

mean sd 0% 25% 50% 75% 100% n air 0.2762 0.2979 0.0009807 0.05426 0.1344 0.4561 0.9833 210 bus 0.1429 0.1989 0.0007001 0.02740 0.0616 0.1801 0.8860 210 car 0.2810 0.2313 0.0080268 0.07236 0.2315 0.4758 0.8549 210 train 0.3000 0.2982 0.0005396 0.06754 0.1879 0.4400 0.9832 210

. tab alt,sum( fcastnew)

Grouped Logit

Same as glogit and blogit in Stata

data("womensrole", package = "HSAUR")

summary(mod1 <- glm(cbind(agree, disagree) ~ sex + education, data = womensrole,family = binomial(),trace=T))

or

fm1 <- cbind(agree, disagree) ~ sex + education

womensrole_glm_1 <- glm(fm1, data = womensrole,family = binomial())

Beta reg

Description Beta regression for modeling beta‐distributed dependent variables, e.g., rates and proportions.

(23)

performed by maximum likelihood (ML) via optim using analytical gradients and (by default) starting

values from an auxiliary linear regression of the transformed response.

betareg(formula, data, subset, na.action, weights, offset,

link = c("logit", "probit", "cloglog", "cauchit", "log", "loglog"), link.phi = NULL, control = betareg.control(...),

model = TRUE, y = TRUE, x = FALSE, ...)

betareg.fit(x, y, z = NULL, weights = NULL, offset = NULL, link = "logit", link.phi = "log", control = betareg.control())

Model estimates and coefficients

confint(fit) # 95% CI for the coefficients exp(coef(fit)) # exponentiated coefficients

exp(confint(fit)) # 95% CI for exponentiated coefficients predict(fit, type="response") # predicted values

residuals(fit, type="deviance") # residuals

library(AER)

coeftest(MLM.1)

T‐Test for NHD

Create subsets so that the variable includes only two categories.

Mon <‐ Housing_VPT[Housing_VPT$city=="Montreal", ]

Mck <‐ Housing_VPT[Housing_VPT$nhd==c("MT","ML"),]

t.test(total~nhd, alternative='two.sided', conf.level=.95, var.equal=FALSE, data=Mck)

Data Input

Reading data

hs1 <- read.table("http://www.ats.ucla.edu/stat/R/notes/hs1.csv", header=T, sep=",")

attach(hs1)

Keeping certain variables or records

Keeping only the observations where the reading score is 60 or higher.

hs1.read.well <- hs1[read >= 60, ]

(24)

these four variables to indicate that we want only these variables in the new data frame called hs1.kept. We use the names function again to verify that hs1.kept consists of only the four variables that we wanted to keep.

names(hs1.read.well)

[1] "female" "id" "race" "ses" "schtyp" "prgtype" "read"

[8] "write" "math" "science" "socst" "prog"

hs1.kept <- hs1.read.well[ , c(1, 2, 7, 8)]

names(hs1.kept)

[1] "female" "id" "read" "write"

Dropping the variables ses and prog from the hs1.read.well data frame by using the column indices corresponding to

these two variables with a negative sign.

names(hs1.read.well)

[1] "female" "id" "race" "ses" "schtyp" "prgtype" "read"

[8] "write" "math" "science" "socst" "prog"

hs1.drop <- hs1.read.well[ , -c(4, 12)]

names(hs1.drop)

[1] "female" "id" "race"

Merge

The merge function allows us to merge two data frames on a variable (or a list of variables). In this case the variable in common is id which has the same name in both data sets. Specifying T in the all argument indicates that we want to keep all the observations from each data set rather than only keeping the observations that came from both data sets.

hsdiss <- merge(hstest, hsdem, by="id", all=T)

If the variable that we were merging on had different names in each data frame then we could use the by.x and by.y

arguments. In the by.x argument we would list the name of the variable(s) that was in the data frame listed first in the

merge function (in this case in hstest) and in the by.y argument we would name the variable(s) that was in the data frame listed second (in this case hsdem).

(25)

Creating an indicator of which data set the observations came from is a little more complicated. We would first create an indicator variable called from, in each data frame to be merged. Then we merge the two data sets. Finally, we create a variable both which would indicate which data frame or both the observation came from. It is generally easier to note that when a data frame did not contribute to the observation in the combined data frame then the variables from that data frame will have missing values (NA's) for that observation.

from <- data.frame(rep(1, length(hsdem$id))) dimnames(from)[[2]] <- "from"

hsdem.1 <- cbind(hsdem, from)

from <- data.frame(rep(1, length(hstest$id))) dimnames(from)[[2]] <- "from"

hstest.1 <- cbind(hstest, from)

hsdiss.2 <- merge(hstest.1, hsdem.1, by.x="id", by.y="id", all=T, suffix=c("test", "dem"))

attach(hsdiss.2)

hsdiss.2$both[!is.na(fromtest) & !is.na(fromdem)] <- "both" hsdiss.2$both[is.na(fromtest)] <- "dem"

hsdiss.2$both[is.na(fromdem)] <- "test"

String AS FACTORS

Correlation Analysis

rcorr(Hmisc)

R Documentation

Matrix of Correlations and P‐values

rcorr(cbind(x,y,z,v))

(26)

Panel Data and Robust Standard Errors

### TEACHING EVALUATION OF PROFESSORS, Jan 24, 2010

lm.1 <- lm(eval ~ beauty + gender + minority + native + tenure + division + credits, data=TeachingRatings)

summary(lm.1)

### Weighted LEAST SQUARES

lm.2 <- lm(eval ~ beauty + gender + minority + native + tenure + division + credits, weights=students,data=TeachingRatings)

summary(lm.2)

lm.2 <- lm(eval ~ beauty + gender + minority + native + tenure + division + credits, weights=students,data=TeachingRatings)

summary(lm.2)

###PANEL DATA MODELS

library(plm)

### Convert of variable from factor to integer and sort data

tr.clean$prof.2 <- with(tr.clean, as.integer(prof)) attach(tr.clean)

tr.sort2 <- tr.clean[order(prof.2),] attach(tr.sort2)

### Declare data as Panel data

tr.3<-plm.data(tr.sort2,index=c("prof.2")) attach(tr.3)

(27)

cbind(coef(plm.pool),coef(plm.rand),coef(lm.2))

### ROBUST STANDARD ERRORS

### The following didn't work.

summary(lm.2)

coeftest(lm.2, vcov = vcovHC) coeftest(lm.2, vcov = vcovHAC)

###DESIGN LIBRARY

library(Design) ### Use OLS

lm.3 <- ols(eval ~ beauty + gender + minority + native + tenure + division + credits, weights=students,

data=TeachingRatings,x=TRUE, y=TRUE)

### Doesn't work --> summary(lm.3) lm.3

robcov(lm.3, prof.2)

### Other options

robcov(lm.3, prof.2,method="efron")

adjfit<-robcov(lm.6, prof.2) sqrt(diag(adjfit$var))

Weighted Arithmetic Mean

Compute a weighted mean of a numeric vector.

Usage

weighted.mean(x, w, na.rm = FALSE)

Arguments

x

_a_numeric_vector_containing_the_values_whose_mean_is_to_be_computed.

w

_a_vector_of_weights_the_same_length_as_x_giving_the_weights_to_use_for_each_element_of_x_.

na.rm

_a_logical_value_indicating_whether_NA_values_in_x_should_be_stripped_before_the_computation

proceeds.

(28)

Missing values in

w

are not handled.

Examples

## GPA from Siegel 1994 wt <- c(5, 5, 4, 1)/15 x <- c(3.7,3.3,3.5,2.8) xm <- weighted.mean(x,wt)

Weighted Cross tabs using Koppelman Intercity example

If the weights are available for each observation as a variable(wt):

.Table <‐ xtabs(wt~altnum+type, data=via.tot, subset=choice==1) .Table

colPercents(.Table) # Column Percentages .Test <‐ chisq.test(.Table, correct=FALSE) .Test

(29)

Estimate Table

Uses the data stored in the "ccl" object to create a formatted table. The default is LaTeX but since version 0.5 export to CSV is possible. Therefore it is possible to import the output into a spreadsheet program and edit it for a wordprocessor.

Usage

esttab(t.value = FALSE, p.value = FALSE, round.dec = 3, caption = NULL, label = NULL, sig.levels = c(0.1, 0.05, 0.01), sig.sym=c("*","**","***"), filename=NULL, csv=FALSE, dcolumn=NULL, table="table",

table.pos="htbp", caption.top=FALSE, booktabs=FALSE, var.order=NULL, sub.sections=NULL,var.rename=NULL)

Arguments

t.value

_if_set_to_TRUE_the_table_will_contain_t_‐_values_instead_of_the_default_standard_errors

p.value

_if_set_to_TRUE_the_table_will_contain_p_‐_values_instead_of_the_default_standard_errors

round.dec

_number_of_decimals_to_round_to

caption

_to_be_used_in_the_LaTeX_output_table

label

_to_be_used_in_the_LaTeX_output_table

sig.levels

_to_change_the_way_the_stars_are_calculated._The_values_must_be_given_as_a_vector_from

largest to smallest p‐value

sig.sym

_vector_of_symbols_to_depict_significance_levels_in_TeX_‐_tables._Insert_the_TeX_command

between the "". The vector corresponds to the sig.levels vector. Please note that due to the cat command backslash needs to be inserted twice in order to appear in the TeX document. (e.g. "\\alpha")

filename

_determins_the_filename_of_the_{output.Default}_is_NULL,_output_is_printed_to_screen.

csv

_for_output_to_csv_(comma_separated_textfile)_for_direct_import_to_a_spreadsheet

program. The default is TeX‐output.

dcolumn

_a_string_can_be_inserted_that_corresponds_to_a_predefined_column_type_in_the_TeX_‐

document's head.

table

_a_string_for_choosing_a_different_table_type_like_sideways_or_tablex.

(30)

parameters insert 'NULL'.

caption.top

_if_set_to_TRUE_the_caption_will_be_inserted_above_the_table.

booktabs

_if_set_to_TRUE_the_\hline_commands_are_replaced_by_there_{corresponding}_booktabs

commands.

var.order

_by_default_the_order_of_variables_is_determined_by_there_appearance_in_the_models.

Providing a vector of variables here in a different order will change the order of variables in the output table. Note that '(Intercept)' is enclosed in braces.

sub.sections

_if_one_needs_to_subdivide_the_table_in_several_sections_using_'subtitles'_this_can_be_done

here. Providing a vector of the form

c(linenumber,"subtitle",2ndlinenumber,"2ndsubtitle") and so forth.

var.rename

_vector_of_names_to_replace_variable_{abbreviations}_of_model_with_real_names._The_vector

has to looks like this:

var.rename=c("old.name1","new.name1","old.name2","new.name2")

Example

names(swm)

lm.10<‐ lm(ln.waste.g ~ nhd + hhld.members + kids.bin + ln.area + own+ grade12.plus, data=swm,na.action=na.omit)

summary(lm.10)

lm.11<‐ lm(ln.waste.g ~ nhd * hhld.members + kids.bin + ln.area + own+ grade12.plus, data=swm,na.action=na.omit)

summary(lm.11)

estclear()

eststo(lm.10); eststo(lm.11)

esttab(t.value=TRUE,round.dec=3,csv=T)

For Standard errors:

esttab(t.value=F,round.dec=3,csv=T)

(31)

ln.waste.g ln.waste.g

(Intercept) 4.145*** 4.261***

[8.136] [8.046]

nhd[T.Naseerabad] 0.591*** 0.678*

[5.003] [1.796]

nhd[T.MC] 0.771*** 0.667

[5.285] [1.291]

nhd[T.PIA Colony] 0.236* ‐0.064

[1.949] [‐0.168]

nhd[T.Nisar Road] 1.212*** 0.96***

[8.454] [2.75]

nhd[T.Valley Road] 1.065*** 1.339***

[6.912] [3.698]

hhld.members 0.142*** 0.134***

[8.698] [3.425]

kids.bin[T.yes] ‐0.045 ‐0.042

[‐0.604] [‐0.558]

ln.area 0.161** 0.152**

[2.307] [2.144]

own[T.yes] ‐0.101 ‐0.119

[‐1.219] [‐1.388]

grade12.plus 0.02 0.014

[0.261] [0.179]

nhd[T.Naseerabad]:hhld.members ‐0.01

[‐0.191]

nhd[T.MC]:hhld.members 0.02

[0.236]

nhd[T.PIA Colony]:hhld.members 0.047

[0.838]

nhd[T.Nisar Road]:hhld.members 0.044

[0.877]

nhd[T.Valley Road]:hhld.members ‐0.043

[‐0.834]

R^2 0.254 0.258

adj.R^2 0.248 0.248

N 1092 1092

t‐values in brackets

(32)

Description Table

Description

Uses the data stored in the "dcl" object to create a standard formated table. The default is LaTeX

(optionally CSV) is possible. Therefore it is possible to import the output into a spreadsheet program and edit it for a word processor.

Usage

desctab(filename=NULL,caption = NULL, label = NULL,csv=FALSE, dcolumn=NULL,booktabs=FALSE)

descsto(swm)

InstallingRandrelatedAdd

R installation and analysis notes

Table of Contents