• No results found

2. Regression and Correlation. Simple Linear Regression Software: R

N/A
N/A
Protected

Academic year: 2021

Share "2. Regression and Correlation. Simple Linear Regression Software: R"

Copied!
8
0
0

Loading.... (view fulltext now)

Full text

(1)

Computer Illustration R Simple Linear Regression

2. Regression and Correlation Simple Linear Regression

Software: R Create txt file from SAS data set

data_null_;

file'C:\Documents and Settings\sphlab\Desktop\slr1.txt';

set temp;

put input day:date7. calls fhigh flow high low rain snow weekday

year sunday subzero;

run;

##### You need to delete the dot signs at the beginning of each line########

1.Read in data from text file

data<-read.table("C:/Documents and Settings/liyuan/Desktop/640TA/slr2.txt",header=T) attach(data)

2. Partical listing of output

list(data) [[1]]

day calls fhigh flow high low rain snow weekday year sunday subzero 1 12069 2298 38 31 39 31 0 0 0 0 0 0 2 12070 1709 41 27 41 30 0 0 0 0 1 0 3 12071 2395 33 26 38 24 0 0 0 0 0 0 4 12072 2486 29 19 36 21 0 0 1 0 0 0 5 12073 1849 40 19 43 27 0 0 1 0 0 0 6 12074 1842 44 30 43 29 0 0 1 0 0 0 7 12075 2100 46 40 53 41 1 0 1 0 0 0 8 12076 1752 47 35 46 40 0 0 0 0 0 0 9 12077 1776 53 34 55 38 1 0 0 0 1 0 10 12078 1812 38 32 43 31 0 0 1 0 0 0 11 12079 1842 35 21 35 25 0 0 1 0 0 0 12 12080 1674 39 27 44 31 1 1 1 0 0 0 13 12081 1692 34 28 40 27 0 0 1 0 0 0

3.Plot of calls over time

par(mfrow=c(2,2))

plot(day,calls, xlim=c(12000,12500), ylim=c(1000,9000), xlab=“Day”,ylab=“Calls”, main=”Calls to NY Auto Club 1993-1994”,col=”black”)

(2)

4. Tests of Assumption of Normality on Y=calls > mean(calls) [1] 4318.75 > length(calls) [1] 28 >sum(calls) [1] 120925 >var(calls) [1] 7249901 > sum(calls^2) ##uncorrected ss## [1] 717992159 > sum(((calls-mean(calls))^2) ) ##corrected ss## [1] 195747315

(3)

Computer Illustration R Simple Linear Regression

##########the package ”fbasic” should be installed first for the following function####### > skewness(calls) [1] 0.4307614 attr(,"method") [1] "moment" > kurtosis(calls) [1] -1.497417 attr(,"method") [1] "excess

######the packages ” nortest” and “stats” should be installed first for the following function####### >shapiro.test(calls)

Shapiro-Wilk normality test data: calls

W = 0.829, p-value = 0.0003628 > cvm.test(calls)

Cramer-von Mises normality test data: calls

W = 0.3112, p-value = 0.0002141 > ad.test(calls)

Anderson-Darling normality test data: calls

A = 1.8673, p-value = 6.68e-05

5. Graphical Assessments of Normality of Y=calls

Histogram with overlay normal

hist(calls,col='lightblue', main='Histogram of calls', breaks=5, include.lowest = TRUE, right = TRUE,freq=F) points(calls,dnorm(calls,mean=mean(calls),sd=sqrt(var(calls))),col='red',lty=6)

(4)

Quantile Quantile Plot

qqnorm(calls,datax=TRUE, main=”Simple Normal QQplot for Y=calls”, ylab=”Calls”, xlab=”Normal quantiles”) qqline(calls,datax=TRUE)

(5)

Computer Illustration R Simple Linear Regression

qqnorm(calls,datax=TRUE, main=”Simple Normal QQplot for Y=calls”) qqline(calls,datax=TRUE) 2000 4000 6000 8000 -2 -1 0 1 2

Simple Normal QQplot for Y=calls

Sample Quantiles T h e o re ti c a l Q u a n tile s

6.Scatterplot of Y=Calls vs X=low

calls0<-calls[year==0] calls1<-calls[year==1] low0<-low[year==0] low1<-low[year==1]

plot(low0,calls0, main="Calls to NY Auto Club 1993-1994",xlim=c(-10,50),ylim=c(1000,9000), xlab="Low", ylab="Calls", col=”green”)

points (low1,calls1, col="red")

(6)

7. Least Squares Estimation and Analysis of Variance Table lm1<-lm(calls~low) summary(lm1) coef(lm1) nova(lm1) Call:

lm(formula = calls ~ low) Residuals:

Min 1Q Median 3Q Max -3112.1 -1467.6 -214.0 1143.9 3587.9

Parameter Estimates

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 7475.85 704.63 10.610 6.10e-11 *** low -145.15 27.79 -5.223 1.86e-05 *** Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 1917 on 26 degrees of freedom

(7)

Computer Illustration R Simple Linear Regression

8. Overlay of straight line fit onto scatterplot of Y=calls vs X=low

abline(lm1)

9. Residuals analysis-Assessment of Normality of Residuals

qqnorm(lm1$residuals, main="Normality of Residuals Y=CALLS v X=LOW")

(8)

plot.lm(lm1,which=4, main=”Cook’s Distance Values for Straight Line Y=Calls v X=Low”)

10. Residuals Analysis—Detection of Outliers Using Cook’s Distance

Diag<- ls.diag(lm1)

plot(lm1$fitted,diag$stud.res,ylim=c(-2.0,2.5),xlab="Predicted Value",ylab="Studentized Residual",main="Jacknife Residuals versus Predicted")

abline(h=0,lty=c(3))

-1

0

1

2

Jacknife Residuals versus Predicted

ud en ti z e d Res id u a l

References

Related documents

In a T-stub, the tensile stiffness results from the elastic deformation of the T-stub flange in bending and of the bolts in tension (the role of the latter is plaid by the anchor

For pending production sequence numbers contact your local dealer for Yale lift trucks. Side-shift carriages and attachments

Through applying it to the CDMA multiuser detector, a multiuser detector for adaptive transiently chaotic neural network (A-TCNN) based on simulated annealing of optimization

Ecological flow networks are described by means of the Theory of Graphs and characterised by two quantities of the Mathematical Theory of Information, namely the

The oxidation number for a tertiary carbon atom The oxidation number for a tertiary carbon atom in CH group, which is bonded with three carbon atoms in CH group, which is bonded

The additional CT findings were consistent with a novel pattern of a traumatic avulsion of the left principal bronchus expanding into the carina and caudal thoracic trachea..

In this paper, we have introduced a new type of feature called the HLBP feature which com- bines the concepts of Haar feature introduced by Viola and Jones, with Local Binary

Specifically, to maintain the same trade flows to and from third countries, the retaliation needs to be such that the volume lost in the complaining country