QPM Lab 2:
Data Visualization & Descriptive Statistics
in
R
and
R
Commander
Viktoryia Schnose & Betul Demirkaya
Department of Political Science
Washington University, St. Louis
September 4, 2013
A Reminder From Last Class
1
Open
R
2
Write the following in the command line:
library(Rcmdr)
3
This way, you open the package from the
R
library. It does
not
open
automatically.
4
A new
window
should have opened. This is
R commander
.
5
Go to the website for QPM.
6
Open the Class Datasets.
Goals for the Class
•
Reading a dataset & viewing the dataset with
R commander
•
Re-coding a variable to a factor
•
Descriptive Statistics with
R commander
•
Bar graphs with
R commander
Reading Data with
R
: Answering Two Questions
1
What type of file do you have?
•
.csv
= comma separated values
•
.txt
= text file
•
.dta
= STATA file
•
.spss
= SPSS file
2
Where is the file?
•
Saved in your computer
•
Online in a url
Reading Data with
R
:
.csv
or
.txt
STEP 1: What type of file are we using for the homework?
Reading Data with
R
:
.csv
or
.txt
STEP 2: Where is the file for the dataset for this week’s HW?
See Appendix for how to enter the URL.
Brazil Dataset: Viewing the Dataset
Active
Dataset
Variable
name
Observation
number
Click here
Wednesday, September 7, 2011Brazil Dataset: Converting
vote
variable into
a factor = qualitative variable
Brazil Dataset: Converting
vote
variable into
a factor = qualitative variable II
Select
vote
variable.
Brazil Dataset: Converting
vote
variable into
a factor = qualitative variable III
Specify what each numeric value means (read the codebook).
Wednesday, September 7, 2011
Brazil Dataset: Frequency Distribution for
vote
variable
Statistics
→
Summaries
→
Frequency distributions
Brazil Dataset: Bar Graph for
vote
variable (frequency)
Graphs
→
Bar Graph
→
vote
no yes vote F re qu en cy 0 200 400 600 800
See Appendix for another way of presenting the same data.
In class assignment
•
Break into groups of 3 or 4 individuals
•
Fill out the sheet. ASK QUESTIONS.
•
If you finish early, try to draw a bar graph for
vote
variable that
shows probabilities. See Appendix for help.
Appendix: Table of Contents
•
Using
R
&
R Commander
•
Reading
.csv
or
.txt
files from the internet
•
Frequency distribution: Difference between
R
script and
R
output
•
Another graph for vote variable showing probability
•
R
code to obtain the bar graph with probability
Using
R
&
R Commander
Drop down
menus
Toolbar
Script Window
:
Here you will see R commands
generated by the GUI.
You can write commands here.
Select them by highlighting them and press ‘Submit’
Output Window
:
Dark Blue
: printed output
Red
: command used
Message Window:
Red
- Error Message
Green
- Warning
Dark Blue
- Other
information
Thursday, September 1, 2011Reading Data with
R
:
.csv
or
.txt
From the Internet
STEP 3: Writing down the
URL
for the Brazil dataset.
Prof. Montgomery’s website
Difference between R Script and R Output
Output
Script
Wednesday, September 7, 2011
Brazil Dataset: Bar graph for
vote
variable (probability)
no yes
Did you vote in the last presidential election?
Pro ba bi lit y 0 20 40 60 80
Brazil Dataset: Bar graph for
vote
variable (probability)
This is how to modify the code to obtain the previous plot.
barplot(table(brasil$vote)*100/sum(table(brasil$vote)),
xlab = "vote", ylab="Probability")