• No results found

Business Statistics: Intorduction

N/A
N/A
Protected

Academic year: 2021

Share "Business Statistics: Intorduction"

Copied!
45
0
0

Loading.... (view fulltext now)

Full text

(1)

Business Statistics: Intorduction

Donglei Du

([email protected])

Faculty of Business Administration, University of New Brunswick, NB Canada Fredericton E3B 9Y2

(2)

Table of contents

1 Why Statistics? 2 What is Statistics?

3 Two methodologies in Science 4 Variables and types

5 Sources of Statistical Data

6 Software for Statistical Analysis

7 Materials to learn R 8 A brief tutorial of R with a

case study

(3)

Section 1

Why Statistics?

(4)

Figure: Source: http://mechanicalforex.com/2015/05/

building-algorithmic-trading-systems-for-the-forex-market-part-2-where-to-look. html

(5)

Why Statistics?

This is a required course for your degree. It is a prerequisite for many other topics.

Data everywhere, particularly in this big data era. Sampling vs censusing.

Decision Making: Statistics will help you make important decision.

(6)

Data everywhere: Example 1: productivity and

standard of living of a nation

wages Labor productivity Japan US U.K. Canada France Germany Italy India Indonesia Bulgaria Egypt Pakistan 80 8 0.8 2 20 200

Figure: High productivity is a key to high standard of living

Two important statistics that a nation is most concerned are the

productivity andstandard of

living. Productivity is usually

measured in terms of output per worker, and standard of living is measured in terms of wages per worker. They are usually strongly related. Countries with high productivity in general are seen with high standard of living (Left figure)

(7)

Data everywhere: Example 2: Consumer Price

Index (CPI) in Canada

The Consumer Price Index measures the rate of price change for goods and services bought by consumers in a country.

It is a statistic constructed using the prices of a sample of representative items whose prices are collected periodically. For instance, the CPI All-items for Canada for the month of July 2013 was 123.1 (2000=100), meaning that consumer prices were 23.1% higher in July 2013 than in 2000. [ hyphens]http://www.statcan.gc.ca/tables-tableaux/sum-som/l01/cst01/cpis01a-eng.htm

(8)

Data everywhere: Example 2: Consumer Price

Index (CPI) in Canada

The CPI directly or indirectly affects nearly all Canadians.

Old Age Security pensions, Canada Pension Plan payments, and other forms of social and welfare payments are adjusted

periodically to take account of changes in the CPI.

Rental agreements, spousal and child support payments and other forms of contractual and price-setting arrangements are frequently tied in some manner to movements in the CPI. Cost-of-living adjustment (COLA) clauses link wage increases to movements in the CPI. Labour contracts governing the wages of many Canadian workers include COLA clauses.

(9)

Sampling vs censusing?

Costs of surveying the entire population may be too large or prohibitive

e.g., Television networks monitor the popularity of their programs;

Destruction of elements during investigation

e.g., Manufacturers estimate the average lifetime of light bulbs; doctors take a blood sample to check for disease.

Unknown future

(10)

Decision Making

How do large retailers (like COSTO, Walmart) fill their store shelves so as to meet the customer demand while minimizing their operating cost?

How do doctors in the hospital make diagnose? How do political leaders run their campaign?

How do Investment Banks in Wall Street (or Bay Street) decide which stock (or stocks) to invest?

How do insurance companies decide the premium for a particular client?

How do car dealers decide how many car models of each brand to be kept in their locations?

(11)

Section 2

What is Statistics?

(12)

What is Statistics?

It is the science and art

of collecting, organizing, and representing data in such a way that the characteristics and patterns of the data can be easily captured (Descriptive Statistics);

also, of estimating attributes and drawing inference from a sample about the entire population (Inferential Statistics).

(13)

Examples: descriptive or inferential?

In 1995, 45% of Canadian households owned a computer and 25% were connected to internet; On average, Canadians spend 1.3 hours per day commuting, and 1.5 hours per day with their children.

descriptive

The accounting department of a firm will select a sample of invoices to check for accuracy of all the invoices of the company.

(14)

Terminologies

Population vs sample

Population: a set of all elements of interests in an investigation Sample: a subset of all elements of a population

Parameter vs Statistic

Parameter: a measurable characteristic of a Population (such as average, extreme, variation, proportion)

Statistic: a measurable characteristic of a Sample

(15)

Section 3

(16)

Two methodologies in Science

Deductive: general −→ particular Mathematics: Axioms + logic Inductive: particular −→ general

Most natural sciences, like physics, chemistry, and Statistics... a.k.a, the method of Experimentation

(17)

The method of Experimentation

1 Define the experimental goal or a working hypothesis 2 Design an experiment

3 Collect and represent data 4 Estimate the values/relations 5 Draw inferences

(18)

Section 4

Variables and types

(19)

Variable

1 A variable is a characteristic/attribute of a population/sample that is interest in a particular investigation and the value of the variable can "vary" from one entity to another.

A person’s gender is a variable, which could have the value of "Male" for one person and "Female" for another.

The rank of faculty members in Business Administration is a variable, which could have the value of "Full Professor" for one person, "Associate Professor" for another, and "‘Assistant Professor"’ for yet another.

Temperatures in this classroom is a variable which could have the value of "20" or "100".

(20)

Types of Data: Qualitative vs. Quantitative

Variables

Qualitative (a.k.a. categorical)

Qualitative variables take on values that are names or labels. Examples: gender, country names, color

Quantitative (a.k.a, numeric)

Quantitative variables are numeric.

Examples: number of bedrooms in houses, number of minutes to the end of this class, distance between two cities

(21)

Types of Data: Discrete vs. Continuous Variables

Quantitative variables can be further classified as discrete or continuous

Continuous Variable: a variable can take on any value within a range;

Examples: the number of minutes to the end of this class, distance between two cities, pressure in a tire, weight of a pork chop, height of students in a class

Discrete variable: a variable can take only certain value (finite or countably infinite) within a range.

(22)

Data/variable representation

Language such as English, French, Chinese: This is the natural and direct way

Such as "All the people in Canada"’

Numbers: This is the mathematical and indirect way

such as, data representation in computer: binary numbers: 0,1

(23)

Level of measurement

Suppose now we represent all data /variable using numbers, then Level of measurement (a.k.a. scales of measure) are the different ways numbers can be used.

There are four levels of measurements Nominal

Ordinal Interval Ratio

However, representing variables as numbers does not give you the license to perform the regular logical/arithmetic operations all the time (such as comparison, addition, subtraction,

(24)

Nominal level

A variable is at the nominal level if none of the five operations (namely comparison, addition, subtraction, multiplication, and division) is meaningful.

At the nominal level of measurement, numbers are assigned to a set of mutually exclusive and exhaustive categories for the purpose of naming, labeling, or classifying the observations, but no arithmetic operation is meaningful;

where

Mutually exclusive: any individual object is included in ONLY ONE CATEGORY

Exhaustive: any individual object MUST APPEAR in one of the categories

(25)

Examples

Examples Barcodes

Social insurance numbers (SIN) Student IDs

Phones numbers

The fact that the barcode for one product is higher than that for another, or that your SIN is higher than mine tells us nothing. In surveys we often use arbitrary numbers to code variables such as religion, ethnicity, major in college or gender.

(26)

Ordinal level

A variable is at the ordinal level if only comparison is meaningful.

(27)

Examples

Examples

Rank of faculty members

Maclean ranking of Canadian colleges:

[hyphens]http://oncampus.macleans.ca/education/2012/11/01/2013-comprehensive/

US News Ranking of US/world colleges http://www.usnews.com/rankings

The differences between data values cannot be determined or are meaningless.

For instance, first class is better than economy and that business is in between. Just how much better first class is

(28)

Interval

A variable is at the interval level if only division is meaningless. Reason: Interval data have meaningful intervals between measurements, but there is no true starting point (zero).

(29)

Examples

Examples

Temperatures in Celsius and Fahrenheit are interval data Certainly order is important and intervals are meaningful. However, a20◦Cdashboard is not twice as hot as the10◦C

outside.

A conversion formula between Celsius and Fahrenheit F= 9 5C+32 So10◦C=50◦Fand20◦C=68◦F. Obviously 20◦C 10◦C 6= 68◦F 50◦F

(30)

Ratio

A variable is at the ratio level if all logical/arithmetic operations are meaningful.

Reason: Ratios between measurements as well as intervals are meaningful because there is a non-arbitrary zero point.

(31)

Examples

Examples Income Distance Height

(32)

Summary of level of measurement

comparison addition subtraction multiplication division

nominal x x x x x

ordinal √ x x x x

interval √ √ √ x x

ratio √ √ √ √ √

(33)

A general method for identifying the level of

measurement

Ask yourself the following three questions: Is order meaningful?

No! then the data is nominal Is difference meaningful?

No! then the data is ordinal Is zero meaningful

No! then the data is interval Yes! then the data is ratio

(34)

Section 5

Sources of Statistical Data

(35)

Sources of Statistical Data

Statistics Canada: http://www.statcan.ca Twitter: www.twitter.com

Facebook: www.facebook.com . . ..

(36)

Section 6

Software for Statistical Analysis

(37)

Software for Statistical Analysis

Many, but the most popular ones are EXCEL (proprietary)

SAS (proprietary) R (open-source) . . ..

(38)

Section 7

Materials to learn R

(39)

Some online resources to learn R I

R in a nutshell An introduction to R R for Beginners Try R Quick-R

The art of R Programming

www.coursera.org offers some nice courses on R Such asthis one

(40)

Section 8

A brief tutorial of R with a case study

(41)

R for Statistical Analysis

We will retrieve some stock data via package quantmod from Yahoo Finance

(42)

Load package

rm(list=ls())

require("quantmod")

(43)

Parameters

startDate <- '1957-03-04' # start of data

endDate <- '2014-01-06' # end of data

(44)

Retrieve data

symbols<-c("^GSPC") #symbols<-c("AAPL","FB", "YHOO") if(file.exists("data/GSPC.RData")) { load("data/GSPC.RData") #GSPC_csv <- read.csv("data/GSPC.csv") } else { getSymbols(symbols,

src = "yahoo", # OHLC format from = startDate,

#to = endDate,

index.class=c("POSIXt","POSIXct"), # Recommended time series index warnings = FALSE,

adjust=TRUE)

dir.create(file.path('data'), showWarnings = TRUE) save(list="GSPC", file="data/GSPC.RData")

write.csv(GSPC,file="data/GSPC.csv")

}

(45)

Plot

options(width=60)

chartSeries(GSPC["2004::2014"]) addMACD()

http://mechanicalforex.com/2015/05/building-algorithmic-trading-systems-for-the-forex-market-part- http://www.usnews.com/rankings http://www.statcan.ca www.twitter.com www.facebook.com R in a nutshell An introduction to R R for Beginners Try R Quick-R The art of R Programming www.coursera.org offers some nice courses on R this one quantmod

References

Related documents

Preventative maintenance (PM) processes should be performed for a wide variety of items used in endoscopy, such as the endoscopes themselves, equipment on the tower, and automated

O halde, mitik geleneklerin ele§tirisi sorunu yanlı§ olarak ortaya konulmu§tur; bir Pausanias'ın, doğruyu yanlı§tan ayırmak bir yana, efsanelerdeki her §eyi yanlı§ sayanz6

Even in this case, however, the periods of activity are dictated by the suc- tion changes in the upper few metres which depend criti- cally on the slope geometry, flow

We developed a PLT inventory management program to supply PLTs more efficiently to patients requiring PLT transfusions within the expiration date, while reducing PLT dis- card

The Deputy Nursery Manager will be expected to work alongside the Nursery Manager as part of a team to build good working relationships at every level.. Working in partnership

- complete all sections of the WC-1, First report of Injury, and email to the Central Office Workers’ Compensation Coordinator immediately after the accident - notify the

ALTA Form 1 - Assessments: This endorsement is designed for use with the ALTA Loan Policy to provide the insured lender with protection against loss sustained by reason of

information you search harper county warrant for the county jail records in the public records section, please try again, and last arrest searches in la and more.. Browse by