• No results found

MBA 611 STATISTICS AND QUANTITATIVE METHODS

N/A
N/A
Protected

Academic year: 2021

Share "MBA 611 STATISTICS AND QUANTITATIVE METHODS"

Copied!
24
0
0

Loading.... (view fulltext now)

Full text

(1)

MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11)

A. Introduction (Chapter 1)

Uncertainty: Decisions are often based on incomplete information from uncertain events. We use statistical methods and statistical analysis to make decisions in uncertain environment.

Population: A population is the complete set of all items in which an investigator is interested.

Sample: A sample is a subset of population values.

& Example:

Population

- High school students - Households in the U.S.

Sample

- A sample of 30 students

- A Gallup poll of 1,000 consumers - Nielson Survey of TV rating

Random Sample: A random sample of n data values is one selected from the population in such a way that every different sample of size n has an equal chance of selection.

& Example: Random Selection - Lotto numbers

- Random numbers

Random Variable: A variable takes different possible values for a given subject of study.

Numerical Variable: A numerical variable takes some countable finite numbers or infinite numbers.

Categorical Variable: A categorical variable takes values that belong to groups or categories.

Data: Data are measured values of the variable. There are two types of data: quantitative data and qualitative data.

(2)

Quantitative Data: Quantitative data are data measured on a numerical scale.

Qualitative Data: Qualitative data are non-numerical data that can only be classified into one of a group of categories.

& Example:

1. Temperature 2. Height 3. Age in years 3. Income 4. Prices 5. Occupations 6. Race

7. Sales and Advertising 8. Consumption and Income

Statistics: Statistics is the science of data. This involves collecting, classifying, summarizing, analyzing data, and then making inferences and decisions based on the data collected.

Population Parameters: The numerical measures of a population are called parameters.

& Example: Population average.

Sample Statistics: The numerical measures of a sample are called sample statistics.

& Example: Sample average.

Descriptive Statistics: Descriptive statistics involves collecting, classifying, and summarizing data.

Inferential Statistics: Inferential statistics makes statistical inference about the population parameters based on sample information.

Business Decisions: From time to time, we use quantitative analysis to make business decisions.

& Example:

Economics: Price of a Good, Interest Rate, Mortgage Rate Finance: Returns, Stock Prices

Marketing: Advertising, Sales Management: Quality Control

(3)

B. Descriptive Statistics (Chapters 2 and 3)

B.1 Describing Data Sets Graphically (Chapter 2)

The simplest way to describe data is to use graphs. The following shows two types of graphs:

frequency histogram and line graph.

B.1.1 Relative Frequency Histogram

The relative frequency histogram shows the proportions of the total set of data values that fall in various numerical intervals.

& Example: Sale Prices

The following data represent sale prices (in thousands of dollars) for a random sample of 25 residential properties sold.

66 59 106 50 63

89 129 74 82 84

71 95 72 57 76

109 77 68 101 65

42 36 148 94 112

Sort the data.

36 42 50 57 59 63 65 66 68 71 72 74 76 77 82 84 89 94 95 101 106 109 112 129 148

Organize the data and construct the following relative frequency distribution table.

Class i Class Limits Freq. ( f

i) Relative Frequency

1 (30, 49) 2 2/25 =0.08

2 (50, 69) 7 7/25 = 0.28

3 (70, 89) 8 8/25 = 0.32

4 (90, 109) 5 5/25 = 0.20

5 (110, 129) 2 2/25 = 0.08

6 (130, 149) 1 1/25 = 0.04

Sum 25 1

(4)

The relative frequency histogram is

Relative Frequency

0 0.1 0.2 0.3 0.4

49.5 69.5 89.5 109.5 129.5 149.5 Sale Price

Relative Frequency

In this graph,

1. The data are classified into 6 classes.

2. Each class has the same width. The width is equal to 20.

3. The graph shows the midpoints of these classes on the horizontal axis.

4. The vertical bar shows the relative frequency of sale prices falling in each class interval.

How to decide the class width:

classes of

number the

number smallest

the - number largest

= the

Width .

& Example: Sale Prices

20 67 . 6 18

36

148− = ≈

=

Width .

O Exercise:

The following data are year-to-day (YTD) returns for a sample of 30 mutual funds.

0.2 1.4 1 -4.2 3.8 2.5 0.6 0.9 -1.1 -0.1 0.6 5.1 0.9 0.9 -1 0.8 0.5 -4.3 5.5 2.7 3.4 1.1 -0.7 -1.1 1 0.6 0.5 3 -0.5 9.6

Then

( )

5 . 2 31 . 6 2

3 . 4 6 .

9 − − = ≈

=

Width .

(5)

Sort the data as the following:

-4.3 -4.2 -1.1 -1.1 -1 -0.7 -0.5 -0.1 0.2 0.5 0.5 0.6 0.6 0.6 0.8 0.9 0.9 0.9 1 1 1.1 1.4 2.5 2.7 3 3.4 3.8 5.1 5.5 9.6 Organize the data and construct the following relative frequency distribution table.

Class i Class Limits Freq. ( fi) Relative Frequency 1 (-5.00, -2.51)

2 (-2.50, -0.01) 3

4 5 6 Sum

Draw a relative frequency histogram.

Textbook Exercises: 2.5, 2.6, 2.8, 2.9, pages 22-23.

Excel: Create a Histogram 1. Click on Tools.

2. Click on Data Analysis. (If Data Analysis is not on the list, click ATools@ and AAdd-Ins@.

Check AAnalysis ToolPak@ to install the add-in from Microsoft Office CD.) 3. Select Histogram; click OK.

4. Complete dialog box: Input range contains data; Bin range contains upper boundary of interval; click OK.

5. Delete (using Edit) last row called More.

(6)

B.1.2 Line Graph (Time Plot)

A line graph is graphic representation for a time series. Time series are data collected at different time period.

& Example: The following data are daily high temperatures from Monday through Friday:

70 74 72 78 75.

The line graph for the temperatures is

& Example: The line graph of IBM stock price is

Textbook Exercises: 2.22-2.29, pages 22-23; 2.35-2.39, pages 30-31.

Temperature

65 70 75 80

Monday Tuesday Wedn. Thursday Friday

Date

Temperature

IBM Stock Price

0 2000 4000 6000 8000 10000 12000 14000

80 82 84 86 88 90 92 94 96 98 00 02

Date

Price

(7)

Excel: Create a Line Graph 1. Click on Insert.

2. Click on Chart.

3. In Chart-Wizard Step 1: Select Line and the top-left line chart; click Next.

4. In Chart-Wizard Step 2: Click Series tab, Values contains data, Category (X) axis labels contains the values of date or time. Click Next and complete the rest steps.

B.2 Measures of Central Tendency (Section 3.1)

To describe data sets numerically, we use mean, median, range, and standard deviation.

B.2.1 Mean (Average)

The mean of a collection of n data values is the sum of the data values divided by n.

& Example: Calculate the mean of the following daily high temperatures:

70 74 72 78 75.

The mean is 73.8

5

75 78 72 74

70+ + + + =

. Notation:

Sum and Mean

Suppose there is a collection of n data values. These values are represented byx1,x2,K,xn ,. The sum of these values is denoted as

= n

i

xi 1

.

The mean is equal to

n x

n

i

i

=1 .

Sample Mean, X

The mean of a sample of n data values x1,x2,K,xn is denoted as X . And

n x X

n

i

i

= =1 .

(8)

O Exercise: Prices of product A

Suppose the prices of product A in the past five months are 6 4 2 3 5.

Calculate the mean.

Answer:

Population Mean, μ

The mean of a population is denoted asμ. If the data values of x are represented by x1,x2,K,xN , then the population variance is defined as

N x

N

i

i

= =1

μ .

B.2.2 Median

The median of a collection of data values is the data value in the middle position for sorted data.

& Example: Calculate the median of the following daily high temperatures:

70 74 72 78 75.

The sorted data are

70 72 74 75 78.

The median is 74.

Textbook Exercises: 3.1-3.11, pages 50-51.

(9)

B.3 Measures of Variability (Section 3.2)

B.3.1 Range

The range of a collection of data values is the difference between the largest and the smallest values.

& Example: Calculate the range of the following daily high temperatures:

70 74 72 78 75.

The range is 78 - 70 = 8.

& Example: Sale Prices for Residential Properties Calculate the range.

The range is 148 - 36 = 112.

O Exercise: YTD Returns

Calculate the range. The range is

B.3.2 Variance and Standard Deviation

The variance is used to measure the variation of the data values from its mean. The variance of a collection of data values is defined to be the average of the squares of the deviations of the data values about their mean.

Sample Variance, s 2

The variance of a sample of n data values x1,x2,K,xn is defined as

( )

1

1

2

2

=

n X x s =

n

i i

.

& Example: Prices of product A

Suppose the prices of product A in the past five months are 6 4 2 3 5.

Calculate the mean and the variance.

(10)

i x i deviation X xi

(deviation)2

(

xiX

)

2

1 6 6 - 4 = 2 4

2 4 4 - 4 = 0 0

3 2 2 - 4 = -2 4

4 3 3 - 4 = -1 1

5 5 5 - 4 = 1 1

Sum 20 0 10

The sample mean is 4 20 =5

=

X .

The sample variance is 2.5 1 5

2 10

s =

= − .

Alternative Formula: A shortcut formula to compute s2 is

( ) ( )

1

2 2 2

×

=

n

X n x

s i .

& Example: Prices of product A

Suppose the prices of product A in the past five months are 6 4 2 3 5.

Use a shortcut formula to compute the sample variance.

(11)

i x i xi2

1 6 36

2 4 16

3 2 4

4 3 9

5 5 25

Sum 20 90

The sample mean is 4 20 =5

=

X .

The sample variance is

( ) ( )

1

2 2 2

×

=

n

X n x

s i

5 . 4 2

4 5

90 2

× =

= − .

O Exercise: Prices of product B

Suppose the prices for product B in the past five months are 3 7 5 4 1.

Calculate the sample mean and the sample variance.

i xi x i2

1 3

2 7

3 5

4 4

5 1

Sum 20

The sample mean is The sample variance is

(12)

Standard Deviation

The standard deviation of a collection of data values is equal to the square root of their variance.

Sample Standard Deviation, s

s2

s= .

& Example: Prices of Product A

The sample standard deviation is s= 2.5=1.58. O Exercise: Prices of Product B

Calculate the sample standard deviation.

The sample standard deviation is

Population Variance σ2 and Population Standard Deviation σ

The population variance is denoted as σ2. For a population with the data values of x1,x2,K,xN and the mean μ, population variance is defined as

N - ) (x

=

2 i n

1

= i 2

μ

σ

.

The population standard deviation is σ= σ2 .

Note:

Sample mean X , variance s2, and standard deviation s are sample statistics.

Population mean μ, variance σ2, and standard deviation σ are population parameters.

Textbook Exercises: 3.12-3.14, 3.20-3.25, pages 59-60.

(13)

B.4 Skewness and Kurtosis

We use skewness and kurtosis to show the shape of distribution.

B.4.1 Skewness

The skewness measures the amount of asymmetry in a distribution or in a relative frequency histogram. If a distribution is symmetric, skewness equals zero; the larger the absolute size of the skewness statistic, the more asymmetric is the distribution. The measure of sample skewness is defined as

( )

s X n x

skewness

i 3

1 3

= .

When skewness has a large positive value indicates a long right tail.

When skewness has a large negative value indicates a long left tail.

& Example: Sale Prices

The data set has a positive skewness. Hence, the distribution has a long right tail.

Column1

Mean 81 Standard Error 5.293707 Median 76 Mode #N/A Standard Deviation 26.46853 Sample Variance 700.5833 Kurtosis 0.488226 Skewness 0.658795 Range 112 Minimum 36 Maximum 148 Sum 2025 Count 25

B.4.2 Kurtosis

The kurtosis is a measure of the thickness of the tails of its distribution (or relative frequency histogram) relative to those of a normal distribution. A normal distribution has a kurtosis of three.

A kurtosis above three indicates Afat tails.@ The measure of sample kurtosis is defined as

( )

s X n x

Kurtosis

i 4

1

4

= .

(14)

& Exercise: YTD Returns

Column1

Mean 1.12 Standard Error 0.490704 Median 0.85 Mode 0.6 Standard Deviation 2.687699 Sample Variance 7.223724 Kurtosis 2.87748 Skewness 0.849036 Range 13.9 Minimum -4.3 Maximum 9.6 Sum 33.6 Count 30

Textbook Exercises: 3.4, 3.6, 3.9-3.11, pages 50, 51.

Excel: Descriptive Statistics (See Appendix) 1. Click on Tools.

2. Click on Data Analysis. (If Data Analysis is not on the list, click ATools@ and AAdd-Ins@.

Check AAnalysis ToolPak@ to install the add-in from Microsoft Office CD.) 3. Select Descriptive Statistics; click OK.

4. Complete dialog box: Input range contains data; Output range contains the starting cell with descriptive statistics; select Summary statistics. Click OK.

(15)

C. Random Variables and Normal Distribution (Sections 5.1-5.3, 6.1, 6.3) C.1 Random Variables (Section 5.1)

Random Experiment: A random experiment is a process leading to two or more possible outcomes with uncertainty as to which outcome will occur.

Random Variable: A random variable is a variable that takes on numerical values determined by the outcome of a random experiment. Usually, there are two usages of random variables.

Random Variables for a Population: We can use a random variable to represent different possible data values for a population. This random variable has a probability distribution.

& Example:

The sale price can be represented by a variable X . Then different data values of sale price can also be represented by(x1 ,x2 ,K ,xn). The population mean is denoted as μX and the population standard deviation isσX.

Random Variables for Statistical Analysis: Some random variables have interesting probability distributions. These probability distributions are useful in statistical inference.

& Example:

The random variable Z has a standard normal distribution.

There are two types of random variables. One is discrete random variable and the other is continuous variable.

Discrete Random Variable: A discrete random variable takes some countable number of values.

Continuous Random Variable: A continuous random variable is a random variable taking values on a line interval.

& Example:

Age in years - Discrete random variable Income - Discrete

Prices - Discrete

Temperature - Continuous Height - Continuous Growth rates - Continuous

(16)

C.2 Discrete Random Variable (Sections 5.2, 5.3)

The probability distribution of a random variable X is denoted as P(x) . The properties of P(x) are

a. P

( )

x ≥0. b.

P

( )

x = 1.

& Example: New Products

Suppose the number of new products introduced each year is a random variableX . The values and the probabilities of Χ are

x P

( )

x

3 0.1

4 0.4

5 0.3

6 0.2

Mean and Standard Deviation

The mean of a discrete random variable X is

( )

x

P

x =

x

μ .

The mean of x is also called the expected value of X ,

( )

xP

( )

x

E Χ =

.

The variance of a discrete random variable X is

(

x x

) ( )

P x

x=

2

2 μ

σ .

(17)

& Example: New Products

Calculate the mean and standard deviation.

x P

( )

x x×P

( )

x x−μx

(

x−μx

)

2

(

x−μx

)

2×P

( )

x

3 0.1 0.3 -1.6 2.56 0.256

4 0.4 1.6 -0.6 0.36 0.144

5 0.3 1.5 0.4 0.16 0.048

6 0.2 1.2 1.4 1.96 0.392

Sum 1 4.6 0.84

The mean is μx =4.6. The variance is σx2 =0.84.

The standard deviation is σx = 0.84 =0.9165. O Exercise: Returned Checks

Suppose the number of returned checks in a day for a department store is a random variable X . The values and the probabilities of X are

x P

( )

x

0 0.3

1 0.4

2 0.2

3 0.1

Calculate the mean, variance, and standard deviation.

(18)

x P

( )

x x×P

( )

x x−μx

(

x−μx

)

2

(

x−μx

)

2×P

( )

x

0 0.3

1 0.4

2 0.2

3 0.1

Sum The mean is The variance is

The standard deviation is

Alternative Formula for Calculating Variance:

A shortcut formula to compute σ2x is

(

2

( ) )

2

2

x x P x μx

σ =

.

& Example: New Products

x P

( )

x x×P

( )

x x2 x2P

( )

x

3 0.1 0.3 9 0.9

4 0.4 1.6 16 6.4

5 0.3 1.5 25 7.5

6 0.2 1.2 36 7.2

Sum 1 4.6 22.0

The variance is

(

2

( ) )

2 22 4.62 0.84

2x=

x P xμx = − =

σ .

(19)

O Exercise: Returned Checks

x P

( )

x x×P

( )

x x2 x2P

( )

x

0 0.3

1 0.4

2 0.2

3 0.1

Sum The variance is σX2 =

Textbook Exercises: 5.1-5.8, pages 136, 137; 5.15-5.21, 5.25-5.29, pages 148-150.

C.3 Continuous Random Variable (Sections 6.1, 6.3, 8.3)

The probability distribution of a random variable X can be denoted as f

( )

x . The probability distribution of X has the following properties:

a. f

( )

x ≥0.

b. Total area under f

( )

x is one.

c. The probability of x falling within an interval

( )

a,b is denoted as

(

a x b

)

P < < .

It is the area under the curve f

( )

x between a and b.

One of the most commonly used continuous random variable is normal random variable.

C.3.1 Normal Distribution (Section 6.3)

Normal Random Variable and Normal Probability Distribution

A normal random variable with a normal probability distribution has the following properties:

a. The probability distribution has a bell-shaped.

b. The distribution is symmetric about its mean μ.

c. The spread of the distribution is determined by the standard deviation σ .

d. Any normal random variable X with mean μ and standard deviation σ can be standardized as a standard normal random variable.

σ

μ Z = X − .

(20)

Standard Normal Random Variable

A standard normal random variable is a normal random variable with mean zero and standard deviation one. The probability table for standard normal random variable shows the probability of

(

Z a

)

P 0< < .

Using Standard Normal Probability Distribution Table

Case 1. Find P

(

0<Z <a

)

.

& Example:

(

0< Z <1.2

)

=0.3849

P .

(

0< Z <1.76

)

=0.4608

P .

O Exercise:

(

0<Z <1.64

)

= P

(

0<Z <1.96

)

= P

Case 2. Find P

(

a< Z <0

)

.

& Example:

(

−1.2<Z <0

)

=0.3849

P .

(

1.76<Z <0

)

=0.4608

P .

(

Z <0

)

=0.5

P .

(

Z >0

)

=0.5

P .

O Exercise:

(

−1.28<Z <0

)

= P

(

−2.33<Z <0

)

= P

Case 3. Find P

(

Z <a

)

.

& Example:

(

Z <−1.2

)

=0.5−0.3849=0.1151

P .

(

Z <−1.76

)

=0.5−0.4608=0.0392

P .

(21)

Note: We denote the cumulative probability as F

( )

a , such that F

( )

a =P

(

Z <a

)

. O Exercise:

(

Z <−1.64

)

= P

(

Z <−1.96

)

= P

Case 4. Find P

(

Z >a

)

.

& Example:

(

Z >1.2

)

=0.50.3849=0.1151

P .

(

Z >1.76

)

=0.5−0.4608=0.0392

P .

O Exercise:

(

Z > 641.

)

=

P

(

Z.> 961.

)

= P

(

Z > 281.

)

=

P

(

Z > 332.

)

= P

Case 5. The probability P

(

0<Z <a

)

is given. Find the value of a.

& Example:

(

Z a

)

%

P 0< < =30 . What is a? From the table, a=0.84.

O Exercise:

(

0<Z <a

)

=40%

P . What is a?

Case 6. The probability P

(

Z >a

)

is given. Find the value of a.

& Example:

(

Z > a

)

=5%

P , find a.

The point a locates on the right-hand side of origin and P

(

0<Z <a

)

=0.5−0.05=0.45. With the given probability 0.45, we find a=1.64 from the table.

(22)

O Exercise:

(

Z > a

)

=0.10

P , find a.

Answer:

Textbook Exercises:6.17, 6.18, page 207.

Probabilities for Normal Random Variables

Let X be a normal random variable with mean μ and variance σ2. Then random variable σ

μ

Z = X − is a standard normal random variable. Also,

( )

⎜ ⎞

⎛ −

<

− <

=

<

< σ

μ σ

μ b

a Z P b X a

P .

& Example:

A company produces light bulbs whose life follows a normal distribution with mean 1,200 hours and standard deviation 250 hours. If we choose a light bulb at random, what is the probability that its lifetime will be between 900 and 1,300 hours?

Answers:

(

X

)

P X

P

⎜ ⎞

⎛ −

− <

− <

=

<

< 250

1200 1300

250 1200 250

1200 1300 900

900

(

−1.2<Z <0.4

)

=0.3849+0.1554=0.5403 P

= .

O Exercise:

Anticipated consumer demand for a product next month can be represented by a normal random variable with mean 1,200 units and standard deviation 100 units.

a. What is the probability that sales will be between 1,000 and 1,300 units?

b. What is the probability that sales will exceed 1,100 units?

Answers:

Textbook Exercises: 6.19 abc, 6.20 abc, 6.21 abc, 6.22 abd, 6.23 ab, 6.24 abc, 6.25, 6.26, 6.27 a, 6.31 ab, 6.35 ab, 6.36a, 6.37 ab, pages 208-210.

(23)

C.3.2 Student=s t Distribution (Section 8.3)

Student's t Distribution ( t -distribution) Let t be a random variable with t -distribution.

Properties of t -distribution:

1. Bell-shaped.

2. Symmetrical about t=0.

3. The probability distribution has tails that are more spread out than the standard normal distribution.

4. The shape of probability distribution depends on a constant, the degrees of freedom (v).

5. When v is large, t distribution is close to the standard normal distribution.

t Statistical Table

The table shows the value of tα, such that

(

t > tα

)

P .

For α =0.01, α =0.025, and α =0.05, the values of tα for different v are

v t.05 t.025 t.01

5 2.015 2.571 3.365

10 1.812 2.228 2.764

15 1.753 2.131 2.602

20 1.725 2.086 2.528

∞ 1.645 1.96 2.326

(24)

& Example:

Find the value a such that,

a. P

(

t > a

)

=0.05 when v=5. b. P

(

t< a

)

=0.025 when v=10. c. P

(

t> a

)

=0.01 when v=20.

Answer: a: a=2.015; b: a=−2.228; c: a=2.258. O Exercise: Find the value a such that,

a. P

(

t> a

)

=0.01 when v=5. b. P

(

t< a

)

=0.05 when v=10. c. P

(

t > a

)

=0.025 when v=15. Answer:

References

Related documents

Minnesota State Community and Technical College (MSCTC) generates an annual economic impact of $197 million from its operations in the West Central region.. Based on the West

The probability distribution of a random variable, X, written as p(x), gives the probability that the random variable will take on each of its possible values?. Probability model

Just as we describe the probability distribution of a discrete random variable by specifying the probability that the random variable takes on each possible value, we describe

To compare two r.v.s we often need single numbers (statistics) associated with each random variable.. Examples of Continuous Random

Then one can regard the binomial random variable as a sum of n Bernoulli random variables each having variance p(1-p). To this end, he randomly selects 16 from each large lot

If the chance outcome of the experiment is a number, it is called a random variable. Discrete random variable: the possible outcomes can be listed e.g. Notation for random

 e.g., Probability of getting a tail is the same each time we toss the coin.  Observations

[r]