beamer-tu-logo Dr. Georgios Tsiotas
Mediterranean Agronomic Institute of Chania &
University of Crete
MS.c Program Business Economics and Management
beamer-tu-logo
Bibliography
Description
1 Introduction
Basic Notions in Statistics Data
Statistical Measures
2 Probability
Basic Notions of Probability
beamer-tu-logo
Bibliography Statistical Measures
Description
1 Introduction
Basic Notions in Statistics Data
Statistical Measures
2 Probability
Basic Notions of Probability
beamer-tu-logo
Bibliography Statistical Measures
Describing the Statistical Problem
Definition
Statistics is a mathematical branch of science that deals with uncertain (or random) phenomena with the help of sampling.
Phenomena
1 Random (or Uncertain): Outcome of tossing a coin, the outcome on
betting, maximum car speed of a car, daily rain percipitation, number of daily births, return of a stock index, the level of sales, number of student attendence, etc
2 Non-Random (Certain): The sunrise, the sunset, gravity on earth, etc
beamer-tu-logo
Bibliography Statistical Measures
Describing the Statistical Problem
Definition
Statistics is a mathematical branch of science that deals with uncertain (or random) phenomena with the help of sampling.
Phenomena
1 Random (or Uncertain): Outcome of tossing a coin, the outcome on betting, maximum car speed of a car, daily rain percipitation, number of daily births, return of a stock index, the level of sales, number of student attendence, etc
2 Non-Random (Certain): The sunrise, the sunset, gravity on earth, etc 3 Chaotic: Extreme financial events, earthquakes, tsunami, etc
beamer-tu-logo
Bibliography Statistical Measures
Describing the Statistical Problem
Definition
Statistics is a mathematical branch of science that deals with uncertain (or random) phenomena with the help of sampling.
Phenomena
1 Random (or Uncertain): Outcome of tossing a coin, the outcome on betting, maximum car speed of a car, daily rain percipitation, number of daily births, return of a stock index, the level of sales, number of student attendence, etc
2 Non-Random (Certain): The sunrise, the sunset, gravity on earth, etc
beamer-tu-logo
Bibliography Statistical Measures
Describing the Statistical Problem
Definition
Statistics is a mathematical branch of science that deals with uncertain (or random) phenomena with the help of sampling.
Phenomena
1 Random (or Uncertain): Outcome of tossing a coin, the outcome on betting, maximum car speed of a car, daily rain percipitation, number of daily births, return of a stock index, the level of sales, number of student attendence, etc
2 Non-Random (Certain): The sunrise, the sunset, gravity on earth, etc 3 Chaotic: Extreme financial events, earthquakes, tsunami, etc
beamer-tu-logo
Bibliography Statistical Measures
Quotations on randomness
1 Aristotle:“The probable is what usually happens”
2 Democritus: “Everything existing in the universe is the fruit of chance” 3 Plato (to Phaedon): “I know too well that these arguments from
probabilities are imposters, and unless great caution is observed in the use of them, they are apt to be deceptive”
4 Heraclitus: “There is nothing permanent except change”
5 Descartes (in Discourse on Method): “It is a truth very certain that when it is not in our power to determine what is true we ought to follow what is most probable”
beamer-tu-logo
Bibliography Statistical Measures
Random Variable (r.v.)
Random variable is the result of a random experiment which is characterised by uncertainty
1 Number of “heads” when tossing a coin ten (10) times 2 The daily consumption of a person
3 The maximum car speed of a car 4 A stock return today
5 Rain precipitation on a day 6 The number of student attendence Sampling
The statistical sampling through the collection of a representative number of random variables can talk about the statistical characteristics of this random variable
beamer-tu-logo
Bibliography Statistical Measures
Random Variable (r.v.)
Random variable is the result of a random experiment which is characterised by uncertainty
1 Number of “heads” when tossing a coin ten (10) times 2 The daily consumption of a person
3 The maximum car speed of a car 4 A stock return today
5 Rain precipitation on a day 6 The number of student attendence Sampling
The statistical sampling through the collection of a representative number of random variables can talk about the statistical characteristics of this random variable
beamer-tu-logo
Bibliography Statistical Measures
Defining Sample and Population
Sample
Sample, is a smaller number (a subset) of the people or objects that exist within a population.
Population
Population is refereed to as the universe, this is the entire set of people or objects of interest. It could be:
1 Infinite tossing a coin experiments 2 All adult citizens in a country 3 All cars driven in a country 4 All stock returns
5 All places where rain happens etc. 6 All students in a country
beamer-tu-logo
Bibliography Statistical Measures
Defining Sample and Population
Sample
Sample, is a smaller number (a subset) of the people or objects that exist within a population.
Population
Population is refereed to as the universe, this is the entire set of people or objects of interest. It could be:
1 Infinite tossing a coin experiments 2 All adult citizens in a country 3 All cars driven in a country 4 All stock returns
5 All places where rain happens etc. 6 All students in a country
beamer-tu-logo
Bibliography Statistical Measures
Defining Sample and Population (cont.)
Important!!
A sample is said to be representative if its members tend to have the same characteristics (e.g., region, shopping behaviour, age, income, educational level) as the population from which they were selected.
1 For example, if 45%of the population consists of female drivers, we would like our sample to also include 45%females.
2 When a sample is so large as to include all members of the population, it is referred to as a complete census.
beamer-tu-logo
Bibliography Statistical Measures
Descriptive Statistics
A simple definition
In descriptive statistics, we simply summarize and describe the data we have collected. For example:
1 Observing the car speed at a specific location in an avenue you
diagnose that the mean speed is 15.4 mph. Also the probability that a car will get speed at least 20 mph is 24%. This is descriptive statistics. You are merely describing the data that you have recorded!!
2 According to the Bureau of the Census, there has been an increase of
200%on the average UK gas consumption after the year 1980.
3 Rain precipitation data from different location are characterised by large
beamer-tu-logo
Bibliography Statistical Measures
Descriptive Statistics
A simple definition
In descriptive statistics, we simply summarize and describe the data we have collected. For example:
1 Observing the car speed at a specific location in an avenue you diagnose that the mean speed is 15.4 mph. Also the probability that a car will get speed at least 20 mph is 24%. This is descriptive statistics. You are merely describing the data that you have recorded!!
2 According to the Bureau of the Census, there has been an increase of
200%on the average UK gas consumption after the year 1980.
3 Rain precipitation data from different location are characterised by large
beamer-tu-logo
Bibliography Statistical Measures
Descriptive Statistics
A simple definition
In descriptive statistics, we simply summarize and describe the data we have collected. For example:
1 Observing the car speed at a specific location in an avenue you diagnose that the mean speed is 15.4 mph. Also the probability that a car will get speed at least 20 mph is 24%. This is descriptive statistics. You are merely describing the data that you have recorded!!
2 According to the Bureau of the Census, there has been an increase of 200%on the average UK gas consumption after the year 1980.
3 Rain precipitation data from different location are characterised by large
beamer-tu-logo
Bibliography Statistical Measures
Descriptive Statistics
A simple definition
In descriptive statistics, we simply summarize and describe the data we have collected. For example:
1 Observing the car speed at a specific location in an avenue you diagnose that the mean speed is 15.4 mph. Also the probability that a car will get speed at least 20 mph is 24%. This is descriptive statistics. You are merely describing the data that you have recorded!!
2 According to the Bureau of the Census, there has been an increase of 200%on the average UK gas consumption after the year 1980.
3 Rain precipitation data from different location are characterised by large variation
beamer-tu-logo
Bibliography Statistical Measures
Statistical Inference
A simple definition
In inferential statistics, sometimes referred to as inductive statistics, we go beyond mere description of the data and arrive at inferences regarding the phenomenon or phenomena for which sample data were obtained. For example:
1 Observing the car speed taken by so many cars, the circulation regulator
may impose a more realistic speed limit.
2 Observing the average gas consumption, the ministry of energy may
decide to turn its energy production to different directions in order to match future needs.
3 Due to observed large variation in rain percipitation data from different
location, the Meteorology office decides to increase the number of locations where data is collected
beamer-tu-logo
Bibliography Statistical Measures
Statistical Inference
A simple definition
In inferential statistics, sometimes referred to as inductive statistics, we go beyond mere description of the data and arrive at inferences regarding the phenomenon or phenomena for which sample data were obtained. For example:
1 Observing the car speed taken by so many cars, the circulation regulator may impose a more realistic speed limit.
2 Observing the average gas consumption, the ministry of energy may
decide to turn its energy production to different directions in order to match future needs.
3 Due to observed large variation in rain percipitation data from different
location, the Meteorology office decides to increase the number of locations where data is collected
beamer-tu-logo
Bibliography Statistical Measures
Statistical Inference
A simple definition
In inferential statistics, sometimes referred to as inductive statistics, we go beyond mere description of the data and arrive at inferences regarding the phenomenon or phenomena for which sample data were obtained. For example:
1 Observing the car speed taken by so many cars, the circulation regulator may impose a more realistic speed limit.
2 Observing the average gas consumption, the ministry of energy may decide to turn its energy production to different directions in order to match future needs.
3 Due to observed large variation in rain percipitation data from different
location, the Meteorology office decides to increase the number of locations where data is collected
beamer-tu-logo
Bibliography Statistical Measures
Statistical Inference
A simple definition
In inferential statistics, sometimes referred to as inductive statistics, we go beyond mere description of the data and arrive at inferences regarding the phenomenon or phenomena for which sample data were obtained. For example:
1 Observing the car speed taken by so many cars, the circulation regulator may impose a more realistic speed limit.
2 Observing the average gas consumption, the ministry of energy may decide to turn its energy production to different directions in order to match future needs.
3 Due to observed large variation in rain percipitation data from different location, the Meteorology office decides to increase the number of locations where data is collected
beamer-tu-logo
Bibliography Statistical Measures
Data Types
Qualitative Data
Qualitative data are words that cannot be defined by numbers. Some of the variables associated with people or objects are qualitative in nature, indicating that the person or object belongs in a category. For example:
1 You are either male or female 2 You are less than 25 years old or not 3 Your have a small or a large household 4 You are located in Crete or not
beamer-tu-logo
Bibliography Statistical Measures
Data Types (cont.)
Quantitative Data
Quantitative data is collective data that can be measured by numbers. There are two types of quantitative variables: discrete and continuous.
1 Discrete quantitative variables can take on only certain values along an interval, with the possible values having gaps between them. Examples of discrete quantitative variables would be the number of employees on the payroll of a manufacturing firm, the number students attending a class, or the number of births are given per each calendar day. Discrete variables in business statistics usually consist of observations that we can count having integer values.
2 Continuous quantitative variables can take on a value at any point along
an interval. For example, the stock index return can take at a given moment the value of 0.0493 or 0.049372. This will depend on the accuracy with which the volume can be measured. The possible values that could be taken on would have no gaps between them. Other examples of continuous quantitative variables are the car speed, the rain precipitation etc
beamer-tu-logo
Bibliography Statistical Measures
Data Types (cont.)
Quantitative Data
Quantitative data is collective data that can be measured by numbers. There are two types of quantitative variables: discrete and continuous.
1 Discrete quantitative variables can take on only certain values along an interval, with the possible values having gaps between them. Examples of discrete quantitative variables would be the number of employees on the payroll of a manufacturing firm, the number students attending a class, or the number of births are given per each calendar day. Discrete variables in business statistics usually consist of observations that we can count having integer values.
2 Continuous quantitative variables can take on a value at any point along an interval. For example, the stock index return can take at a given moment the value of 0.0493 or 0.049372. This will depend on the accuracy with which the volume can be measured. The possible values that could be taken on would have no gaps between them. Other examples of continuous quantitative variables are the car speed, the rain precipitation etc
beamer-tu-logo
Bibliography Statistical Measures
Data Types (cont.)
Dummy Data
Researchers sometimes convert Qualitative Data to Quantitative data using the so-called Dummy data. Examples:
1 For car speed data the Sex qualitative specification can take the answer “YES” for male motorists and the answer “NO” for female. This can be quantified in “1” for male and “0” for female.
2 For rain precipitation data the location qualitative specification can take the answer “YES” for mountainous location and the answer “NO” for the no-mountenous location. This can be quantified in “1” and “0”
beamer-tu-logo
Bibliography Statistical Measures
Data Types (cont.)
Types of quantitative data
1 Cross-section data: These data might refer to people, companies, locations, countries given time t
x1, . . . ,xN
where N the total amount of Cross-sectional data given time t . (Important!!: Ordering of data does not matter)
2 Time-series data:These data might refer to people, companies, locations, countries collected in an an array of time interval given location l
x1, . . . ,xT
where T the total amount of time-series data given location l (Important!!: Ordering of data does matter)
beamer-tu-logo
Bibliography Statistical Measures
Data Types (cont.)
Data transformation in quantitative data
In various time-series data there is a necessity to take raw data from one source and then transform them into a different form for the empirical analysis
1 The percentage change of sales
2 The percentage change of a stock index Suppose, x1, . . . ,xT
represent stock index time-series. Then for t∈ {1, . . . ,(T−1)} (xt+1−xt)
xt ×
100
beamer-tu-logo
Bibliography Statistical Measures
Data Values
Table:Car Speed Data xi fi Fi 4 2 2 7 2 4 8 1 5 9 1 6 10 3 9 11 2 11 12 4 15 13 4 19 14 4 23 15 3 26 16 2 28 17 3 31 18 4 35 19 3 38 20 5 43 22 1 44 23 1 45 24 4 49 25 1 50
beamer-tu-logo
Bibliography Statistical Measures
Classified Data Values
Table:Car Speed Data xi fi Fi [0−5] 2 2 [6−10] 7 9 [11−15] 17 26 [16−20] 17 43 [21−25] 7 50
beamer-tu-logo
Bibliography Statistical Measures
Car Speed
4 8 10 12 14 16 18 20 23 25 car speed 0 1 2 3 4 5 car speed speed Frequency 0 5 10 15 20 25 0 5 10 15beamer-tu-logo
Bibliography Statistical Measures
Advertisement Expenditure in 000,000’s Euro
0 5 10 15 20 25
advertisement in thous. Euro
0 1 2 3 4 5 6 7
advertisement in thous. Euro
adv Frequency 0 5 10 15 20 25 30 0 2 4 6 8 10
beamer-tu-logo
Bibliography Statistical Measures
Gas consumption in UK (quarterly data for 1960
−
1985)
Time Gas consumption in UK 1960 1965 1970 1975 1980 1985 200 400 600 800 1000 1200
beamer-tu-logo
Bibliography Statistical Measures
Toyota car sales in Greece (monthly data for 1998
−
2003)
Time
T
oy
ota car sales in Greece (monthly)
1998 1999 2000 2001 2002 2003 1000 1500 2000 2500 3000
beamer-tu-logo
Bibliography Statistical Measures
Main Objectives of Statistical Measures
1 Describe data using measures of location tendency 2 Describe data using measures of dispersion
3 Describe data using probability measures
4 Compare data of different measure using standardises techniques
beamer-tu-logo
Bibliography Statistical Measures
Main Objectives of Statistical Measures
1 Describe data using measures of location tendency
2 Describe data using measures of dispersion 3 Describe data using probability measures
4 Compare data of different measure using standardises techniques
beamer-tu-logo
Bibliography Statistical Measures
Main Objectives of Statistical Measures
1 Describe data using measures of location tendency 2 Describe data using measures of dispersion
3 Describe data using probability measures
4 Compare data of different measure using standardises techniques
beamer-tu-logo
Bibliography Statistical Measures
Main Objectives of Statistical Measures
1 Describe data using measures of location tendency 2 Describe data using measures of dispersion 3 Describe data using probability measures
4 Compare data of different measure using standardises techniques 5 Express relationship between two (2) different random variables
beamer-tu-logo
Bibliography Statistical Measures
Main Objectives of Statistical Measures
1 Describe data using measures of location tendency 2 Describe data using measures of dispersion 3 Describe data using probability measures
4 Compare data of different measure using standardises techniques
beamer-tu-logo
Bibliography Statistical Measures
Main Objectives of Statistical Measures
1 Describe data using measures of location tendency 2 Describe data using measures of dispersion 3 Describe data using probability measures
4 Compare data of different measure using standardises techniques 5 Express relationship between two (2) different random variables
beamer-tu-logo
Bibliography Statistical Measures
Mean
Description
Mean (or Mean Average) of expresses the arithmetic mean of a random variable for a given sample of our experiment
How to estimate it?
1 For a sample of N observations ¯ x= 1 N N X i=1 xi 2 For a sample of N=PN i=1fi ¯ x= PN i=1fixi PN i=1fi
with fifrequencies for each ximeasure
3 For the populationµ
beamer-tu-logo
Bibliography Statistical Measures
Variance
Description
Variance (or Volatility) expresses a measure of a random variable dispension from the mean
How to estimate it?
1 For a sample of N observations (N≥30) S2= 1 N N X i=1 (xi−¯x)2 2 For a sample N <30 S2= 1 N−1 N X i=1 (xi−x)¯2
beamer-tu-logo
Bibliography Statistical Measures
Variance
How to estimate it? (cont.)
1 For the population (and for samples with N≥30) σ2= 1 N N X i=1 (xi−µ)2≡ 1 N N X i=1 xi2−µ2 Standard Deviation 1 For the sample
S=√S2 2 For the population
beamer-tu-logo
Bibliography Statistical Measures
Car Speed data
Estimate mean and variance when using Discrete Data Values ¯
x=15.4 S2=26
How to estimate mean and variance when using Classified Data Values 1 Set median values for x
i, like zi. These are:
{3,8,13,18,23} 2 Set newz and S¯ 2values such as:
¯ z= PN i=1fizi PN i=1fi =15 S2= 1 N N X i=1 zi2fi−¯z2=26
beamer-tu-logo
Bibliography Statistical Measures
Median (M)
Description
Median expresses a measure of location tendency that is assigned to a value of the random variable that has 50%of the probability (or frequency) of the whole sample
How to estimate it?
beamer-tu-logo
Bibliography Statistical Measures
Percentiles
Description
Percentiles expresses a measure of central tendency that is assined to a value of the random variable that has P%of the probability (or frequency) of the whole sample
How to estimate it? 1 1st Percentile Q1={x:F(x)≤N/4} 2 2nd Percentile (Median) Q2={x:F(x)≤N/2} 3 3nd Percentile Q3={x:F(x)≤3×N/4}
beamer-tu-logo
Bibliography Statistical Measures
Car Speed data
Estimate the Median using Classified Data Values Q2=li+ δi fi [N 2 −Fi−1], 1 l
i, the lower bound of the class where F(x)≤N/2
2 δ
i the width of a class
3 f
ithe frequency of the class such that F(x)≤N/2
4 Fi
−1the bounded frequency which represents the i−1 class of the
beamer-tu-logo
Bibliography Statistical Measures
Mode
Description
Mode expresses a location measure that inform as about the value of our random variable with the highest frequency (or probability)
How to estimate it?
mx ={x:f(x) =max(f1, . . . ,fN)}
Problems with estimating mx
1 The Mode (mx) estimation may depend on the classes assigned for the analysed random variable. A possible change in the width of class assigned may change the mode of a random variable. When no classes assigned then mode is an objective measures of central tendency. 2 The same mx may appear for two (2) or even more random variable
values or classes. This are the bimodal and the multimodal cases respectively.
beamer-tu-logo
Bibliography Statistical Measures
Car Speed data
Estimate the mode using Discrete Data Values mx =20
Estimate the mode using Classified Data Values mx ∈[10,20]
1 Not precise estimation
beamer-tu-logo
Bibliography Statistical Measures
Absolute Measures of Linear Relationship (2 random variables)
Covariance
Covariance is the measure of how much two random variables move together. If two variables tend to move together in the same direction, then the covariance between the two variables will be positive. If two variables move in the opposite direction, the covariance will be negative. If there is no tendency for two variables to move one way or the other, then the covariance will be zero.
How to estimate it?
1 For a sample of N observations S2x,y=
PN
i=1(xi−¯x)(yi−¯y)
N−1 2 For the population
σ2x,y = PN
i=1(xi−µx)(yi−µy)
beamer-tu-logo
Bibliography Statistical Measures
Relative of Linear Relationship (2 random variables)
Correlation
Correlation is the relative measure of linear relationship between two (2) random variables
How to estimate it?
1 For a sample of N observations ρx,y =
σx,y σx·σy
2 For the population
rx,y = sx,y sx·sy
beamer-tu-logo
Bibliography Statistical Measures
Correlation, explained
1 The correlation’s sign depends on the Covariance sign (e.g. positive covariance lead to positive correlation)
2 −1≤ρx
,y ≤1, −1≤rx,y ≤1 3 Whenρx
,y, rx,y →0, we have nearly uncorrelated random variables 4 ρx
,y, rx,y >0, we have possitivly and when<0 negatively correlated random variables
beamer-tu-logo
Bibliography Statistical Measures
Sales versus Advertisement (in thousand Euro’s)
0 5 10 15 20 25 80 82 84 86 88 90 92
Advert. (thous. Euro)
Sales (thous
beamer-tu-logo
Bibliography Statistical Measures
Car speed (mph) versus Distance to stop (in ft)
0 5 10 15 20 25 0 20 40 60 80 100 120 Speed (mph) Stopping distance (ft)
beamer-tu-logo
Bibliography Statistical Measures
Car weights versus Miles per gallon
2 3 4 5 10 15 20 25 30 Car Weight Miles P er Gallon
beamer-tu-logo
Bibliography Statistical Measures
Correlation using data
Correlation in Sales vs Advertisement
rx,y =0.9409605
Correlation in Car speed (mph) versus Distance to stop rx,y =0.8068949
Correlation in Car weights versus Miles per gallon rx,y =−0.8676594
beamer-tu-logo
Bibliography Statistical Measures
Coefficient of Variation (CV)
Description
Coefficient of Variation is a relative dispersion measure for a random variable that expresses the standard deviation as a percentage of the arithmetic mean.
How to estimate it?
1 For a sample of N observations CV =S
¯ x ×100 2 For the population
CV= σ µ×100
beamer-tu-logo
Bibliography Statistical Measures
Why the CV is used?
We use it when we like to compare the variation of two population (or samples) of different measures.
Example: When sales data of two (2) different population (with different currencies) are analysed for their dispersion one can use the variance measure. However, variance expresses the an absolute measure of variation on the currency of each population. Here, the CV can demonstrate a relative dispersion measure in order for the dispersions to be comparable.
beamer-tu-logo
Bibliography Statistical Measures
Skewness Coefficient
Description
Skewness Coefficient is a measure of asymmetry for the frequency (or probability) distribution of a random variable
How to estimate it?
1 Pearson’s first coefficient of skewness of the distribution of a random variable using: a1= µx−mx σx 2 Skewness coefficient a1=µ 3 x σ3 x with µ3x = 1 N N X i=1 (xi−µ) 3
beamer-tu-logo
Bibliography Statistical Measures
Skewness, explained
1 When a1>0 we have positive skewness withµ
x >mx (positive
asymmetry)
2 When a1<0 we have negative skewness withµx <mx (negative asymmetry)
beamer-tu-logo
Bibliography Statistical Measures
beamer-tu-logo
Bibliography Statistical Measures
Kurtosis Coefficient
Description
Kurtosis Coefficient is a measure that assesses how flat or peaked is the frequency (or probability) distribution of a random variable
How to estimate it?
Pearson’s coefficient of kurtosis
a2= µ 4 x σ4 x with µ4x = 1 N N X i=1 (xi−µ)4
beamer-tu-logo
Bibliography Statistical Measures
Kurtosis, explained
1 When a2−3>0 we have more peaked distribution (leptokurtic or fat-tailed distribution)
2 When a2−3<0 we have less peaked distribution (platykurtic or long-tailed distribution )
beamer-tu-logo
Bibliography Statistical Measures
beamer-tu-logo
Bibliography Statistical Measures
Skewness and Kurtosis using Advertisement data
Estimate Skewness and Kurtosis when using Discrete Data Values a1=0.06949301, Positive Asymmetry
a2=−1.174533, Platykurtic
Estimate Skewness and Kurtosis when using Classified Data Values a1=0.542137, Positive Asymmetry
a2=−0.7993088, Platykurtic
1 Similar signs in Asymmetry and Kurtosis
2 Much stronger positive asymmetry in the Classified Data Values (look at the graph!!!)
beamer-tu-logo
Bibliography
Description
1 Introduction
Basic Notions in Statistics Data
Statistical Measures
2 Probability
Basic Notions of Probability
beamer-tu-logo
Bibliography
Probability
Definition
Probability is the likelihood for specific outcome of a random experiment to happen
Number of possible outcomes in which the event occurs Total number of possible outcomes
1 Example I: The probability of having “head” when tossing a coin
(theoretical probability)
1/2
2 Example III: The probability of having “King” when selecting a playing
card (theoretical probability)
4/52
3 Example IV: The probability of Car speed 4 in the Car speed data
sampling experiment (empirical probability) 2/50
beamer-tu-logo
Bibliography
Probability
Definition
Probability is the likelihood for specific outcome of a random experiment to happen
Number of possible outcomes in which the event occurs Total number of possible outcomes
1 Example I: The probability of having “head” when tossing a coin (theoretical probability)
1/2
2 Example III: The probability of having “King” when selecting a playing
card (theoretical probability)
4/52
3 Example IV: The probability of Car speed 4 in the Car speed data
sampling experiment (empirical probability) 2/50
beamer-tu-logo
Bibliography
Probability
Definition
Probability is the likelihood for specific outcome of a random experiment to happen
Number of possible outcomes in which the event occurs Total number of possible outcomes
1 Example I: The probability of having “head” when tossing a coin (theoretical probability)
1/2
2 Example III: The probability of having “King” when selecting a playing card (theoretical probability)
4/52
3 Example IV: The probability of Car speed 4 in the Car speed data
sampling experiment (empirical probability) 2/50
beamer-tu-logo
Bibliography
Probability
Definition
Probability is the likelihood for specific outcome of a random experiment to happen
Number of possible outcomes in which the event occurs Total number of possible outcomes
1 Example I: The probability of having “head” when tossing a coin (theoretical probability)
1/2
2 Example III: The probability of having “King” when selecting a playing card (theoretical probability)
4/52
3 Example IV: The probability of Car speed 4 in the Car speed data sampling experiment (empirical probability)
beamer-tu-logo
Bibliography
Probability (cont.)
Types of Probability
1 Empirical Probability: The probability estimated as an outcome of empirical experiment
2 Theoretical Probability: The probability estimated as an empirical of theoretical experiment
Important!!
1 Theoretical probabilities never coincide with the empirical 2 As researcher increase the sample N to increase the accuracy of
probability estimation
beamer-tu-logo
Bibliography
Probability (cont.)
Types of Probability
1 Empirical Probability: The probability estimated as an outcome of empirical experiment
2 Theoretical Probability: The probability estimated as an empirical of theoretical experiment
Important!!
1 Theoretical probabilities never coincide with the empirical
2 As researcher increase the sample N to increase the accuracy of
probability estimation
beamer-tu-logo
Bibliography
Probability (cont.)
Types of Probability
1 Empirical Probability: The probability estimated as an outcome of empirical experiment
2 Theoretical Probability: The probability estimated as an empirical of theoretical experiment
Important!!
1 Theoretical probabilities never coincide with the empirical 2 As researcher increase the sample N to increase the accuracy of
probability estimation
beamer-tu-logo
Bibliography
Probability (cont.)
Types of Probability
1 Empirical Probability: The probability estimated as an outcome of empirical experiment
2 Theoretical Probability: The probability estimated as an empirical of theoretical experiment
Important!!
1 Theoretical probabilities never coincide with the empirical 2 As researcher increase the sample N to increase the accuracy of
probability estimation
beamer-tu-logo
Bibliography
Car Speed using probabilities
4 8 10 12 14 16 18 20 23 25 car speed 0.00 0.02 0.04 0.06 0.08 0.10 car speed speed Density 0 5 10 15 20 25 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07
beamer-tu-logo
Bibliography
Car Speed using probabilities
Estimating Probabilities 1 P(x=4) =0.04
2 P(x≤7) =0.04+0.04=0.08 3 P(x≥20) =1−P(x≤19) =0.24
beamer-tu-logo
Bibliography
Probability (cont.)
Sample Space
Sample space (Ω) is the collection of all posible outcomes in a random experiment
Ω ={E1, . . . ,Ek}
Sample Space for Car Speed data
Ω ={4,7,8,9,10,11,12,13,14,15,16,17,18,19,20,22,23,24,25}
Redefining Probability
For a given sample spaceΩ, then the probabilities follow the following: 1
0≤P(Ei)≤1, i=1, . . . ,k
2 The sum of the probabilities
k
X
beamer-tu-logo
Bibliography
Probability (cont.)
Laws of Probability
∪Symbolises the union of events∩Symbolises the intersection of events 1 When E+E′= Ωthe events are condidered mutually exclussive and
have P(E1∩E2) =0
2 For E1and E2with E1∩E26=∅, the union of the events P(E1∪E2) =P(E1) +P(E2)−P(E1∩E2) 3 For E1and E2with E1∩E2=∅, the union of the events
P(E1∪E2) =P(E1) +P(E2) for i=1, . . . ,k
4 For E1and E2independent events we have P(E1∩E2) =P(E1)×P(E2)
Eg: E1the event of having “even” outcome when tossing one coin, and E2the event of having “even” outcome when tossing another coin
beamer-tu-logo
Bibliography
Probability (cont.)
Conditional Probability
1 Marginal Probability: The probability that a given event will occur. No other events are taken into consideration. A typical expression is P(A) for the A event.
2 Joint Probability: The probability that two or more events will all occur. A typical expression is P(A∩B)for the A and B events.
3 Conditional Probability: The probability that an event will occur, given that another event has already happened. A typical expression is P(A|B), with the verbal description, “the probability of A, given B.”
beamer-tu-logo
Bibliography
Probability (cont.)
Conditional Probability
Supose data from two (2) random variables. Eithe quality for a product being
high-medium-low for i=1=2 and i=3 respectively. On the other hand we can divide the our sample space into A and B for the A-market area and the B-market area respectively. We can assign conditional frequencies
(probabilities) to the 2-entry table as follows:
A B
E1 P(E1∩A) P(E1∩B) P(E1) E2 P(E2∩A) P(E2∩B) P(E2) E3 P(E3∩A) P(E3∩B) P(E3)
beamer-tu-logo
Bibliography
Probability (cont.)
Conditional Probability (cont.)
Using data we can derive the following 2-entry table:
A B
E1 P(E1∩A) =0.48 P(E1∩B) =0.12 P(E1) =0.60 E2 P(E2∩A) =0.15 P(E2∩B) =0.10 P(E2) =0.25 E3 P(E3∩A) =0.0225 P(E3∩B) =0.1275 P(E3) =0.15 P(A) =0.6525 P(B) =0.3475 P(Ω) =1
beamer-tu-logo
Bibliography
Probability (cont.)
Conditional Probability (cont.)
One can derive the probability of having a product of high quality conditional on being in the A market area as:
P(E1|A) = P(E1∩A) P(A) , where
P(A) =P(E1∩A) +P(E2∩A) +P(E3∩A), the marginal probability. and
P(E1∩A) =P(E1|A)×P(A) P(E1∩B) =P(E1|B)×P(B) the joint probabilities.
beamer-tu-logo
Bibliography
Probability (cont.)
Conditional Probability (cont.)
One can derive the probability of having a product of at least medium quality conditional on being in the A market area as:
P(E1∪E2|A) =P((E1∪E2)∩A)
P(A) ,
where
P(A) =P(E1∩A) +P(E2∩A) +P(E3∩A), the marginal probability. and
P((E1∪E2)∩A) =P((E1∩A)∪(E2∩A)) =P(E1∩A) +P(E2∩A) =0.63 So, the P(E1∪E2|A) =0.709885.
Question?
Is this probability bigger than that of having a product of at least medium quality conditional on being in the B market area?
beamer-tu-logo
Bibliography
Probability (cont.)
Conditional Probability (cont.)
One can derive the probability of having a product of at least medium quality conditional on being in the A market area as:
P(E1∪E2|A) =P((E1∪E2)∩A)
P(A) ,
where
P(A) =P(E1∩A) +P(E2∩A) +P(E3∩A), the marginal probability. and
P((E1∪E2)∩A) =P((E1∩A)∪(E2∩A)) =P(E1∩A) +P(E2∩A) =0.63 So, the P(E1∪E2|A) =0.709885.
Question?
Is this probability bigger than that of having a product of at least medium quality conditional on being in the B market area?
beamer-tu-logo Bibliography
Description
1 Introduction
Basic Notions in Statistics Data
Statistical Measures
2 Probability
Basic Notions of Probability
beamer-tu-logo Bibliography
Bibliography
Anderson D.R., Sweeney, D.J,
Statistics for Business and Economics.
South-Western College Pub; 11th edition, 2010
Weiers R.M.,
Introduction to Business Statistics.
South-Western College Pub; 7th edition, 2010. Koop G.,
Analysis of Economic Data. Wiley Pub; 2nd edition, 2005.
beamer-tu-logo Bibliography
Bibliography
Anderson D.R., Sweeney, D.J,
Statistics for Business and Economics.
South-Western College Pub; 11th edition, 2010
Weiers R.M.,
Introduction to Business Statistics.
South-Western College Pub; 7th edition, 2010.
Koop G.,
Analysis of Economic Data. Wiley Pub; 2nd edition, 2005.
beamer-tu-logo Bibliography
Bibliography
Anderson D.R., Sweeney, D.J,
Statistics for Business and Economics.
South-Western College Pub; 11th edition, 2010
Weiers R.M.,
Introduction to Business Statistics.
South-Western College Pub; 7th edition, 2010.
Koop G.,
Analysis of Economic Data.