• No results found

I Descriptive Statistics (PART I)

N/A
N/A
Protected

Academic year: 2019

Share "I Descriptive Statistics (PART I)"

Copied!
39
0
0

Loading.... (view fulltext now)

Full text

(1)

Introductory Statistics

ECON1005

(2)

I Descriptive Statistics (PART I)

Introduction

(3)

INTRODUCTION

What is Statistics?

Basic Definitions

(4)

What is Statistics?

Statistics

is a group of methods used to

(5)

Branches of Statistics

STATISTICS

DESCRIPTIVE STATISTICS

(characterise attributes of sample

&/or population)

(6)

Used to collect, organize, display and analyse data

There are two types:

1.

Numerical

Involves the computation of a statistic (eg. the average)

2.

Graphical

Involves representing the data using pictures

(7)

Inferential Statistics

Uses sample results to make generalizations,

inferences and predictions about a wider population

There are two types:

▫ Estimation

sample is used to estimate a parameter

▫ Hypothesis Testing

(8)

Descriptive vs. Inferential Statistics

Descriptive Statistics

▫ Collect

▫ Organize

▫ Summarize

▫ Display

▫ Analyze

Inferential Statistics

▫ Predict and forecast

values of a population

▫ Test hypotheses about

values of a population

(9)

Basic Definitions I

Population

:

▫ A

population

is the collection of all items whose characteristics

are being studied.

N represents the population size

▫ Values calculated using population data are called parameters

Sample:

▫ A

sample

is a portion of the population selected for study.

n represents the sample size

(10)

Basic Definitions II

Data:

▫ numbers or measurements that are collected

Variables:

▫ characteristics or attributes that enable us to distinguish one

individual from another

▫ they take on different values when different individuals are

observed (e.g. height)

Element:

(11)

Summarising & Describing Data

Describing the observed patterns in data is an

important part of statistics

Distribution of a single variable

(12)

Describing Data

S

Shape

What is the overall shape of the distribution?

(symmetric or skewed / Mounded or flat)

U

Unusual

(errors, outliers or influential points)

Are there any unusual points?

M

Middle

Where is the centre of the distribution?

(mean, median, mode)

(13)

Organizing and Graphing Data

Introduction

Frequency Distributions

Bar Charts

(14)

Introduction

There are two main types of data:

Quantitative

This is information presented in the form of numbers,

percentages or statistics

It answers in numerical terms such questions as "how often"

and "how many“

Qualitative

(15)

Organizing Data

A frequency distribution lists all categories/classes and

the number of elements that belong to each

(16)

Frequency Distributions

A frequency distribution:

▫ a table in which measurements are tallied

▫ then the frequency or total number of times that each item

occurs is recorded

Usually measurements are arranged in ascending or descending

order

A frequency distribution has 3 columns

▫ the data categories or classes

▫ the tally column (for raw data)

▫ the corresponding frequencies

(17)

Examples

Quantitative

Qualitative

CATEGORY TALLY FREQUENCY

Yes 23

No 13

Undecided 4

Total 40

10M - 14M 25

15M - 19M 15

20M - 24M 19

25M - 29M 8

Total 77

(18)

Frequency Distribution Cont’d

two main types of frequency distributions:

▫ Ungrouped data

▫ Grouped data

Ad. expenditure (J$M) Tally Number of Firms (Frequency)

5M - 9M 10

10M - 14M 25

15M - 19M 15

20M - 24M 19

25M - 29M 8

Total 77

Ages Tally Frequency

(19)

Class Intervals/Limits

▫ largest or smallest numbers which can actually

belong to each class

▫ each class has a lower class limit and an upper

class limit

Ad. expenditure

(J$M)

Tally Number of Firms

5 - 9 10

10 – 14 25

15 – 19 15

20 – 24 19

25 - 29 8

(20)

▫ the numbers which separate classes

▫ given by the midpoint of the upper limit of one class and the

lower limit of the next class

Ad. expenditure (J$M) Class Boundaries Tally Number of Firms

5 - 9 10

10 – 14 9.5 – 14.5 25

15 – 19 15

20 – 24 19

25 - 29 8

Total 77

Lower class boundary for 2

nd

class

(10 – 14):

(21)

Class Mark (Midpoint)

▫ found by taking the average of the class limits (or class

boundaries)

Ad. expenditure (J$M) Class Boundaries Midpoints Number of Firms

5 - 9 4.5 – 9.5 7 10

10 – 14 9.5 – 14.5 25

15 – 19 14.5 – 19.5 15

20 – 24 19.5 – 24.5 19

25 - 29 24.5 – 29.5 8

Total 77

Class 1 - Using Class Limits

(22)

Class Width

▫ aka: class size, class width, class length

▫ Two ways of calculating

▫ Method 1: the difference between corresponding class limits

▫ Method 2: the difference between two class boundaries

Ad. expenditure (J$M) Class Boundaries Midpoints Number of Firms

5 - 9 4.5 – 9.5 7 10

10 – 14 9.5 – 14.5 12 25

15 – 19 14.5 – 19.5 17 15

20 – 24 19.5 – 24.5 22 19

25 - 29 24.5 – 29.5 27 8

Total 77

Using Class

Limits

(23)

Found by dividing the frequency of a category/class by the

sum of all frequencies

▫ The sum of the relative frequencies MUST add to 1

▫ Sometimes expressed as a percentage

Ad. expenditure (J$M) Class Boundaries Number of Firms Relative Frequency

5 - 9 4.5 – 9.5 10 0.13

10 – 14 9.5 – 14.5 25

15 – 19 14.5 – 19.5 15

20 – 24 19.5 – 24.5 19

25 - 29 24.5 – 29.5 8

Total 77 1.00

General

Formula

(24)

Distributions

1.

The classes must be “mutually exclusive” - no element can

belong to more than one class

2.

Even if the frequency is zero, include each and every class

3.

Make all classes the same width (open ended classes may be

inevitable)

4.

Target between 5 and 20 classes, depending on the range and

number of data points

(25)

Consider the following data set:

2.3 4.2

2.8 6.7 4.7 1.6

2.0 1.4 1.0

2.8 1.8

5.2 6.0 5.2 3.5 1.0 3.6 5.1

1.9 7.3

2.5 5.6 3.3 3.4

2.9 3.0 1.8

2.1 3.1

2.8 2.1 4.3 7.1

4.9 1.6

2.2

4.5 6.3

2.7 8.3

a.

Group these figures into a frequency distribution having the classes:

1.0 – 1.9, 2.0 – 2.9, 3.0 – 3.9, 4.0 – 4.9, 5.0 – 5.9, 6.0 – 6.9, 7.0 –

7.9, and 8.0 – 8.9

b.

Calculate the class boundaries

c.

Calculate the class midpoints

d.

Calculate the class width

(26)
(27)

Graphical Representation

When presenting

Quantitative Data

use:

▫ histograms

▫ frequency polygons

▫ cumulative frequency polygons (O-give)

When presenting

Qualitative Data

use:

(28)

▫ A graphical way of presenting qualitative data

Bars (columns) are separated from each other and have

the same width

Categories are placed on the horizontal axis and

(29)

▫ A graphical way of presenting qualitative data

Pie Chart is a circle divided into portions that represent

the relative frequencies belonging to different categories.

To construct pie chart:

(30)

Qualitative Example

The following are the results for a third year

statistics course:

A - 41

B+ - 12

B - 2

C - 22

F - 19

▫ Calculate the relative frequencies

▫ Construct a bar chart

(31)

A graphical way of presenting qualitative data

Divide data into classes of equal width and the number of

observations in each class is counted (information would be

presented in a frequency table)

Class is on the x-axis (horizontal)

▫ Can plot using either:

Class Limits

Class Boundaries

Frequency (or relative frequency) is on the y-axis (vertical)

Bars are drawn where the base of each bar covers the class

and the height of each bar covers the frequency

(32)
[image:32.720.36.718.34.489.2]

Figure 2 – plotted using class

limits

[image:32.720.41.425.35.308.2]
(33)

4. Frequency Polygons

A

Frequency Polygon

is a line graph joining the

midpoints of the bars of a histogram

To construct a frequency polygon:

▫ Plot the midpoint of each class (on horizontal) with its

corresponding frequency/relative frequency (on vertical)

(34)
(35)

Examines how many observations

lie below

a certain class

boundary

Plotted against the upper class boundaries

Using Frequencies

The first value in the distribution is ALWAYS zero

The last value in the distribution is ALWAYS the total number

Using Relative Frequencies

The first value in the distribution is ALWAYS zero

The last value in the distribution is ALWAYS 1

Using Percentages

The first value in the distribution is ALWAYS zero

(36)

expenditure (J$M)

Upper Class Boundaries

Number

of Firms Cumulative Frequency

4.5

5 - 9 9.5 10

10 – 14 14.5 25

15 – 19 19.5 15

20 – 24 24.5 19

25 - 29 29.5 8

Total - 77

(37)

Examines how many observations

lie above

a certain class

boundary

Plotted against the upper class boundaries

Using Frequencies

The first value in the distribution is ALWAYS the total

number

The last value in the distribution is ALWAYS zero

Using Relative Frequencies

The first value in the distribution is ALWAYS 1

The last value in the distribution is ALWAYS zero

Using Percentages

The first value in the distribution is ALWAYS 100

(38)

Ad. expenditure (J$M) Upper Class Boundaries Number of Firms More Than Cumulative Frequency 4.5

5 - 9 9.5 10

10 – 14 14.5 25

15 – 19 19.5 15

20 – 24 24.5 19

25 - 29 29.5 8

Total - 77

(39)

Using the example on

slide 25

,

a.

Construct a histogram with a superimposed

frequency polygon

b.

Calculate:

Less than cumulative frequencies

More than cumulative frequencies

c.

Construct the:

Less than cumulative frequency ogive

Figure

Figure 1 – plotted using class

References

Related documents

Figure 4: Continuous Ranked Probability Skill Score versus lead time for the TIGGE-4 multi-model (solid line), for the contributing single-models itself (dotted

The purpose of this cross sectional study is to examine the influence of various factors associated with BMD, includ- ing calcium intake, exercise history, fitness level, body

We propose and examine a simple model for credit migration and spread curves of a single firm both under the real-world and the risk- neutral measure.. Default is triggered either

Tick original currency (or tick AUD and untick USD), tick thousand under Unit (it is up to you to choose the unit you like; here we choose thousand), and tick at each closing

I argue that Suriname travelogues and similar narratives across this time period index and reproduce a colonial gaze of maroons as a natural (and therefore non-historical)

Rather than being on your computer, a social bookmarking site allows you to save them online, so you can visit these bookmarks from any computer.. Before making any payment to