• No results found

Introduction to Statistics andData Fall/2017-18

N/A
N/A
Protected

Academic year: 2022

Share "Introduction to Statistics andData Fall/2017-18"

Copied!
25
0
0

Loading.... (view fulltext now)

Full text

(1)

Introduction to Statistics and Data Fall/2017-18

(2)

What is statistics?

Webster’s Third New International Dictionary

Statistics is science dealing with collection, analysis, interpretation and presentation of numerical data

Uses mathematics and probability

2

(3)

Data

"facts or figures from which conclusions can be drawn“

−Before one can present and interpret information, there has to be a process of gathering and sorting data.

−Just as trees are the raw material from which paper is produced, so too, can data be viewed as the raw material from which information is obtained.

Nihar Ranjan Roy

Data, Information and Statistics

3

Data collected on the weight of 20 individuals in your classroom

Data Information Statistics

20, 21, 21.5, 24,

25 kg 5 individuals in the 20-to-25-kg range Mean weight = 22.5 kg 28 kg, 30 kg, etc. 15 individuals in the 26-to-30-kg range Median weight = 28 kg

(4)

Information

A good definition of information is "data that have been recorded, classified, organized, related, or interpreted within a framework so that meaning

emerges".

Data, Information and Statistics

4

Data collected on the weight of 20 individuals in your classroom

Data Information Statistics

20, 21, 21.5, 24,

25 kg 5 individuals in the 20-to-25-kg range Mean weight = 22.5 kg 28 kg, 30 kg, etc. 15 individuals in the 26-to-30-kg range Median weight = 28 kg

(5)

Statistics

"a type of information obtained through mathematical operations on numerical data".

Nihar Ranjan Roy

Data, Information and Statistics

5

Data collected on the weight of 20 individuals in your classroom

Data Information Statistics

20, 21, 21.5, 24,

25 kg 5 individuals in the 20-to-25-kg range Mean weight = 22.5 kg 28 kg, 30 kg, etc. 15 individuals in the 26-to-30-kg range Median weight = 28 kg

(6)

Types of Statistics

Population:

−A collection of persons, objects or items of interest.

−When researchers gather data from whole population for a given measurement of interest they call it CENSUS.

Sample:

−A portion of the whole

Statistics

Descriptive Statistic Inferential Statistics

6

(7)

Descriptive Vs Inferential Statistics

7

Descriptive Statistics

Data is gathered on a group to

describe or reach conclusions about the same group.

Example: Athletic Statistics

Inferential Statistics

Data is gathered from a sample and uses the statistics to reach

conclusions about the population from which the sample was taken.

Example: Market Research.

Nihar Ranjan Roy 7

(8)

Descriptive statistics allow you to characterize your data based on its properties. There are four major types of descriptive statistics:

1. Measures of Frequency:

− Count, Percent, Frequency

− Shows how often something occurs

− Use this when you want to show how often a response is given

2. Measures of Central Tendency

− Mean, Median, and Mode

− Locates the distribution by various points

− Use this when you want to show how an average or most commonly indicated response

3. Measures of Dispersion or Variation

− Range, Variance, Standard Deviation

− Identifies the spread of scores by stating intervals

− Range = High/Low points

− Variance or Standard Deviation = difference between observed score and mean

− Use this when you want to show how "spread out" the data are. It is helpful to know when your data are so spread out that it affects the mean

4. Measures of Position

− Percentile Ranks, Quartile Ranks

− Describes how scores fall in relation to one another. Relies on standardized scores

− Use this when you need to compare scores to a normalized score (e.g., a national norm)

Descriptive Statistics

8

(9)

Data???

What can be the possible form of data?

What operations can be performed on this data?

What does this data represent?

Relationships between two values? Interpretation

How to analyse this data?

Nihar Ranjan Roy

9

(10)

Data Measurement

Every data measured should not be analysed the same way statistically.

Need for level of data measurement

− Nominal

− Ordinal

− Interval

− Ratio

10

(11)

Nominal Level

Nominal — In nominal measurement the values just "name" the attribute uniquely.

−No ordering of the cases is implied.

−For example, a persons gender is nominal. It doesn’t matter whether you call them boys vs. girls or males vs. females or XY vs. XX chromosomes.

−Another example is religion – Catholic, Protestant, Muslim, etc.

Nihar Ranjan Roy

11

(12)

Ordinal Level

Ordinal - A variable is ordinal measurable if

ranking

is possible for values of the variable.

−For example, a gold medal reflects superior performance to a silver or bronze medal in the Olympics. You can’t say a gold and a bronze medal average out to a silver medal, though.

−Preference scales are typically ordinal – how much do you like this cereal?

__________ _____________ ___________ ____________ ___________

Like it a lot, somewhat like it, neutral, somewhat dislike it, dislike it a lot.

1 2 3 4 5

12

(13)

Interval Level

Interval - In interval measurement the distance between attributes does have meaning.

−Numerical data typically fall into this category

−For example, when measuring temperature (in Fahrenheit), the distance from 30-40 is same as the distance from 70-80. The interval between values is interpretable.

Nihar Ranjan Roy

13

(14)

Ratio Level

Ratio — in ratio measurement there is always a reference point that is meaningful (either 0 for rates or 1 for ratios)

−This means that you can construct a meaningful fraction (or ratio) with a ratio variable.

−In applied social research most "count" variables are ratio, for example, the number of clients in past six months.

14

(15)

Cardinal Level

A Cardinal Number says how many of something there are, such as one, two, three, four, five.

−A Cardinal Number answers the question "How Many?“

−It does not have fractions or decimals, it is only used for counting.

Cardinal - A variable is cardinally measurable if a given interval between measures has a consistent meaning, i.e., if the measure corresponds to points along a straight line.

−For example, height, output, and income are cardinally measurable

Nihar Ranjan Roy

15

(16)

Nominal level data

Numbers are used to classify or categorize Example: Employment Classification

−1 for Educator

−2 for Construction Worker

−3 for Manufacturing Worker

16

(17)

Ordinal Level Data

Numbers are used to indicate rank or order

−Relative magnitude of numbers is meaningful

−Differences between numbers are not comparable Example: Ranking productivity of employees

Example: Position within an organization o1 for President

o2 for Vice President o3 for Plant Manager

o4 for Department Supervisor o5 for Employee

Nihar Ranjan Roy

17

(18)

Ordinal Level Data

Faculty and staff should receive preferential treatment for parking space.

1 2 3 4 5

Strongly

Agree Agree Strongly

Disagree Disagree

Neutral

18

(19)

Interval Level Data

Interval Level data - Distances between consecutive integers are equal

−Relative magnitude of numbers is meaningful

−Differences between numbers are comparable

−Location of origin, zero, is arbitrary

−Vertical intercept of unit of measure transform function is not zero

Example: Fahrenheit Temperature Example: Monetary Utility

Nihar Ranjan Roy

19

(20)

Ratio Level Data

Highest level of measurement

−Relative magnitude of numbers is meaningful

−Differences between numbers are comparable

−Location of origin, zero, is absolute (natural)

−Vertical intercept of unit of measure transform function is zero

Examples: Height, Weight, and Volume

Example: Monetary Variables, such as Profit and Loss, Revenues, Expenses, Financial ratios - such as P/E Ratio, Inventory Turnover, and Quick Ratio.

20

(21)

Ratio Level Data…

Parametric statistics – requires that the data be interval or ratio

Non Parametric – used if data are nominal or ordinal

−Non parametric statistics can be used to analyze interval or ratio data

Nihar Ranjan Roy

21

(22)

Data Level Nominal Ordinal Interval

Ratio

Classifying and Counting

All of the above plus Ranking

All of the above plus Addition, Subtraction,

Multiplication, and Division (including means, standard deviations, etc.)

All of the above

Meaningful Operations

Data Level, Operations, and Statistical Methods

22

(23)

Classify each of the following as nominal, ordinal, interval or ratio data.?

1. The time required to produce each tire on an assembly line.

2. The number of quarts of milk a family drinks in a month.

3. The ranking of four machines in your plant after they have been designated as excellent, good, satisfactory, and poor.

4. The telephone area code of clients in the United States.

5. The age of each of your employees.

6. The dollar sales at the local pizza house each month.

7. An employee’s ID number.

8. The response time of an emergency unit

Nihar Ranjan Roy

Problem

23

(24)

Classify each of the following as nominal, ordinal, interval or ratio data.?

1. The time required to produce each tire on an assembly line.

2. The number of quarts of milk a family drinks in a month.

3. The ranking of four machines in your plant after they have been designated as excellent, good, satisfactory, and poor.

4. The telephone area code of clients in the United States.

5. The age of each of your employees.

6. The dollar sales at the local pizza house each month.

7. An employee’s ID number.

8. The response time of an emergency unit

Problem

24

1. Ratio 2. Ratio 3. Ordinal 4. Nominal 5. Ratio 6. Ratio

(25)

Problem

Classify the following as nominal, ordinal, interval or ratio data.

1. The ranking of a company by Fortune 500.

2. The number of tickets sold at a movie theatre on a given night.

3. The identification number of a questionnaire 4. Per capita income

5. The trade balance in dollars 6. Profit/loss in dollars

7. A company's tax identification

Nihar Ranjan Roy

25

References

Related documents