• No results found

Continuous Data

In document Brihaspathi Synopsis (Page 173-181)

COMMUNITY DENTISTRY Biostatistics

II. Quantitative Data or Continuous Data or Numerical Data

2. Continuous Data

 Occur when there is no limitation on the values that the variable can take.

Ex: weight or height

Sources of Data 2. Primary data

 Obtained directly from the source

 It is first hand information

 Data can be obtained by means of questionnaires, interviews, or clinical examinations

3. Secondary data

 Obtained from pre-existing records

 It is Second hand information

 Data can be obtained from govt. records, hospital records etc.

Methods of Collecting Data 1. Census

 Defined as the total process of collecting, compiling and publishing demographic, economic and social data pertaining at a specified time or times, to all persons in a country or a delimited territory

 The first regular census in India was recorded in 1881.

 Census is conducted for every 10 years in India (MAHE-99)

 Recent census in India was recorded in February, 2011.

 Census act was passed by the parliament of India in1948.

 ‘Census Commissioner of India’ is the chief officer for census enumeration.

Advantages

o Complete information

Disadvantages

o Expensive, time consuming, needs more man-power, lesser accuracy.

COMMUNITY DENTISTRY Biostatistics

18

© BRIHASPATHI ACADEMY ׀ SUBSCRIBER’S COPY ׀ NOT FOR SALE 2. Sampling

 Sample is a portion of a population, selected from the population in some manner

 A Sampling unit is defined as representing every member of sample. (AIPG-09)

Importance of Sampling

 The physical impossibility of checking all the items in the population

 Adequate accuracy of sampling results

 Cost of study in the entire population

 Saving the time

Types

1. Purposive Sampling i. Judgment Sampling

 Selection of samples is left to the Judgment of investigator.

 In this sampling technique, the accuracy of results depends upon investigator.

Indications

o Employed mainly when population is small o Employed to conduct

pilot study

Limitations

o Accuracy of results depends upon the knowledge of the investigator.

o If investigator is biased, it affects the acceptance or rejection of a hypothesis

ii. Convenience Sampling (Chunk Sampling/ incidental sampling )

 Chunk is a fraction of population, which is selected because it is

conveniently available for investigator.

 Ex: In order to estimate oral hygiene status of school children in a city, the investigator may select a few schools nearby his work. Results of this sampling are rarely representative because they are generally biased.

iii. Quota Sampling

 Each investigator is allotted quota of persons which are to be interviewed.

 Investigators are given instructions to interview persons within the quota with some specified characteristics.

 Ex: Persons within the quota of 10 house wives, 6 professionals.

2. Random Sampling

 The sample is selected using random techniques.

 Selection bias is avoided.

i. Simple Random Sampling (unrestricted random sampling )

 The procedure of selecting a sample in which, every item in a population has an equal chance of being included in the sample. (MAHE-97)

 Applicable when

population is very small, homogeneous and readily available

 Lottery method

Advantages

o Eliminates selection bias

© BRIHASPATHI ACADEMY ׀ SUBSCRIBER’S COPY ׀ NOT FOR SALE Disadvantages

o Selection of sample is costly and time consuming

Limitation

o Difficult to collect data for large samples

ii. Systematic Random Sampling

 By selecting one unit at random and then selecting additional units at evenly spaced intervals (sample interval) till the sample of required size has been formed

 It is applied to field studies when the population is large, scattered &

homogenous.

 Sample interval is calculated by the following formula K = N/n

Where, K - sample interval or sample ratio, N - population size and n - Sample size

 Ex: If 150 patients are to be included in the sample from a population of 3000, K = 3000/150 = 20

Advantages

o Systematic design is simple, convenient to adopt

o The time & labor in collection of sample is relatively small o It gives accurate results

when population is large

Limitation

o Requires a pre-formed list

iii. Stratified Random Sampling (KAR-99)

 If population is

heterogeneous, the simple random sampling is not effective.

 Purpose of this sampling is to increase the efficiency of sampling by dividing heterogeneous sample

population into

homogenous groups. These homogenous groups are termed as strata.

 Ex: Areas, classes, age groups, sexes etc.

Advantages

o There is a greater precision of results o It gives better results

when population is scattered

o More

representativeness &

accuracy

Disadvantages

o It is too technical method and Time consuming.

iv. Cluster Sampling

 In this sampling the required no of groups or clusters are selected by simple random sampling.

Then all the individuals present in those clusters are included in the sample (KAR-04)

Advantages o Simpler

o Involves less time and cost

COMMUNITY DENTISTRY Biostatistics

20

© BRIHASPATHI ACADEMY ׀ SUBSCRIBER’S COPY ׀ NOT FOR SALE Indication

o When population is vast & scattered over a wide area and the population forms natural groups (called clusters), cluster sampling is applicable.

v. Multistage Sampling

 As the name implies this method refers to the sampling procedures carried out in several stages using random sampling technique.

Indication

o When the study involves very large population, like nationwide surveys

vi. Multiphase Sampling

 In this method, part of the information is collected from the whole sample &

part from the sub sample.

Advantages

o Economic, yet

purposeful

o Saves time and manpower

Errors in Sampling Sampling Errors

• Faulty sampling design

• Small sample size

Non sampling Errors

• Coverage error

• Observational error

• Processing error

PRESENTATION OF DATA

• Statistical data once collected should be systematically arranged and presented,

 To arouse interest of readers

 For data reduction

 To bring out important points clearly and strikingly

 For easy grasp and meaningful conclusions

 To facilitate further analysis

 To facilitate communication

• Two main types of data presentation are I. Tabulation

II. Graphic representation with charts and diagrams

I. Tabulation

• It is the most common method

• Data presentation is in the form of columns and rows

• It can be of the following types

1.

Simple tables

2.

Frequency distribution tables

1.

Simple Table

Year Number of in patients

Jan 06 2,800

Feb 06 1,900

March 06 1,750

2.

Frequency distribution table

 In a frequency distribution table, the data is first split into convenient groups (class interval) and the number of items (frequency) which occurs in each group is shown in adjacent column.

Number of

Cavities Number of Patients

0 to 3 78

3 to 6 67

6 to 9 32

9 and above 16

© BRIHASPATHI ACADEMY II. Charts and diagrams

Ideal requirements of Charts and diagrams

• Self explanatory

• Simple and consistent with the d

• Values of the variables should be indicated at the right hand top corner of the graph

• The scale of division of the should be proportional

• The details of the variables and frequencies presented on

should be mentioned

Types of Diagrams 1. Simple Bar

 Represent qualitative data

 Only one variable can be represented using one diagram classification, thus cannot be used for comparison.

1st Qtr 2nd Qtr 3rd Qtr 4th Qtr

© BRIHASPATHI ACADEMY ׀ SUBSCRIBER’S COPY ׀ NOT FOR SALE Charts and diagrams

Simple and consistent with the data Values of the variables should be shown on horizontal or X-axis and ency on vertical line or

Y-many lines on the graph

The scale of presentation should be indicated at the right hand top corner

The scale of division of the two axes

The details of the variables and frequencies presented on the axes

Represent qualitative data

Only one variable can be represented

Width of the bar remains the same

The length varies according to the

vertically or

It represents only one classification, thus cannot be used for comparison.

2. Multiple Bar

 It is used to compare qualitative data with respect to a single variable.

 Eg: with respect to sex, time or region.

 Each category of the variable have a set of bars of the same width corresponding to the different sections without any gap in between the width and the length corresponds to the frequency.

3. Component Bar

 It represents qualitative data.

 We can represent the number of cases in major groups as well as the subgroups simultaneously, using component bar diagram.

 First, rectangles are drawn, proportional to the number of cases of the major group. Then, each rectangle is divided in to components, proportional to the numbers in the subgroups.

4. Histogram (AP-01, 03, KAR

 Most widely used to represent quantitative data of continuous type.

 It is a bar diagram without gap between the bars.

 It represents a frequency distribution.

 X-axis: the size of an observation is marked. Starting from 0, the limit of each class interval is marked. The width of each bar corresponds to the width of the class interval in the

It is used to compare qualitative data with respect to a single variable.

Eg: with respect to sex, time or

Each category of the variable have a set of bars of the same width corresponding to the different sections without any gap in between the width and the length corresponds to the frequency.

It represents qualitative data.

can represent the number of cases in major groups as well as the subgroups simultaneously, using component bar diagram. numbers in the subgroups.

, KAR-10)

used to represent quantitative data of continuous type.

It is a bar diagram without gap between the bars.

It represents a frequency distribution.

the size of an observation is marked. Starting from 0, the limit of each class interval is marked. The width of each bar corresponds to the width of the class interval in the

© BRIHASPATHI ACADEMY

 Y-axis: the frequencies are marked. A rectangle is drawn above each class interval with height proportional to the frequency of that class interval.

5. Frequency Polygon

 It represents frequency distribution of quantitative data

 It facilitates comparison of two or more frequency distributions.

 A point is marked over the mid

the class interval, corresponding to the frequency.

 The first point and last point of each class interval are joined to the midpoint of previous and next class respectively. All the points are connected by straight lines.

 To compare two or more frequency distributions, lines of different types are drawn on the same graph.

6. Line Diagram

 It is useful to study the changes of values in the variables over time. . (AIIMS

 Time is represented on X axis and frequency of the variable on Y axis.

 Facilitates comparison of data among different groups in a simple way

COMMUNITY DENTISTRY Biostatistics

© BRIHASPATHI ACADEMY ׀ SUBSCRIBER’S COPY ׀ NOT FOR SALE

0

the frequencies are marked. A rectangle is drawn above each class interval with height proportional to the frequency of that class interval.

It represents frequency distribution of

It facilitates comparison of two or more

A point is marked over the mid-point of the class interval, corresponding to the

The first point and last point of each class interval are joined to the midpoint of d next class respectively. All the points are connected by straight lines.

To compare two or more frequency distributions, lines of different types are

changes of values . (AIIMS-01) Time is represented on X axis and frequency of the variable on Y axis.

Facilitates comparison of data among different groups in a simple way

7. Pie Chart/Sector Diagram

 Used to present data, expressed in percentages (KAR-04, COMEDK

 The frequency of the group is shown in a circle.

 Degree of angle denotes the frequency.

 Instead of comparing the length of the areas of segments are compared.

8. Scatter/Dot Diagram

 It is used to show the association between two quantitative variables. The frequency of the group is shown in a

Degree of angle denotes the frequency.

Instead of comparing the length of bar, the areas of segments are compared.

It is used to show the association between uantitative variables.

The imaginary line drawn through the center of the scatter shows the

© BRIHASPATHI ACADEMY

9. Cartograms or Spot Map

 It shows geographical distribution of frequencies of a characteristic.

 Easy to understand and condenses a lot of

information in to a simple picture.

10. Pictogram

 The pictures representing the value items are called pictograms.

 It is most useful way of representing data to lay groups.

© BRIHASPATHI ACADEMY ׀ SUBSCRIBER’S COPY ׀ NOT FOR SALE It shows geographical distribution of

frequencies of a characteristic.

Easy to understand and condenses a lot of

information in to a simple picture.

The pictures representing the value of

It is most useful way of representing data

MEASURES OF STATISTICAL AVERAGES OR CENTRAL TENDENCY

• Single estimate of a series of data that summarizes the data is known as the parameter and one such

measure of central tendency.

• Objective: to condense the entire mass of data to facilitate comparison with other data measured on the same grounds.

Ideal Properties of Central Tendency

• Should be easy to understand and compute

• Should be based on each and every item in the measure of central tendency is calculated, they should not differ from each other markedly

Types

1.

Arithmetic mean – mathematical estimate

2.

Median – positional estimate

3.

Mode – based on frequency 1. Arithmetic Mean/Mean (MAHE

AIIMS-01, PGI-02)

 The simplest measure of central tendency

 It is the summation of all the observations divided by the total number of observations (n)

 Denoted by X for sample and µ for population

Mean = Sum of all the observations of the data Number of observations in the data MEASURES OF STATISTICAL AVERAGES OR

Single estimate of a series of data that summarizes the data is known as the parameter and one such parameter is the measure of central tendency.

to condense the entire mass of data to facilitate comparison with other data measured on the same grounds.

Ideal Properties of Central Tendency

Should be easy to understand and compute based on each and every item in the

Should not be affected by extreme measure of central tendency is calculated, they should not differ from each other markedly

mathematical estimate positional estimate

based on frequency

(MAHE-95, 98, 99, 2K,

est measure of central tendency It is the summation of all the observations divided by the total number of

Denoted by X for sample and µ for

the observations of the data Number of observations in the data

means the sum of.

is the value of each observation in the

n: is the number of observations in the

COMMUNITY DENTISTRY Biostatistics

24

© BRIHASPATHI ACADEMY ׀ SUBSCRIBER’S COPY ׀ NOT FOR SALE Properties of the Mean

 Uniqueness – For a given set of data there is one and only one mean

 Simplicity – It is easy to understand and to compute

 Affected by extreme values. Since all values enter into the computation

2. Median (UPSC-01, KAR-02, AIIMS-04)

 When ordering the data, it is the observation that divides the set of observations into two equal parts such that half of the data are before it and the other are after it.

 If n is odd, the median will be the middle of observations. It will be the (n+1)/2th ordered observation. When n = 11, then the median is the 6th observation.

 If n is even, there are two middle observations. The median will be the mean of these two middle observations. It will be the mean of the [(n/2)th, (n/2 +1)th] ordered observation. When n = 12, then the median is the 6.5th observation, which is an observation halfway between the 6th and 7th ordered observation.

Calculation of Median

o Observations are arranged in the ascending or descending order of magnitude & then the middle value of the observations is Median.

o In case of even number of observations, the average of the two middle values is Median

Properties of the Median

 Uniqueness – For a given set of data there is one and only one median

 Simplicity – It is easy to calculate

 It is not affected by extreme values as is the mean

3. Mode (KAR-03, AIIMS-08)

 The value in a series of observations, which occurs with the greatest frequency Example

o Number of decayed teeth in 10 children: 2, 2, 4, 1, 3, 0, 10, 2, 3, 8 Mean = 34 / 10 = 3.4

Median = (0,1,2,2,2,3,3,4,8,10) = 2+3 /2 = 2.5

Mode = 2 (3 Times)

Properties of the Mode

 Sometimes, it is not unique.

 It may be used for describing qualitative data.

TYPES OF VARIABILITY

• There are three types of variability

1.

Biological variability

2.

Real variability

3.

Experimental variability

i. Observer Error ii. Instrumental Error iii. Sampling Error

© BRIHASPATHI ACADEMY ׀ SUBSCRIBER’S COPY ׀ NOT FOR SALE 1. Biological Variability

 It is the natural difference which occurs in individuals due to age, gender and other attributes which are inherent

 This difference is small and occurs by chance and is within certain accepted biological limits

 Ex: vertical dimension may vary from patient to patient

In document Brihaspathi Synopsis (Page 173-181)