Case Study 13.11 Glossary - Statistical Quality Control

Operations Management Unit 5

Unit 13 Statistical Quality Control

13.10 Case Study 13.11 Glossary

13.1 Introduction

By now you must be familiar with the concepts of Independent demand inventory control. This unit familiarises you with the concepts of Statistical Quality Control.

Statistical Quality Control (SQC) monitors the production samples of determining the quantities statistically. Thus, a process is said to be in a state of statistical control if the variations of the sample stay within the limits. However, when a process is out of control it is necessary to locate the specific causes for the variation and take a corrective action.

The information facilitates controlling and improving the process of manufacturing. Furthermore, statistics is that language that facilitates engineers, manufacturing, procurement, management, and other functional components of the business communicate effectively about quality.

Learning Objectives:

After studying this unit you will be able to:

„h Define Statistical Quality Control and various methods associated with it

„h Explain descriptive statistics

„h Define probability distribution

„h Explain various types of Probability Distribution.

13.2 Statistical Quality Control

Statistical Quality Control (SQC) is a method that uses various statistical sampling of units that are produced by a production process. These are further checked and verified for defectives called as variances. It determines whether the process is in control or not.

If the process is not in control, then necessary and corrective actions are taken. Thus, the Statistical Quality Control (SQC) chart is used as a basic tool that formally distinguishes between the normal, as well the abnormal variances. These control charts further helps in distinguishing the random variances from the variances that need managerial investigation.

Thus, the final analysis helps in obtaining the improvements in the products and the processes.

Thus, this identification of the chance variances avoids unwanted and unnecessary investigations of variances and there by eliminating frequent changes.

Some of the various tools and methods associated with the Statistical Quality Control (SQC) are:

„h Descriptive Statistics

„h Stem and leaf Plot

„h Frequency Distribution and Histogram

Now let us discuss these tools and methods in detail.

13.3 Descriptive Statistics

Descriptive statistics is a process that is used to describe the features of data in terms of quantity.

It is generally represented with formal analyses. For example, in a study involving human subjects, there appears a table that provides information such as the overall size of the sample, subgroup sample sizes, and information about the demographic or the clinical characteristics, such as the average age, the proportion of subjects with each gender. However, most statistics can be used either as a statistic that is descriptive or in an inductive analysis.

For example, the average reading test score for the students in each classroom in a school can be reported. This could give a descriptive sense of typical scores and their variation. However, when a formal hypothesis test on scores is performed, we are doing inductive rather than the descriptive analysis. Some of the common examples of the descriptive statistical analysis include measures of central tendency, measures of dispersion and measures of association, cross tabulation, contingency table, and histogram. Thus, descriptive statistics provides various

numerical and graphic procedures. This facilitates to summarise a collection of data in a clear and understandable way.

13.3.1 Descriptiveness Measures

Descriptive statistics provides various numerical and graphic procedures. There are various measures of the descriptiveness statistics. They are as follows:

„h Central Tendency Measures: They are computed in such a way that, a ¡§center¡¨ is achieved around which; the measurements in the data are distributed. However, there are various measures under central tendency measures such as:

o Mean: It computes the sum of all the measurements and divides by the number of measurements. For example Consider the quantities as mentioned in Table 13.11 Table 13.1: Example of Mean

Measurements Deviation X

X-Mean 3 -1 5 1 5 1 1 -3 7 3 2 -2 6 2 7 3 0 -4 4 0 40 0

1 en.wikipedia.org/wiki/Central_tendency -Operations Management Unit 13

.: 254

Therefore, the sum of all the quantities .X. is obtained and the mean is calculates as:

MEAN= 40/10 = 4

The mean of all quantities is 4, and the sum of deviations is 0.

o Median: It is computed in such a way that half of the measurements are below it and half of the measurements are above it. It is illustrated in the table 13.2.

Table 13.2: Example of Median Measurements

Measurements Ranked X

3 0 5 1 5 2 1 3 7 4 2 5 6 6 7 7 0 7 4 7 40 40

Therefore, Median is (4+5)/2 = 4.5

Thus, only two central values are used in the computation. The median is not sensible to extreme values.

Mode: It computes the most frequent measurement in the data.

Operations Management Unit 13 .: 255

Table 13.3: Example of Mode Measurements

X 3 5 5 1 7 2 6 7 0 4

In this case, the data has two modes: 5 and 7 because both the measurements are repeated twice.

„h Variation or Variability measure: They are performed to compute how far away the measurements are from the center.

For example, consider that a population has four observations {1, 3, 5, 7}.

What is the variance??

Solution: First, we need to compute the mean of the population.

It is calculated as:

Then, all the values are plugged in the formula for the variance of a population:

X N i

2 2 ƒã ƒ„¡( ƒ{ƒÝ)

[(1 4) (3 4) (5 4) (7 4) ] 4 2 2 2 2 2 ƒã ƒ ƒ{ ƒy ƒ{ ƒy ƒ{ ƒy ƒ{

[9 1 1 9] 4 20 4 5 2 ƒã ƒ ƒy ƒy ƒy ƒ ƒ

Thus a variance value .5. is obtained for the population.

„h Relative Standing Measures: They are computed to describe the relative positions of specific measurements in the data.

ƒÝ ƒ (1ƒy 3ƒy 5 ƒy 7) 4 ƒ 4 x ƒ{ƒÝ ƒã ƒ 78 ƒ{ 69.0 2.8 ƒ 3.21 x ƒ{ƒÝ ƒã ƒ 76 ƒ{ 63.6 2.5 ƒ 4.96

For Example: Consider a scenario where, the heights of two superstars are compared. NBA superstar Michael Jordan is 78 inch tall and WNBA basketball player Rebecca Lobo is 76 inch tall. By this observation, it is obvious that Jordan is taller by 2inches than Rebecca. But which player is considered taller relatively? Does Jordan.s height among men exceed Lobo.s height among women? Consider Men have heights with a mean of 69.0 inches and a standard deviation of 2.8 inches. Consider women have heights with a mean of 63.6 inches and a standard deviation of 2.5 inches.

Solution: In order to compare the heights of Michael Jordan and Rebecca Lobo that are relative to the populations of men and woman, we need to standardise the heights by converting them to z scores

Jordan: z=

Lobo: z=

Thus, Michael Jordan.s height is 3.21 standard deviation above the mean.

However, Rebecca Lobo.s height is 4.96 standard deviations above the mean. This means that Rebecca Lobo.s height among woman is relatively greater than Michaels Jordan.s height among men.

Self Assessment Questions

1. _________ is a measure to calculate a measurement from the center.

2. _________ computes the most frequent measurement in the data.

3. How do we calculate median?

4. Define Descriptive statistics

5. Statistics can be used either as a statistic that is descriptive or in an inductive analysis.(True/False)

Activity 1

Analyse how the central tendency measures are calculated. List out the differences between mean, median and mode.

13.4 The Stem-and-Leaf plot

Statistics is the science of analysing data and drawing conclusions, taking variation in the data into account. However, no two units of a product that is produced by a manufacturing process are identical. Some variation is inevitable. For example, the net content of a soft drink can vary slightly from, can to can and the output voltage of a power supply is not exactly the same from, one unit to another.2

There are several graphical methods that are very useful for summarising and presenting data.

One of the most useful graphical techniques is the stem-and-leaf display.

Suppose that, the data are represented by u1, u2 . . . , un and that each number u1 consists of at least two digits. To construct a stem-and-leaf plot, each number ui is divided into two parts i.e.

A Stem: It consists of one or more of the leading digits A Leaf: It consists of the remaining digits.

For example: Consider the data that consists of percent defective information ranging between 0 and 100 on various semiconductor wafers. The value 76 can then be divided into the stem 7 and the leaf 6.

Once a set of stems has been selected, then they are listed along the left hand margin side of the display. The leaves that correspond to the observed data values are listed in order, beside each stem in the order in which, they are encountered in the data set.

For example, the construction of a stem and leaf plot can is illustrated in the table 13.4. The table represents the weekly yield data from a semiconductor fabrication facility.

2 www.netmba.com/statistics/plot/stem/ -Table 13.4: Weekly yields

Week Yield

Week Yield 1 48 21 68 2 53 22 65 3 49 23 73 4 52 24 88 5 51 25 69 6 52 26 83 7 63 27 78 8 60 28 81 9 53 29 86 10 64 30 92 11 59 31 75 12 54 32 85 13 47 33 81 14 49

34 77 15 45 35 82 16 64 36 76 17 79 37 75 18 65 38 91 19 62 39 73 20 60 40 92

In order to construct a stem and leaf plot, the values 4, 5, 6, 7, 8 and 9 are selected as stems.

Thus, the resulting stem and the display of leaf are as shown in the Table 13.5.

Table 13.5: Stem-and-leaf display for the data in 13.3 Stem

Leaf Frequency 4

8 9 7 9 5 5 5

3 2 1 2 3 9 4 7

3 0 4 4 5 2 0 8 5 9 10

9 3 8 5 7 6 5 3 8

8 3 1 6 5 1 2 7

9 2 1 2 3

Operations Management Unit 13 .: 259

By inspecting the plots, it is clear that the yield distribution has a symmetric shape, approximately with a single peak.

Variation of the Stem-and-Leaf Display:

An ordered stem-and-leaf display has the leaves arranged by magnitude, as shown in the table 13.6

Table 13.6: Variation of the stem and leaf Stem

Leaf Frequency 4

5 7 8 9 9 5 5

1 2 2 3 3 4 9 7

0 0 2 3 4 4 5 5 8 9 10

3 3 5 5 6 7 8 9 8

1 1 2 3 5 6 8 7

9 1 2 2 3

The display facilitates the process of finding the percentiles of the data. The percentile is a number such that at most p% of the various measurements is below it and at most 100 - 9 % (100 minus 9) of the data are above it. For example, in a certain data the 85th percentile is 340. It means that 15% of the measurements in the data are above 340 and the remaining 85% of the measurement are below 340.

However, the fiftieth percentile of the data distribution is called the sample median .. The median is computed in such a way that, half of the measurements are below it and half of the

measurements are above it. Suppose the number of observations is n and is an odd number. The median can be calculated, by sorting the observations in the ascending order or descending order. Thus, the median will be in the rank position [(n-1)/2 + 1] on the list.

On the other hand, if n is even, then the median is calculated by taking the average of the (n/2) and (n/2 +1) ranked observations. For example, consider the value of n to be 40 that is an even number, the median is calculated by taking the average of the two observations. The tenth percentile is computed by observing the rank (0.1) (40) +0.5 =4.5, or (49+49)/2 = 49.The first quartile is the observation with rank (0.25)(40) + 0.5

Operations Management Unit 13 .: 260

= 10.5 (halfway between the tenth and eleventh observation) or (53+54)/2 = 53.5, and the third quartile is the observation with rank (0.75)(40) + 0.5 = 30.5 (halfway between the thirtieth and thirty-first observation), or (79+81)/2 = 80. The first and third quartiles are occasionally denoted by the symbols Q1 and Q3, respectively and the inter-quartile range IQR = Q3 ¡V Q1 is

occasionally used as a measure of variability. For the semiconductor yield data, the inter-quartile range is IQR = Q3 ¡V Q1 = 80 ¡V 53.5 = 26.5

In some stem-and-leaf displays, it may be desired to provide more classes or stems. One way is modifying the original stems and follows: Divide the stem 5 (say) into two new stems, 5* and 5#.

The stem 5* has leaves 0,1,2,3, and 4, and the stem 5# has leaves 5,6,7,8, and 9. These will double the number of original stems. We could increase the number of original stems by five by defining five new stems: 5* with leaves 0 and 1, 5t (for twos and threes) with leaves 2 and 3, 5f (for fours and fives) with leaves 4 and 5, 5s (for sixes and sevens) with leaves 6 and 7, and 5#

with leaves 8 and 9.

Finally, although the stem-and-leaf display is an excellent way to visually show the variability in data, it does not take the time order of the observations into account. Time is often a very

important factor that contributes to variability in quality improvement problems. We could, of course, simply plot the data values versus time; such a graph is called a time series graph or a run chart. However, a useful; approach is to combine the time series graph with the stem-and-leaf display to produce a dig dot plot.

Figure 13.1 shows the dig dot plot for the semiconductor yield data. This display clearly indicates that time is an important source of variability in this production process. More specifically, yields in the first 20 weeks of production are substantially below the yields reported in the last 20 weeks.

Something may have changed in the process (or have deliberately changed by operation personnel or the process engineers) that is responsible for the yield improvement.

Leaf Stem Time Series plot (run chart) Frequency 2 1 2

2 1 5 6 1 3 8 3 5 6 7 5 8 3 9 9 5 8 0 2 5 4 4 0 3 4 9 3 2 1 2 3 5 9 7 9 8 10 9 8 7 6 5 4 3 7 8 10 7 5

Figure 13.1: A dig dot plot of the data Self Assessment Questions

6. What do we understand by the term ¡§percentile¡¨?

7. What does a stem contain?

8. Stem and leaf display is a graphical technique. State (True/False) 13.5 The Frequency Distribution and Histogram

A frequency distribution is an arrangement of the data by magnitude. It is a more compact summary of data, than a stem-and-leaf display. Table 13.6 represents 125 observations on the inside diameter of forged piston rings used in an automobile engine. The data were collected in 25 samples of five observations each. Note that there is some variability in piston-ring

diameter. However, it is very difficult to see any pattern in the variability or structure in the data, with the observations arranged as they are in Table 13.7.For example, a frequency distribution of the piston-ring data is shown in Table 13.8. From this table, we note that there was one ring that had a diameter between 73.965 mm and 73.970 mm, eight rings having diameters between 73.980 mm and 73.985 mm, and so forth.

Operations Management Unit 13 .: 262

Table 13.7: Forged Piston-Ring inside Diameter (mm) Sample Number

Observations 1

74.030 74.002 74.019 73.992

74.008 2 73.995 73.992 74.001 74.011 74.004 3 73.988 74.024 74.021 74.005 74.002 4 74.002 73.996 73.993 74.015 74.009 5 73.992 74.007 74.015 73.989 74.014 6 74.009 73.994 73.997 73.985 73.993 7 73.995 74.006 73.994 74.000 74.005 8 73.985 74.003 73.993 74.015 73.998 9 74.008 73.995 74.009 74.005 74.004 10 73.998 74.000 73.990 74.007 73.995 11

73.994 73.998 73.994 73.995 73.990 12 74.004 74.000 74.007 74.000 73.996 13 73.983 74.002 73.998 73.997 74.012 14 74.006 73.967 73.994 74.000 73.984 15 74.012 74.014 73.998 73.999 74.007 16 74.000 73.984 74.005 73.998 73.996 17 73.994 74.012 73.986 74.005 74.007 18 74.006 74.010 74.018 74.003 74.000 19 73.984 74.002 74.003 74.005 73.997 20 74.000 74.010

74.013 74.020 74.003 21 73.988 74.001 74.009 74.005 73.996 22 74.004 73.999 73.990 74.006 74.009 23 74.010 73.989 73.990 74.009 74.014 24 74.015 74.008 73.993 74.000 74.010 25 73.982 73.984 73.995 74.017 74.013

Table 13.8: Frequency Distribution for Piston-Ring Diameter Ring Diameter, u (mm)

Frequency

Cumulative Frequency Relative Frequency

Cumulative Relative Frequency 73.965 „T u < 73.970

1 1 0.008 0.008

73.970 „T u < 73.975 0

1 0.000 0.008

73.975 „T u < 73.980 0

1 0.000 0.008

73.980 „T u < 73.985 8

9 0.064 0.072

73.985 „T u < 73.990 10

19 0.080 0.152

73.990 „T u < 73.995 19

38 0.152 0.304

73.995 „T u < 74.000 23

61 0.184 0.488

74.000 „T u < 74.005 22

83 0.176 0.664

74.005 „T u < 74.010 22

105 0.176 0.840

74.010 „T u < 74.015 13

118 0.104 0.944

74.015 „T u < 74.020 4

122 0.032 0.976

74.020 „T u < 74.025 2

124 0.016 0.992

74.025 „T u < 74.030 1

125 0.008 1.000 Total 125 1.000

A graph of the observed frequencies versus the ring diameter is shown in Figure 13.2. This display is called a histogram. The height of each bar in Figure 13.2 is equal to the frequency of occurrence of ring diameter. The histogram represents a visual display of the data in which one may more easily see three properties. They are as follows:

„h Shape

„h Location or central tendency

„h Scatter or spread 0

5 10 15 20 25 Frequncy

Figure 13.2: Histogram for Piston-ring Diameter Data

In the piston-ring diameter data, we see that the distribution of ring diameter is roughly symmetric with the central tendency very close to 74mm. Thus;

the variability in ring diameter is apparently relatively high, as some rings are as small as 73.967 mm, while others are as large as 74.030 mm.

However, there are many factors that have to be considered while

constructing histograms such as when the data is large, it is very essential to group data into bins or cells as in the piston ring. The various factors that needs to be considered while constructing histogram is as follows:

„h Use between 4 and 20 bins ¡V often choosing the number of bins approximately equal to the square root of the sample size works well.

„h Make the bins of uniform width.

„h Start the lower limit for the first bin just slightly below the smallest data value.

Thus, grouping the data into bins condenses the original data. This results in loss of details of some data. Thus, when the number of observations is relatively small, or when the observations only take a few values, the histogram may be constructed from a frequency distribution of the ungrouped data. Alternatively, a stem-an-leaf display could be used. A primary advantage of the stem-an-leaf display is that, the individual observations are preserved, whereas they are lost in a histogram.

13.6 Probability Distribution

The histogram or stem-and-leaf plot is used to describe sample data. A sample is a collection of measurements selected from some larger source or population. For example, the 125 piston-ring diameters in Table 13.7 are a sample of piston-ring diameters, selected from the manufacturing process. The population in this example is the collection of all piston rings produced by that process. By using statistical methods, we may be able to analyse the sample piston-ring diameter data and draw certain conclusions about the process that manufactures the rings.

Thus, a probability distribution is a mathematical model that relates the value of the variable, with the probability of occurrence of that value in the population. In other words, we might visualise piston-ring diameter as a random variable. This is because it takes on different values in the population according to some random mechanism. Then, the probability distribution of ring diameter describes the probability of occurrence, of any value of ring diameter, in the population.

13.6.1 Types of Probability Distribution

Generally, a probability distribution is called discrete, if it is characterised by a probability mass function. Thus, the distribution of a random variable X is discrete. X is called a discrete random variable if:

As .u. runs through the set of all possible values of X.

There are two types of Probability Distribution:

„h Discrete Distribution

„h Continuous Distribution

1. Discrete Distribution: When the parameter being measured can only take on certain values, such as the integers 0, 1, 2, the probability distribution is called a discrete distribution. For example, the distribution of the number of nonconformities or defects in printed circuit boards would be a discrete distribution. A discrete probability can take on only a limited number of values, which can be listed.

The function pi=P(X=Xi) or p(x), is called the probability function or more precisely probability mass function (p.m.f) of the random variable X. The set of all possible ordered pairs {x, p(x)}, is called the probability distribution of the random variable X. To summarise the set of ordered pairs, [x, f(x)] is a probability function, probability mass function or probability distribution of the discrete random variable X, for each possible outcome x. if f(x) . 0, then £U f(x) = 1 and P(X=x) = f(x).

There are two types of Discrete Probability Distributions. They are as follows:

„h Binomial (Bernoulli) Distribution

„h Poisson Distribution

„h Binomial (Bernoulli) Distribution: An experiment often consists of repeated trials, each with two possible outcomes that may be labelled success or failure. The most obvious application deals with the testing of items as they come off an assemble line, where each test or trial may indicate a defective or a non defective item. We may choose to define either outcome as a success. The process is referred to as Bernoulli process. Each trial is called a Bernoulli trial. The process of Bernoulli distribution can be used under the following conditions, if:

„h The random experiment is performed repeatedly a finite and fixed number of times. In other words n, the no. of trials is finite & fixed.

„h The outcome of the random experiment (trial) results in the dichotomous classification of events. In other words, the outcome of each trial may be classified into two mutually disjoint categories, called success (the occurrence of the event) and failure (the non-occurrence of the event).

„h All the trials are independent, i.e., the result of any trial, is not affected in any way, by the preceding trials and doesn.t affect the result of succeeding trials.

The probability of success (happening of an event) in any trial is p, and is constant for each trial.

q=1-p, is termed as the probability of failure. That is non-occurrence of the event and is constant for each trial. The distribution is useful in such an experiment where there are only two outcomes, success or failure, good or defective, hit or miss, yes or no, and so on.

Assumptions:

The assumptions of Bernoulli distribution are:

„h Each trial has mutually exclusive possible outcomes, that is, success or failure.

„h Each trial is independent of other trials.

„h The probability of a success (say p) remains constant from trial to trial.

„h The number of trials is fixed.

Theory:

If the probability of success in any trial is p, and that of failure in any trial is q, then the probability

In document Operations Management (Page 137-156)