Autoregressive Model
1.3.3.13. DEX Standard Deviation Plot
Purpose:
Detect Important Factors with Respect to Scale
The dex standard deviation plot is appropriate for analyzing data from a designed experiment, with respect to important factors, where the
factors are at two or more levels and there are repeated values at each level. The plot shows standard deviation values for the two or more levels of each factor plotted by factor. The standard deviations for a single factor are connected by a straight line. The dex standard deviation plot is a complement to the traditional analysis of variance of designed experiments.
This plot is typically generated for the standard deviation. However, it can also be generated for other scale statistics such as the range, the median absolute deviation, or the average absolute deviation.
Sample Plot
This sample dex standard deviation plot shows that:
1.3.3.13. DEX Standard Deviation Plot
http://www.itl.nist.gov/div898/handbook/eda/section3/eda33d.htm (1 of 3) [5/1/2006 9:56:36 AM]
factor 1 has the greatest difference in standard deviations between factor levels;
1.
factor 4 has a significantly lower average standard deviation than the average standard deviations of other factors (but the level 1 standard deviation for factor 1 is about the same as the level 1 standard deviation for factor 4);
2.
for all factors, the level 1 standard deviation is smaller than the level 2 standard deviation.
3.
Dex standard deviation plots are formed by:
Vertical axis: Standard deviation of the response variable for each level of the factor
●
Horizontal axis: Factor variable
●
Questions The dex standard deviation plot can be used to answer the following questions:
How do the standard deviations vary across factors?
1.
How do the standard deviations vary within a factor?
2.
Which are the most important factors with respect to scale?
3.
What is the ranked list of the important factors with respect to scale?
4.
Importance:
Assess Variability
The goal with many designed experiments is to determine which factors are significant. This is usually determined from the means of the factor levels (which can be conveniently shown with a dex mean plot). A secondary goal is to assess the variability of the responses both within a factor and between factors. The dex standard deviation plot is a
convenient way to do this.
Related
Case Study The dex standard deviation plot is demonstrated in the ceramic strength data case study.
1.3.3.13. DEX Standard Deviation Plot
http://www.itl.nist.gov/div898/handbook/eda/section3/eda33d.htm (2 of 3) [5/1/2006 9:56:36 AM]
Software Dex standard deviation plots are not available in most general purpose statistical software programs. It may be feasible to write macros for dex standard deviation plots in some statistical software programs that do not support them directly. Dataplot supports a dex standard deviation plot.
1.3.3.13. DEX Standard Deviation Plot
http://www.itl.nist.gov/div898/handbook/eda/section3/eda33d.htm (3 of 3) [5/1/2006 9:56:36 AM]
1. Exploratory Data Analysis 1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
1.3.3.14. Histogram
Purpose:
Summarize a Univariate Data Set
The purpose of a histogram (Chambers) is to graphically summarize the distribution of a univariate data set.
The histogram graphically shows the following:
center (i.e., the location) of the data;
1.
spread (i.e., the scale) of the data;
2.
skewness of the data;
3.
presence of outliers; and 4.
presence of multiple modes in the data.
5.
These features provide strong indications of the proper distributional model for the data. The probability plot or a goodness-of-fit test can be used to verify the distributional model.
The examples section shows the appearance of a number of common features revealed by histograms.
Sample Plot
1.3.3.14. Histogram
http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e.htm (1 of 4) [5/1/2006 9:56:37 AM]
Definition The most common form of the histogram is obtained by splitting the range of the data into equal-sized bins (called classes). Then for each bin, the number of points from the data set that fall into each bin are counted. That is
Vertical axis: Frequency (i.e., counts for each bin)
●
Horizontal axis: Response variable
●
The classes can either be defined arbitrarily by the user or via some systematic rule. A number of theoretically derived rules have been proposed by Scott (Scott 1992).
The cumulative histogram is a variation of the histogram in which the vertical axis gives not just the counts for a single bin, but rather gives the counts for that bin plus all bins for smaller values of the response variable.
Both the histogram and cumulative histogram have an additional variant whereby the counts are replaced by the normalized counts. The names for these variants are the relative histogram and the relative cumulative histogram.
There are two common ways to normalize the counts.
The normalized count is the count in a class divided by the total number of observations. In this case the relative counts are normalized to sum to one (or 100 if a percentage scale is used).
This is the intuitive case where the height of the histogram bar represents the proportion of the data in each class.
1.
The normalized count is the count in the class divided by the 2.
1.3.3.14. Histogram
http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e.htm (2 of 4) [5/1/2006 9:56:37 AM]
number of observations times the class width. For this
normalization, the area (or integral) under the histogram is equal to one. From a probabilistic point of view, this normalization results in a relative histogram that is most akin to the probability density function and a relative cumulative histogram that is most akin to the cumulative distribution function. If you want to
overlay a probability density or cumulative distribution function on top of the histogram, use this normalization. Although this normalization is less intuitive (relative frequencies greater than 1 are quite permissible), it is the appropriate normalization if you are using the histogram to model a probability density function.
Questions The histogram can be used to answer the following questions:
What kind of population distribution do the data come from?
1.
Where are the data located?
2.
How spread out are the data?
3.
Are the data symmetric or skewed?
4.
Are there outliers in the data?
5.
Examples 1. Normal
Symmetric, Non-Normal, Short-Tailed
Bimodal Mixture of 2 Normals 5.
The techniques below are not discussed in the Handbook. However, they are similar in purpose to the histogram. Additional information on them is contained in the Chambers and Scott references.
Frequency Plot Stem and Leaf Plot Density Trace
Case Study The histogram is demonstrated in the heat flow meter data case study.
1.3.3.14. Histogram
http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e.htm (3 of 4) [5/1/2006 9:56:37 AM]
Software Histograms are available in most general purpose statistical software programs. They are also supported in most general purpose charting, spreadsheet, and business graphics programs. Dataplot supports histograms.
1.3.3.14. Histogram
http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e.htm (4 of 4) [5/1/2006 9:56:37 AM]
1. Exploratory Data Analysis 1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic 1.3.3.14. Histogram
1.3.3.14.1. Histogram Interpretation: Normal
Symmetric, Moderate-Tailed Histogram
Note the classical bell-shaped, symmetric histogram with most of the frequency counts bunched in the middle and with the counts dying off out in the tails. From a physical science/engineering point of view, the normal distribution is that distribution which occurs most often in nature (due in part to the central limit theorem).
Recommended Next Step
If the histogram indicates a symmetric, moderate tailed distribution, then the recommended next step is to do a normal probability plot to confirm approximate normality. If the normal probability plot is linear, then the normal distribution is a good model for the data.
1.3.3.14.1. Histogram Interpretation: Normal
http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e1.htm (1 of 2) [5/1/2006 9:56:37 AM]
1.3.3.14.1. Histogram Interpretation: Normal
http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e1.htm (2 of 2) [5/1/2006 9:56:37 AM]
1. Exploratory Data Analysis 1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic 1.3.3.14. Histogram