JMP, A Business Unit of SAS SAS Campus Drive
Cary, NC 27513
“The real voyage of discovery consists not in seeking new landscapes, but in having new eyes.” Marcel Proust
Basic Analysis
JMP® 11 Basic Analysis
Copyright © 2013, SAS Institute Inc., Cary, NC, USA ISBN 978-1-61290-670-6
All rights reserved. Produced in the United States of America.
For a hard-copy book: No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, or otherwise, without the prior written permission of the publisher, SAS Institute Inc.
For a Web download or e-book: Your use of this publication shall be governed by the terms established by the vendor at the time you acquire this publication.
The scanning, uploading, and distribution of this book via the Internet or any other means without the permission of the publisher is illegal and punishable by law. Please purchase only authorized electronic editions and do not participate in or encourage electronic piracy of copyrighted materials. Your support of others’ rights is appreciated.
U.S. Government Restricted Rights Notice: Use, duplication, or disclosure of this software and related documentation by the U.S. government is subject to the Agreement with SAS Institute and the restrictions set forth in FAR 52.227-19, Commercial Computer
Software-Restricted Rights (June 1987).
SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513. 1st printing, September 2013
SAS® Publishing provides a complete selection of books and electronic products to help customers use SAS software to its fullest potential. For more information about our e-books, e-learning products, CDs, and hard-copy books, visit the SAS Publishing Web site at support.sas.com/publishing or call 1-800-727-3228.
SAS® and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA
registration.
Other brand and product names are registered trademarks or trademarks of their respective companies.
All Rights Reserved.
Permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee is hereby granted, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting documentation.
NEILHODGSONDISCLAIMSALLWARRANTIESWITHREGARDTOTHISSOFTWARE, INCLUDING ALLIMPLIEDWARRANTIESOFMERCHANTABILITYANDFITNESS, INNOEVENTSHALLNEIL HODGSONBELIABLEFORANYSPECIAL, INDIRECTORCONSEQUENTIALDAMAGESORANY DAMAGESWHATSOEVERRESULTINGFROMLOSSOFUSE, DATAORPROFITS, WHETHERINAN ACTIONOFCONTRACT, NEGLIGENCEOROTHERTORTIOUSACTION, ARISINGOUTOFORIN CONNECTIONWITHTHEUSEORPERFORMANCEOFTHISSOFTWARE.
• Telerik RadControls: Copyright © 2002-2012, Telerik. Usage of the included Telerik RadControls outside of JMP is not permitted.
• ZLIB Compression Library - Copyright © 1995-2005, Jean-Loup Gailly and Mark Adler. • Made with Natural Earth. Free vector and raster map data @ naturalearthdata.com. • Packages - Copyright © 2009-2010, Stéphane Sudre (s.sudre.free.fr). All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
Neither the name of the WhiteBox nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THISSOFTWAREISPROVIDEDBYTHECOPYRIGHTHOLDERSANDCONTRIBUTORS “ASIS” AND ANYEXPRESSORIMPLIEDWARRANTIES, INCLUDING, BUTNOTLIMITEDTO, THEIMPLIED WARRANTIESOFMERCHANTABILITYANDFITNESSFORAPARTICULARPURPOSEARE
DISCLAIMED. INNOEVENTSHALLTHECOPYRIGHTOWNERORCONTRIBUTORSBELIABLEFOR ANYDIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, ORCONSEQUENTIALDAMAGES
NEGLIGENCEOROTHERWISE) ARISINGINANYWAYOUTOFTHEUSEOFTHISSOFTWARE,
EVENIFADVISEDOFTHEPOSSIBILITYOFSUCHDAMAGE.
• iODBC software - Copyright © 1995-2006, OpenLink Software Inc and Ke Jin (www.iodbc.org). All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
‒ Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
‒ Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
‒ Neither the name of OpenLink Software Inc. nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THISSOFTWAREISPROVIDEDBYTHECOPYRIGHTHOLDERSANDCONTRIBUTORS “ASIS” AND ANYEXPRESSORIMPLIEDWARRANTIES, INCLUDING, BUTNOTLIMITEDTO, THEIMPLIED WARRANTIESOFMERCHANTABILITYANDFITNESSFORAPARTICULARPURPOSEARE
DISCLAIMED. INNOEVENTSHALLOPENLINKORCONTRIBUTORSBELIABLEFORANYDIRECT,
INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, ORCONSEQUENTIALDAMAGES (INCLUDING,
BUTNOTLIMITEDTO, PROCUREMENTOFSUBSTITUTEGOODSORSERVICES; LOSSOFUSE,
DATA, ORPROFITS; ORBUSINESSINTERRUPTION) HOWEVERCAUSEDANDONANYTHEORYOF LIABILITY, WHETHERINCONTRACT, STRICTLIABILITY, ORTORT (INCLUDINGNEGLIGENCEOR OTHERWISE) ARISINGINANYWAYOUTOFTHEUSEOFTHISSOFTWARE, EVENIFADVISEDOF THEPOSSIBILITYOFSUCHDAMAGE.
• bzip2, the associated library “libbzip2”, and all documentation, are Copyright © 1996-2010, Julian R Seward. All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
The origin of this software must not be misrepresented; you must not claim that you wrote the original software. If you use this software in a product, an acknowledgment in the product documentation would be appreciated but is not required.
The name of the author may not be used to endorse or promote products derived from this software without specific prior written permission.
THISSOFTWAREISPROVIDEDBYTHEAUTHOR “ASIS” ANDANYEXPRESSORIMPLIED WARRANTIES, INCLUDING, BUTNOTLIMITEDTO, THEIMPLIEDWARRANTIESOF
MERCHANTABILITYANDFITNESSFORAPARTICULARPURPOSEAREDISCLAIMED. INNOEVENT SHALLTHEAUTHORBELIABLEFORANYDIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, ORCONSEQUENTIALDAMAGES (INCLUDING, BUTNOTLIMITEDTO,
PROCUREMENTOFSUBSTITUTEGOODSORSERVICES; LOSSOFUSE, DATA, ORPROFITS; OR BUSINESSINTERRUPTION) HOWEVERCAUSEDANDONANYTHEORYOFLIABILITY, WHETHER INCONTRACT, STRICTLIABILITY, ORTORT (INCLUDINGNEGLIGENCEOROTHERWISE) ARISING INANYWAYOUTOFTHEUSEOFTHISSOFTWARE, EVENIFADVISEDOFTHEPOSSIBILITYOF SUCHDAMAGE.
• R software is Copyright © 1999-2012, R Foundation for Statistical Computing. • MATLAB software is Copyright © 1984-2012, The MathWorks, Inc. Protected by U.S.
and international patents. See www.mathworks.com/patents. MATLAB and Simulink are registered trademarks of The MathWorks, Inc. See www.mathworks.com/
trademarks for a list of additional trademarks. Other product or brand names may be trademarks or registered trademarks of their respective holders.
Whether you are a first-time or a long-time user, there is always something to learn about JMP.
Visit JMP.com to find the following:
• live and recorded webcasts about how to get started with JMP • video demos and webcasts of new features and advanced techniques • details on registering for JMP training
• schedules for seminars being held in your area • success stories showing how others use JMP • a blog with tips, tricks, and stories from JMP staff • a forum to discuss JMP with other users
Basic Analysis
1
Learn about JMP
Documentation and Additional Resources . . . 17
Formatting Conventions . . . 19
JMP Documentation . . . 19
JMP Documentation Library . . . 20
JMP Help . . . 24
Additional Resources for Learning JMP . . . 24
Tutorials . . . 25
Sample Data Tables . . . 25
Learn about Statistical and JSL Terms . . . 25
Learn JMP Tips and Tricks . . . 26
Tooltips . . . 26
JMP User Community . . . 26
JMPer Cable . . . 26
JMP Books by Users . . . 27
The JMP Starter Window . . . 27
2
Introduction to Basic Analysis
Overview of Fundamental Analysis Methods . . . 293
Distributions
Using the Distribution Platform . . . 31Overview of the Distribution Platform . . . 33
Categorical Variables . . . 33
Continuous Variables . . . 33
Example of the Distribution Platform . . . 33
Launch the Distribution Platform . . . 35
The Distribution Report . . . 36
The Frequencies Report . . . 41
The Quantiles Report . . . 41
The Summary Statistics Report . . . 41
Distribution Platform Options . . . 44
Options for Categorical Variables . . . 45
Options for Continuous Variables . . . 47
Normal Quantile Plot . . . 50
Outlier Box Plot . . . 51
Quantile Box Plot . . . 52
Stem and Leaf . . . 52
CDF Plot . . . 53
Test Mean . . . 54
Test Std Dev . . . 55
Confidence Intervals for Continuous Variables . . . 55
Save Commands for Continuous Variables . . . 56
Prediction Intervals . . . 57 Tolerance Intervals . . . 58 Capability Analysis . . . 58 Fit Distributions . . . 60 Continuous Fit . . . 60 Discrete Fit . . . 62
Fit Distribution Options . . . 62
Additional Examples of the Distribution Platform . . . 66
Example of Selecting Data in Multiple Histograms . . . 66
Examples of the Test Probabilities Option . . . 66
Example of Prediction Intervals . . . 69
Example of Tolerance Intervals . . . 70
Example of Capability Analysis . . . 71
Statistical Details for the Distribution Platform . . . 72
Statistical Details for Standard Error Bars . . . 72
Statistical Details for Quantiles . . . 72
Statistical Details for Summary Statistics . . . 73
Statistical Details for the Normal Quantile Plot . . . 74
Statistical Details for the Standard Deviation Test . . . 75
Statistical Details for Normal Quantiles . . . 75
Statistical Details for Saving Standardized Data . . . 76
Statistical Details for Prediction Intervals . . . 76
Statistical Details for Tolerance Intervals . . . 76
Statistical Details for Capability Analysis . . . 77
Statistical Details for Continuous Fit Distributions . . . 81
Statistical Details for Discrete Fit Distributions . . . 87
Statistical Details for Fitted Quantiles . . . 90
Statistical Details for Fit Distribution Options . . . 90
4
Introduction to Fit Y by X
Examine Relationships Between Two Variables . . . 93Overview of the Fit Y by X Platform . . . 95
Launch the Fit Y by X Platform . . . 95
Launch Specific Analyses from the JMP Starter Window . . . 96
5
Bivariate Analysis
Examine Relationships between Two Continuous Variables . . . 99Example of Bivariate Analysis . . . 101
Launch the Bivariate Platform . . . 101
The Bivariate Plot . . . 102
Fitting Commands and General Options . . . 103
Fitting Command Categories . . . 106
Fit the Same Command Multiple Times . . . 106
Fit Mean . . . 106
Fit Mean Report . . . 107
Fit Line and Fit Polynomial . . . 108
Linear Fit and Polynomial Fit Reports . . . 108
Fit Special . . . 115
Fit Special Reports and Menus . . . 116
Fit Spline . . . 117
Smoothing Spline Fit Report . . . 117
Kernel Smoother . . . 118
Fit Each Value . . . 119
Fit Each Value Report . . . 119
Fit Orthogonal . . . 120
Fit Orthogonal Options . . . 120
Orthogonal Regression Report . . . 120
Density Ellipse . . . 121
Correlation Report . . . 122
Nonpar Density . . . 123
Nonparametric Bivariate Density Report . . . 123
Fit Robust . . . 123
Histogram Borders . . . 124
Group By . . . 124
Fitting Menus . . . 125
Fitting Menu Options . . . 126
Diagnostics Plots . . . 129
Additional Examples of the Bivariate Platform . . . 129
Example of the Fit Special Command . . . 129
Example Using the Fit Orthogonal Command . . . 132
Example Using the Fit Robust Command . . . 133
Example of Group By Using Density Ellipses . . . 135
Example of Group By Using Regression Lines . . . 136
Statistical Details for the Bivariate Platform . . . 137
Statistical Details for Fit Line . . . 137
Statistical Details for Fit Spline . . . 137
Statistical Details for Fit Orthogonal . . . 137
Statistical Details for the Summary of Fit Report . . . 138
Statistical Details for the Lack of Fit Report . . . 139
Statistical Details for the Parameter Estimates Report . . . 139
Statistical Details for the Smoothing Fit Reports . . . 140
Statistical Details for the Correlation Report . . . 140
6
Oneway Analysis
Examine Relationships between a Continuous Y and a Categorical X Variable . . . . 141Example of Oneway Analysis . . . 143
Launch the Oneway Platform . . . 145
The Oneway Plot . . . 145
Oneway Platform Options . . . 146
Display Options . . . 150
Quantiles . . . 152
Outlier Box Plots . . . 153
Means/Anova and Means/Anova/Pooled t . . . 153
The Summary of Fit Report . . . 154
The t-test Report . . . 155
The Analysis of Variance Report . . . 156
The Means for Oneway Anova Report . . . 158
The Block Means Report . . . 158
Mean Diamonds and X-Axis Proportional . . . 158
Mean Lines, Error Bars, and Standard Deviation Lines . . . 159
Analysis of Means Methods . . . 160
Compare Means . . . 160
Compare Standard Deviations (or Variances) . . . 161
Analysis of Means Charts . . . 161
Analysis of Means Options . . . 162
Compare Means . . . 162
Using Comparison Circles . . . 164
Each Pair, Student’s t . . . 165
All Pairs, Tukey HSD . . . 166
With Best, Hsu MCB . . . 166
With Control, Dunnett’s . . . 167
Compare Means Options . . . 167
Nonparametric . . . 168
Nonparametric Report Descriptions . . . 170
Unequal Variances . . . 173
Tests That the Variances Are Equal Report . . . 174
Equivalence Test . . . 176
Power . . . 177
Power Details Window and Reports . . . 177
Normal Quantile Plot . . . 179
CDF Plot . . . 179
Densities . . . 179
Matching Column . . . 180
Additional Examples of the Oneway Platform . . . 180
Example of an Analysis of Means Chart . . . 180
Example of an Analysis of Means for Variances Chart . . . 181
Example of the Each Pair, Student’s t Test . . . 182
Example of the All Pairs, Tukey HSD Test . . . 184
Example of the With Best, Hsu MCB Test . . . 186
Example of the With Control, Dunnett’s Test . . . 187
Example Contrasting All of the Compare Means Tests . . . 188
Example of the Unequal Variances Option . . . 189
Example of an Equivalence Test . . . 190
Example of the Robust Fit Option . . . 191
Example of the Power Option . . . 192
Example of a Normal Quantile Plot . . . 194
Example of a CDF Plot . . . 195
Example of the Densities Options . . . 196
Example of the Matching Column Option . . . 197
Statistical Details for the Oneway Platform . . . 198
Statistical Details for Comparison Circles . . . 199
Statistical Details for Power . . . 200
Statistical Details for the Summary of Fit Report . . . 201
Statistical Details for the Tests That the Variances Are Equal Report . . . 202
7
Contingency Analysis
Examine Relationships between Two Categorical Variables . . . 205Example of Contingency Analysis . . . 207
Launch the Contingency Platform . . . 208
The Contingency Report . . . 208
Mosaic Plot . . . 212
Context Menu . . . 213
Contingency Table . . . 214
Tests . . . 216
Fisher’s Exact Test . . . 218
Analysis of Means for Proportions . . . 218
Correspondence Analysis . . . 218
Understanding Correspondence Analysis Plots . . . 219
Correspondence Analysis Options . . . 219
The Details Report . . . 219
Cochran-Mantel-Haenszel Test . . . 220
Agreement Statistic . . . 220
Relative Risk . . . 221
Two Sample Test for Proportions . . . 222
Measures of Association . . . 222
Cochran Armitage Trend Test . . . 224
Exact Test . . . 224
Additional Examples of the Contingency Platform . . . 226
Example of Analysis of Means for Proportions . . . 226
Example of Correspondence Analysis . . . 227
Example of a Cochran Mantel Haenszel Test . . . 231
Example of the Agreement Statistic Option . . . 232
Example of the Relative Risk Option . . . 233
Example of a Two Sample Test for Proportions . . . 235
Example of the Measures of Association Option . . . 236
Example of the Cochran Armitage Trend Test . . . 237
Statistical Details for the Contingency Platform . . . 238
Statistical Details for the Agreement Statistic Option . . . 238
Statistical Details for the Odds Ratio Option . . . 238
Statistical Details for the Tests Report . . . 239
8
Logistic Analysis
Examine Relationships between a Categorical Y and a Continuous X Variable . . . . 241
Overview of Logistic Regression . . . 243
Nominal Logistic Regression . . . 243
Ordinal Logistic Regression . . . 243
Example of Nominal Logistic Regression . . . 244
Launch the Logistic Platform . . . 246
The Logistic Report . . . 246
Logistic Plot . . . 247
Iterations . . . 248
Whole Model Test . . . 248
Parameter Estimates . . . 250
Logistic Platform Options . . . 251
ROC Curves . . . 252
Save Probability Formula . . . 253
Inverse Prediction . . . 254
Additional Examples of Logistic Regression . . . 254
Example of Ordinal Logistic Regression . . . 254
Additional Example of a Logistic Plot . . . 256
Example of ROC Curves . . . 258
Example of Inverse Prediction Using the Crosshair Tool . . . 259
Example of Inverse Prediction Using the Inverse Prediction Option . . . 260
Statistical Details for the Logistic Platform . . . 262
Statistical Details for the Whole Model Test Report . . . 262
9
Matched Pairs Analysis
Compare Measurements on the Same Subject . . . 263Overview of the Matched Pairs Platform . . . 265
Example of Comparing Matched Pairs . . . 265
Launch the Matched Pairs Platform . . . 266
Multiple Y Columns . . . 267
The Matched Pairs Report . . . 268
Difference Plot and Report . . . 268
Matched Pairs Platform Options . . . 270
Example Comparing Matched Pairs across Groups . . . 270
Statistical Details for the Matched Pairs Platform . . . 272
Graphics for Matched Pairs . . . 272
Correlation of Responses . . . 273
10
Bootstrapping
Approximate the Distribution of a Statistic through Resampling . . . 275Example of Bootstrapping . . . 277
Perform a Bootstrap Analysis . . . 278
Bootstrap Window Options . . . 278
Stacked Results Table . . . 279
Unstacked Results Table . . . 280
Analysis of Bootstrap Results . . . 280
11
Tabulate
Create Summary Tables Interactively . . . 283Example of the Tabulate Platform . . . 285
Launch the Tabulate Platform . . . 288
Use the Dialog . . . 290
Add Statistics . . . 292
The Tabulate Output . . . 294
Analysis Columns . . . 296
Grouping Columns . . . 296
Column and Row Tables . . . 297
Edit Tables . . . 298
Tabulate Platform Options . . . 298
Show Test Build Panel . . . 299
Right-Click Menu for Columns . . . 300
Additional Examples of the Tabulate Platform . . . 300
Example of Combining Columns into a Single Table . . . 304
A
References
Index
Chapter
1
Learn about JMP
Documentation and Additional Resources
This chapter includes the following information: • book conventions
• JMP documentation • JMP Help
• additional resources, such as the following: – other JMP documentation
– tutorials – indexes – Web resources
Formatting Conventions . . . 19
JMP Documentation . . . 19
JMP Documentation Library. . . 20
JMP Help . . . 24
Additional Resources for Learning JMP . . . 24
Tutorials . . . 25
Sample Data Tables. . . 25
Learn about Statistical and JSL Terms . . . 25
Learn JMP Tips and Tricks. . . 26
Tooltips. . . 26
JMP User Community . . . 26
JMPer Cable . . . 26
JMP Books by Users . . . 27
Formatting Conventions
The following conventions help you relate written material to information that you see on your screen.
• Sample data table names, column names, pathnames, filenames, file extensions, and folders appear in Helvetica font.
• Code appears in Lucida Sans Typewriter font.
• Code output appears in Lucida Sans Typewriter italic font and is indented farther than the preceding code.
• Helvetica bold formatting indicates items that you select to complete a task: – buttons
– check boxes – commands
– list names that are selectable – menus
– options – tab names – text boxes
• The following items appear in italics:
– words or phrases that are important or have definitions specific to JMP – book titles
– variables
• Features that are for JMP Pro only are noted with the JMP Pro icon . For an overview of JMP Pro features, visit http://www.jmp.com/software/pro/.
Note: Special information and limitations appear within a Note.
Tip: Helpful information appears within a Tip.
JMP Documentation
JMP offers documentation in various formats, from print books and Portable Document Format (PDF) to electronic books (e-books).
• All books are also combined into one PDF file, called JMP Documentation Library, for convenient searching. Open the JMP Documentation Library PDF file from the Help > Books menu.
• e-books are available at Amazon, Safari Books Online, and in the Apple iBookstore. • You can also purchase printed documentation on the SAS website:
http://support.sas.com/documentation/onlinedoc/jmp/index.html
JMP Documentation Library
The following table describes the purpose and content of each book in the JMP library.
Document Title Document Purpose Document Content
Discovering JMP If you are not familiar with JMP, start here.
Introduces you to JMP and gets you started creating and analyzing data.
Using JMP Learn about JMP data
tables and how to perform basic operations.
Covers general JMP concepts and features that span across all of JMP, including importing data, modifying columns properties, sorting data, and connecting to SAS.
Basic Analysis Perform basic analysis using this document.
Describes these Analyze menu platforms: • Distribution
• Fit Y by X • Matched Pairs • Tabulate
How to approximate sampling
distributions using bootstrapping is also included.
Essential Graphing Find the ideal graph for your data.
Describes these Graph menu platforms: • Graph Builder • Overlay Plot • Scatterplot 3D • Contour Plot • Bubble Plot • Parallel Plot • Cell Plot • Treemap • Scatterplot Matrix • Ternary Plot • Chart
Also covers how to create background and custom maps.
Profilers Learn how to use
interactive profiling tools, which enable you to view cross-sections of any response surface.
Covers all profilers listed in the Graph menu. Analyzing noise factors is
included along with running simulations using random inputs.
Design of
Experiments Guide
Learn how to design experiments and determine appropriate sample sizes.
Covers all topics in the DOE menu.
Fitting Linear Models Learn about Fit Model platform and many of its personalities.
Describes these personalities, all available within the Analyze menu Fit Model platform:
• Standard Least Squares • Stepwise • Generalized Regression • Mixed Model • MANOVA • Loglinear Variance • Nominal Logistic • Ordinal Logistic
• Generalized Linear Model Specialized Models Learn about additional
modeling techniques.
Describes these Analyze > Modeling menu platforms: • Partition • Neural • Model Comparison • Nonlinear • Gaussian Process • Time Series • Response Screening
The Screening platform in the Analyze > Modeling menu is described in Design of Experiments Guide.
Multivariate Methods
Read about techniques for analyzing several variables
simultaneously.
Describes these Analyze > Multivariate Methods menu platforms:
• Multivariate • Cluster
• Principal Components • Discriminant
• Partial Least Squares
Quality and Process Methods
Read about tools for evaluating and improving processes.
Describes these Analyze > Quality and Process menu platforms:
• Control Chart Builder and individual control charts
• Measurement Systems Analysis • Variability / Attribute Gauge Charts • Capability
• Pareto Plot • Diagram Reliability and
Survival Methods
Learn to evaluate and improve reliability in a product or system and analyze survival data for people and products.
Describes these Analyze > Reliability and Survival menu platforms:
• Life Distribution • Fit Life by X • Recurrence Analysis • Degradation • Reliability Forecast • Reliability Growth • Reliability Block Diagram • Survival
• Fit Parametric Survival • Fit Proportional Hazards Consumer Research Learn about methods
for studying consumer preferences and using that insight to create better products and services.
Describes these Analyze > Consumer Research menu platforms:
• Categorical • Factor Analysis • Choice
• Uplift • Item Analysis
Note: The Books menu also contains two reference cards that can be printed: The Menu Card describes JMP menus, and the Quick Reference describes JMP keyboard shortcuts.
JMP Help
JMP Help is an abbreviated version of the documentation library that provides targeted information. You can open JMP Help in several ways:
• On Windows, press the F1 key to open the Help system window.
• Get help on a specific part of a data table or report window. Select the Help tool from the Tools menu and then click anywhere in a data table or report window to see the Help for that area.
• Within a JMP window, click the Help button.
• Search and view JMP Help on Windows using the Help > Help Contents, Search Help, and
Help Index options. On Mac, select Help > JMP Help.
• Search the Help at http://jmp.com/support/help/ (English only).
Additional Resources for Learning JMP
In addition to JMP documentation and JMP Help, you can also learn about JMP using the following resources:
• Tutorials (see “Tutorials” on page 25)
• Sample data (see “Sample Data Tables” on page 25)
• Indexes (see “Learn about Statistical and JSL Terms” on page 25) Scripting Guide Learn about taking
advantage of the powerful JMP Scripting Language (JSL).
Covers a variety of topics, such as writing and debugging scripts, manipulating data tables, constructing display boxes, and creating JMP applications.
JSL Syntax Reference Read about many JSL functions on functions and their arguments, and messages that you send to objects and display boxes.
Includes syntax, examples, and notes for JSL commands.
• Tip of the Day (see “Learn JMP Tips and Tricks” on page 26) • Web resources (see “JMP User Community” on page 26)
• JMPer Cable technical publication (see “JMPer Cable” on page 26) • Books about JMP (see “JMP Books by Users” on page 27)
• JMP Starter (see “The JMP Starter Window” on page 27)
Tutorials
You can access JMP tutorials by selecting Help > Tutorials. The first item on the Tutorials menu is Tutorials Directory. This opens a new window with all the tutorials grouped by category. If you are not familiar with JMP, then start with the Beginners Tutorial. It steps you through the JMP interface and explains the basics of using JMP.
The rest of the tutorials help you with specific aspects of JMP, such as creating a pie chart, using Graph Builder, and so on.
Sample Data Tables
All of the examples in the JMP documentation suite use sample data. Select Help > Sample
Data to do the following actions: • Open the sample data directory.
• Open an alphabetized list of all sample data tables. • Find a sample data table within a category.
Sample data tables are installed in the following directory:
On Windows: C:\Program Files\SAS\JMP\<version_number>\Samples\Data On Macintosh: \Library\Application Support\JMP\<version_number>\Samples\Data In JMP Pro, sample data is installed in the JMPPRO (rather than JMP) directory.
Learn about Statistical and JSL Terms
The Help menu contains the following indexes:
Statistics Index Provides definitions of statistical terms.
Scripting Index Lets you search for information about JSL functions, objects, and display boxes. You can also edit and run sample scripts from the Scripting Index.
Learn JMP Tips and Tricks
When you first start JMP, you see the Tip of the Day window. This window provides tips for using JMP.
To turn off the Tip of the Day, clear the Show tips at startup check box. To view it again, select
Help > Tip of the Day. Or, you can turn it off using the Preferences window. See the Using JMP book for details.
Tooltips
JMP provides descriptive tooltips when you place your cursor over items, such as the following:
• Menu or toolbar options • Labels in graphs
• Text results in the report window (move your cursor in a circle to reveal) • Files or windows in the Home Window
• Code in the Script Editor
Tip: You can hide tooltips in the JMP Preferences. Select File > Preferences > General (or JMP
> Preferences > General on Macintosh) and then deselect Show menu tips.
JMP User Community
The JMP User Community provides a range of options to help you learn more about JMP and connect with other JMP users. The learning library of one-page guides, tutorials, and demos is a good place to start. And you can continue your education by registering for a variety of JMP training courses.
Other resources include a discussion forum, sample data and script file exchange, webcasts, and social networking groups.
To access JMP resources on the website, select Help > JMP User Community.
JMPer Cable
The JMPer Cable is a yearly technical publication targeted to users of JMP. The JMPer Cable is available on the JMP website:
JMP Books by Users
Additional books about using JMP that are written by JMP users are available on the JMP website:
http://www.jmp.com/support/books.shtml
The JMP Starter Window
The JMP Starter window is a good place to begin if you are not familiar with JMP or data analysis. Options are categorized and described, and you launch them by clicking a button. The JMP Starter window covers many of the options found in the Analyze, Graph, Tables, and
File menus.
• To open the JMP Starter window, select View (Window on the Macintosh) > JMP Starter. • To display the JMP Starter automatically when you open JMP on Windows, select File >
Preferences > General, and then select JMP Starter from the Initial JMP Window list. On Macintosh, select JMP > Preferences > Initial JMP Starter Window.
Chapter
2
Introduction to Basic Analysis
Overview of Fundamental Analysis Methods
This book describes the initial types of analyses that you often perform in JMP:
• The Distribution platform illustrates the distribution of a single variable using histograms, additional graphs, and reports. Once you know how your data is distributed, you can plan the appropriate type of analysis going forward. See Chapter 3, “Distributions”.
• The Fit Y by X platform analyzes the pair of X and Y variables that you specify, by context, based on modeling type. See Chapter 4, “Introduction to Fit Y by X”. The four types of analyses include:
‒ The Bivariate platform, which analyzes the relationship between two continuous X variables. See Chapter 5, “Bivariate Analysis”.
‒ The Oneway platform, which analyzes how the distribution of a continuous Y variable differs across groups defined by a categorical X variable. See Chapter 6, “Oneway Analysis”.
‒ The Contingency platform, which analyzes the distribution of a categorical response variable Y as conditioned by the values of a categorical X factor. See Chapter 7, “Contingency Analysis”.
‒ The Logistic platform, which fits the probabilities for response categories (Y) to a continuous X predictor. See Chapter 8, “Logistic Analysis”.
• The Matched Pairs platform compares the means between two or more correlated variables and assesses the differences. See Chapter 9, “Matched Pairs Analysis”. • Bootstrap analysis approximates the sampling distribution of a statistic. The data is
re-sampled with replacement and the statistic is computed. This process is repeated to produce a distribution of values for the statistic. See Chapter 10, “Bootstrapping”.
• The Tabulate platform interactively constructs tables of descriptive statistics. See Chapter 11, “Tabulate”.
Chapter
3
Distributions
Using the Distribution Platform
The Distribution platform illustrates the distribution of a single variable using histograms, additional graphs, and reports. The word univariate simply means involving one variable instead of two (bivariate) or many (multivariate). However, you can examine the distribution of several individual variables within a report. The report content for each variable changes depending on whether the variable is categorical (nominal or ordinal) or continuous. Once you know how your data is distributed, you can plan the appropriate type of analysis going forward.
The Distribution report window is interactive. Clicking on a histogram bar highlights the corresponding data in any other histograms and in the data table. See Figure 3.1.
Overview of the Distribution Platform . . . 33 Example of the Distribution Platform . . . 33 Launch the Distribution Platform. . . 35 The Distribution Report. . . 36 Histograms. . . 38 The Frequencies Report . . . 41 The Quantiles Report . . . 41 The Summary Statistics Report. . . 41 Distribution Platform Options . . . 44 Options for Categorical Variables. . . 45 Options for Continuous Variables . . . 47 Normal Quantile Plot. . . 50 Outlier Box Plot. . . 51 Quantile Box Plot . . . 52 Stem and Leaf . . . 52 CDF Plot. . . 53 Test Mean . . . 54 Test Std Dev. . . 55 Confidence Intervals for Continuous Variables . . . 55 Save Commands for Continuous Variables. . . 56 Prediction Intervals. . . 57 Tolerance Intervals . . . 58 Capability Analysis. . . 58 Fit Distributions. . . 60 Additional Examples of the Distribution Platform . . . 66 Statistical Details for the Distribution Platform . . . 72
Overview of the Distribution Platform
The treatment of variables in the Distribution platform is different, depending on the modeling type of variable, which can be categorical (nominal or ordinal) or continuous.
Categorical Variables
For categorical variables, the initial graph that appears is a histogram. The histogram shows a bar for each level of the ordinal or nominal variable. You can also add a divided (mosaic) bar chart.
The reports show counts and proportions. You can add confidence intervals and test the probabilities.
Continuous Variables
For numeric continuous variables, the initial graphs show a histogram and an outlier box plot. The histogram shows a bar for grouped values of the continuous variable. The following options are also available:
• quantile box plot • normal quantile plot • stem and leaf plot • CDF plot
The reports show selected quantiles and summary statistics. Report options are available for the following:
• saving ranks, probability scores, normal quantile values, and so on, as new columns in the data table
• testing the mean and standard deviation of the column against a constant you specify • fitting various distributions and nonparametric smoothing curves
• performing a capability analysis for a quality control application • confidence intervals, prediction intervals, and tolerance intervals
Example of the Distribution Platform
Suppose that you have data on 40 students, and you want to see the distribution of age and height among the students.
2. Select Analyze > Distribution.
3. Select age and height and click Y, Columns. 4. Click OK.
Figure 3.2 Example of the Distribution Platform
From the histograms, you notice the following: • The ages are not uniformly distributed.
• For height, there are two points with extreme values (that might be outliers).
Click on the bar for 50 in the height histogram to take a closer look at the potential outliers. • The corresponding ages are highlighted in the age histogram. The potential outliers are
age 12.
• The corresponding rows are highlighted in the data table. The names of the potential outliers are Lillie and Robert.
Add labels to the potential outliers in the height histogram. 1. Select both outliers.
2. Right-click on one of the outliers and select Row Label. Label icons are added to these rows in the data table. 3. (Optional) Resize the box plot wider to see the full labels.
Figure 3.3 Potential Outliers Labeled
Launch the Distribution Platform
Launch the Distribution platform by selecting Analyze > Distribution.
Figure 3.4 The Distribution Launch Window
Table 3.1 Description of the Distribution Launch Window
Y, Columns Assigns the variables that you want to analyze. A histogram and
associated reports appear for each variable.
Weight Assigns a variable to give the observations different weights. Any moment that is based on the Sum Wgts is affected by weights.
For general information about launch windows, see Using JMP.
The Distribution Report
Follow the instructions in “Example of the Distribution Platform” on page 33 to produce the report shown in Figure 3.5.
Freq Assigns a frequency variable to this role. This is useful if you have
summarized data. In this instance, you have one column for the Y values and another column for the frequency of occurrence of the Y values. The sum of this variable is included in the overall count appearing in the Summary Statistics report (represented by N). All other moment statistics (mean, standard deviation, and so on) are also affected by the Freq variable.
By Produces a separate report for each level of the By variable. If more
than one By variable is assigned, a separate report is produced for each possible combination of the levels of the By variables.
Histograms Only Removes everything except the histograms from the report
window.
Figure 3.5 The Initial Distribution Report Window
Note: Any rows that are excluded in the data table are also hidden in the histogram. The initial Distribution report contains a histogram and reports for each variable. Note the following:
• To replace a variable in a report, from the Columns panel of the associated data table, drag and drop the variable into the axis of the histogram.
• To insert a new variable into a report, creating a new histogram, drag and drop the variable outside of an existing histogram. The new variable can be placed before, between, or after the existing histograms.
Note: To remove a variable, select Remove from the red triangle menu.
• The red triangle menu next to Distributions contains options that affect all of the variables. See “Distribution Platform Options” on page 44.
• The red triangle menu next to each variable contains options that affect only that variable. See “Options for Categorical Variables” on page 45 or “Options for Continuous Variables” on page 47. If you hold down the CTRL key and select a variable option, the option applies to all of the variables that have the same modeling type.
• Histograms visually display your data. See “Histograms” on page 38.
• The initial report for a categorical variable contains a Frequencies report. See “The Frequencies Report” on page 41.
• The initial report for a continuous variable contains a Quantiles and a Summary Statistics report. See “The Quantiles Report” on page 41 and “The Summary Statistics Report” on page 41.
Histograms
Histograms visually display your data. For categorical (nominal or ordinal) variables, the histogram shows a bar for each level of the ordinal or nominal variable. For continuous variables, the histogram shows a bar for grouped values of the continuous variable.
Table 3.2 Histogram Actions
Highlighting data Click on a histogram bar or an outlying point in the graph. The corresponding rows are highlighted in the data table, and corresponding sections of other histograms are also
highlighted, if applicable. See “Highlight Bars and Select Rows” on page 40.
Creating a subset Double-click on a histogram bar, or right-click on a histogram bar and select Subset. A new data table is created that contains only the selected data.
Resizing the entire histogram
Hover over the histogram borders until you see a
double-sided arrow. Then click and drag the borders. For more details, see the Using JMP book.
Rescaling the axis (Continuous variables only) Click and drag on an axis to rescale it.
Alternatively, hover over the axis until you see a hand. Then, double-click on the axis and set the parameters in the Axis Specification window.
Resizing histogram bars (Continuous variables only) There are multiple options to resize histogram bars. See “Resize Histogram Bars for Continuous Variables” on page 39.
To see additional options for the histogram or the associated data table: • Right-click on a histogram. See the Using JMP book.
• Click on the red triangle next to the variable, and select Histogram Options. Options are slightly different depending on the variable modeling type. See “Options for Categorical Variables” on page 45 or “Options for Continuous Variables” on page 47.
Resize Histogram Bars for Continuous Variables
Resize histogram bars for continuous variables by using the following: • the Grabber (hand) tool
• the Set Bin Width option • the Increment option
Use the Grabber Tool
The Grabber tool is a quick way to explore your data. 1. Select Tools > Grabber.
Note: (Windows only) To see the menu bar, you might need to hover over the bar below the window title. You can also change this setting in File > Preferences > Windows Specific. 2. Place the grabber tool anywhere in the histogram.
3. Click and drag the histogram bars.
Think of each bar as a bin that holds a number of observations:
• Moving the hand to the left increases the bin width and combines intervals. The number of bars decreases as the bar size increases.
• Moving the hand to the right decreases the bin width, producing more bars.
• Moving the hand up or down shifts the bin locations on the axis, which changes the contents and size of each bin.
Use the Set Bin Width Option
The Set Bin Width option is a more precise way to set the width for all bars in a histogram. To use the Set Bin Width option, from the red triangle menu for the variable, select Histogram
Options > Set Bin Width. Change the bin width value.
Specifying your selection Specify the data that you select in multiple histograms. See “Specify Your Selection in Multiple Histograms” on page 40.
Use the Increment Option
The Increment option is another precise way to set the bar width. To use the Increment option, double-click on the axis, and change the Increment value.
Highlight Bars and Select Rows
Clicking on a histogram bar highlights the bar and selects the corresponding rows in the data table. The appropriate portions of all other graphical displays also highlight the selection. Figure 3.6 shows the results of highlighting a bar in the height histogram. The corresponding rows are selected in the data table.
Tip: To deselect histogram bars, press the CTRL key and click on the highlighted bars.
Figure 3.6 Highlighting Bars and Rows
Specify Your Selection in Multiple Histograms
Extend or narrow your selection in histograms as follows:
• To extend your selection, hold down the SHIFT key and select another bar. This is the equivalent of using an or operator.
• To narrow your selection, hold down the ALT key and select another bar. This is the equivalent of using an and operator.
Related Information
• “Example of Selecting Data in Multiple Histograms” on page 66
Select a bar to highlight rows and parts of other output.
The Frequencies Report
For nominal and ordinal variables, the Frequencies report lists the levels of the variables, along with the associated frequency of occurrence and probabilities.
For each level of a categorical (nominal or ordinal) variable, the Frequencies report contains the information described in Table 3.3. Missing values are omitted from the analysis.
The Quantiles Report
For continuous variables, the Quantiles report lists the values of selected quantiles (sometimes called percentiles).
Related Information
• “Statistical Details for Quantiles” on page 72
The Summary Statistics Report
For continuous variables, the Summary Statistics report displays the mean, standard
deviation, and other summary statistics. You can control which statistics appear in this report by selecting Customize Summary Statistics from the red triangle menu next to Summary Statistics.
Table 3.3 Description of the Frequencies Report
Level Lists each value found for a response variable.
Count Lists the number of rows found for each level of a response variable. If you use a Freq variable, the Count is the sum of the Freq variables for each level of the response variable.
Prob Lists the probability (or proportion) of occurrence for each level of a response variable. The probability is computed as the count divided by the total frequency of the variable, shown at the bottom of the table.
StdErr Prob Lists the standard error of the probabilities. This column might be hidden. To show the column, right-click in the table and select
Columns > StdErr Prob.
Cum Prob Contains the cumulative sum of the column of probabilities. This column might be hidden. To show the column, right-click in the table and select Columns > Cum Prob.
• Table 3.4 describes the statistics that appear by default.
• Table 3.5 describes additional statistics that you can add to the report using the Customize
Summary Statistics window.
Table 3.4 Description of the Summary Statistics Report
Mean Estimates the expected value of the underlying distribution for the response variable, which is the arithmetic average of the column’s values. It is the sum of the non-missing values divided by the number of non-missing values.
Std Dev The normal distribution is mainly defined by the mean and standard deviation. These parameters provide an easy way to summarize data as the sample becomes large:
• 68% of the values are within one standard deviation of the mean
• 95% of the values are within two standard deviations of the mean
• 99.7% of the values are within three standard deviations of the mean
Std Err Mean The standard error of the mean, which estimates the standard deviation of the distribution of the mean.
Upper 95% Mean and Lower 95% Mean
Are 95% confidence limits about the mean. They define an interval that is very likely to contain the true population mean.
N Is the total number of nonmissing values.
Table 3.5 Additional Summary Statistics
Sum Weight The sum of a column assigned to the role of Weight (in the launch window). Sum Wgt is used in the
denominator for computations of the mean instead of N.
Sum The sum of the response values.
Variance The sample variance, and the square of the sample standard deviation.
Skewness Measures sidedness or symmetry.
CV The percent coefficient of variation. It is computed as the standard deviation divided by the mean and multiplied by 100. The coefficient of variation can be used to assess relative variation, for example when comparing the variation in data measured in different units or with different magnitudes.
N Missing The number of missing observations.
N Zero The number of zero values.
N Unique The number of unique values.
Uncorrected SS The uncorrected sum of squares or sum of values squared.
Corrected SS The corrected sum of squares or sum of squares of deviations from the mean.
Autocorrelation (Appears only if you have not specified a Frequency variable.) First autocorrelation that tests if the residuals are correlated across the rows. This test helps detect non-randomness in the data.
Minimum Represents the 0 percentile of the data. Maximum Represents the 100 percentile of the data. Median Represents the 50th percentile of the data.
Mode The value that occurs most often in the data. If there are multiple modes, the smallest mode appears.
Trimmed Mean (Does not appear if you have specified a Weight variable.)The mean calculated after removing the smallest p% and the largest p% of the data. Geometric Mean The nth root of the product of the data.
Range The difference between the maximum and minimum of
the data.
Interquartile Range The difference between the 3rd and 1st quartiles. Median Absolute Deviation (Does not appear if you have specified a Weight
variable.) The median of the absolute deviations from the median.
Summary Statistics Options
The red triangle menu next to Summary Statistics contains these options:
Customize Summary Statistics Select which statistics you want to appear from the list. You can select or deselect all summary statistics. See Table 3.5.
Show All Modes Shows all of the modes if there are multiple modes.
Related Information
• “Statistical Details for Summary Statistics” on page 73
Distribution Platform Options
The red triangle menu next to Distributions contains options that affect all of the reports and graphs in the Distribution platform.
Robust Mean The robust mean, calculated in a way that is resistant to outliers, using Huber's M-estimation. See Huber and Ronchetti, 2009.
Robust Std Dev The robust standard deviation, calculated in a way that is resistant to outliers, using Huber's M-estimation. See Huber and Ronchetti, 2009.
Enter (1-alpha) for mean confidence interval
Specify the alpha level for the mean confidence interval.
Enter trimmed mean percent Specify the trimmed mean percentage. The percentage is trimmed off each side of the data.
Table 3.5 Additional Summary Statistics (Continued)
Table 3.6 Descriptions of Distribution Platform Options
Uniform Scaling Scales all axes with the same minimum, maximum, and intervals so that the distributions can be easily compared.
Stack Changes the orientation of the histogram and the reports to horizontal and stacks the individual distribution reports vertically. Deselect this option to return the report window to its original layout.
Options for Categorical Variables
The red triangle menus next to each variable in the report window contain additional options that apply to the variable. This section describes the options that are available for categorical (nominal or ordinal) variables.
To see the options that are available for continuous variables, see “Options for Continuous Variables” on page 47.
Arrange in Rows Enter the number of plots that appear in a row. This option helps you view plots vertically rather than in one wide row.
Save for Adobe Flash platform (.SWF)
Saves the histograms as .swf files that are Adobe Flash player compatible. Use these files in presentations and in Web pages. An HTML page is also saved that shows you the correct code for using the resulting .swf file.
For more information about this option, go to http:// www.jmp.com/support/swfhelp/en.
Script This menu contains options that are available to all platforms. They enable you to redo the analysis or save the JSL commands for the analysis to a window or a file. For more information, see Using JMP.
Table 3.6 Descriptions of Distribution Platform Options (Continued)
Table 3.7 Description of Options for Categorical Variables
The Display Options sub-menu contains the following options:
Frequencies Shows or hides the Frequencies report. See “The Frequencies
Report” on page 41.
Horizontal Layout Changes the orientation of the histogram and the reports to
vertical or horizontal.
Axes on Left Moves the Count, Prob, and Density axes to the left instead of the
right.
This option is applicable only if Horizontal Layout is selected. The Histograms sub-menu contains the following options:
Vertical Changes the orientation of the histogram from a vertical to a horizontal orientation.
Std Error Bars Draws the standard error bar on each level of the histogram.
Separate Bars Separates the histogram bars.
Histogram Color Changes the color of the histogram bars.
Count Axis Adds an axis that shows the frequency of column values represented by the histogram bars.
Prob Axis Adds an axis that shows the proportion of column values represented by histogram bars.
Density Axis Adds an axis that shows the length of the bars in the histogram. The count and probability axes are based on the following calculations:
prob = (bar width)*density
count = (bar width)*density*(total count)
Show Percents Labels the percent of column values represented by each histogram bar.
Show Counts Labels the frequency of column values represented by each histogram bar.
Mosaic Plot Displays a mosaic bar chart for each nominal or ordinal response
variable. A mosaic plot is a stacked bar chart where each segment is proportional to its group’s frequency count.
Order By Reorders the histogram, mosaic plot, and Frequencies report in
ascending or descending order, by count. To save the new order as a column property, use the Save > Value Ordering option.
Test Probabilities Displays a report that tests hypothesized probabilities. See
“Examples of the Test Probabilities Option” on page 66 for more details.
Confidence Interval This menu contains confidence levels. Select a value that is listed,
or select Other to enter your own. JMP computes score confidence intervals.
The Save sub-menu contains the following options:
Options for Continuous Variables
The red triangle menus next to each variable in the report window contain additional options that apply to the variable. This section describes the options that are available for continuous variables.
To see the options that are available for categorical (nominal and ordinal) variables, see “Options for Categorical Variables” on page 45.
Level Numbers Creates a new column in the data table called Level <colname>. The level number of each observation corresponds to the histogram bar that contains the observation.
Value Ordering (Use with the Order By option) Creates a new value ordering
column property in the data table, reflecting the new order.
Script to log Displays the script commands to generate the current report in the log window. Select View > Log to see the log window.
Remove Permanently removes the variable and all its reports from the
Distribution report.
Table 3.7 Description of Options for Categorical Variables (Continued)
Table 3.8 Description of Options for Continuous Variables
The Display Options sub-menu contains the following options:
Quantiles Shows or hides the Quantiles report. See “The Quantiles Report”
on page 41.
Set Quantile Increment Changes the quantile increment or revert back to the default
quantile increment.
Custom Quantiles Sets custom quantiles by values or by increments. You can also
specify the confidence level. Smoothed empirical likelihood quantile estimates, based on a kernel density estimate, are added to the report. The confidence intervals for these quantile estimates tend to contain the true quantile with the promised confidence level.
Summary Statistics Shows or hides the Summary Statistics report. See “The Summary
Customize Summary Statistics
Adds or removes statistics from the Summary Statistics report. See “The Summary Statistics Report” on page 41.
Horizontal Layout Changes the orientation of the histogram and the reports to
vertical or horizontal.
Axes on Left Moves the Count, Prob, Density, and Normal Quantile Plot axes to
the left instead of the right.
This option is applicable only if Horizontal Layout is selected. The Histograms sub-menu contains the following options:
Histogram Shows or hides the histogram. See “Histograms” on page 38.
Shadowgram Replaces the histogram with a shadowgram. To understand a shadowgram, consider that if the bin width of a histogram is changed, the appearance of the histogram changes. A shadowgram overlays histograms with different bin widths. Dominant features of a distribution are less transparent on the shadowgram.
Note that the following options are not available for shadowgrams:
• Std Error Bars • Show Counts • Show Percents
Vertical Changes the orientation of the histogram from a vertical to a horizontal orientation.
Std Error Bars Draws the standard error bar on each level of the histogram using the standard error. The standard error bar adjusts automatically when you adjust the number of bars with the hand tool. See “Resize Histogram Bars for Continuous Variables” on page 39. and “Statistical Details for Standard Error Bars” on page 72.
Set Bin Width Changes the bin width of the histogram bars. See “Resize Histogram Bars for Continuous Variables” on page 39.
Histogram Color Changes the color of the histogram bars.
Count Axis Adds an axis that shows the frequency of column values represented by the histogram bars.
Note: If you resize the histogram bars, the count axis also resizes.
Prob Axis Adds an axis that shows the proportion of column values represented by histogram bars.
Note: If you resize the histogram bars, the probability axis also resizes.
Density Axis The density is the length of the bars in the histogram. Both the count and probability are based on the following calculations:
prob = (bar width)*density
count = (bar width)*density*(total count)
When looking at density curves that are added by the Fit
Distribution option, the density axis shows the point estimates of the curves.
Note: If you resize the histogram bars, the density axis remains constant.
Show Percents Labels the proportion of column values represented by each histogram bar.
Show Counts Labels the frequency of column values represented by each histogram bar.
Normal Quantile Plot Adds a normal quantile plot that shows the extent to which the
variable is normally distributed. See “Normal Quantile Plot” on page 50.
Outlier Box Plot Adds an outlier box plot that shows the outliers in your data. See
“Outlier Box Plot” on page 51.
Stem and Leaf Adds a stem and leaf report, which is a variation of the histogram.
See “Stem and Leaf” on page 52.
CDF Plot Adds a plot of the empirical cumulative distribution function. See
“CDF Plot” on page 53.
Test Mean Performs a one-sample test for the mean. See “Test Mean” on
page 54.
Test Std Dev Performs a one-sample test for the standard deviation. See “Test
Std Dev” on page 55.
Confidence Interval Choose confidence intervals for the mean and standard deviation.
See “Confidence Intervals for Continuous Variables” on page 55.
Normal Quantile Plot
Use the Normal Quantile Plot option to visualize the extent to which the variable is normally distributed. If a variable is normally distributed, the normal quantile plot approximates a diagonal straight line. This type of plot is also called a quantile-quantile plot, or Q-Q plot. The normal quantile plot also shows Lilliefors confidence bounds (Conover 1980) and probability and normal quantile scales.
Figure 3.7 Normal Quantile Plot
Prediction Interval Choose prediction intervals for a single observation, or for the
mean and standard deviation of the next randomly selected sample. See “Prediction Intervals” on page 57.
Tolerance Interval Computes an interval to contain at least a specified proportion of
the population. See “Tolerance Intervals” on page 58.
Capability Analysis Measures the conformance of a process to given specification
limits. See “Capability Analysis” on page 58.
Continuous Fit Fits distributions to continuous variables. See “Fit Distributions”
on page 60.
Discrete Fit Fits distributions to discrete variables. See “Fit Distributions” on
page 60.
Save Saves information about continuous or categorical variables. See
“Save Commands for Continuous Variables” on page 56.
Remove Permanently removes the variable and all its reports from the
Distribution report.
Table 3.8 Description of Options for Continuous Variables (Continued)
normal quantile scale
probability scale Lilliefors confidence bounds
Note the following information:
• The y-axis shows the column values.
• The x-axis shows the empirical cumulative probability for each value.
Related Information
• “Statistical Details for the Normal Quantile Plot” on page 74
Outlier Box Plot
Use the outlier box plot (also called a Tukey outlier box plot) to see the distribution and identify possible outliers. Generally, box plots show selected quantiles of continuous distributions.
Figure 3.8 Outlier Box Plot
Note the following aspects about outlier box plots:
• The vertical line within the box represents the median sample value.
• The confidence diamond contains the mean and the upper and lower 95% of the mean. If you drew a line through the middle of the diamond, you would have the mean. The top and bottom points of the diamond represent the upper and lower 95% of the mean. • The ends of the box represent the 25th and 75th quantiles, also expressed as the 1st and 3rd
quartile, respectively.
• The difference between the 1st and 3rd quartiles is called the interquartile range. • The box has lines that extend from each end, sometimes called whiskers. The whiskers
extend from the ends of the box to the outermost data point that falls within the distances computed as follows:
1st quartile - 1.5*(interquartile range) 3rd quartile + 1.5*(interquartile range)
If the data points do not reach the computed ranges, then the whiskers are determined by the upper and lower data point values (not including outliers).
median sample value confidence diamond 1st quartile 3rd quartile whisker whisker shortest half
• The bracket outside of the box identifies the shortest half, which is the most dense 50% of the observations (Rousseuw and Leroy 1987).
Remove Objects from the Outlier Box Plot
To remove the confidence diamond or the shortest half, proceed as follows: 1. Right-click on the outlier box plot and select Customize.
2. Click Box Plot.
3. Deselect the check box next to Confidence Diamond or Shortest Half. For more details about the Customize Graph window, see the Using JMP book.
Quantile Box Plot
The Quantile Box Plot displays specific quantiles from the Quantiles report. If the distribution is symmetric, the quantiles in the box plot are approximately equidistant from each other. At a glance, you can see whether the distribution is symmetric. For example, if the quantile marks are grouped closely at one end, but have greater spacing at the other end, the distribution is skewed toward the end with more spacing. See Figure 3.9.
Figure 3.9 Quantile Box Plot
Quantiles are values where the pth quantile is larger than p% of the values. For example, 10% of the data lies below the 10th quantile, and 90% of the data lies below the 90th quantile.
Stem and Leaf
Each line of the plot has a Stem value that is the leading digit of a range of column values. The Leaf values are made from the next-in-line digits of the values. You can see the data point by joining the stem and leaf. In some cases, the numbers on the stem and leaf plot are rounded
90% quantile
versions of the actual data in the table. The stem-and-leaf plot actively responds to clicking and the brush tool.
Note: The stem-and-leaf plot does not support fractional frequencies.
CDF Plot
The CDF plot creates a plot of the empirical cumulative distribution function. Use the CDF plot to determine the percent of data that is at or below a given value on the x-axis.
Figure 3.10 CDF Plot
For example, in this CDF plot, approximately 30% of the data is less than a total fat value of 10 grams.