J.P. Verma
Data Analysis
in Management
J.P. Verma
Research and Advanced Studies Lakshmibai National University
of Physical Education Gwalior, MP, India
ISBN 978-81-322-0785-6 ISBN 978-81-322-0786-3 (eBook) DOI 10.1007/978-81-322-0786-3
Springer New Delhi Heidelberg New York Dordrecht London Library of Congress Control Number: 2012954479
The IBM SPSS Statistics has been used in solving various applications in different chapters of the book with the permission of the International Business Machines Corporation,# SPSS, Inc., an IBM Company. The various screen images of the software are Reprinted Courtesy of International Business Machines Corporation,#SPSS. “SPSS was acquired by IBM in October, 2009.”
IBM, the IBM logo, ibm.com, and SPSS are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at “IBM Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml.
#Springer India 2013
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.
While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein.
Printed on acid-free paper
To my elder sister Sandhya Mohan for
having me introduced in statistics
Brother-in-law Rohit Mohan for his
helping gesture
Preface
While serving as a faculty of statistics for the last 30 years, I have experienced that the non-statistics faculty and research scholars in different disciplines find it difficult to use statistical techniques in their research problems. Even if their theoretical concepts are sound its troublesome for them to use statistical software. This book provides readers with a greater understanding of a variety of statistical techniques along with the procedure to use the most popular statistical software package SPSS.
The book strengthens the intuitive understanding of the material, thereby increasing the ability to successfully analyze data in the future. It enhances readers capability in using data analysis techniques to a broader spectrum of research problems.
The book is intended for the undergraduate and postgraduate courses along with pre-doctoral and doctoral course work on data analysis, statistics, and/or quantita-tive methods taught in management and other allied disciplines like psychology, economics, education, nursing, medical, or other behavioral and social sciences. This book is equally useful to the advanced researchers in the area of humanities and behavioural and social sciences in solving their research problems.
The book has been written to provide solutions to the researchers in different disciplines for using one of the powerful statistical software SPSS. The book will serve the students as a self-learning text of using SPSS for applying statistical techniques in their research problems.
In most of the research studies, data are analyzed using multivariate statistics which poses an additional problem for the beginners. These techniques cannot be understood without in-depth knowledge of statistical concepts. Further, several fields in science, engineering, and humanities have developed their own nomencla-ture assigning different names to the same concepts. Thus, one has to gather sufficient knowledge and experience in order to analyze their data efficiently. This book covers most of the statistical techniques including some of the most powerful multivariate techniques along with their detailed analysis and interpreta-tion of the SPSS output that are required by the research scholars in different discipline to achieve their research objectives.
The USP of this book is that even without having the indepth knowledge of statistics, one can learn various statistical techniques and their applications on their own.
Each chapter is self-contained and starts with the topics like Introductory concepts, application areas, statistical techniques used in the chapter and step-by-step solved example with SPSS. In each chapter in depth interpretation of SPSS output has been made to help the readers in understanding the application of statistical techniques in different situations. Since the SPSS output generated in different statistical applications are raw and cannot be directly used for reporting hence model way of writing the results has been shown wherever it is required.
This book focuses on providing readers with the knowledge and skills needed to carry out research in management, humanities, and social and behavioral sciences by using SPSS. Looking at the contents and prospects of learning computing skills using SPSS, this book is a must for every researcher from graduate-level studies onward. Towards the end of each chapter, short answer questions, multiple-choice questions, and assignments have been provided as a practice exercise for the readers. The common mistakes like using two-tailed test for testing one-tailed hypothe-sis, using the term “level of confidence” for defining level of significance or using the statement like “accepting the null hypothesis” instead of “not able to reject the null hypothesis” have been explained extensively in the text so that the readers may avoid such mistakes during organizing and conducting their research work.
The faculty who uses this book will find it very useful as it presents many illustrations with either real or simulated data to discuss analytical techniques in different chapters. Some of the examples cited in the text are from my own and my colleagues’ research studies.
This book consists of 14 chapters. Chapter1 deals with the data types, data cleaning, and procedure to start SPSS on the system. Notations used throughout the book in using SPSS commands have been explained in this chapter. Chapter2deals with descriptive study. Different situations have been discussed under which such studies can be undertaken. The procedure of computing various descriptive statis-tics has been discussed in this chapter. Besides computing procedure through SPSS, a new approach has been shown towards the end of the second chapter to develop the profile graph which can be used for comparing different domains of the populations.
Chapter3 explains the chi-square and its different applications by means of solved examples. The step-by-step procedure of computing chi-square using SPSS has been discussed. Chi-square is the test of significance for association between the attributes, but it provides comparison of the two groups as well, in case of the responses being measured on the nominal scale. This fact has been discussed for the benefit of the readers.
Chapter4 explains the procedure of computing correlation matrix and partial correlations using SPSS. The emphasis has been given on how to interpret the relationships.
In Chapter5, computing multiple correlations and regression analysis have been discussed. Both the approaches of regression analysis in SPSS i.e. Stepwise and Enter methods have been discussed for estimating any measurable phenomenon.
In Chapter 6, application of t-test in testing the significance of difference between groups in all the three situations, that is, in one sample, two independent samples, and two dependent samples, has been discussed in detail. Procedures of using one-tailed and two-tailed tests have been thoroughly detailed.
Chapter 7 explains the procedure of applying one-way analysis of variance (ANOVA) with equal and unequal groups for testing the significance of variability among group means. The graphical approach has been discussed for post hoc comparisons of means besides using thep-value concept.
In Chapter8, two-way ANOVA for understanding the causes of variation has been discussed in detail by means of solved examples using SPSS. The model way of writing the results has been shown, which the students should note. Procedure for doing interaction analysis has been discussed in detail by using the SPSS output.
In Chapter 9, the application of ANCOVA to study the role of covariate in experimental research has been discussed by means of a research example. Students can find the procedure of analyzing their data much easier after going through this chapter.
In Chapter10, cluster analysis technique has been discussed in detail for market segmentation. The readers will come to know about the situations where cluster analysis can be used in their research studies. Discussions of all its basic concepts have been elaborated so that even a non-statistician can also appreciate and use it for their research data.
Chapter11deals with the factor analysis, one of the most widely used multivari-ate statistical techniques in management research. By going through this chapter, the readers can understand to study the characteristics of a group of data by means of few underlying structures instead of a large number of parameters. The proce-dure of developing the test battery using the factor analysis technique has also been discussed in detail.
In Chapter 12, we have discussed discriminant analysis and its application in various research situations. By learning this technique, one can develop classifica-tory model in classifying a customer into any of the two categories based on their relevant profile parameters. The technique is very useful in classifying a customer as good or bad for offering various services in the area of banking and insurance.
Chapter 13 explains the application of logistic regression for probabilistic classification of cases into one of the two groups. Basics of this technique have been discussed before explaining the procedure in solving logistic regression with SPSS. Interpretations of each and every output have been very carefully explained for easy understanding of the readers.
In Chapter 14, multidimensional scaling has been discussed to find the brand positioning of different products. This technique is especially useful if the popular-ity of products is to be compared on different parameters.
At each and every step, care has been taken so that the readers can learn to apply SPSS and understand minutest possible detail of analysis discussed in this book. The purpose of this book is to give a brief and clear description of how to apply variety of statistical analysis using any version of SPSS. We hope that this book will
provide students and researchers with a self-learning material of using SPSS to analyze their data.
Students and other readers are welcome to e-mail me their query related to any portion of the book at [email protected], to which timely reply will be sent. Professor (Statistics) J.P. Verma
Acknowledgements
I would like to extend my sincere thanks to my professional colleagues Prof. Y.P. Gupta, Prof. S. Sekhar, Dr. V.B. Singh, Prof. Jagdish Prasad and Dr. J.P. Bhukar for their valuable inputs in completing this text. I must thank to my research scholars who always motivated me to solve varieties of complex research problems which has contributed a lot in preparing this text. Finally I must appreciate the effort of my wife Hari Priya who not only provided me the peaceful environment in preparing this text but also helped me in correcting the manuscript language and format to a great extent. Finally I owe my loving gesture to my children Prachi and Priyam who have provided me the creative inputs in the preparation this manuscript.
Professor (Statistics) J.P. Verma
Contents
1 Data Management. . . 1 Introduction . . . 1 Types of Data . . . 3 Metric Data . . . 3 Nonmetric Data . . . 4 Important Definitions . . . 5 Variable . . . 5 Attribute . . . 6Mutually Exclusive Attributes . . . 6
Independent Variable . . . 6
Dependent Variable . . . 6
Extraneous Variable . . . 6
The Sources of Research Data . . . 7
Primary Data . . . 7
Secondary Data . . . 9
Data Cleaning . . . 9
Detection of Errors . . . 10
Typographical Conventions Used in This Book . . . 11
How to Start SPSS . . . 11
Preparing Data File . . . 13
Defining Variables and Their Properties Under Different Columns . . 13
Defining Variables for the Data in Table1.1. . . 16
Entering the Data . . . 16
Importing Data in SPSS . . . 17
Importing Data from an ASCII File . . . 18
Importing Data File from Excel Format . . . 22
Exercise . . . 25
2 Descriptive Analysis. . . 29
Introduction . . . 29
Measures of Central Tendency . . . 31
Mean . . . 31
Median . . . 36
Mode . . . 38
Summary of When to Use the Mean, Median, and Mode . . . 40
Measures of Variability . . . 41
The Range . . . 41
The Interquartile Range . . . 41
The Standard Deviation . . . 42
Variance . . . 45
The Index of Qualitative Variation . . . 46
Standard Error . . . 47 Coefficient of Variation (CV) . . . 48 Moments . . . 49 Skewness . . . 50 Kurtosis . . . 51 Percentiles . . . 52 Percentile Rank . . . 53
Situation for Using Descriptive Study . . . 53
Solved Example of Descriptive Statistics using SPSS . . . 54
Computation of Descriptive Statistics Using SPSS . . . 54
Interpretation of the Outputs . . . 58
Developing Profile Chart . . . 62
Summary of the SPSS Commands . . . 63
Exercise . . . 64
3 Chi-Square Test and Its Application. . . 69
Introduction . . . 69
Advantages of Using Crosstabs . . . 70
Statistics Used in Cross Tabulations . . . 70
Chi-Square Statistic . . . 70
Chi-Square Test . . . 72
Application of Chi-Square Test . . . 73
Contingency Coefficient . . . 79 Lambda Coefficient . . . 79 Phi Coefficient . . . 79 Gamma . . . 80 Cramer’s V . . . 80 Kendall Tau . . . 80
Situation for Using Chi-Square . . . 80
Solved Examples of Chi-square for Testing an Equal Occurrence Hypothesis . . . 81
Computation of Chi-Square Using SPSS . . . 82
Interpretation of the Outputs . . . 84
Solved Examples of Chi-square for Testing the Significance of Association Between Two Attributes . . . 87
Computation of Chi-Square for Two Variables Using SPSS . . . 88
Interpretation of the Outputs . . . 96
Summary of the SPSS Commands . . . 96
Exercise . . . 98
4 Correlation Matrix and Partial Correlation: Explaining Relationships. . . 103
Introduction . . . 103
Details of Correlation Matrix and Partial Correlation . . . 105
Product Moment Correlation Coefficient . . . 106
Partial Correlation . . . 112
Situation for Using Correlation Matrix and Partial Correlation . . . 115
Research Hypotheses to Be Tested . . . 116
Statistical Test . . . 117
Solved Example of Correlation Matrix and Partial Correlations by SPSS 117 Computation of Correlation Matrix Using SPSS . . . 118
Interpretation of the Outputs . . . 120
Computation of Partial Correlations Using SPSS . . . 123
Interpretation of Partial Correlation . . . 125
Summary of the SPSS Commands . . . 126
Exercise . . . 128
5 Regression Analysis and Multiple Correlations: For Estimating a Measurable Phenomenon. . . 133
Introduction . . . 133
Terminologies Used in Regression Analysis . . . 134
Multiple Correlation . . . 135
Coefficient of Determination . . . 137
The Regression Equation . . . 138
Multiple Regression . . . 145
Application of Regression Analysis . . . 149
Solved Example of Multiple Regression Analysis Including Multiple Correlation . . . 149
Computation of Regression Coefficients, Multiple Correlation, and Other Related Output in the Regression Analysis . . . 150
Interpretation of the Outputs . . . 155
Summary of the SPSS Commands For Regression Analysis . . . 159
Exercise . . . 161
6 Hypothesis Testing for Decision-Making. . . 167 Introduction . . . 167 Hypothesis Construction . . . 168 Null Hypothesis . . . 170 Alternative Hypothesis . . . 170 Test Statistic . . . 170 Rejection Region . . . 171
Steps in Hypothesis Testing . . . 171
Type I and Type II Errors . . . 172
One-Tailed and Two-Tailed Tests . . . 174
Criteria for Using One-Tailed and Two-Tailed Tests . . . 175
Strategy in Testing One-Tailed and Two-Tailed Tests . . . 176
What IspValue? . . . 177
Degrees of Freedom . . . 177
One-Samplet-Test . . . 178
Application of One-Sample Test . . . 179
Two-Samplet-Test for Unrelated Groups . . . 181
Assumptions in Using Two-Samplet-Test . . . 181
Application of Two-Sampledt-Test . . . 182
Assumptions in Using Pairedt-Test . . . 192
Testing Protocol in Using Pairedt-Test . . . 192
Solved Example of Testing Single Group Mean . . . 196
Computation oft-Statistic and Related Outputs . . . 196
Interpretation of the Outputs . . . 201
Solved Example of Two-Samplet-Test for Unrelated Groups with SPSS 201 Computation of Two-Samplet-Test for Unrelated Groups . . . 202
Interpretation of the Outputs . . . 207
Solved Example of Pairedt-Test with SPSS . . . 208
Computation of Pairedt-Test for Related Groups . . . 209
Interpretation of the Outputs . . . 213
Summary of SPSS Commands fort-Tests . . . 214
Exercise . . . 215
7 One-Way ANOVA: Comparing Means of More than Two Samples. . . 221
Introduction . . . 221
Principles of ANOVA Experiment . . . 222
One-Way ANOVA . . . 222
Factorial ANOVA . . . 223
Repeated Measure ANOVA . . . 223
Multivariate ANOVA . . . 224
One-Way ANOVA Model and Hypotheses Testing . . . 224
Assumptions in Using One-Way ANOVA . . . 228
Effect of Using Severalt-tests Instead of ANOVA . . . 228
Application of One-Way ANOVA . . . 229
Solved Example of One-Way ANOVA with Equal Sample Size Using SPSS . . . 233
Computations in One-Way ANOVA with Equal Sample Size . . . 234
Interpretations of the Outputs . . . 238
Solved Example of One-Way ANOVA with Unequal Sample . . . 241
Computations in One-Way ANOVA with Unequal Sample Size . . . 242
Interpretation of the Outputs . . . 246
Summary of the SPSS Commands for One-Way ANOVA (Example 7.2) . . . 248
Exercise . . . 249
8 Two-Way Analysis of Variance: Examining Influence of Two Factors on Criterion Variable. . . 255
Introduction . . . 255
Principles of ANOVA Experiment . . . 256
Classification of ANOVA . . . 257
Factorial Analysis of Variance . . . 257
Repeated Measure Analysis of Variance . . . 258
Multivariate Analysis of Variance (MANOVA) . . . 258
Advantages of Two-Way ANOVA over One-Way ANOVA . . . 259
Important Terminologies Used in Two-Way ANOVA . . . 259
Factors . . . 259
Treatment Groups . . . 260
Main Effect . . . 260
Interaction Effect . . . 260
Within-Group Variation . . . 260
Two-Way ANOVA Model and Hypotheses Testing . . . 261
Assumptions in Two-Way Analysis of Variance . . . 265
Situation Where Two-Way ANOVA Can Be Used . . . 266
Solved Example of Two-Way ANOVA Using SPSS . . . 272
Computation in Two-Way ANOVA Using SPSS . . . 273
Model Way of Writing the Results of Two-Way ANOVA and Its Interpretations . . . 279
Summary of the SPSS Commands for Two-Way ANOVA . . . 285
Exercise . . . 286
9 Analysis of Covariance: Increasing Precision in Comparison by Controlling Covariate. . . 291
Introduction . . . 291
Introductory Concepts of ANCOVA . . . 292
Graphical Explanation of Analysis of Covariance . . . 293
Analysis of Covariance Model . . . 294
What We Do in Analysis of Covariance? . . . 296
When to Use ANCOVA . . . 297
Assumptions in ANCOVA . . . 298
Efficiency in Using ANCOVA over ANOVA . . . 298
Solved Example of ANCOVA Using SPSS . . . 298
Computations in ANCOVA Using SPSS . . . 300
Model Way of Writing the Results of ANCOVA and Their Interpretations . . . 307
Summary of the SPSS Commands . . . 310
Exercise . . . 311
10 Cluster Analysis: For Segmenting the Population. . . 317
Introduction . . . 317
What Is Cluster Analysis? . . . 318
Terminologies Used in Cluster Analysis . . . 318
Distance Measure . . . 318
Clustering Procedure . . . 321
Standardizing the Variables . . . 328
Icicle Plots . . . 328
The Dendrogram . . . 329
The Proximity Matrix . . . 329
What We Do in Cluster Analysis . . . 330
Assumptions in Cluster Analysis . . . 331
Research Situations for Cluster Analysis Application . . . 332
Steps in Cluster Analysis . . . 332
Solved Example of Cluster Analysis Using SPSS . . . 333
Stage 1 . . . 335
Stage 2 . . . 335
Stage 1: SPSS Commands for Hierarchal Cluster Analysis . . . 335
Stage 2: SPSS Commands forK-Means Cluster Analysis . . . 340
Interpretations of Findings . . . 344
Exercise . . . 354
11 Application of Factor Analysis: To Study the Factor Structure Among Variables. . . 359
Introduction . . . 359
What Is Factor Analysis? . . . 361
Terminologies Used in Factor Analysis . . . 361
Principal Component Analysis . . . 362
Factor Loading . . . 362
Communality . . . 362
Eigenvalues . . . 363
Kaiser Criteria . . . 363
The Scree Plot . . . 363
Varimax Rotation . . . 364
What Do We Do in Factor Analysis? . . . 365
Assumptions in Factor Analysis . . . 366
Characteristics of Factor Analysis . . . 367
Limitations of Factor Analysis . . . 367
Research Situations for Factor Analysis . . . 367
Solved Example of Factor Analysis Using SPSS . . . 368
SPSS Commands for the Factor Analysis . . . 370
Interpretation of Various Outputs Generated in Factor Analysis . . . 374
Summary of the SPSS Commands for Factor Analysis . . . 381
Exercise . . . 382
12 Application of Discriminant Analysis: For Developing a Classification Model. . . 389
Introduction . . . 389
What Is Discriminant Analysis? . . . 390
Terminologies Used in Discriminant Analysis . . . 391
Variables in the Analysis . . . 391
Discriminant Function . . . 392
Classification Matrix . . . 392
Stepwise Method of Discriminant Analysis . . . 392
Power of Discriminating Variables . . . 393
Box’s M Test . . . 393
Eigenvalues . . . 393
The Canonical Correlation . . . 394
Wilks’ Lambda . . . 394
What We Do in Discriminant Analysis . . . 394
Assumptions in Using Discriminant Analysis . . . 396
Research Situations for Discriminant Analysis . . . 396
Solved Example of Discriminant Analysis Using SPSS . . . 397
SPSS Commands for Discriminant Analysis . . . 399
Interpretation of Various Outputs Generated in Discriminant Analysis 403 Summary of the SPSS Commands for Discriminant Analysis . . . 407
Exercise . . . 407
13 Logistic Regression: Developing a Model for Risk Analysis. . . 413
Introduction . . . 413
What Is Logistic Regression? . . . 414
Important Terminologies in Logistic Regression . . . 415
Outcome Variable . . . 415
Natural Logarithms and the Exponent Function . . . 415
Odds Ratio . . . 416
Maximum Likelihood . . . 416
Logit . . . 417
Logistic Function . . . 417
Logistic Regression Equation . . . 417
Judging the Efficiency of the Logistic Model . . . 418
Understanding Logistic Regression . . . 419
Graphical Explanation of Logistic Model . . . 419
Logistic Model with Mathematical Equation . . . 421
Interpreting the Logistic Function . . . 422
Assumptions in Logistic Regression . . . 423
Important Features of Logistic Regression . . . 423
Research Situations for Logistic Regression . . . 424
Steps in Logistic Regression . . . 425
Solved Example of Logistics Analysis Using SPSS . . . 426
First Step . . . 427
Second Step . . . 428
SPSS Commands for the Logistic Regression . . . 428
Interpretation of Various Outputs Generated in Logistic Regression . . . 431
Explanation of Odds Ratios . . . 437
Conclusion . . . 437
Summary of the SPSS Commands for Logistic Regression . . . 437
Exercise . . . 438
14 Multidimensional Scaling for Product Positioning. . . 443
Introduction . . . 443
What Is Multidimensional Scaling . . . 444
Terminologies Used in Multidimensional Scaling . . . 444
Objects and Subjects . . . 444
Distances . . . 445
Similarity vs. Dissimilarity Matrices . . . 445
Stress . . . 445
Perceptual Mapping . . . 445
Dimensions . . . 446
What We Do in Multidimensional Scaling? . . . 446
Procedure of Dissimilarity-Based Approach of Multidimensional Scaling . . . 446
Procedure of Attribute-Based Approach of Multidimensional Scaling 447 Assumptions in Multidimensional Scaling . . . 448
Limitations of Multidimensional Scaling . . . 449
Solved Example of Multidimensional Scaling (Dissimilarity-Based Approach of Multidimensional Scaling) Using SPSS . . . 449
SPSS Commands for Multidimensional Scaling . . . 450
Interpretation of Various Outputs Generated in Multidimensional Scaling . . . 452
Summary of the SPSS Commands for Multidimensional Scaling . . . 457
Exercise . . . 457
Appendix: Tables. . . 461
References and Further Readings. . . 469
Index. . . 475