Quantitative Methods in the Humanities
and Social Sciences
Series Editors
Thomas DeFanti, Calit2, University of California San Diego, La Jolla, CA, USA Anthony Grafton, Princeton University, Princeton, NJ, USA
Thomas E. Levy, Calit2, University of California San Diego, La Jolla, CA, USA Lev Manovich, Graduate Center, Room 4319, The Graduate Center, CUNY, New York, NY, USA
Quantitative Methods in the Humanities and Social Sciences is a book series designed to foster research-based conversation with all parts of the university campus – from buildings of ivy-covered stone to technologically savvy walls of glass. Scholarship from international researchers and the esteemed editorial board represents the far-reaching applications of computational analysis, statistical models, computer-based programs, and other quantitative methods. Methods are integrated in a dialogue that is sensitive to the broader context of humanistic study and social science research. Scholars, including among others historians, archae-ologists, new media specialists, classicists and linguists, promote this interdisci-plinary approach. These texts teach new methodological approaches for contemporary research. Each volume exposes readers to a particular research method. Researchers and students then benefit from exposure to subtleties of the larger project or corpus of work in which the quantitative methods come to fruition. Editorial Board:
Thomas DeFanti, University of California, San Diego & University of Illinois at Chicago
Anthony Grafton, Princeton University
Thomas E. Levy, University of California, San Diego Lev Manovich, The Graduate Center, CUNY
Alyn Rockwood, King Abdullah University of Science and Technology Publishing Editor for the series at Springer: Laura Briskman,
Pieter M. Kroonenberg
Multivariate Humanities
Pieter M. Kroonenberg Leiden University Leiden, Zuid-Holland The Netherlands
ISSN 2199-0956 ISSN 2199-0964 (electronic) Quantitative Methods in the Humanities and Social Sciences
ISBN 978-3-030-69149-3 ISBN 978-3-030-69150-9 (eBook)
https://doi.org/10.1007/978-3-030-69150-9
© Springer Nature Switzerland AG 2021
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
To Ineke
Preface
Why this book?
Rather than stating this myself, I leave it to voices in the humanities to explain the need for a book such as the present one.
Just 15% of students in England study mathematics beyond GCSE level. However, many of this non-mathematics studying majorityfind that they need mathematical skills for the advanced study of other subjects, including humanities and social science subjects at school or university or in their job.[...] Without mathematical and, in particular, statistical skills whole areas of the social sciences and humanities are inaccessible to research students and future academics. (Canning, 2014, p. vii)
My position is that multivariate analysis is to be thought of as nothing more than the analysis of tables of data. If it worth putting together a table of data it is worth exploring it by multivariate methods (Wright, 1989, p. 1; quoted in Baxter, 1994).
On the other hand Thomas (1978, p. 231) produces a cautionary note in his paper The awful truth about statistics in archaeology:
There is a rapidly growing clutch of statistically-sophisticated archeologists who seemed perched, rotating factor analyses in hand, prepared to pounce on the first clump of unmanipulated data that should have the misfortune of stumbling into their path.
In writing this book it has been my explicit aim to present a link between the data collected to tackle specific research questions in the humanities, and appropriate multivariate statistical techniques to answer such questions. The central part of the book consists of case studies from different disciplines in the humanities. They are meant to encourage researchers to look differently at their data, and to consider various possibilities for analysis. At the same time it would be wise to take heed of the cautionary remarks in Thomas’s (1978) paper.
Thus, the general idea of the text is to provide guidance for researchers in making informed decisions on which approaches may be useful to answer their research questions using the data at hand. This book is not meant to teach them how to perform all the analysis methods presented here, but instead to show the kind of analyses that are available given the data and the research questions. My hope is that this approach will be useful in practical work. However, to quote Warwick (2004, p. 378), it should be realised that‘the use of digital (and statistical) resources can only be truly meaningful when combined with old-fashioned critical judgement’— and expert knowledge I would like to add.
What is new?
Readers may wonder what new insights can be gained by using multivariate methods. Obviously there is no easy answer, as I am not an expert in all the fields touched upon in this book. Primarily I would be happy if results from applying multivariate methods are not contradictory to what the experts say their data tell them. The power of analysing many data in one go and presenting them in a coherent fashion, especially via well-designed tables and graphs, has given me, as a nonexpert, insights into various fields of endeavour which otherwise I could probably only have acquired via painstaking, detailed time-consuming studies of each field separately. At the same time I hope and expect that quantitative methods, in general, and multivariate techniques, in particular, will put powerful tools in the hands of experts in the humanities so that they can both expand their views and present their research results to colleagues and novices. Moreover, they may dis-cover new facts and anomalies which otherwise might have gone unnoticed.
Apart from this, the book is primarily intended as a demonstration of analytic tools available for use in many fields of the humanities. The case studies are demonstrations of how statistical problems arising from a research project may be tackled. By seeing these tools applied in different disciplines researchers may dis-cover how they may be applied in their own research projects.
How to use this book
The case study chapters all have a similar setup. First, the background and sub-stantive research questions are introduced, as well as the data of the study. Next, the statistical methods used in the chapter are discussed at an elementary level, as well as the reasons for choosing them. Finally, the focal points of the chapters are the results of the analyses, which are presented at a medium statistical level.
To support the analyses in the case studies there are four introductory method-ological chapters, so that the emphasis in the case study chapters could be on the application rather than the mathematical side of things. The fourth chapter contains more detailed explanations for readers who want to delve deeper into the statistical side of the methods. Moreover, for a brief explanation of statistical terms a Glossary is provided at the end of the book.
The emphasis in the text is on descriptive and exploratory methods, amply illustrated with many kinds of different graphics, rather than on test-based and mathematical statistical methods. It should be emphasised that the methods used are rather standard and have been found useful in many contexts. The main exceptions are methods for categorical data, and methods in which the data have been measured more than once. At times the same explanations will be given in several chapters to allow for (semi)independent reading, but there will also be many cross-references to other places in the text for additional explanation and information. However,
researchers wishing to carry out meaningful analyses of their data should turn to statistical consultants and/or specialists in the various statistical techniques descri-bed.
Audience
The level of the exposition is aimed at readers with at least a graduate degree in the humanities and a basic course in methodology and statistics. I am assuming they understand such subjects as elementary descriptive analysis and hypothesis testing. For them I hope to show that many multivariate statistical techniques can be used to make more sense of their data.
On the other hand, I have also tried to make the text attractive for individuals who are primarily interested in the case studies themselves, and want to see what it is all about. I would advise them to read the first chapter, skim (or skip) the next three methodological chapters, and then start with the case studies that interest them, turning only to the methodology if they feel the need. In reading the case studies they should read the background, research questions, and the data descriptions. Then (again) skim or skip the methodological sections, and continue with the content summary at the end of the chapter.
The outcomes of the analyses should not be considered fundamental contribu-tions to the subjects studied. After all I am a statistician, and not an expert on all the areas touched upon in the case studies.
Plagiarism?
Writing books such as this one is a daunting task, because so many methodology books have already been written. Copying texts from other authors who have pre-sented topics so lucidly is a continuous temptation. The following quote from Crowder and Hand (1990, p. 4) was too beautiful and too close to the heart of the matter not to plagiarise:‘There is very little “original” material here. Our inclination for barefaced plagiarization from many sources as possible has been tempered only by the need to write a coherent account’.
Another quote which is close to my heart:
The development of statistical ideas and thinking in the 20th century is possibly one of that century’s more important intellectual achievements; the mathematics of statistics can be ugly and sometimes pointless but I think the justification for statistics ultimately lies in applicability to data analysis (...). Not everyone would agree with this. (Baxter, 2008)
Which is duly noted.
Origin
This book has its origins at the Netherlands Institute for Advanced Study in the Humanities and Social Sciences in Wassenaar, The Netherlands, where I stayed during the academic year 2003–2004. A collection of scholars from both the social sciences and the humanities worked and studied there in truly monastic tranquillity. Everyone was expected to give a presentation about their research. Given that my own research dealt with methods for data analysis but that most of the other scholars were working in substantive areas, the idea came to me to show what my kind of experience in ana-lysing data could offer to their fields of endeavour. This resulted in a lecture Multivariate Analysis: Power to the Humanities with examples concerning the dating of graves from the artefacts they contained, searching for the relative time sequence of the works of Plato, and investigating the authorship of a book from the Wizard of Oz series. I presented this lecture at several subsequent occasions to various audiences. The idea grew to expand this lecture into a book. However, I felt that I would then need more different kinds of data. To this end I wrote to authors of humanities papers asking whether I could use their data for the purpose of a book which at the time was still more of a daydream than anything else. Many reacted favourably to my request, and these datasets form the core of the present book. I am extremely grateful to the authors for their willingness to share their data with me. At the outset I decided that I would reanalyse the data, not in competition with the original publications, but as an example of how one could answer research questions. Several times this resulted in a co-authorship, which is acknowledged in several chapters in this book. Nevertheless, I decided to take complete responsibility for the text presented here to ensure uniformity and purpose.
Acknowledgements
Many people have contributed to the realisation of this book, and I would like to acknowledge them, in alphabetical order. As co-authors and/or data providers: Michael Barry, Zachary Bleemer, Leonard Brandwood, Spike Bucklow, Dorien Herremans, Tore Janson, Takashi Murakami, Donald Polzella, Marilena Vecco, and Jeroen Vermunt. As critical readers and/or correctors: José Binongo, Laura Brinkman, Rachel Chrastil, Pieter de Coninck, Hugh Craig, Twyla Gibson, Ruud Halbertsma, Casper de Jonge, Annemarie Kets-Vree, Paul Keyser, Els Koeneman, Jo McDonald, Cory Mckay, Kaarle Nordenstreng, Adriaan Rademaker, Ralph Rippe, Ineke Smit, June Ross, Meg Southwell, Paul Taçon, HaroldTarrant, Holger Thesleff, Paul Vierthaler, Peter White, and Meredith Wilson.
Leiden, The Netherlands Pieter M. Kroonenberg
Global table of contents
Opening Dedication Preface
PART 1. The Actors
1. Introduction: Multivariate studies in the Humanities 2. Data inspection: The data are in. Now what? 3. Statistical framework
4. Statistical framework extended
PART 2. The Scenes
Theology / Bible studies
5. Similarity data: Bible translations (co-author: Zachary Bleemer) 6. Stylometry: Authorship of the Pauline Epistles
History & Archeology
7. Economic history: Agriculture development on Java 8. Seriation: Graves in the Münsingen-Rain burial site
Arts
9. Complex response data: Evaluating Marian art (co-author: Donald Polzella) 10. Rating scales: Craquelure and pictorial stylometry (co-author: Spike
Bucklow)
11. Pictorial similarity: Rock art images across the world
12. Questionnaires: Public views on deaccessioning (co-author: Marilena Vecco)
Linguistics
13. Stylometry: The Royal Book of Oz: Baum or Thompson? 14. Linguistics: Accentual prose rhythm in mediæval Latin 15. Linguistics: Chronology of Plato’s works
16. Binary judgements: Reading preferences
Music
17. Music appreciation: The Chopin Preludes (co-author: Takashi Murakami) 18. Musical stylometry: Characterisation of music (co-author: Dorien
Herremans)
PART 3. The Finale 19. Final Musings
Statistical glossary References
Subject & Author index
Contents
Part I The Actors
1 Introduction: Multivariate studies in the Humanities . . . 3
1.1 Preliminaries. . . 3
1.1.1 Audience. . . 4
1.1.2 Before you start. . . 4
1.1.3 Multivariate analysis. . . 6
1.1.4 Case studies: Quantification and statistical analysis. . . 7
1.2 The humanities—What are they?. . . 8
1.3 Qualitative and quantitative research in the humanities . . . 9
1.4 Multivariate data analysis . . . 9
1.5 Data: Formats and types . . . 10
1.5.1 Data formats . . . 11
1.5.2 Data characteristics: Measurement levels . . . 11
1.5.3 Characteristics of data types . . . 14
1.5.4 From one data format to another. . . 15
1.6 General structure of the case study chapters. . . 16
1.7 Author references . . . 16
1.8 Wikipedia. . . 17
1.9 Web addresses . . . 17
2 Data inspection: The data are in. Now what?. . . 19
2.1 Background . . . 19
2.1.1 A researcher’s nightmare . . . 20
2.1.2 Getting the data right . . . 22
2.2 Data inspection: Overview. . . 23
2.2.1 The normal distribution . . . 24
2.2.2 Distributions: Individual numeric variables . . . 26
2.2.3 Inspecting several univariate distributions . . . 30
2.2.4 Bivariate inspection . . . 33
2.3 Missing data. . . 36
2.3.1 Unintentionally missing . . . 37
2.3.2 Systematically missing . . . 37
2.3.3 Handling missing data . . . 38
2.4 Outliers . . . 40
2.4.1 Characteristics of outliers . . . 40
2.4.2 Types of outliers . . . 40
2.4.3 Detection of outliers. . . 41
2.4.4 Handling outliers . . . 42
2.5 Testing assumptions of statistical techniques. . . 43
2.5.1 Null hypothesis testing . . . 43
2.5.2 Model testing. . . 43
2.6 Content summary . . . 44
3 Statistical framework . . . 45
3.1 Overview . . . 45
3.2 Data formats. . . 46
3.2.1 Matrices: The basic data format . . . 46
3.2.2 Contingency tables. . . 46
3.2.3 Correlations, covariances, similarities. . . 48
3.2.4 Three-way arrays: Several matrices . . . 48
3.2.5 Meaning of numbers in a matrix. . . 49
3.3 Chapter example. . . 49
3.4 Designs, statistical models, and techniques. . . 50
3.4.1 Data design . . . 50
3.4.2 Model . . . 51
3.5 From questions to statistical techniques . . . 53
3.5.1 Dependence designs versus internal structure designs . . . 54
3.5.2 Analysing variables, objects, or both. . . 55
3.6 Dependence designs: General linear model—GLM . . . 56
3.6.1 Thet test. . . 57
3.6.2 Analysis of variance—ANOVA . . . 57
3.6.3 Multiple regression analysis—MRA. . . 58
3.6.4 Discriminant analysis . . . 60
3.6.5 Logistic regression . . . 61
3.6.6 Advanced analysis of variance models. . . 63
3.6.7 Nonlinear multivariate analysis . . . 64
3.7 Internal structure designs: General description . . . 65
3.8 Internal structure designs: Variables. . . 65
3.8.1 Principal component analysis—PCA . . . 66
3.8.2 Categorical principal component analysis—CatPCA . . . 71
3.8.3 Factor analysis—FA . . . 73
3.8.4 Structural equation modelling—SEM. . . 75
3.8.5 Loglinear models . . . 77
3.9 Internal structure designs: Objects, individuals, cases, etc.. . . 79
3.9.1 Similarities and dissimilarities. . . 79
3.9.2 Multidimensional scaling—MDS. . . 80
3.9.3 Cluster analysis . . . 81
3.10 Internal structure designs: Objects and variables. . . 84
3.10.1 Correspondence analysis: Analysis of tables. . . 84
3.10.2 Multiple correspondence analysis . . . 87
3.10.3 Principal component analysis for binary variables. . . 88
3.11 Internal structure designs: Three-way models . . . 88
3.11.1 Three-mode principal component analysis—TMPCA. . . . 89
3.12 Hypothesis testing versus descriptive analysis. . . 91
3.13 Model selection . . . 91
3.14 Model evaluation . . . 93
3.15 Designing tables and graphs . . . 93
3.15.1 How to improve a table . . . 93
3.15.2 Example of table rearrangement: a binary dataset. . . 94
3.15.3 Examples of table rearrangement: contingency tables. . . 94
3.15.4 How to improve graphs . . . 97
3.16 Software. . . 98
3.17 Overview of statistics in the case studies . . . 99
4 Statistical framework extended. . . 103
4.1 Contents and Keywords . . . 103
4.2 Introduction . . . 104
4.3 Analysis of variance designs . . . 104
4.4 Binning . . . 105 4.5 Biplots. . . 106 4.6 Centroids . . . 108 4.7 Contingency tables . . . 108 4.8 Convex hulls . . . 110 4.9 Deviance plots . . . 111 4.10 Discriminant analysis . . . 112 4.11 Distances . . . 113
4.12 Inner products and projection . . . 114
4.13 Joint biplots . . . 115
4.14 Means plot with error bars, line graph, interaction plot . . . 115
4.15 Missing rows and columns . . . 117
4.16 Multiple regression. . . 118
4.17 Multivariate, multiple, multigroup, multiset, and multiway . . . . 120
4.18 Quantification, optimal scaling, and measurement levels . . . 121
4.19 Robustness. . . 123
4.20 Scaling coordinates. . . 124
4.21 Singular value decomposition . . . 125
4.22 Structural equation modelling—SEM . . . 125
4.23 Supplementary points and variables . . . 127
4.24 Three-mode principal component analysis (TMPCA) . . . 127
4.25 X2 test (v2 test). . . . 128
Part II The Scenes 5 Similarity data: Bible translations. . . 133
5.1 Background . . . 133
5.2 Research questions: Similarity of translations . . . 134
5.3 Data: English and German Bible translations . . . 135
5.4 Analysis methods . . . 137
5.4.1 Characteristics of multidimensional scaling and cluster analysis . . . 138
5.4.2 Multidimensional scaling . . . 138
5.4.3 Cluster analysis . . . 138
5.5 Bible translations: Statistical analysis. . . 139
5.5.1 Multidimensional scaling . . . 139
5.5.2 Cluster analysis . . . 140
5.6 Other approaches to analysing similarities . . . 140
5.7 Content summary . . . 140
6 Stylometry: Authorship of the Pauline Epistles. . . 143
6.1 Background . . . 143
6.2 Research questions: Authorship . . . 146
6.3 Data: Word frequencies in Pauline Epistles . . . 148
6.4 Analysis methods . . . 149
6.4.1 Choice of analysis method . . . 150
6.4.2 Using correspondence analysis . . . 150
6.5 The Pauline Epistles: Statistical analysis. . . 151
6.5.1 Inspecting Epistle profiles. . . 151
6.5.2 Inertia and dimensionalfit . . . 152
6.5.3 Plotting the results . . . 153
6.5.4 Plotting the Epistles profiles . . . 153
6.5.5 Epistles and Word categories: Biplot. . . 155
6.5.6 Methodological summary . . . 156
6.6 Other approaches to authorship studies. . . 157
6.7 Content summary . . . 158
7 Economic history: Agricultural development on Java. . . 161
7.1 Background . . . 161
7.2 Research questions: Historical agricultural data. . . 162
7.3 Data: Agriculture development on Java . . . 163
7.4 Analysis methods . . . 166
7.4.1 Choice of analysis method . . . 166
7.4.2 CatPCA: Characteristics of the method. . . 167
7.5 Agricultural development on Java: Statistical analysis. . . 167
7.5.1 Categorical principal component analysis in a miniature example. . . 168
7.5.2 Main analysis. . . 171
7.5.3 Agricultural history of Java: Further methodological remarks. . . 175
7.6 Other approaches to historical data: . . . 175
7.7 Content summary . . . 176
8 Seriation: Graves in the Münsingen-Rain burial site . . . 177
8.1 Background . . . 177
8.2 Research questions: A time line for graves. . . 178
8.3 Data: Grave contents. . . 179
8.4 Analysis methods . . . 180
8.5 Münsingen-Rain graves: Statistical analysis . . . 181
8.5.1 Fashion as an ordering principle . . . 181
8.5.2 Seriation . . . 182
8.5.3 Validation of seriation . . . 183
8.5.4 Other techniques . . . 185
8.6 Other approaches to seriation. . . 185
8.7 Content summary . . . 186
9 Complex response data: Evaluating Marian art . . . 187
9.1 Background . . . 187
9.2 Research questions: Appreciation of Marian art . . . 188
9.3 Data: Appreciation of Marian art across styles and contents . . . 189
9.4 Analysis method. . . 192
9.5 Marian art: Statistical analysis. . . 193
9.5.1 Basic data inspection . . . 193
9.5.2 A miniature example . . . 195
9.5.3 Evaluating differences in means . . . 197
9.5.4 Examining consistency of relations between the response variables. . . 201
9.5.5 Principal component analyses: All painting categories. . . 202
9.5.6 Principal component analysis: Per painting category. . . 205
9.5.7 Scale analysis: Cronbach’s alpha. . . 205
9.5.8 Structure of the questionnaire . . . 206
9.6 Other approaches to complex response data . . . 209
9.7 Content summary . . . 209
10 Rating scales: Craquelure and pictorial stylometry. . . 211
10.1 Background . . . 211
10.2 Research questions: Linking craquelure, paintings, and judges . . . 213
10.3 Data: Craquelure of European paintings. . . 213
10.4 Analysis methods . . . 215
10.5 Craquelure: Statistical analysis. . . 217
10.5.1 Art-historical categories: Scale means . . . 217
10.5.2 Scales, judges, and paintings: Three-mode component analysis. . . 218
10.5.3 Separation of art-historical categories. . . 223
10.6 Other approaches to pictorial stylometry. . . 225
10.7 Content summary . . . 225
11 Pictorial similarity: Rock art images across the world . . . 227
11.1 Background . . . 227
11.2 Research questions: Evaluating Rock Art. . . 229
11.2.1 The Kimberley versus Algerian images . . . 229
11.2.2 The Zimbabwean, Indian, and Algerian images . . . 229
11.2.3 The Kimberley, Arnhem Land, and Pilbara images. . . . 229
11.2.4 General considerations . . . 230
11.3 Data: Characteristics of Barry’s rock art images . . . 230
11.4 Analysis methods . . . 233
11.4.1 Comparison of proportions . . . 233
11.4.2 Principal component analyses for binary variables . . . . 234
11.5 Rock art: Statistical analysis . . . 235
11.5.1 Comparing rock art from Algeria and from the Kimberley . . . 235
11.5.2 Comparing rock art from Zimbabwe, India, and Algeria . . . 238
11.5.3 Comparing rock art images from within Australia . . . . 241
11.5.4 Further analytical considerations . . . 245
11.6 Other approaches to analysing rock art images. . . 246
11.7 Content summary . . . 246
12 Questionnaires: Public views on deaccessioning . . . 249
12.1 Background . . . 249
12.2 Research questions: Public views on deaccessioning. . . 251
12.3 Data: Public views about deaccessioning . . . 252
12.3.1 Questionnaire respondents. . . 252
12.3.2 Questionnaire structure. . . 252
12.3.3 Type of data design . . . 252
12.4 Analysis methods . . . 254
12.5 Public views on deaccessioning: Statistical analysis . . . 254
12.5.1 Item distributions. . . 254
12.5.2 Item means . . . 254
12.5.3 Item correlations . . . 256
12.5.4 Measurement models: Preliminaries. . . 258
12.5.5 Measurement models: Confirmatory factor analysis. . . 260
12.5.6 Measurement models: Deaccessioning data . . . 261
12.5.7 Item loadings. . . 263
12.5.8 Interpretation . . . 264
12.6 Other approaches in deaccessioning studies . . . 266
12.7 Content summary . . . 266
13 Stylometry: The Royal Book of Oz - Baum or Thompson? . . . 269
13.1 Background . . . 269
13.2 Research questions: Competitive authorship . . . 270
13.3 Data: Occurrence of function words. . . 271
13.3.1 Preprocessing. . . 271
13.3.2 Dataset . . . 272
13.4 Analysis methods . . . 273
13.4.1 Significance testing. . . 273
13.4.2 Distributions and graphics. . . 273
13.4.3 Principal component analysis and graphics. . . 275
13.4.4 Cluster analysis . . . 275
13.5 Wizard of Oz: Statistical analyses . . . 276
13.5.1 Principal component analysis . . . 276
13.5.2 Cluster analysis . . . 280
13.6 Other approaches in authorship studies. . . 281
13.7 Content summary . . . 282
14 Linguistics: Accentual prose rhythm in mediæval Latin. . . 285
14.1 Background . . . 285
14.2 Research questions: Accentual prose rhythm in mediæval Latin . . . 287
14.3 Data: Janson’s data tables. . . 288
14.4 Analysis methods . . . 288
14.4.1 Contingency tables. . . 289
14.4.2 Ordinal principal component analysis . . . 289
14.5 Accentual prose rhythm: Statistical analysis . . . 290
14.5.1 Internal structure of individual authors’ cadences. . . 291
14.5.2 Similarities in accentual prose rhythm . . . 297
14.6 Content summary . . . 301
15 Linguistics: Chronology of Plato’s works . . . 303
15.1 Background . . . 303
15.2 Research questions: Plato’s chronology . . . 304
15.3 Data: Kaluscha’s clausulae data. . . 305
15.4 Analysis methods . . . 305
15.5 Plato’s chronology: Statistical analysis. . . 306
15.5.1 Text similarities . . . 306
15.5.2 Clausulae and texts . . . 307
15.6 Other approaches to Plato’s chronology. . . 309
15.7 Content summary . . . 310
16 Binary judgments: Reading preferences . . . 313
16.1 Background . . . 313
16.2 Research questions: Binary variables . . . 314
16.3 Data: Reading preferences. . . 314
16.4 Analysis methods . . . 314
16.4.1 Loglinear modelling. . . 315
16.4.2 Multiple correspondence analysis . . . 315
16.4.3 Supplementary variables. . . 316
16.5 Reading preferences: Statistical analysis. . . 317
16.5.1 Co-occurrence of quality and popular reading . . . 318
16.5.2 Complexity of the relations. . . 318
16.5.3 Multiple correspondence analysis . . . 320
16.5.4 Supplementary background variables. . . 321
16.6 Other approaches to binary judgments . . . 322
16.7 Content summary . . . 324
17 Music appreciation: The Chopin Preludes . . . 327
17.1 Background . . . 327
17.2 Research questions: Appreciation and musical knowledge. . . 328
17.3 Data: Semantic differential scales. . . 329
17.3.1 Musical database: Semantic differential scales . . . 329
17.3.2 Design and data collection . . . 330
17.3.3 Data format: Students, Preludes, and Scales. . . 331
17.4 Analysis methods . . . 333
17.4.1 Two-way multivariate analysis of variance. . . 334
17.4.2 Tucker’s three-mode model. . . 334
17.4.3 Joint biplots. . . 336
17.5 The Chopin preludes: Statistical analysis . . . 336
17.5.1 Individual differences. . . 336
17.5.2 Three-mode principal component analysis—TMPCA. . . . 338
17.5.3 Scale components. . . 338
17.5.4 Prelude components . . . 339
17.5.5 Joint biplot—Consensus. . . 341
17.5.6 Preludes as characterised by the scales . . . 341
17.5.7 Circle offifths: Judgments and keys . . . 341
17.5.8 Joint biplot—Individual differences. . . 343
17.6 Other approaches to evaluating music appreciation . . . 344
17.7 Content summary . . . 344
18 Musical stylometry: Characterisation of music. . . 347
18.1 Background . . . 347
18.2 Research questions: Differences in musical style. . . 348
18.3 Data: Melodic intervals and pitch . . . 349
18.3.1 Musical database . . . 349
18.3.2 Musical styles: Features . . . 349
18.3.3 Design. . . 351
18.4 Analysis methods . . . 351
18.4.1 Binary logistic regression . . . 351
18.4.2 Multinomial logistic regression . . . 353
18.4.3 Discriminant analysis . . . 354
18.5 Characterisation of music: Statistical analysis . . . 355
18.5.1 Data description. . . 355
18.5.2 Data inspection . . . 355
18.5.3 Preliminaries for the analyses . . . 360
18.5.4 Predicting Bach versus Haydn + Beethoven. . . 363
18.5.5 Genre: Heterogeneity of the Bach pieces. . . 365
18.5.6 Genre: Discriminating between Bach pieces. . . 367
18.6 Other approaches to analysing musical styles:. . . 369
18.7 Content summary . . . 370
Part III The Finale 19 Final Musings. . . 373
Appendix A: Discipline-orientated statistics books. . . 375
Statistical Glossary. . . 377
References. . . 405
Index . . . 417