Quantitative Methods in the Humanities and Social Sciences

(1)

Quantitative Methods in the Humanities

and Social Sciences

Series Editors

Thomas DeFanti, Calit2, University of California San Diego, La Jolla, CA, USA Anthony Grafton, Princeton University, Princeton, NJ, USA

Thomas E. Levy, Calit2, University of California San Diego, La Jolla, CA, USA Lev Manovich, Graduate Center, Room 4319, The Graduate Center, CUNY, New York, NY, USA

(2)

Quantitative Methods in the Humanities and Social Sciences is a book series designed to foster research-based conversation with all parts of the university campus – from buildings of ivy-covered stone to technologically savvy walls of glass. Scholarship from international researchers and the esteemed editorial board represents the far-reaching applications of computational analysis, statistical models, computer-based programs, and other quantitative methods. Methods are integrated in a dialogue that is sensitive to the broader context of humanistic study and social science research. Scholars, including among others historians, archae-ologists, new media specialists, classicists and linguists, promote this interdisci-plinary approach. These texts teach new methodological approaches for contemporary research. Each volume exposes readers to a particular research method. Researchers and students then beneﬁt from exposure to subtleties of the larger project or corpus of work in which the quantitative methods come to fruition. Editorial Board:

Thomas DeFanti, University of California, San Diego & University of Illinois at Chicago

Anthony Grafton, Princeton University

Thomas E. Levy, University of California, San Diego Lev Manovich, The Graduate Center, CUNY

Alyn Rockwood, King Abdullah University of Science and Technology Publishing Editor for the series at Springer: Laura Briskman,

[email protected]

(3)

Pieter M. Kroonenberg

Multivariate Humanities

(4)

Pieter M. Kroonenberg Leiden University Leiden, Zuid-Holland The Netherlands

ISSN 2199-0956 ISSN 2199-0964 (electronic) Quantitative Methods in the Humanities and Social Sciences

ISBN 978-3-030-69149-3 ISBN 978-3-030-69150-9 (eBook)

https://doi.org/10.1007/978-3-030-69150-9

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, speciﬁcally the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microﬁlms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a speciﬁc statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional afﬁliations.

(5)

To Ineke

(6)

Preface

Why this book?

Rather than stating this myself, I leave it to voices in the humanities to explain the need for a book such as the present one.

Just 15% of students in England study mathematics beyond GCSE level. However, many of this non-mathematics studying majorityﬁnd that they need mathematical skills for the advanced study of other subjects, including humanities and social science subjects at school or university or in their job.[...] Without mathematical and, in particular, statistical skills whole areas of the social sciences and humanities are inaccessible to research students and future academics. (Canning, 2014, p. vii)

My position is that multivariate analysis is to be thought of as nothing more than the analysis of tables of data. If it worth putting together a table of data it is worth exploring it by multivariate methods (Wright, 1989, p. 1; quoted in Baxter, 1994).

On the other hand Thomas (1978, p. 231) produces a cautionary note in his paper The awful truth about statistics in archaeology:

There is a rapidly growing clutch of statistically-sophisticated archeologists who seemed perched, rotating factor analyses in hand, prepared to pounce on the ﬁrst clump of unmanipulated data that should have the misfortune of stumbling into their path.

In writing this book it has been my explicit aim to present a link between the data collected to tackle specific research questions in the humanities, and appropriate multivariate statistical techniques to answer such questions. The central part of the book consists of case studies from different disciplines in the humanities. They are meant to encourage researchers to look differently at their data, and to consider various possibilities for analysis. At the same time it would be wise to take heed of the cautionary remarks in Thomas’s (1978) paper.

Thus, the general idea of the text is to provide guidance for researchers in making informed decisions on which approaches may be useful to answer their research questions using the data at hand. This book is not meant to teach them how to perform all the analysis methods presented here, but instead to show the kind of analyses that are available given the data and the research questions. My hope is that this approach will be useful in practical work. However, to quote Warwick (2004, p. 378), it should be realised that‘the use of digital (and statistical) resources can only be truly meaningful when combined with old-fashioned critical judgement’— and expert knowledge I would like to add.

(7)

What is new?

Readers may wonder what new insights can be gained by using multivariate methods. Obviously there is no easy answer, as I am not an expert in all the fields touched upon in this book. Primarily I would be happy if results from applying multivariate methods are not contradictory to what the experts say their data tell them. The power of analysing many data in one go and presenting them in a coherent fashion, especially via well-designed tables and graphs, has given me, as a nonexpert, insights into various fields of endeavour which otherwise I could probably only have acquired via painstaking, detailed time-consuming studies of each field separately. At the same time I hope and expect that quantitative methods, in general, and multivariate techniques, in particular, will put powerful tools in the hands of experts in the humanities so that they can both expand their views and present their research results to colleagues and novices. Moreover, they may dis-cover new facts and anomalies which otherwise might have gone unnoticed.

Apart from this, the book is primarily intended as a demonstration of analytic tools available for use in many fields of the humanities. The case studies are demonstrations of how statistical problems arising from a research project may be tackled. By seeing these tools applied in different disciplines researchers may dis-cover how they may be applied in their own research projects.

How to use this book

The case study chapters all have a similar setup. First, the background and sub-stantive research questions are introduced, as well as the data of the study. Next, the statistical methods used in the chapter are discussed at an elementary level, as well as the reasons for choosing them. Finally, the focal points of the chapters are the results of the analyses, which are presented at a medium statistical level.

To support the analyses in the case studies there are four introductory method-ological chapters, so that the emphasis in the case study chapters could be on the application rather than the mathematical side of things. The fourth chapter contains more detailed explanations for readers who want to delve deeper into the statistical side of the methods. Moreover, for a brief explanation of statistical terms a Glossary is provided at the end of the book.

The emphasis in the text is on descriptive and exploratory methods, amply illustrated with many kinds of different graphics, rather than on test-based and mathematical statistical methods. It should be emphasised that the methods used are rather standard and have been found useful in many contexts. The main exceptions are methods for categorical data, and methods in which the data have been measured more than once. At times the same explanations will be given in several chapters to allow for (semi)independent reading, but there will also be many cross-references to other places in the text for additional explanation and information. However,

(8)

researchers wishing to carry out meaningful analyses of their data should turn to statistical consultants and/or specialists in the various statistical techniques descri-bed.

Audience

The level of the exposition is aimed at readers with at least a graduate degree in the humanities and a basic course in methodology and statistics. I am assuming they understand such subjects as elementary descriptive analysis and hypothesis testing. For them I hope to show that many multivariate statistical techniques can be used to make more sense of their data.

On the other hand, I have also tried to make the text attractive for individuals who are primarily interested in the case studies themselves, and want to see what it is all about. I would advise them to read the first chapter, skim (or skip) the next three methodological chapters, and then start with the case studies that interest them, turning only to the methodology if they feel the need. In reading the case studies they should read the background, research questions, and the data descriptions. Then (again) skim or skip the methodological sections, and continue with the content summary at the end of the chapter.

The outcomes of the analyses should not be considered fundamental contribu-tions to the subjects studied. After all I am a statistician, and not an expert on all the areas touched upon in the case studies.

Plagiarism?

Writing books such as this one is a daunting task, because so many methodology books have already been written. Copying texts from other authors who have pre-sented topics so lucidly is a continuous temptation. The following quote from Crowder and Hand (1990, p. 4) was too beautiful and too close to the heart of the matter not to plagiarise:‘There is very little “original” material here. Our inclination for barefaced plagiarization from many sources as possible has been tempered only by the need to write a coherent account’.

Another quote which is close to my heart:

The development of statistical ideas and thinking in the 20th century is possibly one of that century’s more important intellectual achievements; the mathematics of statistics can be ugly and sometimes pointless but I think the justiﬁcation for statistics ultimately lies in applicability to data analysis (...). Not everyone would agree with this. (Baxter, 2008)

Which is duly noted.

(9)

Origin

This book has its origins at the Netherlands Institute for Advanced Study in the Humanities and Social Sciences in Wassenaar, The Netherlands, where I stayed during the academic year 2003–2004. A collection of scholars from both the social sciences and the humanities worked and studied there in truly monastic tranquillity. Everyone was expected to give a presentation about their research. Given that my own research dealt with methods for data analysis but that most of the other scholars were working in substantive areas, the idea came to me to show what my kind of experience in ana-lysing data could offer to their fields of endeavour. This resulted in a lecture Multivariate Analysis: Power to the Humanities with examples concerning the dating of graves from the artefacts they contained, searching for the relative time sequence of the works of Plato, and investigating the authorship of a book from the Wizard of Oz series. I presented this lecture at several subsequent occasions to various audiences. The idea grew to expand this lecture into a book. However, I felt that I would then need more different kinds of data. To this end I wrote to authors of humanities papers asking whether I could use their data for the purpose of a book which at the time was still more of a daydream than anything else. Many reacted favourably to my request, and these datasets form the core of the present book. I am extremely grateful to the authors for their willingness to share their data with me. At the outset I decided that I would reanalyse the data, not in competition with the original publications, but as an example of how one could answer research questions. Several times this resulted in a co-authorship, which is acknowledged in several chapters in this book. Nevertheless, I decided to take complete responsibility for the text presented here to ensure uniformity and purpose.

Acknowledgements

Many people have contributed to the realisation of this book, and I would like to acknowledge them, in alphabetical order. As co-authors and/or data providers: Michael Barry, Zachary Bleemer, Leonard Brandwood, Spike Bucklow, Dorien Herremans, Tore Janson, Takashi Murakami, Donald Polzella, Marilena Vecco, and Jeroen Vermunt. As critical readers and/or correctors: José Binongo, Laura Brinkman, Rachel Chrastil, Pieter de Coninck, Hugh Craig, Twyla Gibson, Ruud Halbertsma, Casper de Jonge, Annemarie Kets-Vree, Paul Keyser, Els Koeneman, Jo McDonald, Cory Mckay, Kaarle Nordenstreng, Adriaan Rademaker, Ralph Rippe, Ineke Smit, June Ross, Meg Southwell, Paul Taçon, HaroldTarrant, Holger Thesleff, Paul Vierthaler, Peter White, and Meredith Wilson.

Leiden, The Netherlands Pieter M. Kroonenberg

[email protected]

(10)

Global table of contents

Opening Dedication Preface

PART 1. The Actors

1. Introduction: Multivariate studies in the Humanities 2. Data inspection: The data are in. Now what? 3. Statistical framework

4. Statistical framework extended

PART 2. The Scenes

Theology / Bible studies

5. Similarity data: Bible translations (co-author: Zachary Bleemer) 6. Stylometry: Authorship of the Pauline Epistles

History & Archeology

7. Economic history: Agriculture development on Java 8. Seriation: Graves in the Münsingen-Rain burial site

Arts

9. Complex response data: Evaluating Marian art (co-author: Donald Polzella) 10. Rating scales: Craquelure and pictorial stylometry (co-author: Spike

Bucklow)

11. Pictorial similarity: Rock art images across the world

12. Questionnaires: Public views on deaccessioning (co-author: Marilena Vecco)

(11)

Linguistics

13. Stylometry: The Royal Book of Oz: Baum or Thompson? 14. Linguistics: Accentual prose rhythm in mediæval Latin 15. Linguistics: Chronology of Plato’s works

16. Binary judgements: Reading preferences

Music

17. Music appreciation: The Chopin Preludes (co-author: Takashi Murakami) 18. Musical stylometry: Characterisation of music (co-author: Dorien

Herremans)

PART 3. The Finale 19. Final Musings

Statistical glossary References

Subject & Author index

(12)

Quantitative Methods in the Humanities and Social Sciences