THE
CHALLENGES
OF BIG DATA
The AHRC has an important leadership role in supporting the development of the new or enhanced skills and competencies in order to fully exploit the potential of ‘big data’ across the full range of arts and humanities disciplines.
BIG DATA
More data is being generated than ever before. According to a recent analysis by CISCO, in 1992 100 gigabytes of data were transmitted across the internet every day; by 2012, this had grown to 12,000 gigabytes a second. CISCO estimates that this data will treble by 2017, with the global equivalent of all the films ever made crossing the global internet every four minutes.
The generation of such data poses great technical challenges, but it also presents the possibility of using analytic techniques which allow human society and behavior to be investigated in greater detail than ever before. Commercial companies have been quick to seize the marketing opportunities offered by this data, but it also offers exciting new possibilities for researchers in many academic disciplines. It is commonly assumed that the most challenging big data problems are presented by scientific projects such as the Large Hadron Collider. However, CISCO points out that the bulk of traffic on the internet consists of video. Family history is one of the most popular pursuits on the internet, and some of the largest servers on the planet contain historical records. It is becoming increasingly clear that there are important, distinctive and creative contributions that the arts and humanities can make to the development of approaches to the use of such ‘big data’, for example in terms of developing new types of visualisation and representation, exploring different contexts in which it might be used or inspiring creative ways to engage with data users. It is also clear that there are immense opportunities for transformative research in the arts and humanities offered by developments in the capacity to develop, exploit and re-use very large and complex datasets and to link together large and varied forms of data in increasingly sophisticated ways.
The challenges of big data
However, for researchers to meaningfully engage with information on such a large scale, it is necessary to develop new, or draw on innovative existing, tools and methods. In approaching such data, researchers may need to develop new methods by, for example, using more visualization and quantitative techniques and adopting a ‘data-driven’ approach, which looks for anomalous patterns in data. As researchers deal with ever larger quantities of data, the need for a powerful and flexible computing infrastructure becomes more pressing than ever.
Big data approaches can also create new cross-disciplinary and international collaborative research opportunities and stimulate novel boundary crossing research across different forms of physical and virtual materials, spanning wide geographical and temporal scales and combining data from different sources.
The AHRC and big data
There are challenges which arise from working with any kind of data, such as privacy and trust as well as intellectual property and copyright. Understanding and contextualising big data is also important and issues of complexity can be as challenging as those of scale. In this regard there are close connections to AHRC’s Digital Transformations Theme which is exploring such issues in the broader context of developments in digital technologies and approaches.
The AHRC has an important leadership role in supporting the
development of the new or enhanced skills and competencies that will be needed in, for example, the potential of innovative technologies and
These projects illustrate how the arts and
humanities can help exploit the
opportunities offered by these vast data
resources. They cover an amazing range of
subject areas, from classical history and more
efficient retrieval of information about music to
the use of online gambling data for more
accurate political analysis. By developing better
tools for the visualisation and analysis of data,
these projects will have significant impact
beyond the arts and humanities and will assist
the UK in grasping the economic and social
opportunities offered by big data.”
Professor Andrew Prescott
Theme Leadership Fellow, Digital Transformations
data analysis in order to fully exploit the potential of ‘big data’ across the full range of arts and humanities disciplines.
Big data potentially connects to a number of AHRC’s wider priority areas, including the creative and cultural economy, heritage and its leadership of the cross-Council Connected Communities Programme. As a part of this we are keen to explore the potential to deepen and expand the way in which the people, skills and research we support in relation to big data can contribute to creativity and innovation, and to community and public life, to bring cultural, intellectual and economic benefits to the UK and the wider world. Effective use of big data has
the potential to drive a step-change in the way the creative and cultural economy engages with data which could deliver significant direct benefits for the sector as well as supporting advances in arts and humanities research.
AHRC-funded projects
In February of this year the Minister for Universities and Science, David Willetts MP, announced funding of £4.6 million for 21 Digital Transformations in the Arts and Humanities projects as part of the AHRC’s investment in Big Data.
The twenty-one new research projects will be addressing the
challenges of working with big data and making the information more accessible and easier to interpret by a lay audience. Part of the Digital Transformations in the Arts and Humanities theme, each of these research projects will produce a tangible asset that sustains beyond the life of the project. These will include open source tools to analysis election poll data and an online teaching resource which will hold a collection of pronunciations of words from speakers of different varieties of English.
The projects include:
•
Frameworks and tools for statistical big data in the humanities led by Dr Humphrey Southall at the University of Portsmouth•
What are the odds? Capturing and exploring data created byonline political gambling markets led by Dr Matthew Wall at Swansea University
•
Standards for Networking Ancient Prosopographies: Data and Relations in Greco-roman Names led by Dr Gabriel Bodard at King’s College London•
The Secret Life of a Weather Datum led by Dr Jo Bates at the University of Sheffield•
Optical Music Recognition from Multiple Sources led by Dr Alan Marsden at the Lancaster University•
DEEP FILM Access Project led by Dr Sarah Atkinson at the University of Brighton•
Visualising European Crime Fiction: New Digital Tools and Approaches to the Study of Transnational Popular Culture led by Dr Dominique Jeannerod at the Queen’s University of Belfast•
Understanding the annotation process: annotation for Big data by•
A Big Data History of Music led by Dr Stephen Rose at the Royal Holloway, University of London•
A Pilot Historical Thesaurus of Scots led by Dr Susan Rennie at the University of Glasgow•
Big Data for Law led by Mr John Sheridan, at The National Archives•
Lost Visions: retrieving the visual element of printed books fromthe nineteenth century led by Professor Julia Thomas at the Cardiff University
•
Traces though Time: Prosopography in practice across Big Data led by Dr Sonia Ranade at The National Archives•
Digital Music Lab - Analysing Big Music Data led by Dr Tillman Weyde at the City University London•
Semantic Annotation and Mark Up for Enhancing Lexical Searches led by Dr Marc Alexander at the University of Glasgow•
Palimpsest: an Edinburgh Literary Cityscape led by Professor James Loxley at the University of Edinburgh•
Mining the History of Medicine led by Professor Sophia Ananiadou at The University of Manchester•
Proteus: Capturing the Big Data Problem of Ancient Literary Fragments led by Dr Dirk Obbink at University of Oxford•
Seeing Data: are good big data visualisations possible? led by Dr Helen Kennedy at the University of Leeds•
Big UK Domain Data for the Arts and Humanities (BUDDAH) led by Dr Jane Winters at the University of London•
Dynamic dialects: integrating articulatory video to reveal the complexity of speech led by Professor Jane Stuart Smith at the University of GlasgowThe projects have been funded under the £4.6 million, ‘Digital
Transformations in the Arts and Humanities: Big Data Research’ funded by the Arts and Humanities Research Council with support from the Economic and Social Research Council.
For further information, please go to: www.ahrc.ac.uk
Getting quality data out of the hands of a
few and into the public domain is an
important goal for this Government. This
funding will help to overcome the challenge of
making vast amounts of rich data more
accessible and easier to interpret by the public.
These 21 projects promise to come up with
innovative long-lasting solutions.”
Rt Hon David Willetts MP
Published by
Arts and Humanities Research Council
Polaris House, North Star Avenue, Swindon, Wiltshire, SN2 1FL www.ahrc.ac.uk
©Arts and Humanities Research Council 2014. Published July 2014 Design by Rumba. Printed by JRS on paper containing