CHAPTER 2: LITERATURE REVIEW
2.5. Parallel Problems in Conservation Biology
I have identified the gap in current digital preservation practice wherein there is
presently no applied file format risk monitoring and analysis method. In my exploration of
methods to best address this gap, I discovered similar problems in the research area of
conservation biology. After further exploration, it was clear that the methods used to monitor,
analyze, and warn against impending species extinction are relevant to my research in file
Biologists and allied researchers have invested enormous time and effort to identify,
categorize, and count the organisms of the Earth. Consequently, many methods for
monitoring species have been created, tested and refined. There is much relevant research in
this area and there are several possibilities to adapt these methods for use in monitoring file
format endangerment.
At the heart of species endangerment monitoring is population viability analysis
(PVA). Doak defines PVA as, “the use of quantitative methods to predict the likely future
status of populations of conservation concern and also to predict how best to manage these
populations” (2009, p. 522). While there are many variations of PVA, the method generally
involves the collection of data for specified factors that inform an understanding of a species’
health in the wild. The data is then statistically analyzed, often using analysis and simulation
software to predict the possibility of a species’ extinction. The lists of factors vary depending
on the species and have evolved over time as a product of increased understanding of what
kind of data can be collected and what does and does not affect the predictability of species
endangerment.
The roots of PVA are found in the work of Shaffer (1981) and Soulé (1985). Shaffer’s
work in minimum population sizes helped shape models for predicting the possibility of
extinction that later became part of contemporary PVA models. Soulé has been instrumental
in shaping and defining the field of conservation biology, the field in which researchers most
commonly use PVA methods.
In their overview of PVA, Gerber and González-Suárez wrote that,
PVA represents one of the most valuable approaches that has emerged from the burgeoning field of conservation biology. While it is impossible to make precise
estimate the relative risk of extinction, and to compare the efficacy of alternate management strategies (2010, Summary, ¶1).
Applying a PVA-type approach to file format endangerment analysis may not be able to
predict an exact date when a file format will become endangered, but it can be used to alert
the community to potential file format endangerment. Additionally, creating PVA-style
simulations may be useful in comparing file format preservation strategies.
In 1995, Paul Angermeir reported on a study in which he examined the attributes of
extinction-prone species of freshwater fish in Virginia. Based on the results of this study,
Angermeir was able to make focused suggestions on the prevention of extirpation and
broader extinction for these fish species. The method Angermeir used involves analyzing
data collected for individual factors of extirpated species. It was through correlation analysis
of these factors between species that he was able to determine the similarities between the
extirpated species. He stated that, “because direct observation and experimentation are
generally infeasible, correlative analyses are the primary tools available to study large-scale
extinction processes” (p. 154). Angermeir acknowledged the difficulty in detecting patterns
in systems with complex dynamics. In order to overcome the resulting “statistical noise,” he
performed multiple complimentary analyses. Because the “ecosystem” of file formats is also
complex, similar triangulation of methods may be useful in analyzing factors that contribute
to file format endangerment.
The notion of extirpation, or the local extinction of species, is useful in considering
research methods for file format endangerment. By extension, the phrase “file format
extirpation” would mean that a local institution or its regular users could not access
cataclysmic. Rather, it is incremental, with total extinction preceded by local or regional
extinctions” (p. 144). He said that knowledge of local extinction of a species could help to
fuel proactive measures to prevent further extirpation and widespread extinction.
Understanding the causes of local extinction can help in the creation of more specific
solutions to the problem. By extension, understanding localized file format endangerment
can be useful in creating more useful solutions, both locally and worldwide.
Mace and Lande (1991) provided some guidelines for designing an effective
extinction threat assessment system. They suggested the following six characteristics of an
ideal system:
• The system should be simple. There should only be a few categories for assessing risk, they should have a clear relationship with each other, and “should be based around a probabilistic assessment of extinction risk”
• The categorization system should be flexible about the quality and quantity of data required.
• The system should work with any species. • The terminology used should be clear.
• The system should include some assessment of uncertainty.
• A timescale should be used for each category of extinction, i.e., number of years until extinction. (p. 150).
All six of these characteristics of an ideal extinction threat assessment system could be
relevant to the development of an ideal file format endangerment assessment system.
O’Grady, Reed, Brook, and Frankham (2004) examined the correlation between
sixteen criteria used in determining species extinction risk. They performed stepwise multiple
regression analysis on each of the factors and found that, “population size and percent change
in population size are the best predictors of extinction risk” (p. 519). O’Grady, Reed, Brook
predictors of extinction risk were population size and change in population size. This is
significant in that O’Grady and his colleagues showed that monitoring only the population
size of a species is sufficient to predict impending extinction. Once file format endangerment
monitoring factors have been selected and sufficient data has been collected, similar tests of
correlation should be performed to assess effectiveness. If the models are truly similar, it
could mean that monitoring the number of instances of a file format and the change in this
number may be sufficient for effective endangerment prediction.
2.5.2. Applications in File Format Endangerment Research
The research area of conservation biology provides a strong foundation and a useful
framework from which to base file format endangerment research. The conservation biology
field presents decades of research in methodologies to track and preemptively detect threats
to the continued existence of living species. Examining the frameworks and methods used in
conservation biology has helped me to define the steps that need to be taken to effectively
assess file format endangerment.
First, conservation biology has tried and tested methods for collecting and analyzing
data for monitoring threats in complex systems. In particular, the methods of Population
Viability Analysis have been used, tested, and improved and provide a strong foundation
from which to base file format endangerment analysis. Second, conservation biology presents
a useful framework of threat evaluation and terminology, some of which I have appropriated
for this research. In particular, I am using the term “endangerment” to refer to the possibility
that information encoded in a particular file format will become inaccessible within a certain
Lande’s (1991) recommendations to base threat systems on “probabilistic assessment of risk”
and to include a timescale in each category of risk. Third, the results presented in O’Grady,
Reed, Brook, and Franklin (2004) served as a basis and strong motivator for disambiguating
and reducing the wide array of file format evaluation factors discussed in the literature to
only the most relevant.