1.3 Foundations in Information Science
1.3.4 Information Retrieval in Libraries
Until recently, the typical information retrieval system in a library was the Online Public Access Catalog (OPAC). The first OPACs either attempted to emulate the familiar card catalog in its new online form or they adopted the model familiar to online database searchers of commercial search services (Hildreth, 1995). They soon were replaced by so-called second generation OPACs that combined these two approaches and enhanced the search possibilities significantly, but at the same time increased the complexity for the user.
Borgman (1996) performed several studies on the problems of OPAC users be- tween 1986 and 1996. She concludes that only little improvements, if any, were done in this interval to improve the usability of catalog systems. Users still needed assistance to translate their questions in a structured query that can be interpreted by the retrieval system. They were mainly usable by librarian experts, not by the typical library user. The system design should follow the users’ search behavior, not the other way round.
Regarding the use of KOSs for information retrieval, the discrepancy between expert searchers and typical library users is evident: Fidel (1991a, 1991b, 1991c) examined in detail the search behavior of professional searchers and found that they heavily relied on thesauri, they “consulted them for 75% of the search keys they selected ” (Fidel, 1991a, p. 512). In contrast, studies about the use of the traditional OPACs showed that subject search was often not successful or satisfy- ing, mainly because only few users could take advantage of the controlled subject headings that were available in the library catalog (Sridhar, 2004; Yu & Young, 2004). Greenberg (2004) found in an – admittedly limited – study with typical library users (educationally advanced students pursuing MBA degrees) that the users’ thesauri comprehension is extremely limited and that – given a basic the- saurus introduction – users indicate a desire to use thesauri. These studies suggest that the use of KOSs for information retrieval has to be more intuitive for the users. With the wide-spread use of Internet search engines, a new component came into play: now the users were not only inexperienced with the use of OPACs, they had expectations toward an OPAC that result from their experiences with search engines like Google. Yu and Young (2004) describe this development in depth and suggest that OPACs have to implement search engine features like natural lan- guage search with keywords, relevance feedback, spelling corrections, and rele- vance ranked output. Similar statements are made by Campbell and Fast (2004)
22 CHAPTER 1. INTRODUCTION – they see a huge potential for new innovations in the complementary relationship between catalogs and search engines.
Search engines do not only have an impact on usability expectations, today they are an inherent part of any information search: According to Rosa (2006, p. 1-7), “89% of college student information searches begin with a search engine.” But what are the differences between OPACs and search engines? In 2002, Eversberg (2002, p. 122) stated that “catalogs and search engines are juxtaposed in a pears vs. apples comparison.”18 But he also admits that “there are, however, widening ‘grey’ areas: Genuine Internet resources are being cataloged to enrich catalogs. And search engines index files that contain book reviews, abstracts, whole chapters, descriptions, etc.”
Since 2002, more and more smooth transitions emerge between catalogs and search engines. In 2004, Google started with two new services: Google Scholar19 and Google Books – formerly known as Google Print.20 Both provide access to documents that were only available via library catalogs by then. There are vi- sions of digital libraries, where the user can search and browse the whole inventory and access all documents (and audio files, movies, ...) with one click at any time and any place in the world, for instance the open library project21 of the Internet Archive.22
From the side of the OPACs, the transition towards search engines is highly visible, too. In 2007, the Mannheim University Library introduced Primo, a com- mercial solution by Ex Libris,23 that integrates various sources for bibliographic data, not only about the books that are physically available at the library, but also data about single articles in subscribed journals and huge amounts of data for arti- cles and e-books that are available to the library users through other channels. The interface is familiar and intuitive for search engine users and the result lists can easily be sorted and filtered by various aspects (drill down, faceted search), which, in turn, is a feature that Google just recently added to its standard search interface. Like Ex Libris with Primo, other commercial vendors of library solutions have similar products. They are commonly referred to as Resource Discovery Systems and employed in an increasing number of libraries world-wide.
Another solution worth to mention is the library resource portal VuFind,24which is developed and maintained by the Villanova University, PA, USA and provided free of charge under an open source license. Regarding the features, VuFind is sim-
18
Quoted from the English version of the author, available at http://www.allegro-c.de/formate/ tlcse.htm 19 http://scholar.google.com/ 20 http://books.google.com/ 21http://openlibrary.org/ 22http://www.archive.org/ 23 http://www.exlibrisgroup.com 24 http://vufind.org/
1.4. RESEARCH QUESTIONS, CONTRIBUTIONS, AND LIMITATIONS 23