• No results found

An Experimental Framework for Comparative Digital Library Evaluation: The Logging Scheme

N/A
N/A
Protected

Academic year: 2021

Share "An Experimental Framework for Comparative Digital Library Evaluation: The Logging Scheme"

Copied!
10
0
0

Loading.... (view fulltext now)

Full text

(1)

An Experimental Framework for Comparative Digital

Library Evaluation: The Logging Scheme

Claus-Peter Klas

Norbert Fuhr

Sascha Kriewel

University of Duisburg-Essen

Hanne Albrechtsen

Institute of Knowledge Sharing

Laszlo Kovacs

Andras Micsik

SZTAKI

Preben Hansen

Swedish Institute of Computer

Science

Giannis Tsakonas

Sarantos Kapidakis

Christos Papatheodorou

Ionian University

Elin Jacob

Indiana University Bloomington

ABSTRACT

Evaluation of digital libraries assesses their effectiveness, quality and overall impact. To facilitate the comparison of different evaluations and for supporting the re-use of evalu-ation data, we are proposing a new logging schema that will account for all kinds of data about users, systems and the user-system interactions. We present a novel, multi-level logging framework that will provide complete coverage of the different aspects of DL usage. The main focus of this paper is the concept level, for which we describe the logging scheme in some detail. Based on this specification, we show how the various DL stakeholders can analyze the logging data according to their specific interests. In addition, spe-cific analysis tools and a freely accessable log data repository will yield synergies and sustainablity in DL evaluation.

Categories and Subject Descriptors

H.3.7 [Information Storage and Retrieval]: Digital Li-braries—standards, user issues

General Terms

digital library standard evaluation logging

1.

INTRODUCTION

Evaluation of digital libraries (DLs) aims at assessing their effectiveness, quality and overall impact. Analysis of trans-action logs is one evaluation method that has provided DL stakeholders with substantial input for making managerial decisions and establishing priorities, as well as indicating the need for system enhancements. However, the quantita-tive nature of this method is often criticized for its inabil-ity to provide in-depth information about user interactions

with the DL being evaluated. Because the results of a log-ging study are often localized and not easily interpretable outside the DL being investigated, recommendations arising from such an analysis are not easily generalizable to other DLs. The problem of generalizability is compounded by the absence of a standardized logging scheme that could map across the various logging formats being used, thus presenting them in a commonly understandable language. The development of such a scheme would facilitate compar-isons across DL evaluation activities and provide the means for highlighting critical events in user behaviour and system performance.

The logging scheme proposed in this paper is framed within the DL evaluation activities of the DELOS Network of Ex-cellence. In our whitepaper on DL evaluation [8], we have identified the long-term goal of building a community for DL evaluation. Under the umbrella of an experimental frame-work that will serve as a joint theoretical and practical plat-form for the evaluation of DLs, the proposed logging scheme will allow for the meaningful interpretation and comparison of DL transactions. By using this scheme, evaluation re-searchers will be able to extract re-usable data from the results of previous DL evaluations and to identify useful benchmarks, thereby facilitating more efficient and effective design of evaluation studies.

In order to support the re-use of all possible evaluation data, we intend that the proposed scheme will account for all man-ner of data that can be collected from the user, the system and the user-system interaction. To this end, we are propos-ing a novel, multi-level loggpropos-ing framework that will provide complete coverage of the different aspects of DL usage. The main focus of this paper is the user behaviour level, for which we describe the logging scheme in some detail. Based on this specification, we show how the various DL stakeholders can analyze the logging data according to their specific interests. We also describe some tools supporting this task.

The remainder of this paper is structured as follows: A brief survey of related work on DL logging is given in the Section 2. The levels of the proposed logging scheme are presented

(2)

in Section 3, while Section 4 addresses the events that com-prise the conceptual aspects of the user’s actions. Section 5 presents an application of the logging scheme in the Daf-fodil DL and describes the technical side of the scheme.

The views of the various DL stakeholders on logging data are discussed in Section 6. In conclusion, Section 7 sum-marizes the arguments presented in this paper and outlines future work with logging schemes.

2.

RELATED WORK

In DL research, there is a tendency to identify patterns of information searching behaviour through the use of system features such as Boolean operators, use of ranking and sort-ing mechanisms, identification of the type and nature of queries submitted and analysis of users’ distribution of these features [12],[22], [13].

Logs also provide a useful resource for the evaluation of web-based information systems due to their ability to record ev-ery action that occurs during the user’s interaction with a digital resource. They are generally used to gather quanti-tative data and to optimize the generation of statistical indi-cators. Logs are widely employed in the DL domain because of the consistency they provide with respect to the condi-tions of data collection; but logging should be understood as only one aspect of the overall experimental framework, as suggested by [4]. For websites and for bibliographic sys-tems such as OPACs, logs provide dependable and coherent information about the usage, traffic and performance of a given system. However, because logs can only partially re-veal both the behaviour of the user and her level of satisfac-tion [6], richer informasatisfac-tion is frequently derived by applying other methods of analysis such as sequential pattern analysis and web surveys.

A first proposal for a standard logging scheme for DLs was presented in [10] and [9].

However, when considering not only web based digital li-braries, but a fully developed bibliographic application with a rich set of services (cf. Section 5), the proposal has to be seen as a starting point. Therefore we push the issue further.

First, we distinguish between different levels of logging, as described in Section 3. Each of these levels is supported by a standard XML schema that defines the events of interest according to the special needs of the different stakeholders interested in the logging data.

Second, we extend the original set of events that have been proposed because the Daffodil framework offers a more comprehensive set of DL services, thus requiring the identi-fication of more possible event types.

For the analysis of logged events, we also differentiate be-tween the various stakeholders of a DL system, including, for example, system owners, librarians, endusers, developers and researchers. Each stakeholder group requires a particu-lar view on the logging data in order to address its specific information needs.

3.

LEVELS OF LOGGING

To classify the different logging events, we categorize them into different levels, each of which has its own agreed-upon XML logging schema.

We have identified five levels on which user and system events occur:

• User behaviour level

• Concept level for comparative evaluations • Service level

• Human computer interaction (keystroke) level • System level

Through logging events by level, we have a horizontal view that tracks a sequence of events dealing with a single aspect of the DL. For example, by focusing on events that occur on the concept level, we can identify the user’s moves and tactics as she works her way through the document space. In contrast, a vertical view across levels gives us information about the impact of a specific event across the DL system, from information about user behaviour on the highest level to system specific data on the lowest level.

3.1

User Behavior Level

The users and their behaviours are located on the first or highest level. Each user has a task to accomplish, within a certain social environment, and brings to that task her in-dividual knowledge. Information about the task, the social environment and the user’s knowledge is usually not accessi-ble through logging and can only be gathered by asking the user either before, during or after the execution of an infor-mation seeking task. The data gathered from observing and interviewing the user can be analyzed into various attributes that describe the user’s characteristics (e.g., age, gender, profession, cognitive resources and social environment), the user’s search strategies and preferences, and individual or collaborative task situations and decision tasks.

3.2

Concept Level

On the concept level, we locate generalized events gener-ated by the DL user. By logging these generalized events, user evaluation can be backed up with statistical data and a comparative evaluation of different users, systems and sys-tem content can be undertaken. These generalized events are described in greater detail in Section 4.

3.3

Service Level

Compared to the concept level, the service level represents a more specific set of system-dependent DL services. Because of differing input and output parameters, which are gen-eralized on the concept level, service level events are more difficult to assess. Examples of service level events are:

• Metadata search

• Annotation of digital library objects • Search for references or citations

(3)

This level also includes logging of the graphical tools pre-sented to the user, allowing the comparison of different views of a single service.

3.4

Human Computer Interaction (Keystroke)

Level

The keystroke or human computer interaction (HCI) level consists of input events by the user searching in the DL. The input is usually entered via keyboard, mouse or some other input device such as a braille keyboard. This level corresponds to the keystroke level of the GOMS model [11] (http://www.usabilityfirst.com/methods/goms.txl). Other physical reactions of the user such as eye movement can also be tracked on the keystroke level.

3.5

System Level

The lowest level is the system level consisting of events that happen on the computer or in the computer network where DL services are executed. This level aggregates very specific information concerning the state of the DL (e.g., database conditions, server load, or amount of network traffic) and its response (e.g., response time).

4.

EVENTS ON THE CONCEPT LEVEL

On the concept level, we have identified several general event types that support comparative evaluation across DLs. Our focus on the concept level represents the centrality of these events for log analysis and interpretation. Events that occur on the concept level indicate critical aspects of the user’s in-teraction with the DL system and supply valuable data for rich interpretation of user behaviour. As is highlighted in other DL logging studies [17], current approaches are often inadequate for capturing complex or abstract actions by the user and are therefore unable to elicit meaningful conclu-sions.

By logging data about general event types at the concept level, we provide a basis for comparative evaluation across DLs.

The event types and event properties that we have identi-fied are neither fixed nor complete and should be viewed as recommendations that can also serve as discussion points. Each event will consist of its own set of properties modelled in XML as sub-elements and attributes of an event. These properties might include, for example:

• a unique session-id • a unique event-id

• timestamps for start and end of the event • a service name

• possible errors during the event

• an indictator of cancelation by the user or the system

In addition each event also has an event-specific set of prop-erties. If it is neccessary to collect more detailed data about a given type of event, we suggest extending standard event

definitions through reference to an XML namespace that fines the new properties. For example, in the following de-scription of a search event, the input property is defined as a list of terms. However, many DLs allow for more complex query formulations such as boolean queries, which could be stored in a specific field defined in the extended namespace as a child or sub-element of the original list-of-terms input element. An example is given in Section 4.11. Although comparison across DLs will only work on the list-of-terms property defined in the original namespace, the researcher who requires an extended view of logging events is guar-anteed that no information will be lost in the process of comparison.

4.1

Search

In asearch event a user formulates aquery orfilter con-dition that is to be processed by a given DL service against acollection. The input for a search event with a textually given query would be a list of terms, but to capture a query by example search event, the input would be one or more existing DL object(s). The corresponding DL service (e.g., a metadata search or a journal search) responds by returning a set or ranked list of DL objects that satisfies the specified search condition. The properties of a search event are:

• Input: A query as list of terms or digital library objects • Output: One or a set of a digital library object(s) • Collections: The collection(s) to be searched or

fil-tered, e.g., ACM digital library

4.2

Navigate

Thenavigate eventis fired if a user selects a specific item from a set of possibilities. This is similar to hypertext brows-ing, where a user follows a link or moves from one point to another point. If the link points to a leaf node (e.g., a doc-ument detail), aninspect eventfollows. The properties of a navigate event are:

• Input: A digital library object • Output: One or a set of DL object(s)

• Collections: The underlying collection(s), e.g., ACM digital library

4.3

Inspect

Theinspect eventis fired when the user accesses the de-tails of a single object. Examples of the inspect event are viewing a detail out of a result list or semantic information about a term in a thesaurus. The proposed property of an inspect event is:

• Input: A single DL object

4.4

Display

Any event can be followed by adisplay event. The display event describes a specific visualization of the information presented to the user. Sorting of the DL objects in a set or list will also be captured here. To present the different dis-play events, we propose the use of a standardized taxonomy or classification of visualization techniques, e.g., Shneider-man’s task by data type texonomy [23] or the classification of visualization techniques in [3]. The properties of a display event are:

(4)

• Visualization technique: e.g., standard 2D/3D, geo-metrical transformation or icon based

• Sort criteria: attributes (author, year, etc.) and pred-icates (descending, ascending)

• Input: A set of DL objects to be displayed

4.5

Browse

Thebrowse eventdescribes user actions that involve view-ing a set of DL objects (e.g., viewview-ing a result list followview-ing a search). In a browse event, the user will generally move a slider to view different portions of the result set or will click on a next button or link, as with many internet search en-gines. The browse event can be followed by adisplay event if the properties of the visualization change. The properties of the browse event are:

• Input: Method of user interaction (e.g., slider, button) • Output: A set of DL objects

• Collection: Set of DL objects the interaction involves • Browsing Method: A definition of the browsing method

(View)

4.6

Store

Thestore event files an object for later reuse, either at a generic location (e.g., a clipboard) or at a specific location (e.g., in a specific folder). This event also includes print-ing, exporting in a special format, or storing within the DL system, e.g. in a personal library such as provided by

Daf-fodil. The properties of the store event are: • Input: One or a set of DL object(s) as targets. • Destination: The destination of the target. This may

also include format options (e.g., export formats like BibTeX or endnote)

4.7

Annotate

Theannotate event adds information to an existing DL object, either by marking specific parts of it, by linking it to other digital library objects, or by adding inline or external comments. This event can be followed by an authoring eventif a new annotation, such as a summary or review is written. The properties of the annotate event are:

• Input: One or a set of DL object(s) • Annotation: The actual annotation

• Annotation-Type: The specific type of annotation (e.g., link, inline, comment, summary)

4.8

Authoring

Theauthoring event is fired if a user creates a new DL object or edits an existing object such as a document or annotation. This event is followed by astore event since the new or revised object will need to be saved. A good DL system should also include revision control so that all changes can be tracked. The property of an authoring event is:

• Input: The new or existing DL object along with the new content.

4.9

Help

Thehelp eventrecords a user request for help information. The help event may be general or context-specific and can include introductory overviews or tutorials about the DL system. A help event may also occur if a DL system proac-tively provides help information based on the current state of the system or the context in which the user is working. If the user reacts to a system-generated suggestion, this event should also be stored. The proposed properties of a help event are:

• Input: The current context, service or graphical tool for which help is requested, or the explicit help request • Help Content: The explicit content of the request or a

link to it

• Type of Help: The type of help given (e.g., a fixed help text, a context specific suggestion or an interac-tive video)

4.10

Communicate

Thecommunicate eventfires when users collaborate. Com-munication can include posing a simple question or full col-laboration through the use of specific tools or services. Such communication or collaboration can be synchronous or asyn-chronous, depending on the service provided by the system. Examples of communication services are internal message boards, shared library folders, instant messaging systems or voice-over-IP services, or a whiteboard tool such as that available inDaffodil. The properties of the communicate event are:

• Type Communication: Synchronous or asynchronous, local or distributed

• Communication medium: The form of communication (e.g., chat protocol or a whiteboard)

• Communication content: The content of the commu-nication

4.11

XML schema description

The description of events will be formalized in an XML schema. Below is an abbreviated example that includes two events from one session: a search event followed by a display event. (Seehttp://www.is.informatik.uni-duisburg.de/ wiki/index.php/Schema_description_of_eventsfor the cur-rent schema proposal.)

<delos:event beginTimeStamp="1119346409235" endTimeStamp="1119346409235" eventid="31.c3e82b:1049e70ade2" service="daff:metadatasearch"> <delos:search service="Search Tool" numberOfResults="487">

<delos:input source="2a.c3e82b:1049e70ade2"> <delos:query>

information visualization shneiderman </delos:query> <daffodil:query> title=information visualization and author=Shneiderman </daffodil:query> </delos:input> <delos:output>

(5)

<dlo id="..."</dlo> <dlo id="..."</dlo> </delos:output> <delos:collection>

acm, dblp, citeseer, achilles, hcibib </delos:collection> </delos:search> </delos:event> <delos:event beginTimeStamp="1119346512939" endTimeStamp="1119346512939" eventid="38.c3e82b:1049e70ade2" service="daff:metadatasearch"> <delos:display> <delos:visual>1D, list</delos:visual> <delos:sort attribute="year" predicate="descending" </delos:sort> </delos:display> </delos:event>

We have useddelosas the namespace for all proposed event schemes. In addition to a value for thedelos:inputelement involved in the search event, we have includeddaffodil:query

as an extension of the delos scheme in order to capture the boolean query to be processed byDaffodil.

5.

LOGGING IN DAFFODIL

Daffodil1 (see [21], [16] and [7]) is a virtual DL targeted

at strategic support of users during the information search process. For searching, exploring, and managing DL ob-jects,Daffodil provides information seeking patterns that can be customized by the user for searching over a federa-tion of heterogeneous digital libraries. Searching with Daf-fodil makes a broad range of information sources easily

accessible and enables quick access to a rich information space.

TheDaffodil framework consists of two major parts: the graphical front-end client, depicted in Figure 1 and a set of back-end, agent-based services.

5.1

Graphical front-end client

The graphical client combines a set of high-level search ac-tivities as integrated tools according to the WOB model for user interface design (see [14]). The currentDaffodil pro-totype for the domain of computer science provides a variaty of tools and functions such as:

• Asearch toolthat allows the user to specify the search domain, set filters and compose queries. The queries are broadcasted to a set of distributed information ser-vices via agents and wrappers. Integrated result lists are displayed for navigation and detail inspection by the user.

• A collaborativepersonal librarythat stores DL objects in personal or group folders and supports sharing and annotation of stored objects.

• Aclassification browserthat provides hierarchical, topic-driven access to the information space and enables browsing of classification schemes such as the ACM Computing Classification System.

1

http://www.daffodil.de

• Athesaurus browser that can transform search terms into broader or narrower terms. In addition, subject-specific or web-based thesauri such as WordNet are provided for identifiying related terms.

• A author network browser that supports browsing of social networks based on the co-author relation. • Ajournal and conference browser that searches for a

journal or conference title by browsing many directo-ries, often with direct access either to the metadata or the full-text of articles.

The front-end client is deployed to the user via Java Web-start technology; there are no limitations on integration of new tools as long as they are written or wrapped in the Java programming language.

5.2

The backend services

On the back-end side, each front-end tool is represented by one or more agent-based services that provide the actual functionality. Currently, more than 30 services and 15 wrap-per agents are used by theDaffodilsystem. For efficiency reasons, these services communicate via CORBA, but they can also be accessed via SOAP. For communication, XML messages are used. The agent framework itself is modelled very simplistically for high performance and provides paral-lel threads for each user.

5.3

Logging

With a framework like that of Daffodil, which consists of the graphical user interface and the back-end services writ-ten in Java, we have full control of all user and system trig-gered events, in constrast to a web-based DL system. The logging service itself, depicted in Figure 2, is simplistic: the

Daffodil user interface client and eachDaffodil back-end service can sback-end an event to the event logging service, which then stores the event.

Figure 2: Logging Service Model

Currently,Daffodilhandles over 40 different events. The main groups of events are search, navigate and browse events; result events are generated by each of the system services (e.g., the thesaurus, the journal and conference browser, or the main search tool). The personal library supports store events as well as events involving annotation or authoring of an object.

(6)

Figure 1: Current desktop of Daffodil

Through the log schema converter, we are able to provide more than 100 MB of actual log data in the format of the proposed XML schema. The data will be anonymized and will soon be accessible for comparative analysis.

6.

ANALYSIS OF LOG EVENTS

In order to analyze the logged events, we have assumed that different stakeholders need different views of the log-ging data; thus, a variety of analysis tools is required. We have identified the following DL stakeholders: system own-ers,content providers,system administration,librarians, de-velopers,scientific researchers, andend-users of a DL.

6.1

General interests

There are general statistics that are of interest to all stake-holders. The most common of these is interest in simply viewing the logged sessions of users of a DL. For this pur-pose, Daffodil provides a special tool that allows any stakeholder to visualize and analyze user behavior in a graph view (depicted in Figure 3) or in a tree view (depicted in Figure 4). This tool can also be used by end-users to anal-yse their own search tactics and to look for open or missed search directions that could be pursued.

Several general questions are of interest to all stakeholders:

• When and how long is the DL used?

• What services are used, how often and for how long? • Who uses the DL?

Although strict rights management should be instituted to prevent misuse of logging data, this is of particular impor-tance when attempting to identify the characteristics of DL end-users.

6.2

System Owners

The management of a DL system may not only have mone-tary interests, but may also be interested in collecting user and other statistics in order to establish and execute a busi-ness plan for the DL system. For example, one concern is the need to develop access and acquisitions policies that will en-courage increased usage. The primary questions of interest to management that can be addressed by analysis of logging data include:

• How many users access the DL system in what time period?

• How many DL objects are delivered or sold?

• Which objects are most requested and which of these are not available?

6.3

Developers

The developers of a DL system are usually keen to provide a rich set of services in a user-friendly manner. From a system perspective, these services should be effective, efficient and error free. In addition, users should be supported in their at-tempts to achieve their information goals. Thus developers need information about the system that includes:

(7)

Figure 3: Example of user session in a graph visualization.

(8)

• How long is the DL used?

• Which services of the DL are used?

• How many users canceled a specific service? • Which services are flawed?

• Are response times satisfactory?

6.4

Content providers

Content providers frequently need to assess the way users manipulate information objects in order to mine current pat-terns of usage; based on this information, they can decide about necessary revisions of collection or pricing policies. Although content providers may focus on events that in-dicate inspection and storage of information objects, other research questions may also be posed. For example:

• What formats are most frequently used?

• What levels of information are preferred (e.g., cita-tions, abstracts, full texts)?

• What levels of access are preferred (e.g., open, autho-rized)?

• What items of information, in terms of titles or subject categories, are preferred?

6.5

Librarians

As discussed above, the integrated visualization tools allow even inexperienced users of log files to understand crucial events and user tendencies that affect the performance of their activities. For librarians, these activities will include reference services as well as instructional programs [5]. Typ-ical questions for librarians are:

• What classes of users exist?

• From where do they gain access (e.g., in the library, on campus, from home)?

• What are the predominant patterns of use?

• Do these patterns of use indicate progress in the de-velopment of information skills and/or growing under-standing of appropriate search behaviours?

• Are users communicating with library personnel and to what degree?

6.6

End users

As mentioned previously, log analysis may provide sugges-tions for users in need of help. Transaction logs may also remind the user of certain work patterns she had previ-ously performed. Collaborative filtering and recommenda-tion tools also find collected log data valuable. Besides doc-uments, also work patterns and result screens may be rec-ommended to a user based on logging data.

6.7

Researcher

The experimental framework of Daffodil , including the logging scheme, gives researchers a baseline and a powerful toolbox for evaluation work. For example, at the system level, the efficiency of algorithms or the appropriateness of DL architectures can be evaluated and compared. Research on usability, including the GUI development, will concen-trate on both the HCI level and the system levels.

The main focus of theDaffodilproject has always been to place the user in the center of digital library and information retrieval research. Daffodilis based on the infovrmation search model by Marcia Bates ([2] and [1]) which classifies search activities of users as moves, tactics, stratagems and strategies. The proposed logging scheme and the concept level is a natural extension of this original aim, as it provides a method to analyse the sequence of moves and tactics. As Wildermuth stated in [24]:

While significant work has examined the indi-vidual moves that searchers make (Bates, 1979, 1987; Fidel, 1985), it is equally important to ex-amine the sequences of moves made by searchers in order to understand the cognitive processes they use in formulating and reformulating their searches.

With regard to this goal, the logging scheme allows for studying the usage of the diverse services; furthermore, the whole search process/sequence can be analyzed, from its be-ginning with the formulation of the original search query to its conclusion with the storage of a DL object for further use.

In addition to capturing and understanding of the user’s search activities through analysis of logging data, recom-mendations can be made for reuse of search patterns. Thus, Kriewel [15] suggests that search paths discovered in anal-ysis of logging data can be used as a basis for suggesting potential search steps to other users.

User strategies could be captured at the concept level as well, by looking at the sequence of sessions for one task. Pejtersen’s framework for analysis of search strategies is di-rected to the design of interfaces that provide consistent support of the user’s dynamic ”workflow” during a search session, instead of prescribing any particular steps or phases for the user’s search. The strategies proposed by her are: analytical search, browsing, search by analogy, bibliographic search strategy and empirical search strategy [19], [20], [18].

6.8

Recommended practice

It is obvious that (with the exception of the classification of users) most of the questions listed above can be answered in sufficient detail using the suggested logging scheme, and that the concept level log is most useful in providing these answers. While all stakeholders are interested in the concept level, the system and HCI levels are perhaps most relevant for researchers and developers. In contrast, the logging level of user behaviour is most interesting for librarians and end-users, while the service level can be used to answer questions

(9)

Stakeholder Level General Concept, Service System Owners Concept

Developers Concept, Service, HCI, System Content Owners Concept

Librarianss User behavior, Concept

Researchers All

End-users User behavior, Concept Table 1: Levels per Stakeholder

of general interest as well as those of developers. Interest-ingly, the only stakeholder group that uses all levels of log-ging is researchers. See table 1 for a complete list of the relationship between stakeholders and levels.

Logs of a single system may always be compared to each other, but in such comparative evaluations some caution is required when planning the logging or conversion tools for the standard scheme to ensure that no possible compari-son are overlooked. To avoid the loss of potential points of comparison, a set of recommendations, or “best practices”, should be established for translating system-provided log data into a standardized format.

An essential prerequisite for comparison is a common nam-ing scheme that is not restricted to the names provided by the XML logging schema (e.g., names of events). Digi-tal objects in the log should be referred to using the most widely used identifiers; the use of DOIs or URIs is also rec-ommended, especially when comparing statistics on content usage. Furthermore, because natural language messages in the log can cause comparison problems in a multilingual en-vironment, error messages should be mapped to a language neutral taxonomy of possible errors. In addition, while the identity of users should be hidden, it should still be possible to track the same user in different logs of the same system. It is very likely that many DLs will need to extend the XML logging schema for their own use. Typical extensions would be the addition of new event subtypes. The proposed XML schema has been planned to be extensible, using a concept similar to Dublin Core application profiles: each newly in-troduced element should be derived from the core schema or mapped to some local schema definition. For events, er-rors and other concepts the use of a core ontology providing clear relationships between various newly introduced terms is encouraged.

7.

SUMMARY AND OUTLOOK

In this paper, we have presented the first efforts to de-velop a standardized experimental framework for digital li-brary evaluation. If the various DL stakeholders can form a community and agree upon models, dimensions, defini-tions and common sense criteria for the evaluation of DL systems, then the process of evaluation will gain impetus among DL researchers. As an experimental platform for evaluation both within and across DL systems, application of theDaffodilframework can substantially advance this research area, since it currently provides functions and ser-vices that are based on a solid theoretical foundation and well known models. The proposed logging scheme is a first

step intended to encourage evaluation of individual systems as well as comparisons across systems.

Most evaluation techniques require a great deal of prepara-tion and effort and are thus not easily replicated. In case of online DL systems, this means that the results of an evalu-ation often reflects a past snapshot of the system. It is nec-essary to find ways for continuous, cost effective and (more or less) automated evaluation of digital libraries. The sug-gested logging scheme is a first step in this direction. Our group aims at establishing a community forum for evalua-tors in order to promote the propagation of various tools and approaches, and the exchange of experience.

As part of the effort to encourage a community forum for researchers interested in DL evaluation, we have published documentation for the logging scheme on the forum website athttp://www.is.informatik.uni-duisburg.de/wiki/index. php/JPA_2_-_WP7. The log analysis tools will follow soon, as they are under preparation. The visualizations from the figures above are currently integrated as a new tool into the

Daffodiluser interface. As a further step, large amounts of anonymized logging data from two DL services will be made available. Such primary data will help the commu-nity to improve the application of transaction logging and to compare and experiment with sample data. In the long term, we envisage that this effort will evolve into a primary data repository, providing help for evaluators who want to find similar scenarios together with logging data. In other fields of research, such primary data repositories are already well established and play an important role in the conduct of research. (In fact, the provision of primary data frequently counts as a publication, creating further incentives for this kind of work.) An infrastructure of independent evalua-tor services, primary data reposievalua-tories, logging tools and on-line questionnaires may provide computer-based support for some of the cost-effective and time-consuming tasks in evaluation and the community will gain sustainability.

8.

ACKNOWLEDGMENTS

This work was funded by the DELOS Network of Excellence on Digital Libraries (EU 6. FP IST, G038-507618).

9.

REFERENCES

[1] M. J. Bates. Idea tactics.Journal of the American Society for Information Science, 30(5):280–289, 1979. [2] M. J. Bates. Information search tactics.Journal of the

American Society for Information Science, 30(4):205–214, 1979.

[3] E. Bertini, T. Catarci, S. Kimani, and L. D. Bello. Visualization in digital libraries. InFrom Integrated Publication and Information Systems to Virtual Information and Knowledge Environments, pages 183–196, 2005.

[4] J. Bertot, C. McClure, W. Moen, and J. Rubin. Web usage statistics: measurement issues and analytical techniques.Goverment Information Quarterly, 14(4):373–3951, 1997.

[5] M. Chau. Web mining technology and academic librarianship.First Monday, 4(6), 1999.http://www. firstmonday.dk/issues/issue4_6/chau/index.html.

(10)

[6] D. T. Covey. Usage and usability assessment: Library practices and concerns. Technical report, Digital Library Federation, 2002.

[7] N. Fuhr, C.-P. Klas, A. Schaefer, and P. Mutschke. Daffodil: An integrated desktop for supporting high-level search activities in federated digital libraries. InResearch and Advanced Technology for Digital Libraries. 6th European Conference, ECDL 2002, pages 597–612, Heidelberg et al., 2002. Springer. [8] N. Fuhr, G. Tsakonas, T. Aalberg, M. Agosti,

P. Hansen, S. Kapidakis, C.-P. Klas, L. Kovcs, M. Landoni, A. Micsik, C. Papatheodorou, C. Peters, and I. Solvberg. Evaluation of digital libraries. (To appear), 2006.

[9] M. A. Goncalves, E. A. Fox, L. Cassel, A. Krowne, U. Ravindranathan, G. Panchanathan, and

F. Jagodzinski. Standards, mark-up, and metadata: The xml log standard for digital libraries: analysis, evolution, and deployment. InProceedings of the third ACM/IEEE-CS joint conference on Digital libraries, 2003.

[10] M. A. Goncalves, R. Shen, E. A. Fox, M. F. Ali, and M. Luo. An xml log standard and tool for digital library logging analysis. InAgosti, Maristella (ed.) et al., Research and advanced technology for digital libraries. 6th European conference, ECDL 2002, Rome, Italy, September 16-18, 2002. Proceedings. Berlin: Springer. Lect. Notes Comput. Sci. 2458, 129-143, 2002.

[11] B. E. John and D. E. Kieras. Using goms for user interface design and evaluation: Which technique? acm trans.ACM Trans. Comput.-Hum. Interact. 3, 1996.

[12] S. Jones, S. J. Cunningham, R. McNab, and S. Boddie. A transaction log analysis of a digital library.International Journal on Digital Libraries, 3(2):152–169, 2000.

[13] H. Ke, R. Kwakkelaar, T. Tai, and L. Chen. Exploring behavior of e-journal users in science and technology: Transaction log analysis of elsevier’s sciencedirect onsite in taiwan.Library & Information Science Research, 24(3):265–291, 2002.

[14] J. Krause. Visualisation, multimodality and traditional graphical user interfaces.Review of Information Science, 2(2), 1997.

[15] S. Kriewel. Finding and using strategies for search situations in digital libraries. Bulletin of the IEEE Technical Committee on Digital Libraries (to appear), 2005. http://www.dei.ist.utl.pt/ jlb/ECDL2005-DC/14-SaschaKriewel/14-Sascha-final.pdf. [16] S. Kriewel, C.-P. Klas, A. Schaefer, and N. Fuhr.

Daffodil - strategic support for user-oriented access to heterogeneous digital libraries.D-Lib Magazine, 10(6), June 2004.http://www.dlib.org/dlib/june04/ kriewel/06kriewel.html.

[17] B. Pan. Capturing users behavior in the national science digital library (nsdl). Technical report, NSDL, 2003. http://dlist.sir.arizona.edu/848/.

[18] A. M. Pejtersen, H. Albrechtsen, B. Cleal, C. Hansen, and M. Hertzum. A webbased multimedia

-collaboratory: Empirical work studies in film archives. Technical report, Riso National Laboratory, Denmark, 2001.

[19] A. M. Pejtersen and R. Fidel. A framework for work centered evaluation and design: A case study of IR on the Web. Technical report, Riso National Laboratory, Denmark, 1998.

[20] A. M. Pejtersen and J. Rasmussen.Ecological

information systems and support of learning: Coupling work domain information to user characteristics. Elsevier, Amsterdam (NL), 1997.

[21] A. Schaefer, M. Jordan, C.-P. Klas, and N. Fuhr. Active support for query formulation in virtual digital libraries: A case study with DAFFODIL. In

A. Rauber, C. Christodoulakis, and A. M. Tjoa, editors,Research and Advanced Technology for Digital Libraries. Proc. European Conference on Digital Libraries (ECDL 2005), Lecture Notes in Computer Science, Heidelberg et al., 2005. Springer.

[22] M. Sfakakis and S. Kapidakis.User Behavior Tendencies on Data Collections in a Digital Library, volume 2458 ofLecture Notes In Computer Science, pages 550–559. Springer-Verlag, Berlin; Heidelberg, 2002. Research and Advanced Technology for Digital Libraries: 6th European Conference, ECDL 2002, Rome, Italy, September 16-18, 2002. Proceedings. [23] B. Shneiderman. The eyes have it: A task by data type

taxonomy for information visualizations. Technical Report CS-TR-3665, University of Maryland, Department of Computer Science, July 1996.

[24] B. M. Wildemuth. The effects of domain knowledge on search tactic formulation.Journal of the American Society for Information Science and Technology, 55(3):246–258, 2004.

References

Related documents

+ Stagnant CapEx, Reduced Operating Expenses Expected: Analysts are predicting capital expenditures of public energy companies to grow at 2% or less year-over-year

Shewanella oneidensis Flavodoxin fldA 175 Long Proteobacteria Negative Shigella flexneri Uncharacterized protein YqcA yqcA 149 Short Proteobacteria Negative Shigella flexneri

Imperial Valley College Pasadena City College College of the Desert Victor Valley College Rio Hondo College Long Beach City College College of the Sequoias Glendale

But in the materialistic age in which we live today, man’ s great danger is forgetting that he is not a creature of time, but of eternity - that within his body there dwells a

In this study, the effect of subcritical water extraction (SWE) was employed for the extraction of bioactive compounds from zingiber officinale namely 6-gingerol,

First, the model computes variable costs per vehicle-km based on average fuel consumption factors for each technology and country- specific fuel prices provided by the POLES

We show how to break both the challenge/response authentication protocol (Microsoft CHAP) and the RC4 encryption protocol (MPPE), as well as how to attack the control channel

Como dicho popular, dentro de la cultura del pueblo ecuatoriano, existe una división marcada sobre todo por razones geográficas, y así se comenta que las personas