• No results found

Information access through information technology

N/A
N/A
Protected

Academic year: 2021

Share "Information access through information technology"

Copied!
112
0
0

Loading.... (view fulltext now)

Full text

(1)

1

Information access through

information technology

Created to support an invited lecture

at the International Conference MDGICT 2009 in Tamil Nadu, India, December 2009

by [email protected]

(2)

These slides should be available from the WWW site 2

http://www.vub.ac.be/BIBLIO/nieuwenhuysen/presentations/

(note: BIBLIO and not biblio)/

(3)

1. General: 3

Access to information:

the evolution

2. More specific:

So much information, so little time

contents

= summary

= structure

= overview of this

presentation

(4)

4

Information access

through information technology

Access to information:

the evolution

(5)

5

Information is important for development

Research / Education / Journalism

Access to information = important

(6)

6

Electronic mail Electronic mail

Internet Internet

CD-CD-ROM with cheap 600 MB memoryROM with cheap 600 MB memory

Library automation with integrated library management systems Library automation with integrated library management systems

CDS/ISIS

CDS/ISIS free cataloguing software by UNESCOfree cataloguing software by UNESCO Personal computers

Personal computers Online databases on

Online databases on DIALOGDIALOG host computerhost computer Libraries

Libraries

1970 1975 1980 1985 1990 1995 2000 2005 2010

(7)

7

Google

Google WWW search becomes most popular information discovery toolWWW search becomes most popular information discovery tool Electronic mail

Electronic mail

WWW based on Internet (without search engine) WWW based on Internet (without search engine)

Microsoft Windows 95 PC operating system Microsoft Windows 95 PC operating system WWW search engines to uncover the WWW WWW search engines to uncover the WWW

DVD with more than 4000 MB memory DVD with more than 4000 MB memory

CDS/ISIS for Windows

CDS/ISIS for Windows with user-with user-friendly interfacesfriendly interfaces Internet

Internet

CD-CD-ROM with cheap 600 MB memoryROM with cheap 600 MB memory

Library automation with integrated library management systems Library automation with integrated library management systems

CDS/ISIS

CDS/ISIS free cataloguing software by UNESCOfree cataloguing software by UNESCO Personal computers

Personal computers Online databases on

Online databases on DIALOGDIALOG host computerhost computer Libraries

Libraries

1970 1975 1980 1985 1990 1995 2000 2005 2010

(8)

Authentication and authorization to access proprietary digital content 8

Electronic full-text journals Local link generators

Digital libraries, including repositories

1970 1975 1980 1985 1990 1995 2000 2005 2010

(9)

9

Library automation with free software such as CDS/ISIS = ABCD Computing + memory for storage in the “Cloud”

Social WWW, including Facebook, Flickr and YouTube

Authentication and authorization to access proprietary digital content Electronic full-text journals

Local link generators

Google Scholar allows discovery of academic information free of charge Libraries implement Electronic Resource Management Systems

Electronic books

Digital libraries, including repositories

1970 1975 1980 1985 1990 1995 2000 2005 2010

(10)

10

Library automation with free software such as CDS/ISIS = ABCD Computing + memory for storage in the “Cloud”

Social WWW, including Facebook, Flickr and YouTube

Authentication and authorization to access proprietary digital content Electronic full-text journals

Local link generators

Google Scholar allows discovery of academic information free of charge Libraries implement Electronic Resource Management Systems

Better integration, aggregation, federation ??????????

Electronic books

Digital libraries, including repositories

1970 1975 1980 1985 1990 1995 2000 2005 2010

(11)

11

Information access:

difficulties and bottlenecks

Cost of content

»Books

»Journals

»Bibliographical databases

Cost of computer hardware

»At server side

»At client side

Cost of computer software

(12)

12

Information access:

difficulties and bottlenecks

Need to train personnel

Fast evolution

(13)

13

Information access:

difficulties and bottlenecks in developing countries

Poor infrastructure

»Power supply

»Internet access

(14)

14

Information access

through information technology

So much information, so little time

(15)

15

Introduction:

scattering of sources

Users want to exploit information sources fast and effectively.

• This is hindered by the fact that digital, electronic information sources that may contain relevant

information are created and scattered, distributed on

numerous computers all over the Internet and the WWW.

(16)

16

Introduction:

scattering of sources

• In other words:

integration / aggregation is still far from perfect.

(17)

17

Introduction:

scattering of sources difficulties

• Using many information retrieval systems costs time:

1. They must be used one after the other which requires many decisions and actions

(18)

18

Introduction:

scattering of sources difficulties

• Using many information retrieval systems costs time:

2. They offer different user interfaces in the retrieval phase, which is confusing

(19)

19

Introduction:

scattering of sources difficulties

• Using many information retrieval systems costs time:

3. They offer found information items in various data formats

(20)

20

Information access

through information technology

Problem statements

(21)

21

Problem statements

1. Which methods have been developed and applied to cope with this reality?

(22)

22

Problem statements

2. Which concrete

applications are available and how can an end-user exploit systems created in this domain?

(23)

23

Problem statements

3. How can information

intermediaries evaluate and apply these methods to

bring information more efficiently to end-users?

(24)

24

Information access

through information technology

Various methods

for information retrieval from scattered sources

(25)

25

Method 1: Merging = aggregating into a searchable database

Search engine

Aggregated database

Database or web site

or…

Database or web site

or…

Database or web site

or…

UserUser

UserUser

(26)

26

Method 2: Federated searching through scattered databases

Federated search engine

Database Database Database

UserUser UserUser

Search engine

Search engine Search engine

(27)

27

Both methods

offer benefits to the users

+ Saves the users time that would be needed to execute queries towards various servers or to browse through various systems.

(28)

28

Both methods

offer benefits to the users

+ The users have to learn only 1 user interface for searching and only 1 search syntax,

instead of a user interface and a search syntax for each database.

(29)

29

Both methods

offer benefits to the users

+ The system offers a uniform / consistent display of results in the output phase.

(30)

30

Method 1: Merging = aggregating into a searchable database

Search engine

Aggregated database

Database or web site

or…

Database or web site

or…

Database or web site

or…

UserUser

UserUser

(31)

31

Method 2: Federated searching through scattered databases

Federated search engine

Database Database Database

UserUser UserUser

Search engine

Search engine Search engine

(32)

32

Federated searching

through scattered databases: why?

The perfect trip:

1. A cheap and nice flight 2. A cheap and nice hotel 3. A visit to a nice museum

4. Something nice to read (free via your library) The perfect trip:

1. A cheap and nice flight 2. A cheap and nice hotel 3. A visit to a nice museum

4. Something nice to read (free via your library)

(33)

33

Federated searching: application:

finding a suitable flight

Example:

• http://CheapTickets.com/ for the USA Example

(34)

34

Federated searching: application:

finding a hotel room in some city

Example

(35)

35

Federated searching:

searching in a museum

Example

(36)

36

Federated searching:

searching in a library

Example

(37)

37

So many digital libraries

through information technology

The various methods applied for end-users

(38)

38

Method 1: Merging = aggregating into a searchable database

Search engine Aggregated database

Database or web site

or…

Database or web site

or…

Database or web site

or…

UserUser UserUser

orD

(39)

39

Internet global subject directories:

introduction

They are virtual libraries with open shelves, for browsing.

They are manually generated, man-made by many people.

They can be browsed following a tree structure or a more complicated variation.

(40)

40

Internet global subject directories:

Yahoo!

A hypertext global subject directory can be found at http://dir.yahoo.com/

Entries are NOT rated.

Accessible free of charge.

Example

(41)

41

Internet global subject directories:

BUBL LINK

A hypertext global subject directory to more than 10 000 WWW sites for the higher education community can be found at

http://bubl.ac.uk/link/ [accessed 2008]

Accessible free of charge.

The categories are based on the well-known general Dewey classification system.

Example

(42)

42

Internet global subject directories:

dmoz Open Directory Project

A hypertext global subject directory can be found at http://www.dmoz.org/

Accessible free of charge.

It is allowed to use the contents also in other systems;

this is indeed done in Webbrain.

Example

(43)

43

Internet global subject directories:

Librarians' Internet Index

A hypertext global subject directory can be found at http://www.lii.org/ [accessed 2008]

Accessible free of charge.

Librarians select the sites and build the overview.

The name ‘Internet Index’ may create some confusion,

because this term means in many cases an index as part of a full-text searchable database, that is an Internet search engine.

This is NOT the case here.

Example

(44)

44

Internet global subject directories:

Intute

• http://www.intute.ac.uk/ [accessed 2008]

From 2006

Accessible free of charge.

Offers a collection of hypertext subject directories that focus on academic information sources

Also tutorials are offered about how to find information in specific subject domains.

Example

(45)

45

Internet indexes:

automated search tools

The basic, fundamental architecture of the WWW does NOT include a system to discover relevant information resources.

Thus search systems / engines have been implemented besides the WWW, mainly by commercial companies.

(46)

46

Internet indexes:

automated search tools

The situation is already since 1990’s as follows:

»WWW

= decentralised without central control

= good

»Search through the WWW

= centralised in a few systems,

each one managed by a commercial US company

= NOT (?) good

(47)

47

Internet indexes:

Google

• http://www.google.com/

Available since 2001 with most of its features.

The most popular search system since 2003.

Example

(48)

48

Internet indexes:

Google Scholar

Google Scholar allows us to search for more scholarly information sources, including journal articles.

A beta (test) version has been available since November 2004.

The system is accessible starting from the home page of Google as one of the additional services,

or more directly from http://scholar.google.com/

Example

(49)

49

Internet indexes:

Bing

• http://www.bing.com/

Available in 2009 in beta = test version.

Replaces Microsoft Live Example

(50)

50

Internet indexes:

Scirus

The search interface: http://www.scirus.com/

Since 2001.

Offers not only access to files in html format, but also to files in PDF.

Example

(51)

51

Internet indexes:

Ask

Available from: http://www.ask.com/

• Offers a feature that is not offered by most other search systems:

categorization = classification = refinement = clustering of search results,

to help the user coping with the problem of ambiguity of meaning of the search query that was made

Example

(52)

52

Internet indexes:

Yahoo!

An Internet search system is offered through http://www.yahoo.com/

This is offered BESIDES the well-established, classical Yahoo! subject directory.

Example

(53)

53

Current awareness services focusing on WWW pages: Google Alerts

Available at http://www.google.com/ and then see the page with additional services

or more directly from http://www.google.com/alerts/

Since 2004.

Can discover relevant changed or new WWW pages for you in the future.

Is based on the popular Internet index Google.

Works with search queries given by you that are stored on their server computer.

(54)

54

Current awareness services focusing on WWW pages: Google Alerts

(55)

55

Databases accessible over the Internet:

example: scientificcommons

Example

• http://www.scientificcommons.org/

Since 2007

Similar to OAISTER:

Allows you to search the full texts in scientific open access repositories all over the world.

(56)

56

Databases accessible over the Internet: example: Medline

Medline/PubMed offers bibliographic descriptions of publications on

medicine, free of charge.

Example

(57)

57

Internet with WWW and printed books

Since a few years, Internet with the WWW have become the primary information source for many people.

However:

»A lot of information is still distributed only in the form of printed books

»The content of old printed books can still be interesting.

»The content of most printed books is (still) not available on the Internet.

(58)

58

Public access book databases provided by bookshops

To find currently available books, the bibliographic databases assembled by big bookshops are interesting.

Several offer a good coverage.

Many are accessible free of charge.

The added price information can be useful for the

acquisition and accounting department of a library or if an individual user wants to buy a book.

Some provide a current awareness service, also free of charge.

Take into account delivery costs: postage + import tax

(59)

59

Book databases accessible free of charge: examples in U.S.A.

Amazon.com (US):

http://www.amazon.com/

This company offers also different, more local

versions that offer books in other languages, such as http://www.amazon.co.uk/

http://www.amazon.fr/

note: amazon, NOT amazone

Subject description is poor.

Take into account delivery costs: postage + import tax

Examples

(60)

60

Book databases accessible free of charge: examples in U.S.A.

Barnes and Noble (US):

http://www.barnesandnoble.com/ or http://www.bn.com/

Examples

(61)

61

Search systems for books that are made available by dealers

descriptions of books & real books for sale User

Book dealer catalog database

(62)

62

Search systems for books that are made available by dealers

descriptions of books & real books for sale User

Book dealer catalog databases

(63)

63

Search systems for books that are made available by dealers

descriptions of books & real books for sale User

Book dealer catalog databases

(64)

64

Free public access search systems for books: multi-dealer databases

» Multi-dealer database = database obtained after merging of several existing catalogue / inventory databases, which are managed and updated by individual dealers/shops/sellers.

» Such a system can include from a few to more than 10000 shops/dealers.

(65)

65

Search systems for books that are made available by dealers

descriptions of books & real books for sale User

Book dealer catalog databases

(66)

66

Search systems for books that are made available by dealers

descriptions of books & real books for sale User

Multi-dealer database

= merged book dealer databases Book dealer catalog databases

(67)

67

Search systems for books that are made available by dealers

descriptions of books & real books for sale User

Multi-dealer databases

= merged book dealer databases Book dealer catalog databases

(68)

68

Search systems for books that are made available by dealers

descriptions of books & real books for sale User

Multi-dealer databases

= merged book dealer databases Book dealer catalog databases

(69)

69

Free public access multi-dealer book databases: examples

Amazon Marketplace:

http://www.amazon.com/

[accessed 2009]

In synergy with the online bookshop Amazon on 1 WWW site:

Used books are displayed alongside Amazon’s new books.

“the world’s biggest online book bazaar”

Subject description is poor.

Take into account delivery costs: postage + tax

(70)

70

Free public access multi-dealer book databases: examples

• http://www.antiqbook.com/books/ [accessed 2009]

(NOT www.antiqbooks.com)

“ANTIQBOOK unites more than 400 independent booksellers from all over the world. You can use our search pages for a free search of over 3.8 million books, and order them directly from your bookseller. Strong areas in our database are books from European booksellers, many of them specialist antiquarian booksellers.

While ANTIQBOOK takes care that you can order safely from our

booksellers we do not take part in their sales. We just bring you in touch with some of the finest booksellers in the world. You can order your books straight from the source, at their original prices and no hidden costs or markup fees.”

(71)

71

Full-text databases of books:

introduction

Some organisations have scanned the contents of thousands of books,

to make them full-text searchable through the Internet.

(72)

72

Full-text databases of books:

Google Books

• http://www.books.google

Since 2005

(73)

73

Online Public Access Catalogues:

union catalogues of libraries

Some systems offer access to the merged catalogues of several libraries, so-called ‘union catalogues’.

Example:

Copac

http://www.copac.ac.uk/

is accessible free of charge.

Example

(74)

74

Online access databases

about journal articles: overview

Thousands of fee-based online access databases offer bibliographies or full-texts of journal articles in

particular subject domains and published by many publishers.

Many publishers offer searchable bibliographies, but only of their own publications.

(for instance Elsevier, Emerald, Sage)

Only few large databases offer access to bibliographies of articles published in journals from many publishers,

free of charge.

(75)

75

Online access databases about journal articles: Ingenta

Available from: http://www.ingentaconnect.com/

Ingenta allows you to search a bibliographic database of millions of journal articles,

including titles, authors, in many cases abstracts.

The organisation claims to be

“The most comprehensive collection of academic and professional publications”

Example

(76)

76

Online access databases about journal articles: Infotrieve ArticleFinder

Available from: http://www.infotrieve.com/

Infotrieve allows you to search free of charge in a bibliographic database of the articles

of more than 20 000 journal titles and conference proceedings,

NOT full-text.

Payment is required to receive the full text of a document.

Example

(77)

77

Online access databases about journal articles: Scirus

The search interface: http://www.scirus.com

This is a specialised Internet index that allows you to search for selected scientific information (only) on the WWW.

This includes the peer-reviewed articles in the journals that are published in ScienceDirect by Elsevier.

Offered free of charge by Elsevier.

An article can be downloaded in full-text format only when a fee has been paid to the publisher.

Example

(78)

78

Online access databases about journal articles: Google Scholar

Google Scholar allows us to search for more scholarly information sources, including journal articles.

A beta (= test) version has been available since November 2004.

The system is accessible starting from the home page of Google as one of the additional services besides the

normal, classical WWW search.

Example

(79)

79

Finding images on the Internet:

introduction

+ Several public access search systems are available free of charge to search for

images / pictures (either artwork, either photos, or both) on the Internet.

+ When searching for images, the search results from such a system offer not only links to the image files on the

Internet, but also directly small versions of the images (so-called “thumbnails”).

(80)

Example 80

Finding images on the Internet:

screen shot of a Google image search

(81)

Example 81

Finding images on the Internet:

examples of search engines

• http://images.google.com/ !

or through http://www.google.com/

[accessed in 2009]

The largest database in this category (at least in 2002…2008).

For each result, not only a thumbnail is offered, but also directly the origin with the readable URL;

this makes it easier to guess the relevance of the document.

(82)

Example 82

Finding images on the Internet:

examples of search engines

• http://images.search.yahoo.com/

[accessed in 2007]

or

http://yahoo.com/ or http://www.yahoo.com and then go to searching ‘Images’

(83)

Example 83

Finding images on the Internet:

examples of search engines

• http://www.ask.com/

[accessed in 2007]

Ask

Offers no indication of the number of images retrieved, which is a disadvantage when many pictures are found, but only a few can be seen at the time.

(84)

84

Finding images on the Internet:

examples of search engines

• http://www.bing.com/

Available in 2009 in beta = test version.

Replacing

Microsoft Live and Yahoo Search ? Example

(85)

85

Method 2: Federated searching through scattered databases

Federated search engine

Database Database Database

UserUser UserUser

Search engine

Search engine Search engine

(86)

86

Databases accessible over the Internet:

example

Example

• http://WorldWideScience.org/

“A global science gateway connecting you to national and international scientific databases and portals.

Accelerates scientific discovery and progress by providing one-stop searching of global science sources.”

(87)

87

Databases accessible over the Internet: example

• http://www.scitopia.org/scitopia/

Federated searching through various scientific databases Example

(88)

Examples 88

Meta-search systems on a server computer

http://aftervote.com/

http://draze.com/

http://www.all4one.com

http://www.bytesearch.com

http://clusty.com/

http://www.cyber411.com

http://www.dogpile.com = http://dogpile.com/

http://www.go2net.com = http://www.metacrawler.com

http://jux2.com

http://www.kartoo.com

http://www.mamma.com

http://www.museseek.com

http://www.profusion.com

http://www.search.com

http://www.vivisimo.com = http://vivisimo.com/

(89)

Example 89

Meta-search systems: server-based:

example: Clusty

Adds value by analysing the retrieved results / hits / links / WWW documents, in order to

cluster / group / categorize / classify / map these under headings / classes / categories,

to make further selections by the user / searcher easier and faster.

Can accomplish this on the fly, that is WITHOUT pre- processing the documents before the search.

(90)

Example 90

Meta-search systems: server-based:

example: Clusty screenshot in 2009

(91)

91

Free public access book meta-search systems: types

We can make the following distinction between various types of meta-systems for searching:

1. Database resulting from merging several existing smaller databases = aggregator database

In this case of books:

multi-dealer database = “listing service”

2. Federated search system

= cross-database search system

(92)

92

Search systems for books that are made available by dealers

descriptions of books & real books for sale

User Federated

book search systems

Multi-dealer databases

= merged book dealer databases Book dealer catalog databases

(93)

93

Search systems for books that are made available by dealers

descriptions of books & real books for sale

User Federated

book search systems Multi-dealer databases

= merged book dealer databases Book dealer catalogue databases

(94)

94

Free public access federated search systems for books: examples

• http://www.addall.com/ [accessed 2009]

Covers many book dealer databases and multi-dealer databases, including unique databases that are not covered by competing search systems.

Can calculate the cost to ship/send a book to you, taking into account your country and currency.

Searches only new books;

to find used books, a companion system should be used.

This is inconvenient if the user is interested in both types of books.

(95)

95

Free public access federated search systems for books: examples

• http://www.bookfinder.com/ [accessed 2007, 2008, 2009]

BookFinder

Covers many book dealer databases and multi-dealer databases, including unique databases that are not covered by competing search systems.

It is efficient that new and used books are searched in 1 action;

the results are presented in 2 columns: new | used.

(96)

96

Online Public Access Catalogues:

simultaneous searching: examples

Example

Simultaneous access to catalogues of libraries related to water, organised by IAMSLIC, using Z39.50

(97)

97

Federated searching

offered by a university library

Main goal of such a system is offering easy and fast access to various information sources and NOT

sophisticated and complicated searching.

The user interface is simple, in agreement with the aim of such a system.

(98)

98

Information access

through information technology

Comparison of methods

for efficient information retrieval

(99)

99

Comparison of methods

for efficient information retrieval

- +

Presearch analysis of

all data

2.

Federated searching 1. Merging databases

(100)

100

Comparison of methods

for efficient information retrieval

- +

Presearch analysis of

all data

+ - Size of the coverage

2.

Federated searching 1. Merging databases

(101)

101

Comparison of methods

for efficient information retrieval

- +

Presearch analysis of

all data

+ - Size of the coverage

- +

Independent of Internet /

WWW

2.

Federated searching 1. Merging databases

(102)

102

Comparison of methods

for efficient information retrieval

- +

Presearch analysis of

all data

+ - Size of the coverage

- +

Independent of Internet /

WWW

+ 2.

Federated searching

- 1. Merging

databases

Up-to-date information

(103)

103

Comparison of methods

for efficient information retrieval

- +

Presearch analysis of

all data

+ - Size of the coverage

- +

Independent of Internet /

WWW

- +

2.

Federated searching

+ -

1. Merging databases

Speed of retrieval

and display Up-to-date

information

(104)

104

Comparison of methods

for efficient information retrieval

+ The evolution of information and communication technology makes systems more powerful, easier to implement and use, and cheaper:

+ Merging information sources is pushed forward mainly by the decreasing costs of hard disks and of computer

memory in general.

+ Federated searching is pushed mainly by the evolution of the Internet.

(105)

105

Both methods bring

difficulties / challenges / problems

- In many cases there are differences among the merged sources in the formatting/structuring of their database records in fields.

This hinders

- searching limited to a field

- displaying selected fields only (such as title)

- sorting of the displayed records on the contents of a particular selected field (such as author or date)

(106)

106

Both methods bring

difficulties / challenges / problems

- In many cases there are differences among sources in the metadata schemes that are applied in the databases to improve retrieval, such as

»classifications

»taxonomies

»thesaurus systems

»ontologies

- This hinders the exploitation of the added value of such metadata.

(107)

107

Information access

through information technology

Conclusions

(108)

108

Introduction:

scattering of sources difficulties

(109)

109

Introduction:

scattering of sources difficulties

(110)

110

Methods for efficient information

retrieval:

conclusions

The examples given show at least that progress in this field is impressive.

(111)

111

Questions? Suggestions? Remarks?

(112)

112

You are free to copy, distribute, display this work under the following conditions:

»Attribution:

You must mention the author.

»Noncommercial:

You may not use this work for commercial purposes.

»No Derivative Works:

You may not change, modify, alter, transform, or build upon this work.

For any reuse or distribution, you must make clear to others the license terms of this work.

References

Related documents