Information Retrieval (ir)

Top PDF Information Retrieval (ir):

Sense Disambiguation in Information Retrieval

Sense Disambiguation in Information Retrieval

Due to global exchange of information, there has been a rapid expansion in availability of online texts. It has been a great deal to manage such vast repository of text and provide access to end user for accessing this repository. User always expects to get the most appropriate results. The work of searching is done by the search engines. Search engines help the user to get appropriate results according to user needs. For this purpose they adopt various methods and algorithms to rank the results. But what when the search string is ambiguous? For example check can refer to term check mate and check can also refer to verification of something. The task of Information Retrieval (IR) becomes quite complicated; also user may not get what he/she actually wants. Hence it becomes important to resolve the ambiguity for the user to get accurate results. In this paper we will discuss about the ambiguity problem faced by the search engines and propose an algorithm to resolve such an ambiguity.
Show more

5 Read more

A Modern Information Retrieval at a Glance

A Modern Information Retrieval at a Glance

An information retrieval process begins when a user enters a query into the system. Queries are formal statements of information needs, for example search strings in web search engines. In information retrieval a query does not uniquely identify a single object in the collection. Instead, several objects may match the query, perhaps with different degrees of relevancy. IR refers to the method of extracting the information resources in a pre-defined automated manner, from an available lot of information resources. The search operations can be formulated on the basis of the metadata/full text or other indexing techniques. The main aim of IRS is to obtain relevant information by comparing the query with the associated and available documents [2].
Show more

6 Read more

Effective Information Retrieval System

Effective Information Retrieval System

Data retrieval, in the context of an IR system, consists mainly of determining which documents of a collection contain the keywords in the user query which, most frequently, is not enough to satisfy the user information need. In fact, the user of an IR system is concerned more with retrieving information about a subject than with retrieving data which satisfies a given query. A data retrieval language aims at retrieving all objects which satisfy clearly defined conditions such as those in a regular expression or in a relational algebra expression. Thus, for a data retrieval system, a single erroneous object among a thousand retrieved objects means total failure. For an information retrieval system, however, the retrieved objects might be inaccurate and small errors are likely to go unnoticed. The main reason for this difference is that information retrieval usually deals with natural language text which is not always well structured and could be semantically ambiguous. On the other hand, a data retrieval system (such as a relational database) deals with data that has a well defined structure and semantics. One may want to criticise this dichotomy on the grounds that the boundary between the two is a vague one.
Show more

6 Read more

Transfer learning for information retrieval

Transfer learning for information retrieval

Ranking has laid the foundations of many fields, for example, Information Retrieval (IR) and Recommender Systems, as well as Question Answering (QA). For IR applications like search engines, the ranking system looks to return a permutation of documents ordered by their relevance to an information request, expressed in queries, submitted to the system. However, the relevance of a document to an information need is not straightforwardly expressed in the document. Instead, various ranking models, which include the BM25 [1] and language models [2, 3], have been developed to predict the relevance via a set of signals extracted from both the document and the query. However, it has repeatedly been demonstrated that the ranking e↵ectiveness of ranking models varies across di↵erent test collections [4–6]. The majority of existing ranking models have been developed based on empirical studies and require parameter tuning for specific corpus. Recent research on Learning to Ranking (L2R) [7] has made significant strides towards training ranking models via machine learning techniques. Note that L2R is not learning to optimise the parameters for existing models, but to train a ranking model that can achieve optimised ranking function for a specific task.
Show more

174 Read more

Survey: Temporal Information Retrieval

Survey: Temporal Information Retrieval

Information retrieval is defined as “process of providing most relevant documents to the users from an existing collection”. Users request for data in the form of query typically in short textual form. In recent years, time has been acquiring increasing importance within search contexts, constructing to a new research area known as temporal information retrieval (TIR) that contains a number of different challenges. In recent years many researchers has taken interest in temporal information retrieval. Its aim is to improve the effectiveness of information retrieval methods by exploiting temporal information in documents and queries. T-IR aims to fulfil search needs by merging the traditional belief of document relevance with temporal relevance. For example, users may request for documents that contains the past information (e.g., information about historical figures); documents having the most new, up-to-date information (e.g., information about weather forecasts or currency rates); or even future-related information (e.g., information about planned events to be held in a certain area).
Show more

7 Read more

Cross-lingual Information Retrieval

Cross-lingual Information Retrieval

The area of information access has evolved to include many sophisticated tasks such as information retrieval, question answering tasks, summarization, multimedia information retrieval, text mining, text clustering and web information retrieval. Information retrieval (IR) is “the act of finding materials, usually documents of an unstructured form that satisfies an information need within large collections stored in computers”[1]. These tasks are not restricted to only documents in one language but also in multiple languages. The classical IR normally regards the documents in foreign language as unwanted “noise” [2]. The need for handling multiple languages, introduce a new area of IR which takes into account all documents regardless of the languages being used taking into account cross-lingual and multi-lingual aspects. Whilst in classical IR search engines, both query language and the retrieved documents language are the same, in cross-lingual IR system, they can be different. In enhanced version of cross-lingual IR, where the retrieved documents are of multiple languages, there are many problems that can arise in implementing it. This paper will focus on the challenges and current approaches to overcome these problems.
Show more

5 Read more

The Study of Information Retrieval

The Study of Information Retrieval

The meaning of the term information retrieval can be very broad. Just getting a credit card out of your wallet so that you can type in the card number is a form of information retrieval. An information retrieval is a system where the end users extract information from www. However, as an academic field of study information retrieval might be defined as Information retrieval (IR) is finding material (usually documents) of an unstructured nature (usually text) that satisfies an information need from within large collections (usually stored on computers).
Show more

5 Read more

NOVEL APPROACH FOR INFORMATION RETRIEVAL

NOVEL APPROACH FOR INFORMATION RETRIEVAL

Nowadays, all information available on World Wide Web present in digital form. With the gradual increase in the amount of information in the World Wide Web, there is a need for a much efficient techniques for Web Search. Sometimes the traditional keyword matching as well as the standard statistical techniques are insufficient to retrieve more relevant Web Pages. Users expect required information to be retrieved in return of simple short query with generic web search on huge heaps of information but there is great difficulty in retrieving relevant information according to user preferences. So, we need to enhance the power of web search to retrieve relevant information. Different Information retrieval (IR) techniques are available there. So this paper is an attempt to provide different methods to retrieve the most relevant information from such a huge collection that satisfied the users need. Thus it will represent the collaboration of various methods for information retrieval.
Show more

5 Read more

Teaching and learning in information retrieval

Teaching and learning in information retrieval

Focussing in the first (specialised) perspective, the use of complete IRS’s in order to teach the IR process, some tools used in the classroom. One example is the IR Toolbox (Efthimiadis & Freier, 2007), “an experiential teaching tool for learning about information retrieval systems”. The student can learn the whole IR process (document analysis, indexing, searching and evaluation), without having to program, at different levels of complexity, and the tool contains individual and group exercises. A second example is IR Base (Calado et al, 2007). This is an object oriented-designed toolkit the aim of which is “integration of components, documentation and services, focused on the rapid development of prototypes for research and teaching”. In this research, students are presented with a wide range of existing classes that show how models could be implemented. The knowledge gained is useful for implementing IR models and for performing experiments with standard test collections. An earlier effort and similar tool was an object oriented IR platform, produced with the aim of providing core functionality to develop new models and algorithms (Wade & Braeckevelt, 1994). The learning outcomes in these two last examples are to be able to build a search engine using a set of classes using basic functions, and to develop new modules to include new requirements (for example, new retrieval methods, indexing techniques, etc).
Show more

46 Read more

Effective Information Retrieval System

Effective Information Retrieval System

And so it is, but it is a useful one in that it illustrates the range of complexity associated with each mode of retrieval. Let us now take each item in the table in turn and look at it more closely. In data retrieval we are normally looking for an exact match, that is, we are checking to see whether an item is or is not present in the file. In information retrieval this may sometimes be of interest but more generally we want to find those items which partially match the request and then select from those a few of the best matching ones. The inference used in data retrieval is of the simple deductive kind, that is, aRb and bRc then aRc. In information retrieval it is far more common to use inductive inference; relations are only specified with a degree of certainty or uncertainty and hence our confidence in the inference is variable. This distinction leads one to describe data retrieval as deterministic but information retrieval as probabilistic. Frequently Bayes' Theorem is invoked to carry out inferences in IR, but in DR probabilities do not enter into the processing.
Show more

6 Read more

Information Retrieval on the Internet

Information Retrieval on the Internet

The main components of a search engine are the Web crawler which has the task of collecting webpages and the Information Retrieval system which has the task of retrieving text documents that answer a user query. In this chapter we present approached to Web crawling, Information Retrieval models, and methods used to evaluate the retrieval performance. Practical considerations include information about existing IR systems and a detailed example of a large- scale search engine (Google), including the idea of ranking webpages by their importance (the Hubs an Authorities algorithm, and Google’s PageRank algorithm). Then we discuss the Invisible Web, the part of the Web that is not indexed by search engines. We briefly present other types of IR systems: digital libraries, multimedia retrieval systems (music, video, etc.), and distributed IR systems. We conclude with a discussion of the Semantic Web and future trends in visualizing search results and inputting queries in natural language.
Show more

30 Read more

Learning representations for Information Retrieval

Learning representations for Information Retrieval

The pioneering work by van Rijsbergen (2004) officially formalized the idea that Quantum Theory (QT) could be seen as a “formal language that can be used to describe the objects and processes in information retrieval ”. The idea of QT as a framework for manipulating vector spaces and probability is appealing. However, the methods that stem from this initial intuition provided only limited evidence about the usefulness and effectiveness of the framework for IR tasks. For exam- ple, Piwowarski et al. (2010) test if acceptable performance for ad-hoc tasks can be achieved with a quantum approach to IR. The authors represent documents as subspaces and queries as density operators. However, both documents and queries representations are estimated through passage-retrieval like heuristics, i.e. a document is divided into passages and is associated to a subspace spanned by the vectors corresponding to document passages. Different representations for the query density matrix are tested but none of them led to good retrieval perfor- mance. Successively, a number of works took inspiration from quantum phenom- ena in order to relax some common assumption in IR (Zhao et al., 2011; Zuccon et al., 2010). Zuccon et al. (2010) introduce interference effects into the Probability Ranking Principle (PRP) in order to rank interdependent documents. Although this method achieves good results, it does not make principled use of the quantum probability space and cannot be considered as evidence towards the usefulness of the enlarged probabilistic space. In general, these methods made heuristic use of the concepts of the theory and no clear probabilistic interpretation can be given.
Show more

159 Read more

INFORMATION RETRIEVAL

INFORMATION RETRIEVAL

I should like to acknowledge my considerable debt to many people and institutions that have helped me. Let me say first that they are responsible for many of the ideas in this book but that only I wish to be held responsible. My greatest debt is to Karen Sparck Jones who taught me to research information retrieval as an experimental science. Nick Jardine and Robin Sibson taught me about the theory of automatic classification. Cyril Cleverdon is responsible for forcing me to think about evaluation. Mike Keen helped by providing data. Gerry Salton has influenced my thinking about IR considerably, mainly through his published work. Ken Moody had the knack of bailing me out when the going was rough and encouraging me to continue experimenting. Juliet Gundry is responsible for making the text more readable and clear. Bruce Croft, who read the final draft, made many useful comments. Ness Barry takes all the credit for preparing the manuscript. Finally, I am grateful to the Office of Scientific and Technical Information for funding most of the early experimental work on which the book is based; to the Kings College Research Centre for providing me with an environment in which I could think, and to the Department of Information Science at Monash University for providing me with the facilities for writing.
Show more

153 Read more

Interactive information retrieval

Interactive information retrieval

These are the main sources of materials on IIR, the ones I have used primarily for this chapter, but most conferences in the wide areas of IR, information science, librarianship, HCI, and the Web, as well as other less obvious places, such as conferences on social computing, will include occasional papers reflecting the pervasive nature of information access. There is no single monograph dealing solely with IIR although there are a number of dedicated monographs or collections of edited works addressing related areas. Numerous “how-to” books on optimizing end- user searching strategies and awareness (e.g., Hill, 2004) indicate the need for user support in searching. Hearst’s (2000) chapter in Modern Information Retrieval is still worth reading. The Turn, by Ingwersen and Järvelin (2005), serves as a companion to Ingwersen’s (1992) earlier work, which set out to provide a cognitive account of interactive infor- mation seeking. Other contributions teach us about information seeking and behavior, which, in turn, help specify the role of IIR and define the broader context in which these systems are used. Examples include the two recent collections edited by Spink and Cole on human information behavior (Spink & Cole, 2005b) and cognitive information retrieval (Spink & Cole, 2005a). Cognitive information retrieval, in this context, is focused on the human’s role in information retrieval.
Show more

49 Read more

Inferencing in Information Retrieval

Inferencing in Information Retrieval

INFERENCING IN INFORMATION RETRIEVAL I N F E R E N C I N G IN I N F O R M A T I O N RETRIEVAL A l e x a T M c C r a y N a t i o n a l L i b r a r y o f M e d i c i n e B e t h e s d a , M a r y l a n[.]

6 Read more

Image based Information Retrieval

Image based Information Retrieval

Image processing is a process of manipulating the images for different purposes such as image quality improvement, noise reduction, object detection, object recognition and content based image retrieval. With growth in digitalization the use of digital images have grown rapidly in all the fields such as in forensic investigation, medical image analysis. There is yet another field which can find image processing a useful tool for recognition and information retrieval purposes. This research paper uses Content Based Image Retrieval (CBIR) as a core concept for similar feature matching.
Show more

5 Read more

On Sanskrit and Information Retrieval

On Sanskrit and Information Retrieval

submitted as query to its inflected forms and to alternate spellings of these forms, with the help of a morphological generator. This expansion process does not cover phonetic transformations that result from the application of sandhi, so that a number of results are typically missed. Nevertheless, the recall of the system is very high, on par with the DCS word retrieval facilities. The SARIT corpus also makes use of an information retrieval system, the most interesting feature of which is the support of document attributes. Its indexing strategy is not described anywhere, but we can reasonably assume, by looking at the website documentation and at search results, that the unit of indexing is a cluster. Searching for a string in such a way that all its occurrences are returned thus requires adding wildcards on each side of it, as in *mukha* ‘face,’ for instance. This somewhat defeats the purpose of using an information retrieval architecture, if only because of efficiency reasons. Indeed, searching for a query string with a leading wildcard typically involves a full traversal of the terms dictionary, followed by a costly merge operation. 15
Show more

14 Read more

Survey Paper on Information Retrieval Algorithms and Personalized Information Retrieval Concept

Survey Paper on Information Retrieval Algorithms and Personalized Information Retrieval Concept

The paper targets traditional and advanced algorithms that are generally used and researched upon. Information Retrieval Systems are used in every field and to personalize the retrieval, new algorithms are being worked upon. Different approaches are being researched upon for improving performance and efficiency of the Information Retrieval Systems. The Goal of the paper is to discuss these Algorithms in detail. This paper clearly explains and compares the algorithms and their limitations and why there is a need to focus on personalized information retrieval systems as retrieving information is a day to day phenomena and making it accurate and precise is what needs to be done.
Show more

5 Read more

Private  Stateful  Information  Retrieval

Private Stateful Information Retrieval

In this section, we define the notion of Private Stateful Information Retrieval (PSIR). Roughly speaking, PSIR is an extension of the classical notion of a single-server PIR [35] in which the client keeps a state between queries. Before any query can be issued, the client initializes its state by executing the Init protocol with the server. As in PIR, the query process of a PSIR consists of a single client-to-server message followed by a single server-to-client message. The client will use the server’s response and their current state to recover the record desired. For the sake of clarity, we split this last step into two distinct parts: an Extract algorithm executed by the client to extract the record queried from the server’s reply, and an UpdateState algorithm jointly executed by the client and server to update the client’s state. In our construction, the UpdateState protocol consumes part of the client’s state and, if needed, re-executes the Init protocol. We stress that the server of a PSIR does not have a state just like in the original notion of a single-server PIR.
Show more

31 Read more

Methods for Distributed Information Retrieval

Methods for Distributed Information Retrieval

Published methods for distributed information retrieval generally rely on cooperation from search servers. But most real servers, particularly the tens of thousands available on the Web, are not engineered for such cooperation. This means that the majority of methods proposed, and evaluated in simulated environments of homogeneous coop- erating servers, are never applied in practice.

15 Read more

Show all 10000 documents...