fuzzy information retrieval system

Top PDF fuzzy information retrieval system:

Implementation of an efficient Fuzzy Logic based Information Retrieval System

Implementation of an efficient Fuzzy Logic based Information Retrieval System

In this paper, we discussed the implementation and efficiency details of an IR system with fuzzy based similarity measures. Experiments performed on TREC Ohsumed data collection using Apache Lucene prove the superiority of the proposed measure. This is a new technique having advantages over the other Information Retrieval systems as it can handle vague and imprecise queries of user very well. The performance of proposed technique is compared with cosine based similarity measure on TREC dataset. Results indicate that proposed similarity measure technique based on fuzzy logic, is better than cosine based similarity measure technique for handling vague, uncertain and imprecise queries. The insight provided by this model makes clear that fuzzy notions describe situations known through imprecise, uncertain, and vague information in a way that neither replaces nor is replaced but that, rather, complements the views produced by other approaches [14].
Show more

7 Read more

A fuzzy semantic information retrieval system for transactional applications

A fuzzy semantic information retrieval system for transactional applications

For the premise parameters identification (identification of premise and consequence) process, the space of each input variable is taken in turn and partitioned into fuzzy subsets while keeping the range of the other variables unpartitioned. Therefore, for the category module, when the ‘programming’ variable is partitioned, the variables ‘general’, ‘graphics’, ‘ai’, and ‘internet’ are not partitioned. In addition, when the ‘general’ variable is partitioned, the variables ‘programming’, ‘graphics’, ‘ai’, and ‘internet’ are not partitioned. At the end of the identification process for the consequence and premise parameters, a set of rules, which describes the behaviour of the fuzzy inference system, is produced. Looking at the membership functions depicted in Figure 2, the input variable ‘programming’ has seven sets of premises, the variable ‘general’ has five sets of premises, the variables ‘graphics’ and ‘ai’ have four sets of premises respectively, while the variable ‘internet’ has two sets of premises for each fuzzy subset. Hence, there are 7*5*4*4*2=1120 rules for each input variable. As there are five variables, the total number of rules will amount to 1120*5=5600. However, using the rule of thumb, or heuristic, concerning the relationship among the variables, it is possible to reduce the number of rules significantly (Zhang et al., 1997). It should be noted that removing a fuzzy subset from the clause of a rule reduces the number of rules by 25. After eliminating irrelevant rules, the total number of rules left in the category module is 540. Similar procedures were carried out for the feature and fsir modules respectively. The feature module has 360 rules, while the fsir module has 720 rules.
Show more

10 Read more

Online Full Text

Online Full Text

Application of fuzzy techniques approach to model flexible system for the access to information on the WWW is also realized within the solved problem. The aim is to design the system that can represent and manages the vagueness and uncertainty, which is characteristic of the process of information searching and retrieval. When some specific information is searched, this point and click access paradigm is unpractical, and the effectiveness of the results strongly depends on the starting page. The definition of systems plays an important role that help users to automatically access information relevant to their needs [1], [2]. The research is aimed at defining systems tolerant to imprecision and uncertainty in the elicitation of users' performances and able to learn them through an interactive and adaptive behaviour. The fuzzy technique approach is the definition basis of flexible system for locating and accessing information on the Web.
Show more

7 Read more

Information Retrieval: A Survey - Free Computer, Programming, Mathematics, Technical Books, Lecture Notes and Tutorials

Information Retrieval: A Survey - Free Computer, Programming, Mathematics, Technical Books, Lecture Notes and Tutorials

Since Rocchio’s method is here being used for classification rather than query refinement, there is no “initial query” term, as noted above. Dumais et al. also elected to discard the negative exam- ples, i.e., the documents that were not relevant to the given category for which the classifiers were being trained. Since the training set starts with relevance judgments for each category, there are no interactive relevance judgments. Hence, the Rocchio formula for a given category was reduced to computing the centroid (the average) of the documents labeled relevant to the given category. At test time, a new document was judged relevant to a given category if its similarity to the cate- gory’s centroid (as measured by the Jaccard similarity measure) exceeded a specified threshold. Lewis and Gale [SIGIR ‘94] use a variation on traditional relevance feedback which they call “uncertainty sampling.” In any situation where the volume of training data is too large for the user to rate all the documents, some sampling method is required. In traditional relevance feedback, the sample the user is asked to classify consists of those documents that the current classifier con- siders most relevant. Hence, Lewis and Gale call this approach “relevance sampling”. It has the notable virtue, especially if the relevance feedback is taking place while the system is operational, that the documents the user is asked to classify are the ones that (as far as the classifier can tell) he wants to see anyway. However, if the training is taking place before the system is operational (or in a very early stage of operation) and the primary objective is to perfect the classifier, then uncer- tainty sampling (derived from “results in computational learning theory”) may work better. The method assumes a “classifier that both predicts a class and provides a measurement of how certain that prediction is. Probabilistic, fuzzy, nearest neighbor, and neural classifiers, along with many others, satisfy this criterion or can be easily modified to do so.” The sample documents chosen for the user to rate are those about which the classifier is most uncertain, e.g., most uncertain whether to classify them as relevant or non-relevant. For a probabilistic classifier (such as the one they actually describe and test in their paper), the most uncertain documents are those that are classi- fied with a probability of correct classification close to 0.5. Lewis and Gale obtained substantially better classification for a given sample size when the classifier was trained by uncertainty sam- pling of the training set than when it was trained by relevance sampling (and far better than with training on a random sample).
Show more

224 Read more

FAIR: A Fuzzy ART Network Based Scheme for Retrieving Useful Information from Blogs

FAIR: A Fuzzy ART Network Based Scheme for Retrieving Useful Information from Blogs

Following this trend, researchers have paid more and more attentions to study some issues regarding blogs. Todoroki et al. [18] propose to utilize a blog as an electronic research notebook, since a blog system provides user-friendly interface compatible with web browsers, easy-to-use authoring tools and full-text retrieval. Chau and Xu [13] present a semi-automated approach to facilitate the monitoring, study, and research on blogs of online hate groups. Lin and Huang [19] indicate that blogs can significantly influence browsers and indirectly promote tourism. Du and Wanger [23] seek to explore blogs’ success factors from a technology perspective. Asano [24] investigated whether a ‘fiction novel’ on blogs describing a girl undergoing epilepsy surgery can potentially facilitate familiarity to epilepsy surgery among the general Internet users in Japan. Most of related studies have been conducted to measure the influence of the blogosphere However, relatively few papers discuss extracting knowledge from blogs, such as usage mining [25] and structure mining [13]. And, it just has relatively few works to discuss blog content mining.
Show more

6 Read more

A Survey on Information Retrieval Techniques and Applications

A Survey on Information Retrieval Techniques and Applications

The standard practice for IR is ad-hoc. Here user puts a query and the matched information is retrieved. Matching can be exact based on Boolean logic. As a better alternative, matching can be ranked. Presently, ranking is done by statistical analysis. However, author opines that application of fuzzy logic might provide a better ranking system.

6 Read more

Multi-agent System for Documents Retrieval and Evaluation Using Fuzzy Inference Systems

Multi-agent System for Documents Retrieval and Evaluation Using Fuzzy Inference Systems

Recently the World Wide Web are packed with huge quantities of information. From this view the user finds it difficult to get the relevant informations due to the increased of their quantities. This paper uses multi- agent system uses intelligent agent in order to retrieval documents from the World Wide Web. The user by this system can easily get the relevant documents which to need them.Multi-agent System is combined with fuzzy inference system for ranking documents. The documents ranking score by cosine similarity using fuzzy inference system development and implemented much simpler than the traditional method which require mathematical equations.
Show more

7 Read more

NLP for Information Retrieval using B Trees

NLP for Information Retrieval using B Trees

Information retrieval is an emerging technology in Information technology field. Every application needs storing and retrieval of application specific data. The traditional way of doing it, is through various SQL queries. This requires high technical knowledge about the usage of the SQL tools and the structure of relevant database schema. It is hard for common people not having technical knowledge to use these kinds of tools. In this regard, Natural Language Processing concept started evolving rapidly. NLP made human-computer interaction possible through human natural language. Application user can query the database in any of the human languages like English and get the relevant answer using NLP techniques.
Show more

6 Read more

Interface Design for Domain-Specific Image Retrieval: A Pilot Study.  A master's paper for the M.S. in L.S. degree.

Interface Design for Domain-Specific Image Retrieval: A Pilot Study. A master's paper for the M.S. in L.S. degree.

Even though the results from the pilot study show the advantage of the new design over the current search interface, the application of overview/preview in the prototype is only one possible way in which an online image collection can adopt these design guidelines. Design and testing of previews and overviews for other image retrieval systems with different user groups are needed in order to better answer the research questions. Besides, in the redesign, different approaches have been taken in order to deal with several new issues brought to our attention during the usability test, e.g.
Show more

68 Read more

On the implementation of E.R.I.K - Effective Retrieval of Information by Keyword: an information storage and retrieval research system

On the implementation of E.R.I.K - Effective Retrieval of Information by Keyword: an information storage and retrieval research system

CONTENTS CHAPTER 1 1.1 1.2 1.3 1.4 1.5 QUERIES CHAPTER 2 2.1 2.2 2.3 OVERVIEW OF EXISTING SEARCH CHAPTER 3 3.1 3.2 OVERVIEW ERIK FUNCTIONAL 3.3 ERIK CONSTRAINTS CHAPTER 4 INTRODUCTION 2 [r]

119 Read more

Studying the History of Ideas Using Topic Models

Studying the History of Ideas Using Topic Models

understanding, computational semantics, WordNet, word sense disambiguation, semantic role labeling, RTE and paraphrase, MUC information extraction, and events/temporal. We then plotted p(z ˆ ∈ S|y), the sum of the proportions per year for these top- ics, as shown in Figure 3. The steep decrease in se- mantics is readily apparent. The last few years has shown a levelling off of the decline, and possibly a revival of this topic; this possibility will need to be confirmed as we add data from 2007 and 2008.

9 Read more

Information retrieval from image databases

Information retrieval from image databases

As vast amount of digital image data is getting archived by the advanced libraries, there is a requirement for an ef cient search methodologies to make them accessible according to client's data requirement. For their retrieval, it is imperative to recognize their contents. Current technologies for optical character recognition (OCR) and document analysis do not handle such documents adequately because of the recognition errors. Due to these challenges, computer is unable to recognize the characters while reading them. In this paper, we propose and effective word image matching scheme that achieves high performance in the presence of noise in image, degradation and word form-variants. Initially, each image in image-database is pre-processed. In the next step find contour method is used to detect blobs which are further passed in tesseract engine. Tesseract segments the characters from the image and stores in character database. Each word in the database is used to index a given set of images. During retrieval, the query word presented to the system is matched with characters in the database and all images containing instances of the query word are retrieved and presented to the user. Using this approach, our method is able to successfully handle images with different font styles, size and heavily touching characters. From the experimental results on the variety of image database it is observed that the extraction of text from the images is mostly accurate and indexing of words based on the position is working perfectly.
Show more

5 Read more

A Full Text Retrieval System in a Digital Library Environment

A Full Text Retrieval System in a Digital Library Environment

There are many factors that must be considered when designing the user interface of a software because the user must be able to interact with the system in a way that the system will understand whatever input given by the user. Therefore, the quality of the interface and software in general must pass the usability testing standard. Some usability factors, such as fit for use, ease of learning, task efficiency, ease to remember, subjective satis- faction and understand ability but all are put into consideration when designing the user interface (Figure 3).

8 Read more

Research Problems in Natural Language Processing – A brief Overview P. Selvaperumal

Research Problems in Natural Language Processing – A brief Overview P. Selvaperumal

Information retrieval (IR) system is widely dealt problem but still there are many areas in IR which needs to be addressed. Since natural language is highly ambiguous removal of intrinsic ambiguity in the query form inherent part of any information retrieval system. Ambiguity may be in names (synonymy, polysemy etc) or in any other parts of the sentences. Cross lingual information retrieval system [6] is another promising area where the task is to retrieve documents that are in other languages to that of the query language. To be precise, search engines has to retrieve documents of any language provided it is relevant to the query. Such search engines are generally regarded as semantic search engines which retrieves documents that are semantically related to the query. An extension of traditional Information retrieval system is web information retrieval system that involves retrieving relevant web pages for an input query. Research areas in Information retrieval includes query expansion, index creation and maintenance, information retrieval models etc.
Show more

5 Read more

diehu.pdf

diehu.pdf

It has been demonstrated that cluster-based information retrieval can be helpful for improving retrieval effectiveness (Kang, Na, Kim, & Lee, 2007), and cluster-based document browsing is more effective than a single merged list (Crestani, Wu, 2006). Crestani and Wu’s study in 2006 demonstrates that cluster hypothesis continues to be applicable in heterogeneous distributed information retrieval environments, and creating hierarchical clusters is highly effective for presenting retrieved results in heterogeneous distributed information retrieval environments. However, findings from the use of cluster-based IR systems are not always absolute. Voorhees (1985) reported that in a clustered-based retrieval, there is not a full ranking of the document collection and thus, clustered-based retrieval is not agreeable to the creation of recall and precision graphs.
Show more

30 Read more

Hybrid algorithm for the control of technical 
		objects

Hybrid algorithm for the control of technical objects

There was a loss of synchronization during opening and closing the valves in the channels of air- and gas-supply. The concentration of output flow was constantly changing in the channel of chemical purification and this led to the activation of the blocking system. Approbation of the PI-controller model is fulfilled on the example of regulation of the DC engine speed (valve drive).

5 Read more

AN EXTENDING RECOMMENDATION SYSTEM FOR WEB INFORMATION RETRIEVAL

AN EXTENDING RECOMMENDATION SYSTEM FOR WEB INFORMATION RETRIEVAL

Abstract: - Web is a huge source of informa tion a number of internet users visi t on different web si tes and extra ct thei r requi red da ta . Tha t is di rect source of informa tion whi ch is used by end client. On the other hand some additional data genera ted on the pa rked domain web server whi ch is used by web si te administra tor and used for deciding the future business trends and future servi ce planning. Tha t essential information is recovered from the web server l og files, knowledge extra ction from these raw files a re also called the web usage mining. In this presented work web usage mining is inves ti gated and a new da ta model for web recommenda tion is reported. in order to develop the proposed recommender s ys tem the user session web a ccessed log da ta is a ccessed and classified on the basis of the time based fashion. This kind of anal ysis demons tra tes the user web a ccess browsing beha viour in di fferent time slots. Thus a ccording to the user beha vi our anal ysis in different time domains a predi cti ve model namel y hidden Ma rkov model is a pplied on the recovered da ta . Tha t uses the probability es tima tion techniques for finding the new na vi ga tional web a ccess trend. The proposed da ta model is implemented using the visual s tudio envi ronment and the performance of the predicti ve algori thm is computed. The performance of the i mplemented s ys tem is evaluated in terms of a ccura cy, memory consumption, error ra te and time consumption. According to the obtained resul ts the p resented technique enhancing the performance as the training da ta is increases.
Show more

13 Read more

Bibliographic Information Retrieval System using FORTRAN

Bibliographic Information Retrieval System using FORTRAN

an attempt has been made to write the programs in as much a structured form as possible, so that understand- ing of the basic philosophy behind the programs and the system is easy and clear and one can translate the pro- grams into other languages without much difficulty. Or to increase the efficiency, one can write the programs in more than one language, and then have the load

31 Read more

An Identity based Information Retrieval System for MANET.

An Identity based Information Retrieval System for MANET.

The exponential increase in the number of nodes in MANET needs proper management hence organizing MANET into different groups, called cluster, each cluster has its own leader Called Cluster head (CH)[22,23].Cluster head works as a Certificate authority[2,3] for own Cluster and Mange all operation related to communication, like information about the each cluster node, node mobility etc. As security point of view Clustering play important role in MANET. Traditional information retrieval systems have several drawbacks in common, such as delaying in information updating. The need to secure communication in MANET is extremely challenging because of the dynamic nature of the network and the lack of centralized management. A distributed corticated authority intended for cluster-based architecture is discussed in this paper. Certificate use for authentication of node and Session key play a important role in secure Communication.
Show more

6 Read more

Development of Real time Naval Strategic Command and Control Systems Dec65 pdf

Development of Real time Naval Strategic Command and Control Systems Dec65 pdf

System Programs 4.1 Storage and Retrieval of Information 4.1.1 File System 4.1.2 File Organization 4.1.3 Retrieval of Items from the List Structured File 4.1.4 Real Time Updating of the [r]

120 Read more

Show all 10000 documents...