Abstract: The traditional search engines retrieve both relevant and irrelevant data. This results in wastage of user’s time. Since the traditional rankingtechniques are in the level of keyword matching . They do not consider the semantics behind the user’s query. To overcome the drawbacks of traditional rankingtechniques there is semantic page ranking. In this paper we focus on the analysis of various semantic page rankingtechniques and their comparative survey with respect to some similar factors among them.
Cloud computing presents an architecture that delivers computing services via the internet on demand and payed per use of a pool of resources that are shared, such as networks, storage, servers, services and applications, without having to acquire them. Cloud computing reduces managing cost and time for organizations. Many industries, such as banking, healthcare and education are shifting to the cloud, due to the effectiveness of services delivered by the pay-per-use concept depending on the resources, such as processing power utilized, transactions performed, bandwidth consumed, data transmitted, or storage space occupied etc. Cloud computing is considered as a technology that relies completely on the internet, where client data is saved and kept in the data center of a cloud supplier. The goal of this paper is to implement and evaluate different allocation, scheduling and rankingtechniques, where, different methods for the allocation, scheduling and ranking of workflow tasks are proposed, implemented and evaluated. Simulation was performed on these techniques, and the results were analyzed to find the best technique in terms of efficiency and performance in reducing completion time and cost.
Image search re-ranking is rectification of the search results by employing visual characteristics of images to reorder the initial search results. The retrieved search result may include noisy images. It decreases efficiency of image search. So we have to rank the search results. This reordering helps to satisfy user’s search experience in both accuracy and response time. The current image search re- ranking has two important steps, they are feature extraction and ranking function. It increases the image retrieval performance. This ranking function design is the main challenge in image search re-ranking. So it became an interested area of research. It can be classified into different area. That are classification based method, learning to rank based method and the graph based method. In classification based method, it first train the classifier with a training data which is get from initial search results. Then it reorders the result by the relevance scores. This methods takes the ranking as a classification problem. So the performance is poor as compared with rest techniques. In graph based methods, it implemented by a Bayesian perspective or random walk. Here re-ranking implement as random walk. Here the nodes represent the result of initial search. The stationary probability of this random walk is used for the computation of final re-ranking scores. But main limitation of this method is graph construction and ranking computation is expensive. So it limits its application to the large data sets. In learning to rank method, it utilize two popular learning to rank approaches. Here a content aware ranking model is used. By using this both textual and visual information are fetched in ranking learning process. But main limitation of this method is it require more training data to train a model, and it is not practical for re-ranking of real images.
We divided the ten documents in each collection into a training set and a testing test randomly with each set consisting of 5 documents. The upper lim- it of the summary size was kept to be 250 words. The widely used summary evaluation tool ‘ROUGE’ (Lin, 2004) was used to compare the results with several other summarizers. Our com- petitors consist of summarization methods that competed in the TAC 2011 conference (UBSum- marizer and UoEssex), a widely used summarizer which is integrated with Microsoft-Word (auto- summarize) and the recently proposed Association Mixture Text Summarization (AMTS) (Gross et al., 2014). Test results the Recall, Precision and F- measure using ROUGE-2 and ROUGE-SU4 sum- mary evaluation techniques that is shown in Table1 and Table 2 respectively.
Abstract - The World Wide Web consists billions of web pages and huge amount of information available within the web pages. To retrieve required information from World Wide Web, search engines perform number of tasks based on their respective architecture. When a user refers a query to the search engine, it generally returns a large number of pages in response to the user’s query. To support the ordering of search results according to their importance and relevance to user’s query, various ranking algorithms are applied on the search results. this paper gives detailed comparison and analysis of different ranking algorithms: first is the Text Based Ranking, second is PageRank( the Google’s algorithm) algorithm; and the last being the Users Rank algorithm.
ABSTRACT: Cloud Computing becomes prevalent, sensitive information are being increasingly centralized into the cloud. For the protection of data privacy, sensitive data has to be encrypted before outsourcing, which makes effective data utilization a very challenging task. Although traditional searchable encryption schemes allow users to securely search over encrypted data through keywords, without capturing any relevance of data files. Ranked search greatly enhances system usability by enabling search result relevance ranking instead of sending undifferentiated results, and further ensures the file retrieval accuracy. The statistical measure approach, i.e. relevance score, from information retrieval to build a secure searchable index. . In proposed system, stemming and Synonym clustering is applied on the score calculation of each keyword in the file collection, it increase the score value and it help to retrieve more relevant files.
ME classifier was used by Kaufmann to detect parallel sentences between any language pairs with small amount of training data. Other tools were developed to automatically extract parallel data from non-parallel corpora use language specific techniques or require large amounts of training data. Their results showed that ME classifiers can generate useful results for almost any language pair. This can allow the formation of parallel corpora for many new languages.
Information security means protecting information and systems from security threats such as unauthorized access, use, disclosure, disruption, modification or destruction of information. The frequency of information security breaches is growing and common among most organizations. Internet connection is increasingly cited as a frequent point of attack and likely sources of attacks are independent hackers and disgruntled employees. Despite the existence of firewalls and intrusion detection systems, network administrators must decide how to protect systems from malicious attacks and inadvertent cascading failures. Effective management of information security requires understanding the processes of discovery and exploitation used for attacking. An attack is the act of exploiting a vulnerability that is a weakness or a problem in software (a bug in the source code or flaw in design). Software exploits follow a few patterns; one example is buffer overflow. An attack pattern is defined as a “blueprint for creating a kind of attack” (Hoglund & McGraw, 2004, p. 26). Buffer overflow attacks follow several standard patterns, but they may differ in timing, resources used, techniques and so forth.
Both routing techniques were simulated in the same environment using Network Simulator (ns-2). Both AODV and DSDV were tested by the traffic i.e. TCP. The algorithms were tested using 50 nodes. The simulation area is 1000m by 1000m where the nodes location changes randomly. The connection used at a time is 30. Speed of nodes varies from 1m/s to 50m/s. by using CBR traffic we calculate performance of these two protocols.
also be useful for short-range, multi-hop communication (instead of long range communication) to conserve energy. We expect most sensor networks to be dedicated to a single application or a few collaborative applications, thus rather than node-level fairness, we focus on maximizing system-wide application performance. Techniques such as data aggregation can reduce traffic, while collaborative signal processing can reduce traffic and improve sensing quality. In-network processing, data will be processed as whole messages at a time in store-and- forward fashion, so packet or made-level interleaving from multiple sources only increases overall latency.
the many user-item subgroups each consisting of a subset of items and a group of like-minded users on these items. It was more natural to make preference predictions for a user via the correlated subgroups than the entire user-item matrix. In their work to find meaningful subgroups, they invent the Multiclass Co-Clustering (MCoC) problem and propose an effective solution to it. Then they proposed a unified framework to extend the traditional CF algorithms by utilizing the subgroups information for improving their top-N recommendation performance. Their approach can be seen as an extension of traditional clustering CF models. Systematic experiments on three real world data sets have demonstrated the effectiveness of the proposed approach. Their experiments were performed on three real data sets: MovieLens-100K4, MovieLens-1M and Lastfm. They run many state-of-the-art recommendation methods and check whether their top-N recommendation performance is improved after using their framework. The experimental results showed that using subgroups was a promising way to further improve the top-N suggestion performance for many popular CF methods. Gediminas Adomavicius et. al  has introduced and explore a number of item rankingtechniques that can generate recommendations that have substantially higher aggregate diversity across all users while maintaining comparable levels of suggestion accuracy. Comprehensive empirical evaluation consistently showed the diversity gains of the proposed techniques using several real-world rating datasets and poles apart rating prediction algorithms. They have conducted experiments on the three datasets including Movie Lens (data file available at grouplens.org), Netflix (data file available at netflixprize.com), and Yahoo! Movies (individual ratings collected from movie pages
In recent years, Data Mining techniques on network traffic data provides a potential solution that helps develop better intrusion detection systems. Feature selection is an important pre-processing tool in data mining that helps in increasing the performance of classification models . The purpose of this paper is: to select the most informative feature using an ensemble feature ranking technique. The ensemble approach determines a feature’s importance or score from multiple feature rankingtechniques which are combined to generate three ranking lists: fusion, selection and hybrid; to introduce a hybrid classifier model of Simplified Swarm optimization (SSO) with Ant Colony Optimization (ACO) to maximize the classification accuracy. The purpose of this hybridization is to improve the performance of SSO for mining the intrusion pattern of the network traffic.
As mentioned before, feature rankingtechniques provide a ranked list of features. Different feature rankingtechniques may produce different rankings according to their specific criteria for assessing features and there is no universal ranking algorithm that considers all the measures. Therefore, motivated by ensemble methods in supervised learning , rank aggregation methods are proposed to combine different feature rank- ing methods and achieve more stable ranked feature lists with similar or even higher classification performance [13, 14]. In order to perform ensemble feature selection, one needs to decide on the method to aggregate the results from different ranking methods. There are many rank aggregation approaches from the very simple ones to some more complex . To the best of our knowledge, there are no studies done on feature selec- tion based on rank aggregation methods in the sleep stage classification area.
that are relevant to discriminate between classes under imbalanced data conditions.The order of the algorithm for BFE- SVM with balanced loss is exactly the same as that of RFE-SVM, and it may lead to better results by defining an adequate loss function for classification performance with imbalanced data sets. Advantages are: The proposed approaches outperform other feature rankingtechniques in terms of predictive performance for different SVM-based feature selection techniques, achieving particularly good results on highly imbalanced data sets, based on their ability to identify irrelevant variables using the classifier and minimizing the number of errors in the minority class, which is assumed to have a higher cost.The proposed methods allow for the explicit incorporation of misclassification costs in the assessment of each attribute’s contribution, leading to a feature selection process especially designed for a particular application.Our strategies are very flexible and allow the use of different kernel functions for nonlinear feature selection and classification using SVM. Furthermore, they can also be generalized to various classification methods, other than SVM. Disadvantages are: Need for an intelligent oversampling in extreme cases of class imbalance and overlap,in which no adequate classifier can be found, since embedded and wrapper feature selection strongly depend on the classification method.