Solving Mining Aspects Problem by Non Linear Regression Technique and FIRS

(1)

Volume 1, Issue 3, September 2012

Abstract

-

Data mining deals with extracting knowledge from large and infinite amount of stream data with handling the quality of data which has limited volume of disk or memory. In traditional transaction environment it is impossible to perform frequent items mining because it requires analyzing which item is a frequent one to continuously incoming stream data. For the stream data regression techniques developed using Frequent Item Prediction Method (FIPM), Frequent Temporal Pattern Data Stream (FTP-DS). For the stream data regression techniques developed using Frequent Item Prediction Method (FIPM), Frequent Temporal Pattern Data Stream (FTP-DS).Frequent Item Prediction Method is a Linear regression method which is applied on the one-dimensional stream data and Frequent Temporal Pattern Data Stream is also a linear regression method is applied on the two-dimensional stream data. In this dissertation we analyze the proposed Frequent Item Prediction Method and Frequent Temporal Pattern Data Stream with existing methods. For reducing the error in FIPM and FTPDS we propose the nonlinear regression method for stream data instead of linear regression method. This minimizes the errors in proposed FIPM and FTPDS in comparison to existing FIPM and FTPDS method.

Recently, the application of soft computing to data mining with its aspects has also drawn the attention of researchers to solve data mining problems. For any aspects of data mining as web mining, text-mining etc information retrieval become an important term. Since precise data and removal of uncertainty remains a great challenge, the objective of this work is to develop some new and practical system with the use of fuzzy logic ,the proposed system “FIRS”[ 1] From the experimental results the conclusion can be drawn that different methods might outperform the others in different situations.

Keywords: FIPM, FTP-DS, Data Mining, Web Mining, FIRS I. INTRODUCTION

In the last few decades, the continuing advancement of modern technology has brought about a revolution in science and engineering. Data mining is the task of discovering interesting patterns from large amounts of data, where the data center stored in databases, data ware houses, or other information repositories. Data mining functionalities include the discovery of concept/class descriptions, associations and correlations, classification, prediction, clustering, trend analysis, outlier and deviation analysis, and similarity analysis. In Regression techniques prediction is to be done on the basis of predefined mathematical model and methods. Usually taking linear regression, non linear regression and multi variable linear regression approaches. The different aspects of mining, like stream data mining, web mining, text mining, information mining, Multimedia mining etc.

One aspects of data mining is Stream Data Mining. Stream data is generated continuously in a dynamic environment, with huge volume, infinite flow, and fast changing behavior. The mining the Stream Data is a process of extracting knowledge structures from large continuous, rapidly changing data records. Stream data is a continuous data in which variables are occurring repeatedly at different timings examples of data streams include computer network traffic, phone conversations, ATM transactions, web searches, and sensor data. In traditional transaction environment it is impossible to perform frequent items mining because it requires analyzing which item is a frequent one to continuously incoming stream data. For the stream data regression techniques developed using Frequent Item Prediction Method (FIPM), Frequent Temporal Pattern Data Stream (FTP-DS). For the stream data regression techniques developed using Frequent Item Prediction Method (FIPM), Frequent Temporal Pattern Data Stream (FTP-DS).Frequent Item Prediction Method is a Linear regression method which is applied on the one-dimensional stream data and Frequent Temporal Pattern Data Stream is also a linear regression method is applied on the two-dimensional stream data. Data mining refers to the extraction of useful information from a large set of data. It is a technique for the discovery of patterns hidden in large data sets, focusing on issues relating to their feasibility, usefulness, effectiveness and scalability. On the other hand, soft computing deals with information processing. If these two key properties can be combined in a constructive way, then this formation can effectively be used for knowledge discovery in large databases. The main constituents of soft computing, at this juncture, include fuzzy logic, neural networks, genetic algorithms, and rough sets. For any aspects of data mining as web mining, text-mining etc information retrieval become an important term. The main problems in information retrieval and data mining lie in the large scale of databases, especially when dealing with text or web resources, in the heterogeneous data of various types, numerical or symbolic, precise or imprecise, ambiguous, approximate, with incomplete files, uncertain because of the poor reliability of sources or the difficulties of measurement of observation. So for removing the uncertainty Fuzzy Logic is a better tool of soft computing techniques in data mining and its related aspects issues.

The rest of this paper is organized as follows: Section II describe the characteristics of data mining techniques and its related aspects. Section III provides an introduction of stream data with FIPM and FTPDS. Section IV covers, in detail, the use of proposed new non linear regression method for FIPM

Solving Mining Aspects Problem by Non Linear

Regression Technique and FIRS

(2)

and FTP-DS. Section V provide the soft computing approach for data mining aspects a new model FIRS for handling imprecise and uncertainty in user database and query. Section VI provides the conclusion and future scope of research in the area of data mining and its aspects in soft computing.

II. DATAMININGASPECTSANDTECHNIQUES The term Knowledge Discovery in Databases or KDD for short, refers to the broad process of finding knowledge in data, and emphasizes the "high-level" application of particular data mining methods. It is of interest to researchers in pattern recognition, machine learning, statistics, artificial intelligence, knowledge acquisition for expert systems, and data visualization. The unifying goal of the KDD process is to extract knowledge from data in the context of large databases.

Fig.1. KDD Process A.TECHNIQUES

Data mining refers to extracting or “mining” knowledge from large amounts of data. The term is actually a misnomer. Data mining used in many organizations and offices for extracting the data from large data warehouses, Data mining have lot of scope. Typically, a data mining algorithm constitutes some combination of the following three components.

• The model: The function of the model (e.g., classification, clustering) and its representational form (e.g., linear discriminants, neural networks). A model contains parameters that are to be determined from the data.

• The preference criterion: A basis for preference of one model or set of parameters over another, depending on the given data. The criterion is usually some form of goodness- of-fit function of the model to the data, perhaps tempered by a smoothing term to avoid over fitting, or generating a model with too many degrees of freedom to be constrained by the given data.

• The search algorithm: The specification of an algorithm for finding particular models and parameters, given the data, model(s), and a preference criterion. A particular data mining algorithm is usually an instantiation of the model/preference/search components. The more common model functions or techniques in current data mining practice include the following.

1) Classification: classifies a data item into one of several predefined categorical classes.

2) Regression: maps a data item to a real valued prediction variable.

3) Clustering: maps a data item into one of several clusters, where clusters are natural groupings of data items based on similarity metrics or probability density models.

4) Rule generation: extracts classification rules from the data.

5) Discovering association rule: describes association relationship among different attributes.

6) Summarization: provides a compact description for a subset of data.

7) Sequence analysis: models sequential patterns, like time-series analysis. The goal is to model the states of the process generating the sequence or to extract and report deviation and trends over time

Fig.2. Data Mining Techniques B.ASPECTSOFDATAMINING

Different aspects or categories of data mining is as follows 1) Web Mining

Web mining [11] refers to the use of data mining techniques to automatically retrieve, extract and evaluate (generalize/analyze) information for knowledge discovery from Web documents and services.

Three main axes of Web mining have been identified, according to the Web data used as input in the data mining process, namely Web structure, Web content and Web usage mining.

2) Text Mining:

Text Mining, also known as knowledge discovery from text, and document information mining, refers to the process of extracting interesting patterns from very large text corpus for the purposes of discovering knowledge. It is an interdisciplinary field involving Information Retrieval, Text Understanding, Information Extraction, Clustering, Categorization, Topic Tracking, Concept Linkage, Computational Linguistics, Visualization, Database Technology, Machine Learning, and Data Mining.

3) Information Mining:

Information mining [12] is the non-trivial process of identifying valid novel, potentially useful, and understandable patterns in heterogeneous information sources. The term information is thus meant to indicate two things: In the first place, it points out that the heterogeneous sources to mine can already provide information, understood as expert background knowledge, textual descriptions, images and sounds etc., and not only raw data. Secondly, it emphasizes that the results must be comprehensible (\must provide a user with information"), so

(3)

that a user can check their plausibility and can get insight into the domain the data comes from.

4) Stream data mining:

The foundations, on which stream data mining solutions rely, come from the field of statistics, complexity and computational theory [44]. The online nature of data streams and their potentially high arrival rates impose high resource requirements on data stream processing systems. In order to deal with resource constraints in a graceful manner, many data summarization techniques have been adopted from the field of statistics. They provide means to examine only a subset of the whole dataset or to transform the data vertically or horizontally to an approximate smaller size data representation so that known data mining techniques can be used. Also, techniques from computational theory have been implemented to achieve time and space efficient solutions.

III. CONCEPTOFSTREAMDATAWITHFIPMAND FTP-DS

Stream data is generated continuously in a dynamic environment, with huge volume, infinite flow, and fast changing behavior. The mining the Stream Data is a process of extracting knowledge structures from large continuous, rapidly changing data records Stream data is a continuous data in which variables are occurring repeatedly at different timings Examples of data streams include computer network traffic, phone conversations, ATM transactions, web searches, and sensor data. In traditional transaction environment it is impossible to perform frequent items mining because it requires analyzing which item is a frequent one to continuously incoming stream data.

A.OUTLINES OF FIPM FOR STREAM DATA

Chai, Eun Hee Kim and Long Jin[ 8] proposed FIPM (frequent item prediction method ) is proposed to predict frequent items using linear regression model. Which can be used as a prediction model for the stream data? They proposed an algorithm that predicts frequent items using simple linear regression model from one Dimensional stream data. As we know Stream data is continuous and complex in time so it is only possible to access such data temporarily. Stream data has sequential characteristics that can be considered as time series data. Prediction of time series data gathers useful data estimating future through the analysis of data from the past. In the FIPM method first one dimensional stream data is preprocessed to establish simple linear regression model. When the regression model is generated, prediction process on the possibility of frequent items is performed based on the regression model stream data is reorganized with the time in which each one dimensional data is inputted .In the reorganization of data we collect the input time of same data and calculate the difference in time at which data is accured. At the later stage pairing is to be done with time which is calculated from the reorganization of data.

B STEPSOFFIPM

In the regression Method for FIPM, we use the value of dependent and independent variables and calculate the coefficients and put these coefficients in the linear regression model. Steps for FIPM are following

Step1: Calculate the co efficient of regression model of FIPM, b0 and b1

Step2: fit the regression model of FIPM:

𝑦𝑖 = 𝑏0 + 𝑏1𝑥 + 𝜀

Step3: calculate the error

C. FREQUENTTEMPORALPATTERNDATASTREAM(FTP-DS) Wei-Guang Teng and Feng Zhao, Qing-Hua A Li [13] [14] introduced an algorithm FTP-DS(Frequent Temporal Pattern Data Stream) The FTP-DS algorithm uses linear regression to perform trend detection, it is an effective method, but it always omits some exceptions, which are too pivotal to lead to failure. Regression-based algorithm, called algorithm FTP-DS (Frequent Temporal patterns of Data Streams), has been more effective to mine frequent temporal patterns for data streams. FTP-DS has two major features, one data scan for online statistics collection and regression based compact pattern representation. To attain the feature of one data scan, the data segmentation and the pattern growth scenarios are explored for the frequency counting purpose.

D.FEATURESOFFTP-DS

The following are the features of FTP-DS

1) Algorithm FTP-DS scans online transaction flows and generates candidate frequent patterns in real time.

2) The other important feature of algorithm FTP-DS is on the regression-based compact pattern representation. Specifically, to meet the space constraint, we devise for pattern representation a compact ATF (standing for Accumulated Time and Frequency) form to aggregately comprise all the information required for regression analysis. 3) FTP-DS use the segmentation tuning and

segment relaxation to enhance the Regression Technique.

4) FTP-DS perform the trend detection very effectively in comparison of other linear regression model.

5) Based on two dimensional stream data.

IV. PROPOSEDNONLINEARREGRESSIONMETHOD FORFIPMANDFTPDS

In existing FIPM we have applied the Linear Regression for the frequent stream data. This gives the Prediction graph

(4)

with lot of errors. In proposed method, we have applied the non linear regression for the frequent stream data with same preprocessing of existing FIPM.

A.EXISTINGFIPM

Existing FIPM consists of following steps:

Step1: organize the data, when same types of variable are occurred.

Step2: Calculate the interval of times when variables is occurred.

Step3: make the pairs using intervals

Step 4: Arrange these pairs in the form of dependent variables and independent variables.

Step5: Apply the Linear regression model on the data. B.PROPOSEDMETHOD

In existing FIPM we have applied the Linear Regression for the frequent stream data. This gives the Prediction graph with lot of errors. In proposed method, we have applied the non linear regression for the frequent stream data with same preprocessing of existing FIPM. Proposed FIPM consists of following steps:

Step1: Apply the Preprocessing to the frequent stream data which is same as existing FIPM.

Step2 : Inputs are (xi, yi),i=(1,2,3,4---n)calculate the Y1, Y2, Y3, X1, X2, X3 and X4

Y1=( ∑ yi ) /n Y2= (∑ xi yi ) /n , Y3=(∑xi2 yi)/n X1= (∑ xi )/n , X2= (∑xi2 )/n ,X3= (∑xi3 )/n , X4= (∑xi4 )/n

Step3: Use the variables Y1, Y2, Y3, X1, X2, X3 and X4 for calculating the

co efficient b0,b1,b2 of Non linear regression model b2 = ( y2-x1y1)(x3-x1x2)-(y3-y2y1)(x2-x12) _______________________________ (x3-x1x2)2 – (x4-x22) (x2-x12) b1 = (y2-x1y1) – b2(x3-x1x2) ___________________ (x2-x12) b0 = y1- b1x1- b2x2 Non Linear Model is

Yi = b0 + b1 xi + b2 xi2---n.

C.OUTLINES OF PROPOSED FTPDS

FTP-DS (Frequent Temporal Pattern Data stream) linear regression method for two-dimensional frequent stream data. In this method we apply the preprocessing on the stream data for getting dependent and independent variable for regression analysis. Stream data comprise of character/symbols/ items appearing repeated in sequential file. In Proposed FTP-DS we have applied the non linear regression method on the stream data. In this method we organize the data according to their time and ids at which they are appearing. For example let <ab> sequence be appearing at id 2,6,9 at 0 to 3 sliding window out of total 15 ids. Support is calculated by number of ids at which data is appearing is

divided by total number of ids. FTPDS is a linear regression method for the two-dimensional stream data.

This algorithm is used for mining of frequent data items. It is the easiest way to predict frequent items using regression model for continuous incoming two-dimensional stream data. In this method first preprocesses two dimensional stream data and transforms it in to the form of sampling value for further regression. And then linear regression is applied on the organized stream data. FTP-DS has two major features, namely one data scan for online statistics collection and regression based compact pattern representation. FTP-DS is able to not only conduct mining with variable time intervals but also perform trend detection effectively. FTP-DS algorithm is based on the linear regression Examples of data streams include computer network traffic, phone conversations, ATM transactions, web searches, and sensor data. In traditional transaction environment it is impossible to perform frequent items mining because it requires analyzing which item is a frequent one to continuously incoming stream data.

D.EXISTING FTPDS

Existing FTPDS consists of following Steps:

Step1: Identify ids at which particular stream data sequence is available and at specified sliding windows .

Step2: Calculate the support for the sequence of stream data for every sliding window using following formula.

Number of id at which stream data is present ___________________________________

Total number of id in stream sequence

Step3: Support for each sliding window is known as dependent variable and ending time of sliding window is known as independent variable

Step4: Apply the Linear regression method on dependent and independent Variable and Predicted support is calculated.

E.PROPOSED FTPDSUSINGNONLINEAR REGRESSION

In proposed FTPDS non linear regression method is applied for frequent stream data. In proposed FTPDS predicted supports and errors are calculated for stream data sequence using non linear regression technique. Proposed FTPDS consists same steps as proposed FIPM.

Step1: Apply the Preprocessing to the stream data as Existing FTPDS

Step2 : Inputs are (xi, yi),i=(1,2,3,4---n)calculate the following variable Y1,Y2,Y3,X1,X2,X3 and X4

Y1= ( ∑ yi ) /n Y2= (∑ xi yi ) /n , Y3=(∑xi2 yi)/n

X1= (∑ xi )/n , X2= (∑xi2 )/n ,X3= (∑xi3 )/n , X4= (∑xi4 )/n

Step3: use variables Y1, Y2, Y3, X1, X2, X3 and X4 for calculating the co efficient b0,b1,b2 of Non linear regression model b2 = ( y2-x1y1)(x3-x1x2)-(y3-y2y1)(x2-x12) _______________________________ (x3-x1x2)2 – (x4-x22) (x2-x12) b1 = (y2-x1y1) – b2(x3-x1x2) ___________________

(5)

(x2-x12) b0 = y1- b1x1- b2x2 Non Linear Model is

Yi = b0 + b1 xi + b2 xi2---

V. FIRSFORMININGASPECTS

Recently, the application of soft computing to data mining with its aspects has also drawn the attention of researchers to solve data mining problems. For any aspects of data mining as web mining, text-mining etc information retrieval become an important term. The main problems in information retrieval and data mining lie in the large scale of databases, especially when dealing with text or web resources, in the heterogeneous data of various types, numerical or symbolic, precise or imprecise, ambiguous, approximate, with incomplete files, uncertain because of the poor reliability of sources or the difficulties of measurement of observation. Fuzzy logic is very useful in this matter because of its capability to represent miscellaneous data in a synthetic way, its robustness with regard to changes of the parameters of the user environment, and obviously its unique expressiveness. Here we proposed a Soft Computing tool FIRS for data mining for handling issues related to incomplete/imprecise data/query, fuzzy approach for weighted queries to fulfill the user information needs.

A.FUZZY INFORMATION RETRIEVAL SYSTEM (FIRS) FOR INFORMATION

RETRIEVAL AND DATA MINING

Fuzzy IRSs (FIRSs)[2] are those IRSs that use the potential of the fuzzy tools to improve the retrieval activities. I focus on fuzzy IR models that use fuzzy weighted queries to improve the representation of user information needs and fuzzy connectives to process such queries.

As it is pointed out in [23], existing training IR systems present several shortcomings, e.g., they do not give feedback about the performance or success of user queries, it is not possible to observe how a user query is evaluated, and it is not possible to compare the performance of different types of user queries and different evaluation procedures of user queries.

In this thesis I introduce a software tool, which gives user a chance to acquire the complex skills that provide those FIRSs based on weighted queries. This is a Web-based computer-supported learning application whose goal is to provide an environment for demonstrating the performance of fuzzy queries and their evaluation using different fuzzy connectives. It offers users the opportunity to see and compare the achieved results of different weighted queries. User can choose different semantics (threshold, relative importance, ideal importance, [24,25] to formulate weighted queries, different fuzzy connectives to evaluate these queries (maximum, minimum, OWA operators, Induced OWA operators) [26,27],and different expression domains (numerical or linguistic one) [24,28] to assess weights associated with queries.

Fig.5. Fuzzy Information Retrieval System (FIRS)

Fig.6. Query Evaluation VI. CONCLUSION

FIPM and FTP-DS both algorithms are used for mining from stream data .both are based on linear regression technique. According to analysis of FIPM and FTP-DS it is to be found that FIPM and FTPDS using linear regression give prediction with high error. For improving the result or we can say that for reducing the errors in FIPM and FTPDS, we proposed FIPM and FTPDS using non linear regression which gives improved results. But Proposed FIPM does not give the better result in reducing of errors during the prediction because of preprocessing method of FIPM each time is not considered in the FIPM for appearing the stream data but in FTPDS for a given sequence each and every time is considered using sliding windows. Proposed FTPDS gives prediction with low errors in comparison of existing FIPM and FTPDS and Proposed FIRS is used to remove the uncertainty in user information needs and also give evaluation tree for comparing the queries and give feedback regarding the query. Future scope is for combined the neuro fuzzy applications in data mining and its aspects.

ACKNOWLEDGEMENTS

For this research work I acknowledge the role of parent Institution, Referenced research work and the valuable work of authors which if I missed to address and my friends

(6)

REFERENCES

[1] Chai, Eun Hee Kim and Long Jin: Prediction of Frequent Items to One Dimensional Stream Data; Fifth International Conference on Computational Science and Applications ; page 353-360, 2001. [2] Divya Jyoti Shrivastav, Waseem Ahmad: Optimization of Information

Retrieval by Advance Indexing Models and FIRS; International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 3, May 2012.

[3] U. Fayyad and R. Uthurusamy, “Data mining and knowledge discovery in databases,” Commun. ACM, vol. 39, pp. 24–27, 1996.

[4] W. H. Inmon, “The data warehouse and data mining,” Commun. ACM, vol. 39, pp. 49–50, 1996.

[5] Zhengxin chen : An integrated architecture for OLAP in Data Mining, volume 2,pp:114-136, Stevenage,2002.

[6] R. Hayward; A Basic Approach to Linear Regression; RWJ Clinical Scholars Program; pp1-3, University of Michigan , 2005.

[7] A. Arning, R. Agrawal, and P. Raghavan. : A linear method for deviation detection in large databases. Data Mining and Knowledge Discovery; Proc. International Conf; Portland Oregon, Aug. 1996. [8] Chai, Eun Hee Kim and Long Jin: Prediction of Frequent Items to One

Dimensional Stream Data; Fifth International Conference on Computational Science and Applications ; page 353-360, 2001. [9] R. Hayward; A Basic Approach to Linear Regression; RWJ Clinical

Scholars Program; pp1-3, University of Michigan , 2005.

[10] Arning, R. Agrawal, and P. Raghavan. : A linear method for deviation detection in large databases. Data Mining and Knowledge Discovery; Proc. International Conf; Portland Oregon, Aug. 1996.

[11] Sankar K. Pal, Varun Talwar, and Pabitra Mitra,:Web Mining in Soft Computing Framework:Relevance, State of the Art and Future Directions; IEEE Transaction on Neural N/w, Vol. 13, No. 5, Sep 2002 [12] Rudolf Kruse, Detlef Nauck, and Christian Borgelt:Data Mining with fuzzy Methods:Status and Perspective: Otto-von-Guericke-University of Magdeburg.

[13] Wei-Guang Teng, Ming-Syan Chen, Philip S.Yu, ”A Regression-Based TemporalPattern Mining Scheme for Data Stream”, Proceedings of the 29th International Conference on Very Large Data Base, Berlin, pp.607-617, August 2003

[14] Feng Zhao, Qing-Hua A Li :A Plane Regression-Based Sequence Forecast Algorithms for Stream Data ; Proc. of the Fourth International Conference on Machine Learning and Cybernetics; pp-1559-1562 Guangzhou,18-21 August, 2005

[15] Baeza-Yates R and Ribeiro-Neto B. (1999) Modern Information Retrieval. Addison-Wesley.

[16] P. Piatetsky-Shapiro and W. J. Frawley, Eds., Knowledge Discovery in Databases. Menlo Park, CA: AAAI/MIT Press, 1991.

[17] W. Pedrycz, “Conditional fuzzy c-means,” Pattern Recognition Lett.,vol. 17, pp. 625–632, 1996.

[18] W. Pedrycz, “Fuzzy set technology in knowledge discovery,” Fuzzy SetsSyst., vol. 98, pp. 279–290, 1998.

[19] O. Nasraoui, R. Krishnapuram, and A. Joshi, “Relational clustering based on a new robust estimator with application to web mining,” in Proc. NAFIPS 99, New York, June 1999, pp. 705–709.

[20] R. Yager, “A framework for linguistic and hierarchical queries for document retrieval,” in Soft Computing in Information Retrieval: Techniquesand Applications, F. Crestani and G. Pasi, Eds, Heidelberg: Physica-Verlag, 2000, vol. 50, pp. 3–20.

[21] T. Gedeon and L. Koczy, “A model of intelligent information retrieval using fuzzy tolerance relations based on hierarchical co-occurrence of words,” in Soft Computing in Information Retrieval: Techniques and Applications, F. Crestani and G. Pasi, Eds. Heidelberg, Germany:Physica-Verlag, 2000, vol. 50, pp. 48–74.

[22] G. Pasi and G. Bordonga, “Application of fuzzy set theory to extend boolean information retrieval,” in Soft Computing in Information Retrieval: Techniques and Applications, F. Crestani and G. Pasi,Eds. Heidelberg, Germany: Physica Verlag, 2000, vol. 50, pp. 21–47. [23] Halttunen K and Sormunen E. (2000) Learning information retrieval

through an educational game. Is Gaming sufficient for learning?. Education for Information, 18(4), pp. 289-311.

[24] Herrera-Viedma E. (2001) Modeling the retrieval process for an information retrieval system using an ordinalfuzzy linguistic approach. Journal of the American Society for Information Science and Technology, 52(6), pp.460-475.

[25] Kraft D H, Bordogna G and Pasi G. (1994) An Extended Fuzzy Linguistic Approach to Generalize BooleanInformation Retrieval. Information Sciences, 2, pp. 119-134.

[26] Yager R R. (1988) On ordered weighted averaging aggregation operators in multicriteria decision making.IEEE Transactions on Systems, Man, and Cybernetic, 18, pp. 183-190.

[27] Yager R R and Filev D P. (1999) Induced ordered weighted averaging operators. IEEE Transaction on Systems, Man and Cybernetics, 29, pp. 141-150.

[28] Kraft D H and Buell D A. (1983) Fuzzy sets and generalized Boolean retrieval systems. International Journal of Man-Machine Studies, 19, pp. 45-56.

[29] Baeza-Yates R and Ribeiro-Neto B. (1999) Modern Information Retrieval. Addison-Wesley.

[30] Salton G and McGill M H. (1984) {\em Introduction to modern information retrieval}. New York: McGraw-Hill.

.

Divya Jyoti Shrivastav1_{received the B.Tech. Degree in computer science}

from B.I.T, Muzaffarnagar, India, in 2007.

Currently, she is a Senior Lecturer in N.I.E.T, Gr.Noida Her research interests are in the area of data mining and knowledge discovery, pattern recognition, learning theory, and soft computing. Her three papers was published in National Conference.