M. Graña Romay et al. (Eds.): HAIS 2010, Part I, LNAI 6076, pp. 170–177, 2010. © Springer-Verlag Berlin Heidelberg 2010
Data Processing on Database Management Systems
with Fuzzy Query
İrfan Şimşek1
and Vedat Topuz2 1
Msc. Sultançiftliği Primary School, Çekmeköy, 34788, Istanbul, Turkey Ph.: (+90) 216 312 13 81; Fax: (+90) 216 429 29 10
2
Asst. Prof. Dr. Marmara University Vocational School of Technical Sciences, Göztepe, 34722, Istanbul, Turkey
Ph.: (+90) 216 418 25 04; Fax: (+90) 216 418 25 05
Abstract. In this study, a fuzzy query tool (SQLf) for non-fuzzy database man-agement systems was developed. In addition, samples of fuzzy queries were made by using real data with the tool developed in this study. Performance of SQLf was tested with the data about the Marmara University students' food grant. The food grant data were collected in MySQL database by using a form which had been filled on the web. The students filled a form on the web to de-scribe their social and economical conditions for the food grant request. This form consists of questions which have fuzzy and crisp answers. The main pur-pose of this fuzzy query is to determine the students who deserve the grant. The SQLf easily found the eligible students for the grant through predefined fuzzy values. The fuzzy query tool (SQLf) could be used easilywith other database system like ORACLE and SQL server.
Keywords: Fuzzy logic, fuzzy query, database.
1 Introduction
Database management systems have made a significant progress in terms of function-ality and performance since they were first designed in the 1960s. However, the query systems of relational database management systems, which are widespread today, are based on two-value logic. In this logic, an entry either meets the criteria or not. After the querying criteria, it creates sets whose boundaries are certain. This is in contradic-tion with our natural thinking method, because we are unable to differentiate some objects in our daily lives in such a certain way. For example, a person does not sud-denly become short or tall because of a couple of millimeters difference [1-6]. In order to define these situations, using fuzzy logic will be beneficial to simplify the query and get a more correct report.
Today's database management systems are advanced in terms of performance and functionality and almost all of them have their own high-level query systems. How-ever, these query systems work with precise values or value intervals. [7-9].
The fuzzy set theory, proposed by L.A. Zadeh, aims at processing the indefinite and vague information. In other words, the concept of fuzziness refers to the state of ambiguity which stems from the lack of certainty. The fuzzy logic and the fuzzy set theory play an important role for vague knowledge display and almost all of our ex-pressions in the daily language contain fuzziness. (cold-hot, rich-poor, short-long etc.) [10-13]. Ambiguity plays an important role in human thinking style, especially in communication, inference, and in identifying and abstracting figures; and the impor-tance of the fuzzy theory appears at this point. When we wish to transform the user interfaces which enable us to communicate with machines into a human-oriented style, the fuzzy theory becomes an effective tool at our hands [14].
The fuzzy query provides us with the ability to evaluate imprecise data and use expressions such as “old” or “rich” which do not imply certain quantities. The fuzzy query provides the nearest data to us, if what we search for does not exist. This is a very beneficial feature especially if we do not have absolute information or the infor-mation we have is not quantitative [9, 15-17].
2 “SQLf” Fuzzy Query Software
The SQLf Software was written by taking into consideration the software designed to make fuzzy queries on database management systems such as SummarySQL and FuzzyQuery. The Figure 1 shows the relationship between the SQLf software's browser, php and database server. The task of the SQLf software is to make both classical and fuzzy queries on non-fuzzy database systems and report the results. The software was encoded in PHP programming language. It is available on the web and the address is http://www.fuzzyquery.com.
Fig. 1. Relationship between the SQLf Software's Browser, Php andDatabase Server
The components of the software are as follows:
1. The graphical interface that interacts with the user
2. Making connection settings to the database management system 3. Defining the criteria necessary for the query (Criteria Definition) 4. Defining fuzzy sets for table fields (Fuzzy Sets)
5. Monitoring the impact of hedges on the current fuzzy sets (Hedges) 6. Creating precise and fuzzy queries from the defined criteria (Query Design)
7. Controlling the queries created and determining the desired fields on the re-sult table (Query Control)
8. Displaying the result table and the query statistics after running the con-trolled query according to the desired fields (Query Run)
The software is composed of two main sections as shown in figure 2, namely the fuzzying and query.
FUZZIFICATION MODULE
QUERY MODULE
Query Interpreter Query Processor
Fig. 2. Main Parts of Systems
The fuzzification module: Since the database folder on which query is made is not fuzzy, firstly a fuzzying operation is needed. To this end, the user is shown the fields of the desired folder and then s/he is enabled to define fuzzy sets for the fields s/he desired. There is no restriction about which fields the user can fuzzy. Fuzzying is generally made for the fields containing quantitative data. Fuzzy sets are defined as the sets of pairs of elements and degrees of membership. In order to reuse the definitions, a set database was formed in which all the entered information is stored.
The query module: The query processor steps in, after the defined queries are con-trolled. The general structure of the query processor is demonstrated in the Figure 3, finds the matching degree of each entry and produces a report accordingly.. Unlike the classical processing, the matching degree of the query is not either 0 or 1, but it is a number between 0 and 1.
Q u e r y M a t c h i n g A c c e t a b l e ? ( > t h r e s h o l d ? ) R e p o r t N e g l e c t R e c o r d M a t c h i n g D e g r e e E H D a t a b a s e R e c o r d
Fig. 3. Essence of Database Querying
3 Example: Marmara University Food Grant
Performance of SQLf was tested with Marmara University student’s food grant data. The students filled a form which describes their social and economical positions for
the food grant request on the web. This form consists of questions which have fuzzy and crisp answers. The SQLf easily found the eligible students for grants with prede-fined fuzzy values. Fuzzy query tool (SQLf) was designed to work not only with this database, but also with other databases.
3.1 The Assessment Table
The Food Grant Database consists of three tables; namely student information, family information and contact information. Since our aim is to find the students who de-serve the food grant, we will conduct the assessment on the family information table which contains the student's living conditions, the state of family, and the other re-ceived grants.
Table 1 shows the family information table's field names, field types and other fea-tures of fields. The fields which will be assessed in this table and the characteristics of these fields in terms of the information they contain are as follows:
Table 1. Family Information Table
Field Name Field Type Empty Default Explanation
Id int(11) No Student ID
stOfPa tinyint(4) No State of parents (1-3) as a numerical
value
numCh int(11) No Number of children in the family
numChAttSch int(11) No Number of children attending school.
fathOcc tinyint(11) No Father's occupation (1) Private, (2)
Self-Employed,(3)Public,(4) Unemployed.
mothOcc tinyint(11) No Mother's occupation
stFamHou tinyint(4) No State of house in which the family stay
(1-4) as a numerical value.
netInc decimal(10) No Sum of the family's net income
scho1 varchar(20) Yes NULL
scho2 varchar(20) Yes NULL
scho3 varchar(20) Yes NULL
Names of the scholarships that the student receive, if any
noScho tinyint(6) No Whether the student receives
scholarship from another institution as 1 and 0
stHouse tinyint(4) No State of the house in which the student
currently stay (1-6) as a numerical value
3.2 Defining Fuzzy Sets
Before defining the criteria, constitution of the fuzzy sets is needed for the fuzzy crite-ria. The fuzzy sets constituted are placed at the fsql_fsets table in the MySQL data-base to be used later. The fuzzy sets constituted for our application sample are shown in Table 2.
Table 2. Food Grant Fuzzy Sets Table Field (field name) fsname (fuzzy set name) fsetForm (formal information) fsalpha (Alpha cut coefficient) fmin (minimum data) Fmax (maximum data)
netInc poor decline 0 0 1000
stOfPa bad l.increasing 0 1 3
numChAttSch very growth 0 0 20
numCh very growth 0 0 20
fathOcc bad growth 0 1 4
mothOcc bad growth 0 1 4
stFamHou bad decline 0 1 4
estHouse bad decline 0 1 6
3.3 Preparing the Criteria
After constituting the fuzzy sets, the criteria should be prepared in order to use these fuzzy sets in our query. The criteria are divided into two categories; namely the pre-cise qualitative expressions and the fuzzy qualitative expressions. The processing steps for the fields for which criteria will be prepared are as follows:
1. The “Criteria Definition” section should be visited.
2. The relevant field should be selected from the fields section.
3. Since fuzzy qualitative expressions will be constituted, the fuzzy operator (@) should be selected from the operators section.
4. We do not need to select any switcher for our application sample. Thus, the expression of <none> should be selected from the Hedges section.
5. From the value section, the set, which we have constituted from the “Fuzzy Sets” section before, should be selected..
Figure 4 shows the fuzzy qualitative expressions prepared for the application sample. Totally four fuzzy qualitative expressions have been prepared.
3.4 Constituting the Queries from the Prepared Criteria
By connecting the simple fuzzy qualitative expressions prepared in the “Criteria Defi-nition” section with AND or OR in the “Create Query” section, complex fuzzy quali-tative expressions are constituted. The figure 5 shows the complex fuzzy qualiquali-tative expressions constituted for the application sample.
Fig. 5. Query Design Sections
The query sentence can either be a simple single sentence, or a complex sentence consisting of several simple sentences connected with AND or OR. If such a complex sentence is the case, the matching degrees of each sub-sentence are calculated for each entry and thus the overall matching degree is obtained. The entries whose matching degrees are above a defined lower limit are written on the output folder. The sentence or sub-sentences may not be fuzzy. In this case, the operators such as =, >, >=, <, <= etc. and constant values are used in the query, instead of switchers and fuzzy sets.
3.5 Controlling the Queries
Figure 6 shows Query Control sections. The “Query Control” section should be vis-ited in order to control the SQL and SQLf expressions which appear after approving the expressions connected in the “Query Design” section. In this section, the user not only controls the expressions but also defines the settings for the result report. The boundary value is also determined in this section. We defined it as 0.2 in our applica-tion sample. It means that, after the query, those whose degrees of membership are below 0.2 will be ignored during the reporting.
3.6 Running the Queries and the Result Table
The results shown by Table 3 are obtained from the “Run Query” section. After the query, 88 out of 3645 people are listed. In this table
µ
(grant) field shows the fuzzy deserve level of grant according student ID and other information which are used in fuzzy query.µ
(grant) values could be between 0 and 1 and 1 value means that student completely deserved the grant. A part of the result table is presented in Table 3.Table 3. Result Table
µ
(grant) ID stHouse stFamHou numCh numChAttSch netInc0.86 2282 1 1 3 3 265 0.82 2301 1 1 3 3 300 0.778 2610 2 1 4 4 170 0.777 1704 2 2 5 5 200 0.776 1590 2 2 3 3 245
4 Conclusion
This paper proposes a fuzzy query languages (fuzzy relational calculus and fuzzy relational algebra) based on the relational database query languages. This is an appli-cation of the fuzzy set theory and the fuzzy logic was carried out by developing an interface which renders possible to query on any relational database with query tences similar to the sentences used in the daily language. Complex fuzzy query sen-tences including hedges and crisp values could be constituted.
Efficiency of application is shown with student food grant problem. This is an ex-ample of relational database which have crisp and fuzzy fields. Hence, it is not con-venient to say who deserved food grant easily. Therefore all applicant student food grant deserved degree was found as a fuzzy membership value. Consequently this developed application could be used to query any relational database which has crisp or fuzzy fields.
References
1. Mutlu, T.: A Fuzzy Query Tool For Non-Fuzzy Databases, Master Thesis, Istanbul Tech-nical University Information Sciences Institute, Istanbul (1996)
2. Bahadır, A.: Flexible Querying in Standard Database Systems With Fuzzy Set Approach, Master Thesis, Istanbul Technical University Information Sciences Institute, Istanbul (1999)
3. Andersen, T., Christiansen, H., Larsen, H.L.: Flexible Query Answering System, pp. 45– 61, 187-209, 247-277. Kluwer Academic Publishers, Boston (1997)
4. Zadeh, L.A., Kacprzyk, J.: Fuzzy Logic for the Management of Uncertainty, pp. 645–672. Wiley, New York (1992)
5. Kacprzyk, J., Ziolkowski, A.: Database Queries with Fuzzy Linguistic Quantifiers. IEEE Transactions on Systems, Man and Cybernetics SMC-16(3), 474–478 (1986)
6. Takahashi, Y.: A Fuzzy Query Language for Relational Databases. IEE Transactions on Systems, Man and Cybernetics 21(6), 1576–1579 (1991)
7. Rasmussen, D., Yager, R.R.: SummarySQL–A Fuzzy Tool For Data Mining. Intelligent Data Analysis 1(1-4), 49–58 (1997)
8. Rasmani, K.A., Shen, Q.: A Data-Driven Fuzzy Rule-Based Approach for Student Aca-demic Performance Evaluation. Applied Intelligence 23(3), 305–319 (2006)
9. Zadeh, L.A.: Knowledge Representation in Fuzzy Logic. IEEE Transactions on Knowl-edge and Data Engineering 1(1), 89–100 (1989)
10. Klir, G.J., Yuan, B.: Fuzzy Sets and Fuzzy Logic Theory and Applications, pp. 379–388. Prentice Hall, New Jersey (1995)
11. Tanaka, K.: An Introduction to Fuzzy Logic for Practical Applications, pp. 68–75. Springer, New Jersey (1996)
12. Ross, J.T.: Fuzzy Logic with Engineering Applications, pp. 52–75. McGraw Hill Inc, New York (2004)
13. Kosko, B.: Fuzzy Engineering, pp. 18–24. Prentice Hall, New Jersey (1997)
14. Zongmin, M.: Fuzzy Database Modeling of Imprecise and Uncertain Engineering Informa-tion, pp. 137–155. Springer, New York (2006)
15. Zimmermann, H.J.: Fuzzy Sets, Decision Making, and Expert Systems, pp. 125–134. Kluwer Academic Publishers, Boston (1987)
16. Terano, T., Asai, K., Sugeno, M.: Fuzzy Systems Theory and Its Applications. Academic Press, San Diego (1992)
17. Şen, O.N.: Oracle SQL, SQL*PLUS, PL/SQL and Database Management, Beta Impression Publication Distributor, Istanbul, pp. 85–90 (2000)