Vol 8, No 11 (2018)

(1)

Research Article

a

November

2018

Computer Science and Software Engineering

ISSN: 2277-128X (Volume-8, Issue-11)

Rank Based Query Optimization for RDBMS Using Genetic

Algorithm

Omkar Singh*, Dr. S. K. Singh, Hiteshri Rajput, Rajni Chandwani

Thakur College of Science and Commerce, Thakur Village, Kandivali – E, Mumbai, Maharashtra, India

[email protected], [email protected], [email protected], [email protected]

Abstract— Rank based query optimization in SQL, is a system that provides a systematic framework to support efficient evaluations of ranking queries in relational database systems (RDBMS), by extending relational algebra and query optimization. Previously, query processing is studied in the middleware scenario or in RDBMS in a piecemeal fashion, i.e., focusing on specific operator or sitting outside the core of query engines. In contrast, we aim to support ranking as a first-class database construct. As a key insight, the new ranking relationship can be viewed as another logical property of data, parallel to the membership property of relational data model. While membership is essentially supported in RDBMS, the same support for ranking is clearly lacking. We address the fundamental integration of ranking in RDBMS in a way similar to how membership, i.e., Boolean filtering, is supported.

Keywords— Rank based SQL, RDBMS, Genetic Algorithms, Heuristic search.

I. INTRODUCTION 1.1Ranking

Information systems of different types use various techniques to rank query answers. In many application domains, end-users are more interested in the most important query answers in the potentially huge answer space. In such cases records can be provided ranks based on attributes of most importance and sort them on the basis of those important attributes to put limit on no of records in output. Different emerging applications has warranted efficient support for top record. Top-k query processing connects to many database research areas including query optimization, indexing methods, and query languages. As a consequence, the impact of efficient top-k query processing is becoming evident in an increasing number of applications. The following examples illustrate real-world scenarios where efficient top-k processing is crucial. The examples highlight the importance of adopting efficient top-k processing techniques in traditional database environments. We also introduce a taxonomy to classify top-k query processing techniques based on multiple design dimensions, described in the following:

1.2Query model

Rank based SQL query processing techniques are classified according to the query model they assume. Some techniques assume a selection query model, where query is based on specifying with selection operator. Other techniques assume a join query model, where selection and projections both are deployed together and an aggregate query model where aggregate functions are specified.

1.3Data access methods

Rank based query processing techniques are classified according to the data access methods. Some techniques for data access methods are:-

1) The availability of random access of data. 2) Restricted to only access of sorted data.

1.4 Data and query uncertainty

Rank based query processing techniques are classified based on the uncertainty involved in their data and query models. Some techniques produce exact answers, while others allow for approximate answers, or deal with uncertain data.

1.5Ranking function

(2)

ISSN(E): 2277-128X, ISSN(P): 2277-6451, pp. 16-20

1.6Genetic Algorithm

In the field of artificial intelligence, a genetic algorithm (GA) is a search heuristic that mimics the process of natural selection. This heuristic (also sometimes called a metaheuristic) is routinely used to generate useful solutions to optimization and search problems. Genetic algorithms belong to the larger class of evolutionary algorithms (EA), which generate solutions to optimization problems using techniques inspired by natural evolution.

Genetic algorithms are faster and more efficient as compared to the traditional methods. Also provides a list of “good” solutions and not just a single solution. It always gets an answer to the problem, which gets better over the time. It turns out to be very useful when the search space is very large and there are a large number of parameters involved.

II. RELATED WORK

In the area of query optimization various techniques has been suggested, such as [1] & [2], which has emphasis on ranking of queries based genetic algorithms. Also [3] & [4] has emphasis on ranking of queries based on object view. Our work is based on implementation of genetic algorithms with heuristic search. Heuristic search technique provides a better approach for ranking of records and sorting them based on their ranks which provides faster execution of queries.

III. METHODOLOGY This research basically introduces:

Rank based SQL query, a system that provides a systematic and principled framework to support efficient evaluations of ranking queries in relational database systems (RDBMS), by using the concept of Genetic Algorithm. Generally there are many methods of sorting and finding the queries but Rank based SQL is efficient as it takes less time and improvises the results more. The main focus is on giving the ranks and thus our work focuses on giving the ranks by using the fitness function of genetic algorithm. It will optimize the results and provide the (top-k) queries faster.

3.1 Database Description

A database is basically a collection of information organized in such a way that a computer program can quickly select desired pieces of data. You can think of a database as an electronic filing system. Traditional databases are organized by fields, records, and files. A field is a single piece of information; a record is one complete set of fields; and a file is a collection of records.

3.2 Graphical Epresentation

Representation of query and rank will be shown graphically. Time taken by SQL and time taken by genetic algorithm is recorded and both of the time are compared graphs are generated for the same. It is graphicallyshown to show the pictorial representation of graphs.

(3)

ISSN(E): 2277-128X, ISSN(P): 2277-6451, pp. 16-20

3.4 Records of User_Details Table:

This is basically regarding the details of users who visits bank on daily, weekly or yearly basis.

3.5 Execution of Normal Select Query

It is the execution of normal select query where we are displaying all the fields of the user_details table of bank. We are generating rank for every user depending on his/her visits balance and transactions. It will increase according to his/her visits and thus the rank is updated.

Query Used For Execution:

Select * ,((balance % (visits*transactions))%10) as rank from user_details order by rank DESC limit 10

3.6 Comparison between SQL and Genetic query execution:

3.7 SQL query execution

(4)

ISSN(E): 2277-128X, ISSN(P): 2277-6451, pp. 16-20

Query for execution:

select *,((balance % (visits*transactions))%10) as rank from user_details where age>30 and transactions=12 order by rank DESC limit 10

Time taken by SQL query to generate top-k records is 1201s

3.8 Genetic execution

In this type, we are basically selecting the parameters on which the genetic algorithm is implemented and top-k records are displayed. Time taken by SQL query to generate top-k records is 16s

We are considering two parameters here. Parameters for execution:

i.) Age(age>30)

ii.) Transactions(transactions=12)

Query Normal Exe.

Time.

G.A based Exe. Time.

No of rows

select *,((balance %

(visits*transactions))%10) as rank from user_details where age>30 and transactions=12 order by rank DESC limit 10

1201 secs 16 secs 1000

IV. CONCLUSION

The overall hope for this project is to generate top k queries by implementing genetic algorithm and improvising the result by comparing it with SQL queries. This system will greatly reduce the probability of error and producing the best result for the same. This system provides a systematic framework to support efficient evaluation of top-k queries in DBMS. The system has been designed such that Query execution will be extended to handle the ranking query. The simulation has been done for a number of generations. The results of the simulation gives the best fit chromosome found so far on the basis of fitness function. Genetic Algorithms are highly effective in searching a large, poorly defined search space even in the presence of difficulties such as high-dimensionality, multi-modality, discontinuity and noise. Genetic Algorithm is applied to a wide variety of searching and optimization problems in various fields from science, engineering and technology.

ACKNOWLEDGEMENTS

We have taken efforts in building this software. However, it would not have been possible without the kind support and help of many individuals. We would like to extend our sincere thanks to all of them.

We are extremely thankful to our institution to provide such a great opportunity. It has been a great learning experience on our technical as well as personnel aspects.

(5)

ISSN(E): 2277-128X, ISSN(P): 2277-6451, pp. 16-20

REFERENCES

[1] Richa Garg & Saurabh mittal, Optimization by Genetic Algorithm,International Journal of Advanced Research in Computer Science and Software Engineering,Volume 4, Issue 4, April 2014, ISSN: 2277 128X

[2] Pushpendra Kumar Yadav & Dr.N.L.Prajapati, An Overview of Genetic Algorithm and Modeling,International Journal of Scientific and Research Publications, Volume 2, Issue 9, September 2012, ISSN 2250-3153

[3] Matthias Jarke & Jurgen Koch, Query Optimization in Database Systems