Implementation of Pattern Matching Algorithm to Defend SQLIA

(1)

Procedia Computer Science 45 ( 2015 ) 453 – 459

Peer-review under responsibility of scientific committee of International Conference on Advanced Computing Technologies and Applications (ICACTA-2015).

doi: 10.1016/j.procs.2015.03.078

ScienceDirect

International Conference on Advanced Computing Technologies and Applications

(ICACTA-2015)

Implementation of pattern matching algorithm to defend SQLIA

Nency Patel

a

, Narendra Shekokar

b

a_{Department of Computer Engineering, D.J.Sanghavi College of engineering, Mumbai, India} b_{Department of Computer Engineering, D.J.Sanghavi College of engineering, Mumbai, India}

Abstract

SQL Injection is a type of web application security vulnerability in which an attacker is able to submit a database SQL command which is executed by a web application, exposing the back-end database. SQL injection is one of the technique by which a malicious user alters SQL statements to serve a different purpose than what was originally intended. In network security pattern matching is used to detect malicious packets. Most of the pattern based techniques use static analysis and patterns are generated from the attacked statements. In the existing system the algorithm which they have used is not memory efficient. We have proposed a detection and prevention technique for SQL Injection Attack (SQLIA) using modified Aho–Corasick pattern matching algorithm. In proposed system the user generated SQL Queries are checked whether they are SQL injected or not using SQLMAP tool and AIIDA-sql techniques. Then the user generated SQL queries are checked by applying static pattern matching algorithm. In the new system, if any form of new anomaly occurs, then a new anomaly pattern will be updated to the existing static pattern list. In addition, the repeated keywords are stored only once which optimizes overall memory consumption. © 2015 The Authors. Published by Elsevier B.V.

Keywords: SQLIA; SQLMAP; AIIDA-SQL; Aho-corasick pattern matching algorithm.

1. Introduction

SQL injection attack is one of the security vulnerability at application layer. SQL injection is one of the most common attack strategies employed by attackers to steal identity and other sensitive information from Web sites. A Web application can read user input in several ways based on the environment in which the application is deployed. SQL-Injection attack always reach the final aim of cheating the server and run the unwanted SQL command by

(2)

inserting itself to the query-string of web input or domain name [8]. Depending on the specific goals of the attacker the different types of attacks are used together or sequentially. To address this problem, developers have to use defensive coding practices, such as encoding user input and validation. A systematic application of these techniques is an effective solution for preventing SQL injection vulnerabilities [20]. However, in practice, the application of such techniques is human-based and, thus, prone to errors [20].

Table 1 Types of SQL attacks

Types of attack Working method

Tautology It is a kind of attack in which condition becomes always true.

Logically incorrect query

This attack lets an attacker to get information about the back-end database of a Web application using error message.

Stored Procedure

Built-in stored procedure is used with malicious SQL injection codes.

Piggy-Backed Queries

Additional malicious queries are inserted into an original injected query.

Union Query UNION keyword is used to get information by joining the injected query with safe query.

We have discussed the introduction about SQL injection and types of SQLIA. This paper is organized as follows: section II discusses the existing techniques for SQLIA prevention and issues with that techniques. Section III introduces the key idea behind the proposed architecture. Experimental results are addressed in Section IV. Finally, Section V concludes the paper followed by references.

2. Existing Techniques

Researchers have proposed various methods to address the SQL-Injection problem. In web based security problems, SQL injection attack (SQLIA) has the top most priority. Detection and prevention techniques can be classified into two broad categories. First is to detect SQLIA through checking anomalous SQL Query structure using string matching, pattern matching and query processing [9]. Other approach uses data dependencies among data items which are less likely to change for identifying malicious database activities [9]. Many researchers proposed different schemes with integrating data mining and intrusion detection systems [9]. This type of approaches reduces the false positive alerts, minimizing human intervention and better detection of attack. AMNESIA is model based technique that combines the static analysis and run time monitoring. In the static phase, it uses a static analysis to build the models of the SQL queries that an application legally generates at each point of the access to the database [6]. In run time monitoring or dynamic phase, it intercepts all the SQL queries then they are sent to the database and each query will be checked against the statically built models and the queries that violate the model are identified as SQL injection attacks [6].

To solve security problems related to input validation, William [7] proposed SQLIAs prevention based on dynamic tainting. In dynamic tainting approaches malicious data are marked as tainted and then at runtime flow of tainted data will be tracked, and prevents this data from being used in potentially harmful ways [7]. Any security tool does not becomes effective until and unless its effectiveness is evaluated using the real time environment. WASP tool was not tested using already deployed web applications was not tested to detect the real time attacks.

(3)

A general framework for detecting malicious database transaction patterns using data mining was proposed by Bertino et al [13][14] to mine database logs to form user profiles that can model normal behaviors and identify anomalous transaction in database with role based access control mechanism. The system is able to identify intruder by detecting behaviors that different from the normal behavior. Kamra et al [13], proposed an enhanced model that can identify intruders in databases where there are no roles associated with each user. Bertino et al [15], proposed a framework based on anomaly detection technique and association rule mining to identify the query that deviates from the normal database application behavior. Bandhakavi et al [20] proposed a misuse detection technique to detect SQLIA by discovering the intent of a query dynamically and then comparing the structure of the identified query with normal queries based on the user input with the discovered intent. Srivastava et al [21] offered a weighted sequence mining approach for detecting data base attacks. We have proposed a technique for detecting and preventing SQLIA using both static phase and dynamic phase. Another technique for the same uses static Anomaly Detection using Aho–Corasick Pattern matching algorithm. The anomaly SQL Queries are detection in static phase. In the dynamic phase, if any of the queries is identified as anomaly query then new pattern will be created from the SQL Query and it will be added to the Static Pattern List (SPL) [1]. The main disadvantage of Aho-corasick is it consumes lots of memory.

Fig.1: Proposed Architecture

3. Proposed System

We propose an efficient algorithm for detecting and preventing SQL Injection attack using modified Aho– Corasick Pattern matching algorithm [2]. The proposed architecture is given in figure 1.

In figure 1 first phases is giving the input query and then checking whether the query is injected or not by SQLMAP tool and AIIDA-Sql technique.

In real time industries they are using SQLMAP tool to detect SQL injected queries. SQLMAP is able to detect and exploit five different SQL injection types:

x Boolean-based blind queries: The affected parameter in the HTTP request, a syntactically valid SQL statement string containing a SELECT sub-statement, or any other SQL statement queries are replaced or appended by

(4)

Boolean-based blind queries that’s the user wants to retrieve the output. For each HTTP response, the HTTP response headers/body and the original request are compared and the tool gives the output of the injected statement character by character. The user can also provide a string or regular expression to match on True pages. To perform this technique bisection algorithm is implemented in SQLMAP and it is able to fetch each character of the output with a maximum of seven HTTP requests [3].

x Time-based blind queries: These queries are replaced or appended to the affected parameter in the HTTP request, a syntactically valid SQL statement string containing a query which put on hold the back-end DBMS to return for a certain number of seconds. For each HTTP response, it compares the HTTP responses time and the original request and the tool outputs the injected statement character by character. For rest of the procedure the bisection algorithm is applied [3].

x Error-based queries: These queries are replaced or appended to the affected parameter in a database-specific error message and parses the HTTP response headers and body in search of DBMS error messages. These error messages contain the injected pre-defined chain of characters and the subquery statement output within. This technique is useful for the web application which has been configured to disclose back-end database management system error messages [3].

x UNION based queries: These queries are appended to the affected parameter in a syntactically valid SQL statement starting with an UNION ALL SELECT [23]. When the web pages are passed directly within a for loop this technique will be applied so that each line of the query output is printed on the page content. SQLMAP is also able to exploit partial UNION query SQL injection vulnerabilities which occur when the output of the statement is not cycled in a for construct, whereas only the first entry of the query output is displayed [3]. x Stacked queries: SQLMAP first tests if the web application supports stacked queries and if stacked queries are

supported then it will be appended to the affected parameter in the HTTP request, a semi-colon (;) followed by the SQL statement to be executed. This technique is useful to run SQL statements other than SELECT like data definition or data manipulation statements.

The main disadvantage of SQLMAP tool is it takes almost 15 to 20 minutes to evaluate one query. This is not feasible for our system. To overcome this disadvantage we are using another technique to detect SQL injected query which is based on neural network called AIIDA-sql [4].

We have used Multi-Perceptron neural network shown in figure 2 which has 4 inputs, 5 hidden neurons and 3 outputs. We have used inbuilt JAVA suppoorted Neuroph API to design the neural network. Inputs are normalized before providing it to the network for training and prediction.

Transfer function is basically used for adjusting weights of the neurons in order to converge to the particular optimum weight which will help in predicting the results in later stages. Weights propagate across the layers and they get adjusted in every iteration of dataset provided for training. For training of the network the dataset is read from Excel sheet shown in table 2 and passed to Neuroph API to train the network. This is the learning phase of the neural network. For weight adjustment we have used following formulas.

Infected= (A/D+C/D)* B

Not Infected = (A/D+C/D) *(1- B)

Cannot Determined = (Infected + Not Infected)/2 Where A- No of Escape/special Character B- Search Intensity [0.0-1.0]

C- Number of words from Skip List D- No of Literals

For output we have considered three cases Infected, Not infected and cannot determine. We have to consider the third case because sometimes the query like < > % are used in database and if training data contains this characters then we manually raise cannot determine output to high value.

(5)

Table 2 Excel of training No of Escape/spe cial Character Search Intensity[0.0-1.0] Number of words from Skip List(no of words matching with our keys) No of Literals (No of words in the string) Infected Not Infected Cannot Determine

Once the ANN is trained they are ready for usage. Normalized inputs are given to the system and then provided to the network for analysis. Once that is done based on weights adjusted at various perceptron they emit the outputs. If multiple outputs are expected then the highest output is considered as the decision.

After checking the infected query, we are applying pattern matching algorithm to check the query with static pattern list. The algorithm for the same is given below.

In this algorithm we have created a tree for the static pattern list which is known as trie structure.

Fig. 2 Neural Schema

Search Algorithm:

Step 1: It takes start node as first word in the input parameter and the element number to be analyzed

Word_first: WORD Element Number: Number

Step 2: If element number is -1 then start element number is equal input element number

Step 3: If count of elements in array of words is less than or equal to input element number than calculate the match percentage based on difference of total elements in array and start element index and return null.

Step 4: After verifying initial conditions move to next element Increment the counter

Fetch all the children of the start node

Iterate over all children to find if it matches with the provided pattern

Step 5: If there is match of elements in the provided array recursively call the Search Method with child element

and index

Step 6: If there is no match further than calculate the match percentage based on difference of total elements in

array and start element index

Step 7: Return the Start node element

After checking the query with static pattern list, if the query is in the list, it will be rejected and percentage matching will be given. If the query is not in the list but still it is infected than it will be added on the admin page and then admin will add that query in the static pattern list.

(6)

4. Result Analysis

In this section we present our result analysis based on memory efficiency and comparison of two SQLIA detection techniques based on their accuracy. In our system we have stored the static pattern list in database in the form of trie structure and that shows the usage of keys that is number of times the keys are used in the trie structure and all the repeated keywords are stored only once which is shown in figure 3. In this graph we are showing the comparison of memory used by existing system and the proposed system. In figure 3 the X-axis shows the keys stored in database and Y-axis shows the memory consumption of those keys in database. The dotted line shows the memory consumption by existing system and the straight line shows for proposed system. Which clearly shows overall memory consumption is optimized.

Fig.3 Memory Consumption Fig.4 Accuracy Graph

Second result analysis is for accuracy of SQLMAP tool and AIIDA-sql technique shown in figure 4. SQLMAP tool is supposed to consider as 100% accurate then whatever prediction result we receive via AIIDA-sql. Because in AIIDA-sql we have considered a case cannot determine as third output which may sometime give inaccurate result. In figure 4 X-axis shows the queries given as input to both techniques and Y-axis shows the percentage calculations of accuracy for both techniques. The straight line shows the accuracy of SQLMAP tool and dotted line shows the accuracy of AIIDA-sql technique.

5. Conclusion

SQL injection is a technique by which a malicious user alters SQL statements to serve a different purpose than what was originally intended and it is one of the most serious security threats for Web application. We have proposed a technique that uses static anomaly detection using modified Aho–Corasick Pattern matching algorithm. And for SQLIA detection we are using two techniques, one is SQLMAP tool which give 100 % accurate result and another is ANN scheme. After detection we are applying the search algorithm to check whether the query is in the list or not. In our proposed technique the memory consumption is less in comparison of the existing system.

References

1. 1. Dr.M.Amutha Prabakar, M.KarthiKeyan, Prof.K. Marimuthu. An efficient technique for preventing Sql injection attack using pattern Matching algorithm, IEEE International Conference on Emerging Trends in Computing, Communication and Nanotechnology (ICECCN 2013)

2. Weizhe Zhang, Yuanjing Zhang, Hongli Zhang, Xuemai Gu, Albert M.K. Cheng. A Memory-Efficient Multi-Pattern Matching Algorithm Based on the Bitmap, 2009 Fourth International Conference on Internet Computing for Science and Engineering

3. SQLMAP user’s manual [online]. Avaible:

(7)

4. Cristian Pinzone, Juan F. De Paz, Javier Bajo, Alvaro Herrero, Emilio Corchado. AIIDA-SQL: An adaptive intelligent intrusion detector agent for detecting SQL injection attacks, 10th International conference on hybris intelligent systems IEEE 2010.

5. David Litchfield.Web Application Disassembly with ODBC ErrorMessages, http://www.nextgenss.com/papers/Webappdis.doc, 2012. 6. Halfond, W. G. and Orso. AMNESIA: Analysis and Monitoring for Neutralizing SQL-Injection Attacks, in Proceedings of the 20th

IEEE/ACM international Conference on Automated Software Engineering, 2005

7. WASP: Protecting Web Applications Using Positive Tainting and Syntax -Aware Evaluation, William G.J. Halfond, Alessandro Orso, Member, IEEE Computer Society, and Panagiotis Manolios, Member, IEEE Computer Society, 2008.

8. N. W. Group. RFC 2616 – Hypertext Transfer Protocol – HTTP/1.1. Request for comments, The Internet Society, 1999.

9. M. Dornseif. Common Failures in Internet Applications May 2005.

http://md.hudora.de/presentations/2005-common-failures/dornseif-common-failures-2005-05-5.pdf.

10. T. M. D. Network. Technical report, Microsoft Corporation [online].Available:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/iissdk/html/9768ecfe-8280-4407-b9c0-844f75508752.asp.

11. Amit Kumar Pandey. SECURING WEB APPLICATIONS FROM APPLICATION-LEVEL ATTACK, master thesis, 2007

12. C.J. Ezeife, J. Dong, A.K. Aggarwal. SensorWebIDS: A Web Mining Intrusion Detection System, International Journal of Web Information Systems, volume 4, pp. 97-120, 2007

13. Bertino, E., Kamra, A, Terzi, E., and Vakali, A. Intrusion detection in RBAC-administered databases, in the Proceedings of the 21st Annual Computer Security Applications Conference, 2005

14. Kamra A, Bertino, E., and Lebanon. Mechanisms for Database Intrusion Detection and Response, in the Proceedings of the 2nd SIGMOD PhD Workshop on Innovative Database Research, 2008

15. Bertino, E., Kamra, A, and Early, J. Profiling Database Application to Detect SQL Injection Attacks, In the Proceedings of 2007 IEEE International Performance, Computing, and Communications Conference, 2007

16. Buehrer, G., Weide, B. w., and Sivilotti, P. A. Using Parse Tree Validation to Prevent SQL Injection Attacks, in Proceedings of the 5th international Workshop on Software Engineering and Middleware, 2005

17. V. Aho and M.J. Corasick. Efficient String Matching: An Aid to Bibliographic search, Communications of the ACM, Vol. 18, Issue 6, 1975, pp. 333-340.

18. S. Wu and U. Manber. A Fast Algorithms for Multi-pattern Searching, Report TR-94-17, Department of Computer Science, University of Arizona, 1994.

19. N. Tuck, T. Sherwood, B. Calder, and G. Varghese. Deterministic memory-efficient string matching for intrusion detection, in Proceeding of the 23rd IEEE Infocom Conference, 2004, pp.333-340 .

20. Bandhakavi, S., Bisht, P., Madhusudan, P., and Venkatakrishnan V., CANDID: Preventing sql injection attacks using dynamic candidate evaluations, in the Proceedings of the 14th ACM Conference on Computer and Communications Security, 2007

21. Srivastava, A, Sural S., and Majumdar, AK..Database Intrusion Detection Using Weighted Sequence Mining, Journal of Computers, vol.1, no. 4 (2006)

22. Mohammed Firdos Alam Sheikh, Secure query processing by blocking SQL injection attack, International journal of research in management, vol.3(November-2011), ISSN2249-5908.