Bayesian Classification for SQL Injection Detection

(1)

Bayesian Classification for SQL Injection Detection

Brandon Skari

College of Engineering and Applied Science University of Wyoming

Laramie, Wyoming 82070 [email protected]

(2)

Overview

SQL Injection Types of attacks Prevention techniques SQLassie

(3)

Dynamic web sites

Change based on user interaction

I Web mail

I Online stores

I Search engines

I Social networks

I Banking sites

(4)

SQL Injection

Injection attacks are one of the foremost threats facing dynamic web applications[23, 13, 24, 5]

Allow attackers to:

I Access, modify and delete data

I Bypass authentication

I Denial of service

(5)

Queries

Queries usually come in 4 types:

I Select

I Insert

I Update

I Delete

A typical query may look like the following[6]:

SELECT name FROM U s e r s

(6)

(7)

Queries

A developer has written:

WHERE name = ’ $name ’ AND p a s s w o r d = ’ $ p a s s w o r d ’

What happens when the user enters the following for the username?

name =

The query is run as the following:

(8)

Types of attacks

Halfond[13] classified attacks based on their intention Identifying injectable parameters

Performing database fingerprinting Determining database schema Adding or modifying data Denial of service

Accessing data Avoiding detection Bypassing authentication Executing remote commands Performing privilege escalation

(9)

Attack techniques

There are a number of techniques that attackers use to attack a database[6, 13, 20, 21] Multiple queries Unions Invalid queries Tautologies Inference Detection evasion Second order attacks

(10)

Multiple queries

The easiest attacks to implement

SQL allows multiple queries to be run in sequence, if they are separated by semicolons

Attackers can run essentially query, allowing full access to the database

Most database engines include special commands to access files or other information from the operating system

(11)

Multiple queries

SELECT name FROM U s e r s WHERE

name = ’ $name ’ AND p a s s w o r d = ’ $ p a s s w o r d ’

What if the user enters:

’ ; DROP TABLE U s e r s −−

The query is run as:

(12)

Unions

Normal SELECT commands retrieve a number of rows from a database

UNION can be used to append data from other tables and even from unrelated applications

(13)

Unions

What if the user enters:

’ UNION SELECT ’ Admin ’ −−

The query is run as

(14)

Unions

SELECT a . f o r u m i d , a . a u t h v i e w , a . a u t h m o d FROM p h p b b a u t h a c c e s s a , p h p b b u s e r g r o u p ug WHERE ug . u s e r i d = ’ $ i d ’

AND ug . u s e r p e n d i n g = 0 AND a . g r o u p i d = ug . g r o u p i d

What if the user enters the following:

0 ’ UNION SELECT TABLE SCHEMA , TABLE NAME , COLUMN NAME FROM i n f o r m a t i o n s c h e m a . COLUMNS

This would return the database name, table name, and column name of everything that’s stored in the database engine.

wiki - Users - name wiki - Users - password

(15)

Invalid queries

When presented with an invalid query, most database engines will provide detailed information that can aid in debugging

Attackers can use this information to determine:

I Database vendor and version

I Operating system

I Data types

I Schema information

Most databases are configured by default to run in developer mode You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ’’’ AND password = ’’

(16)

Tautologies

Tautology attacks work by modifying conditionals to make the statement always true

Either remove conditionals or add new conditionals

SELECT name FROM U s e r s WHERE name = ’ Brandon’ OR ’1’ = ’1 ’ AND p a s s w o r d = ’ ’

(17)

Inference

Applications that are in production typically disable debugging to the point that if attacks succeed, no usable feedback is provided

Used against mysql.com on March 27 2011 Normal blind attacks

I Cannot see error messages but can see the results of modified queries

I Use IF statements to selectively change a value to infer information

Totally blind attacks

I Attackers get no visual feedback

I _{Use statements that selectively slow a query to infer information}

name = ’ Brandon’ AND IF((SELECT SUBSTR(password, 0, 1) FROM Users WHERE name = ’admin’) < ASCII(’L’), 1, 0) -- ’

(18)

Detection evasion

Many intrusion detection systems rely on string comparisons or check for illegal characters

Detection evasion tries to evade these detection patterns Build strings using concatenation, insertion, substring Build strings from hashes or constant string function MySQL lets you specify strings in hex:

0x4272616E646F6E

(19)

Second order attacks

All of the attacks shown thus far have taken input directly from the an attacker and used it to modify queries

Second order attacks work by inserting data into a database and then retrieving it later and using it in attacks

These are much more difficult to predict because the database and previous results are usually considered a safe source

All data that could have ever originated from a user need to be sanitized every time it’s used in a query

(20)

Second order attacks

Searching for posts by users whose name matches a particular pattern

SELECT u s e r n a m e FROM p h p b b u s e r s WHERE u s e r n a m e LIKE ’ $ r e g e x ’

SELECT p o s t i d , p o s t n a m e FROM p h p b b p o s t s WHERE u s e r n a m e = ’ ’

What if a user registers the following username?

(21)

Prevention

So what can you, as a developer, do to prevent injection?[13]

I Sanitize input

I Parameterizing input

I Input type checking

I Positive pattern matching

The problem with all these techniques is that they must be applied consistently by hand

(22)

Input sanitization

$ c o n n e c t i o n = new m y s q l i ( ’ 1 2 7 . 0 . 0 . 1 ’ , ’ Brandon ’ , ’ 1234 ’ , ’ w i k i ’ ) ; $name = $ POST [ ’ name ’ ] ;

$ p a s s w o r d = $ POST [ ’ p a s s w o r d ’ ] ;

$ q u e r y = ”SELECT name FROM U s e r s WHERE name = ’ $name ’ AND p a s s w o r d = ’ $ p a s s w o r d ’ ” ;

$ r e s u l t = $ c o n n e c t i o n −>q u e r y ( $ q u e r y ) o r d i e ( $ c o n n e c t i o n −>c o n n e c t e r r o r ( ) ) ;

So what happens when the user enters: Brandon’ ; DROP TABLE Users

--The problem is that single quote changes the meaning of the query. By escaping that quote, we can remove its significance.

$name = $ c o n n e c t i o n −> m y s q l r e a l e s c a p e s t r i n g ( $ POST [ ’ name ’ ] ) ; $ p a s s w o r d =

$ c o n n e c t i o n −> m y s q l r e a l e s c a p e s t r i n g ( $ POST [ ’ p a s s w o r d ’ ] ) ;

Now, name will be set to

(23)

--Input sanitization

Unfortunately, you have to do more than just escape strings! SQL lets you leave numeric constants unquoted, so this is ok: SELECT username FROM Users WHERE id = 5

What happens if we do this with PHP?

$ i d = $ c o n n e c t i o n −> m y s q l r e a l e s c a p e s t r i n g ( $ POST [ ’ i d ’ ] ) ; $ q u e r y = ”SELECT u s e r n a m e FROM U s e r s WHERE i d = $ i d ” ; $ r e s u l t = $ c o n n e c t i o n −>q u e r y ( $ q u e r y ) ;

What if the user enters: 5 ; DROP TABLE Users

When this string is escaped, it is not changed! Always quote your variables inside of queries

(24)

Input sanitization

User input can come from unexpected places[19] HTTP POST requests and GET requests Hidden fields

HTTP headers (referrer, user-agent, etc.) Cookies

In PHP, arrays are stored as key-value pairs: ’id’ → ’1’, ’email’ → ’foo@bar’. The extract function moves an array’s key-value pairs into variable scope e x t r a c t ( $ POST ) ; is equivalent to $ i d = $ POST [ ’ i d ’ ] ; $ e m a i l = $ POST [ ’ e m a i l ’ ] ; $ n e w P a s s w o r d = ’ ’ ; f o r ( $ i = 0 ; $ i < 8 ; $ i ++) $ n e w P a s s w o r d .= c h r ( r a n d ( 9 7 , 1 2 2 ) ) ;

(25)

Parameterizing input

Easiest, most reliable way to prevent injection attacks Queries are prepared with placeholders where data will go User input is bound and sent to the database separately The database treats input as values so there is no possibility of injection

(26)

Parameterizing input

In PHP, when you bind a parameter, you have to specify its type i → integer d → double s → string b → binary blob $ c o n n e c t i o n = new m y s q l i ( ’ 1 2 7 . 0 . 0 . 1 ’ , ’ Brandon ’ , ’ 1234 ’ , ’ w i k i ’ ) ; $ i d = $ POST [ ’ i d ’ ] ; $ t o p i c = $ POST [ ’ t o p i c ’ ] ;

$ s t m t = $ c o n n e c t i o n −>p r e p a r e ( ”SELECT p o s t C o u n t FROM p o s t s WHERE u s e r i d = ? AND t o p i c = ? ” ) ;

$ s t m t−>b i n d p a r a m ( ” i s ” , $ i d , $ t o p i c ) ; $ s t m t−>e x e c u t e ( ) ;

(27)

Input type checking

$ i d = $ c o n n e c t i o n −> m y s q l r e a l e s c a p e s t r i n g ( $ POST [ ’ i d ’ ] ) ; $ q u e r y = ”SELECT u s e r n a m e FROM U s e r s WHERE i d = $ i d ” ; $ r e s u l t = $ c o n n e c t i o n −>q u e r y ( $ q u e r y ) ;

An attacker could enter: 5 ; DROP TABLE USERS

--What if you just typecast as you receive it from the user?

$ i d = ( i n t ) $ POST [ ’ i d ’ ] ;

Your framework will either reject the input, or ignore the bad part No injection is possible through integers!

(28)

Positive pattern matching

It’s difficult to predict all the bad things that an attacker can do It’s easy to predict what kinds of things are normal

Use regular expressions to match what is expected

(29)

Black box testing

Black box testing tests an application without examining source code

I SQL Inject Me[25]

I WAVES[15]

I SQLiX[7]

I HP WebInspect[14]

I IBM Rational AppScan[16]

Black box testing is limited by its list of prebuilt attacks Will probably not find second order attacks

(30)

Static analysis

Analyzes the source code for problems No runtime overhead

I JDBC checker[8]

I LAPSE[19]

I PHP-Checker[27]

I Pixy[17]

Source code is not always available

Can present false positives because they can’t determine if a source is safe[27]

(31)

Static and dynamic analysis

Source code is analyzed and then queries are monitored at runtime A model of normal queries is built and queries that don’t match are rejected

I AMNESIA[11]

I DFA web filter[22]

Source code is not always available Fixing the source can be difficult

If queries are not identified during the static analysis, there will be false positives

(32)

Taint tracking

User input is marked as tainted as it enters an application Tainted data are tracked

Queries that are generated using input are checked to make sure that tainted data do not modify the query

I SQLGuard[4]

I Java String class[10]

I WASP[12]

I SQLrand[3]

I CANDID[2]

I Metacharacter tracking[26]

I Hamming distance[18]

Ignore second order attacks False sense of security

(33)

Blacklisting

Reject queries based on certain features

Effectiveness is proportional to the sophistication of the detection technique

Doesn’t require access to source code

Requires no interaction from the user or modification of code

I GreenSQL[9]

I _SQLassie

Blacklists must be updated to deal with new attacks Runtime overhead

(34)

Permissions

MySQL lets you define permissions on a per user/host basis Each user/host combination can have a separate password and permission set

Most web applications shouldn’t need to DROP TABLES, so remove those permissions

Give each application a separate account and only give that user access to one particular database

Make sure the passwords are strong! Limit connections to machines you trust

(35)

[1] Alexa.

Alexa top 500 global sites.

http://www.alexa.com/topsites, December 2010.

[2] Bandhakavi, S., Bisht, P., Madhusudan, P., and Venkatakrishnan, V. N.

CANDID: preventing SQL injection attacks using dynamic candidate evaluations.

In CCS ’07: Proceedings of the 14th ACM conference on Computer and communications security (New York, NY, USA, 2007), ACM, pp. 12–24.

[3] Boyd, S. W., and Keromytis, A. D.

SQLrand: Preventing SQL injection attacks.

In In Proceedings of the 2nd Applied Cryptography and Network Security (ACNS) Conference (2004), pp. 292–302.

(36)

In Proceedings of the 5th international workshop on Software engineering and middleware (New York, NY, USA, 2005), SEM ’05, ACM, pp. 106–113.

[5] Christey, S., and Martin, R. A.

CVE - vulnerability type distributions in CVE.

http://cve.mitre.org/docs/vuln-trends/, May 2007.

[6] Clarke, J.

SQL Injection Attacks and Defense, first ed.

Syngrass, May 2009.

[7] Cochin, C.

SQLiX project.

http://www.owasp.org/index.php/Category: OWASP_SQLiX_Project, August 2006.

[8] Gould, C., Su, Z., and Devanbu, P.

(37)

[9] GreenSQL.

GreenSQL — open source SQL database security, SQL injection prevention.

http://www.greensql.net, December 2010.

[10] Haldar, V., Chandra, D., and Franz, M.

Dynamic taint propagation for Java.

In In Proceedings of the 21st Annual Computer Security Applications Conference (2005), pp. 303–311.

[11] Halfond, W. G. J., and Orso, A.

Combining static analysis and runtime monitoring to counter SQL-injection attacks.

In Proceedings of the third international workshop on Dynamic analysis (New York, NY, USA, 2005), WODA ’05, ACM, pp. 1–7.

[12] Halfond, W. G. J., Orso, A., and Manolios, P.

(38)

In Proceedings of the 14th ACM SIGSOFT international symposium on Foundations of software engineering (New York, NY, USA, 2006), SIGSOFT ’06/FSE-14, ACM, pp. 175–185.

[13] Halfond, W. G. J., Viegas, J., and Orso, A.

A classification of SQL-injection attacks and countermeasures.

In Proc. of the International Symposium on Secure Software Engineering (2006).

[14] Hewlett Packard.

HP WebInspect Software.

http://www.hp.com/go/appsec, December 2010.

[15] Huang, Y.-W., Huang, S.-K., Lin, T.-P., and Tsai, C.-H.

Web application security assessment by fault injection and behavior monitoring.

In Proceedings of the 12th international conference on World Wide Web (New York, NY, USA, 2003), WWW ’03, ACM, pp. 148–159.

(39)

http://www-01.ibm.com/software/awdtools/appscan/, December 2010.

[17] Jovanovic, N.

Pixy: XSS and SQLI scanner for PHP.

http://pixybox.seclab.tuwien.ac.at/pixy/, July 2007.

[18] Liu, A., Yuan, Y., Wijesekera, D., and Stavrou, A.

SQLProb: a proxy-based architecture towards preventing SQL injection attacks.

In Proceedings of the 2009 ACM symposium on Applied Computing (New York, NY, USA, 2009), SAC ’09, ACM, pp. 2054–2061.

[19] Livshits, V. B., and Lam, M. S.

Finding security vulnerabilities in Java applications with static analysis.

In Proceedings of the 14th conference on USENIX Security Symposium - Volume 14 (Berkeley, CA, USA, 2005), USENIX

(40)

SQL injection attacks and some tips on how to prevent them.

http://www.codeproject.com/KB/database/ SqlInjectionAttacks.aspx, January 2005.

[21] Mavituna, F.

SQL injection cheat sheet.

http:

//ferruh.mavituna.com/sql-injection-cheatsheet-oku, March 2007.

[22] Muthuprasanna, M., Wei, K., and Kothari, S.

Eliminating SQL injection attacks - a transparent defense mechanism.

In Web Site Evolution, 2006. WSE ’06. Eighth IEEE International Symposium on (2006), pp. 22 –32.

[23] OWASP.

Open web application security project (OWASP) top ten project.

(41)

[24] SANS Institute.

SANS: The top cyber security risks.

http://www.sans.org/top-cyber-security-risks/, September 2009. [25] SecCom. SQL inject me 0.4.5. http://labs.securitycompass.com/index.php/exploit-me/ sql-inject-me/, May 2010.

[26] Su, Z., and Wasserman, G.

The essence of command injection attacks in web applications.

ACM Press, pp. 372–382.

[27] Xie, Y., and Aiken, A.

Static detection of security vulnerabilities in scripting languages.

In Proceedings of the 15th conference on USENIX Security Symposium - Volume 15 (Berkeley, CA, USA, 2006), USENIX