Bayesian Classification for SQL Injection Detection
Brandon Skari
College of Engineering and Applied Science University of Wyoming
Laramie, Wyoming 82070 [email protected]
Overview
SQL Injection Types of attacks Prevention techniques SQLassieDynamic web sites
Change based on user interaction
I Web mail
I Online stores
I Search engines
I Social networks
I Banking sites
SQL Injection
Injection attacks are one of the foremost threats facing dynamic web applications[23, 13, 24, 5]
Allow attackers to:
I Access, modify and delete data
I Bypass authentication
I Denial of service
Queries
Queries usually come in 4 types:
I Select
I Insert
I Update
I Delete
A typical query may look like the following[6]:
SELECT name FROM U s e r s
Queries
A developer has written:
SELECT name FROM U s e r s
WHERE name = ’ $name ’ AND p a s s w o r d = ’ $ p a s s w o r d ’
What happens when the user enters the following for the username?
name =
The query is run as the following:
SELECT name FROM U s e r s
Types of attacks
Halfond[13] classified attacks based on their intention Identifying injectable parameters
Performing database fingerprinting Determining database schema Adding or modifying data Denial of service
Accessing data Avoiding detection Bypassing authentication Executing remote commands Performing privilege escalation
Attack techniques
There are a number of techniques that attackers use to attack a database[6, 13, 20, 21] Multiple queries Unions Invalid queries Tautologies Inference Detection evasion Second order attacks
Multiple queries
The easiest attacks to implement
SQL allows multiple queries to be run in sequence, if they are separated by semicolons
Attackers can run essentially query, allowing full access to the database
Most database engines include special commands to access files or other information from the operating system
Multiple queries
SELECT name FROM U s e r s WHERE
name = ’ $name ’ AND p a s s w o r d = ’ $ p a s s w o r d ’
What if the user enters:
’ ; DROP TABLE U s e r s −−
The query is run as:
SELECT name FROM U s e r s WHERE
Unions
Normal SELECT commands retrieve a number of rows from a database
UNION can be used to append data from other tables and even from unrelated applications
Unions
SELECT name FROM U s e r s WHERE
name = ’ $name ’ AND p a s s w o r d = ’ $ p a s s w o r d ’
What if the user enters:
’ UNION SELECT ’ Admin ’ −−
The query is run as
SELECT name FROM U s e r s WHERE
Unions
SELECT a . f o r u m i d , a . a u t h v i e w , a . a u t h m o d FROM p h p b b a u t h a c c e s s a , p h p b b u s e r g r o u p ug WHERE ug . u s e r i d = ’ $ i d ’
AND ug . u s e r p e n d i n g = 0 AND a . g r o u p i d = ug . g r o u p i d
What if the user enters the following:
0 ’ UNION SELECT TABLE SCHEMA , TABLE NAME , COLUMN NAME FROM i n f o r m a t i o n s c h e m a . COLUMNS
This would return the database name, table name, and column name of everything that’s stored in the database engine.
wiki - Users - name wiki - Users - password
Invalid queries
When presented with an invalid query, most database engines will provide detailed information that can aid in debugging
Attackers can use this information to determine:
I Database vendor and version
I Operating system
I Data types
I Schema information
Most databases are configured by default to run in developer mode You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ’’’ AND password = ’’
Tautologies
Tautology attacks work by modifying conditionals to make the statement always true
Either remove conditionals or add new conditionals
SELECT name FROM U s e r s WHERE
name = ’ $name ’ AND p a s s w o r d = ’ $ p a s s w o r d ’
SELECT name FROM U s e r s WHERE name = ’ Brandon’ OR ’1’ = ’1 ’ AND p a s s w o r d = ’ ’
Inference
Applications that are in production typically disable debugging to the point that if attacks succeed, no usable feedback is provided
Used against mysql.com on March 27 2011 Normal blind attacks
I Cannot see error messages but can see the results of modified queries
I Use IF statements to selectively change a value to infer information
Totally blind attacks
I Attackers get no visual feedback
I Use statements that selectively slow a query to infer information
SELECT name FROM U s e r s WHERE
name = ’ Brandon’ AND IF((SELECT SUBSTR(password, 0, 1) FROM Users WHERE name = ’admin’) < ASCII(’L’), 1, 0) -- ’
Detection evasion
Many intrusion detection systems rely on string comparisons or check for illegal characters
Detection evasion tries to evade these detection patterns Build strings using concatenation, insertion, substring Build strings from hashes or constant string function MySQL lets you specify strings in hex:
0x4272616E646F6E
Second order attacks
All of the attacks shown thus far have taken input directly from the an attacker and used it to modify queries
Second order attacks work by inserting data into a database and then retrieving it later and using it in attacks
These are much more difficult to predict because the database and previous results are usually considered a safe source
All data that could have ever originated from a user need to be sanitized every time it’s used in a query
Second order attacks
Searching for posts by users whose name matches a particular pattern
SELECT u s e r n a m e FROM p h p b b u s e r s WHERE u s e r n a m e LIKE ’ $ r e g e x ’
SELECT p o s t i d , p o s t n a m e FROM p h p b b p o s t s WHERE u s e r n a m e = ’ ’
What if a user registers the following username?
Prevention
So what can you, as a developer, do to prevent injection?[13]
I Sanitize input
I Parameterizing input
I Input type checking
I Positive pattern matching
The problem with all these techniques is that they must be applied consistently by hand
Input sanitization
$ c o n n e c t i o n = new m y s q l i ( ’ 1 2 7 . 0 . 0 . 1 ’ , ’ Brandon ’ , ’ 1234 ’ , ’ w i k i ’ ) ; $name = $ POST [ ’ name ’ ] ;
$ p a s s w o r d = $ POST [ ’ p a s s w o r d ’ ] ;
$ q u e r y = ”SELECT name FROM U s e r s WHERE name = ’ $name ’ AND p a s s w o r d = ’ $ p a s s w o r d ’ ” ;
$ r e s u l t = $ c o n n e c t i o n −>q u e r y ( $ q u e r y ) o r d i e ( $ c o n n e c t i o n −>c o n n e c t e r r o r ( ) ) ;
So what happens when the user enters: Brandon’ ; DROP TABLE Users
--The problem is that single quote changes the meaning of the query. By escaping that quote, we can remove its significance.
$name = $ c o n n e c t i o n −> m y s q l r e a l e s c a p e s t r i n g ( $ POST [ ’ name ’ ] ) ; $ p a s s w o r d =
$ c o n n e c t i o n −> m y s q l r e a l e s c a p e s t r i n g ( $ POST [ ’ p a s s w o r d ’ ] ) ;
Now, name will be set to
--Input sanitization
Unfortunately, you have to do more than just escape strings! SQL lets you leave numeric constants unquoted, so this is ok: SELECT username FROM Users WHERE id = 5
What happens if we do this with PHP?
$ i d = $ c o n n e c t i o n −> m y s q l r e a l e s c a p e s t r i n g ( $ POST [ ’ i d ’ ] ) ; $ q u e r y = ”SELECT u s e r n a m e FROM U s e r s WHERE i d = $ i d ” ; $ r e s u l t = $ c o n n e c t i o n −>q u e r y ( $ q u e r y ) ;
What if the user enters: 5 ; DROP TABLE Users
When this string is escaped, it is not changed! Always quote your variables inside of queries
Input sanitization
User input can come from unexpected places[19] HTTP POST requests and GET requests Hidden fields
HTTP headers (referrer, user-agent, etc.) Cookies
In PHP, arrays are stored as key-value pairs: ’id’ → ’1’, ’email’ → ’foo@bar’. The extract function moves an array’s key-value pairs into variable scope e x t r a c t ( $ POST ) ; is equivalent to $ i d = $ POST [ ’ i d ’ ] ; $ e m a i l = $ POST [ ’ e m a i l ’ ] ; $ n e w P a s s w o r d = ’ ’ ; f o r ( $ i = 0 ; $ i < 8 ; $ i ++) $ n e w P a s s w o r d .= c h r ( r a n d ( 9 7 , 1 2 2 ) ) ;
Parameterizing input
Easiest, most reliable way to prevent injection attacks Queries are prepared with placeholders where data will go User input is bound and sent to the database separately The database treats input as values so there is no possibility of injection
Parameterizing input
In PHP, when you bind a parameter, you have to specify its type i → integer d → double s → string b → binary blob $ c o n n e c t i o n = new m y s q l i ( ’ 1 2 7 . 0 . 0 . 1 ’ , ’ Brandon ’ , ’ 1234 ’ , ’ w i k i ’ ) ; $ i d = $ POST [ ’ i d ’ ] ; $ t o p i c = $ POST [ ’ t o p i c ’ ] ;
$ s t m t = $ c o n n e c t i o n −>p r e p a r e ( ”SELECT p o s t C o u n t FROM p o s t s WHERE u s e r i d = ? AND t o p i c = ? ” ) ;
$ s t m t−>b i n d p a r a m ( ” i s ” , $ i d , $ t o p i c ) ; $ s t m t−>e x e c u t e ( ) ;
Input type checking
$ i d = $ c o n n e c t i o n −> m y s q l r e a l e s c a p e s t r i n g ( $ POST [ ’ i d ’ ] ) ; $ q u e r y = ”SELECT u s e r n a m e FROM U s e r s WHERE i d = $ i d ” ; $ r e s u l t = $ c o n n e c t i o n −>q u e r y ( $ q u e r y ) ;
An attacker could enter: 5 ; DROP TABLE USERS
--What if you just typecast as you receive it from the user?
$ i d = ( i n t ) $ POST [ ’ i d ’ ] ;
Your framework will either reject the input, or ignore the bad part No injection is possible through integers!
Positive pattern matching
It’s difficult to predict all the bad things that an attacker can do It’s easy to predict what kinds of things are normal
Use regular expressions to match what is expected
Black box testing
Black box testing tests an application without examining source code
I SQL Inject Me[25]
I WAVES[15]
I SQLiX[7]
I HP WebInspect[14]
I IBM Rational AppScan[16]
Black box testing is limited by its list of prebuilt attacks Will probably not find second order attacks
Static analysis
Analyzes the source code for problems No runtime overhead
I JDBC checker[8]
I LAPSE[19]
I PHP-Checker[27]
I Pixy[17]
Source code is not always available
Can present false positives because they can’t determine if a source is safe[27]
Static and dynamic analysis
Source code is analyzed and then queries are monitored at runtime A model of normal queries is built and queries that don’t match are rejected
I AMNESIA[11]
I DFA web filter[22]
Source code is not always available Fixing the source can be difficult
If queries are not identified during the static analysis, there will be false positives
Taint tracking
User input is marked as tainted as it enters an application Tainted data are tracked
Queries that are generated using input are checked to make sure that tainted data do not modify the query
I SQLGuard[4]
I Java String class[10]
I WASP[12]
I SQLrand[3]
I CANDID[2]
I Metacharacter tracking[26]
I Hamming distance[18]
Ignore second order attacks False sense of security
Blacklisting
Reject queries based on certain features
Effectiveness is proportional to the sophistication of the detection technique
Doesn’t require access to source code
Requires no interaction from the user or modification of code
I GreenSQL[9]
I SQLassie
Blacklists must be updated to deal with new attacks Runtime overhead
Permissions
MySQL lets you define permissions on a per user/host basis Each user/host combination can have a separate password and permission set
Most web applications shouldn’t need to DROP TABLES, so remove those permissions
Give each application a separate account and only give that user access to one particular database
Make sure the passwords are strong! Limit connections to machines you trust
[1] Alexa.
Alexa top 500 global sites.
http://www.alexa.com/topsites, December 2010.
[2] Bandhakavi, S., Bisht, P., Madhusudan, P., and Venkatakrishnan, V. N.
CANDID: preventing SQL injection attacks using dynamic candidate evaluations.
In CCS ’07: Proceedings of the 14th ACM conference on Computer and communications security (New York, NY, USA, 2007), ACM, pp. 12–24.
[3] Boyd, S. W., and Keromytis, A. D.
SQLrand: Preventing SQL injection attacks.
In In Proceedings of the 2nd Applied Cryptography and Network Security (ACNS) Conference (2004), pp. 292–302.
In Proceedings of the 5th international workshop on Software engineering and middleware (New York, NY, USA, 2005), SEM ’05, ACM, pp. 106–113.
[5] Christey, S., and Martin, R. A.
CVE - vulnerability type distributions in CVE.
http://cve.mitre.org/docs/vuln-trends/, May 2007.
[6] Clarke, J.
SQL Injection Attacks and Defense, first ed.
Syngrass, May 2009.
[7] Cochin, C.
SQLiX project.
http://www.owasp.org/index.php/Category: OWASP_SQLiX_Project, August 2006.
[8] Gould, C., Su, Z., and Devanbu, P.
[9] GreenSQL.
GreenSQL — open source SQL database security, SQL injection prevention.
http://www.greensql.net, December 2010.
[10] Haldar, V., Chandra, D., and Franz, M.
Dynamic taint propagation for Java.
In In Proceedings of the 21st Annual Computer Security Applications Conference (2005), pp. 303–311.
[11] Halfond, W. G. J., and Orso, A.
Combining static analysis and runtime monitoring to counter SQL-injection attacks.
In Proceedings of the third international workshop on Dynamic analysis (New York, NY, USA, 2005), WODA ’05, ACM, pp. 1–7.
[12] Halfond, W. G. J., Orso, A., and Manolios, P.
In Proceedings of the 14th ACM SIGSOFT international symposium on Foundations of software engineering (New York, NY, USA, 2006), SIGSOFT ’06/FSE-14, ACM, pp. 175–185.
[13] Halfond, W. G. J., Viegas, J., and Orso, A.
A classification of SQL-injection attacks and countermeasures.
In Proc. of the International Symposium on Secure Software Engineering (2006).
[14] Hewlett Packard.
HP WebInspect Software.
http://www.hp.com/go/appsec, December 2010.
[15] Huang, Y.-W., Huang, S.-K., Lin, T.-P., and Tsai, C.-H.
Web application security assessment by fault injection and behavior monitoring.
In Proceedings of the 12th international conference on World Wide Web (New York, NY, USA, 2003), WWW ’03, ACM, pp. 148–159.
http://www-01.ibm.com/software/awdtools/appscan/, December 2010.
[17] Jovanovic, N.
Pixy: XSS and SQLI scanner for PHP.
http://pixybox.seclab.tuwien.ac.at/pixy/, July 2007.
[18] Liu, A., Yuan, Y., Wijesekera, D., and Stavrou, A.
SQLProb: a proxy-based architecture towards preventing SQL injection attacks.
In Proceedings of the 2009 ACM symposium on Applied Computing (New York, NY, USA, 2009), SAC ’09, ACM, pp. 2054–2061.
[19] Livshits, V. B., and Lam, M. S.
Finding security vulnerabilities in Java applications with static analysis.
In Proceedings of the 14th conference on USENIX Security Symposium - Volume 14 (Berkeley, CA, USA, 2005), USENIX
SQL injection attacks and some tips on how to prevent them.
http://www.codeproject.com/KB/database/ SqlInjectionAttacks.aspx, January 2005.
[21] Mavituna, F.
SQL injection cheat sheet.
http:
//ferruh.mavituna.com/sql-injection-cheatsheet-oku, March 2007.
[22] Muthuprasanna, M., Wei, K., and Kothari, S.
Eliminating SQL injection attacks - a transparent defense mechanism.
In Web Site Evolution, 2006. WSE ’06. Eighth IEEE International Symposium on (2006), pp. 22 –32.
[23] OWASP.
Open web application security project (OWASP) top ten project.
[24] SANS Institute.
SANS: The top cyber security risks.
http://www.sans.org/top-cyber-security-risks/, September 2009. [25] SecCom. SQL inject me 0.4.5. http://labs.securitycompass.com/index.php/exploit-me/ sql-inject-me/, May 2010.
[26] Su, Z., and Wasserman, G.
The essence of command injection attacks in web applications.
ACM Press, pp. 372–382.
[27] Xie, Y., and Aiken, A.
Static detection of security vulnerabilities in scripting languages.
In Proceedings of the 15th conference on USENIX Security Symposium - Volume 15 (Berkeley, CA, USA, 2006), USENIX