Web Application Security 101 Real-world examples, tools and techniques for securing websites

(1)

Web Application Security 101

Real-world examples, tools and

techniques for securing websites

A WhiteHat Security White Paper

(2)

Introduction

Over 700 million people worldwide bank, shop, buy airline tickets, and perform research using the World Wide Web. With each transaction, private information, including names, addresses, phone numbers, credit card numbers, and passwords, are routinely transferred and stored in a variety of locations. Billions of dollars and millions of personal identities are at stake every day. In the past, security

professionals thought firewalls, Secure Sockets Layer (SSL), patching, and privacy policies were enough to protect websites from hackers (see 5 Myths of Web Application Security1_{). Today, with prominent}

Web attacks taking place seemingly every week, the industry knows better.

The Web Application Security Consortium has identified twenty-four classes of Web attacks, including Cross-Site Scripting2_{(XSS) and SQL}

Injection3_{, used to prey upon corporations, their customers, and}

educational institutions. These attacks are forcing many organizations to take a hard look at their existing web application security posture. In many cases, web application security is a new concept with many facets. This paper will examine the fundamental components of a website, entry points of web attacks, attack methodologies, and suggested preventive measures.

The Basics

The best way to begin exploring web application security is by learning how the Web works. While most IT professionals are very comfortable with using a web browser to surf the Web, few of us look behind the application, at the client-server4_{structure that powers the Web. This}

structure governs the way web browsers (Firefox5_{, Microsoft Internet}

Explorer6_{) must communicate with web servers}7_(Apache8_{, Microsoft}

IIS9_{) to retrieve web pages}10_{. To peer deeper into the world of the}

Web we’ll begin by looking at the web browser location bar (see diagram 1).

All major web browsers possess a location bar that displays the web address11_{(URL) of the current web page. URL manipulation is one of}

(4)

(location bars) are required to enable customers, partners, and

hackers to view your website. URL’s are used to uniquely identify the location of a web page or on-line resource. When traveling from one web page to the next, the displayed URL is updated. URLs, also

referred to as links, are commonly embedded in web pages to click on to visit other pages. URLs also tell us a lot about a website. They tell us what type of communication they expect, what type of operating system they run, the type of web application code is being used, and more. We’ll be exploring the anatomy of URLs closely in the following section and we’ll look at how each section can be vulnerable to attack.

Diagram 1: Location Bar

A critical point to note here: lack of firewall protection on the Web. When visiting any one of the millions of websites that exist today, it’s unlikely that you will encounter firewall protection. It’s not that

firewalls aren’t there or not useful, they are. In fact, most websites have firewalls protecting them from network-based attack (worms, viruses, hackers). But network-based attacks are fundamentally different from web-based attacks (XSS, SQL Injection), which are immune to a firewall’s defenses.

A firewall’s job is to prevent unauthorized connections to protected network devices. For instance, you probably do not want intruders connecting to internal databases, workstations, printers, and so forth. For a website to be accessible to the public, firewalls must allow traffic to the web server. If it did not, no one would be able to visit a

website. This means if a website has a vulnerability, firewalls are powerless, since web traffic must be allowed in. Web application security sits above the network layer, in a world of its own.

(5)

Basic Web Architecture

Now that we’ve established a basic understanding of Web

communication, lets dive deeper into the technology by analyzing the anatomy of a URL.

Anatomy of Web Requests and URLs

HTTP and HTTPS

http://example.com/path/to/application.cgi?param1=value1&param2=value2 At the beginning of a URL there is the designated communication protocol (in this case “http”). The protocol designates how the web browser and the web server communicate with each other. HTTP12_{is a}

stateless protocol. This means when a user wants something, a connection to the web server is established, a request is sent, and a response is received. Afterwards the connection is severed. HTTPS is another common protocol specification which is HTTP wrapped with Secure Sockets Layer13_{(SSL) encryption.}

SSL connections, indicated by a tiny lock symbol in web browser window ( ), ensures that information sent to and from a website is encrypted. Anyone monitoring the network traffic will not be able to read the data. This is great for protecting credit card numbers, social security numbers, and other forms of sensitive data traveling across the network. Contrary to popular belief, however, SSL does not secure a website. SSL only protects data in transit, and does not protect information stored once it arrives. Many sites using strong, 128-bit SSL have been hacked as often as those that do not. When private data is stored on the web, the risk is at the server, not in between.

(6)

Web Server / Domain Name

http://example.com/path/to/application.cgi?param1=value1&param2=value2 The next section of the URL after the double forward slash,

“example.com,” specifies the web server’s domain name. When we click on a URL, this is the web server to which the web browser will connect. Web servers handle the network communication between the website and the visitor.

Directory Path

http://example.com/path/to/application.cgi?param1=value1&param2=value2 Beginning with the forward slash after the domain name, is the directory or file path. This points to the location on the web server where the resource is found. A resource could be the path to an html file, web application, Powerpoint presentation, or almost any other file type. The URLs tell the web server where to find it. Hackers are able to manipulate the directory path to find old files, like customer account numbers, that may have been updated and forgotten on a server.

Web Applications

http://example.com/path/to/application.cgi?param1=value1&param2=value2 Web application security owes its name to the next section of the URL. The portion of the URL after the final forward slash and before the question mark is the “web application14_{.” Web applications are}

software that enables a website to serve-up dynamic content. They can do just about anything and be programmed in just about any language. When they receive requests, these applications dynamically create web pages to return to the web browser. Web applications turn an ordinary web server into a Google (www.google.com), a Yahoo! (www.yahoo.com), an online auction, web bank, message board, blog, etc. Without web applications, the web would be filled with static content and none of the interactivity that drives innovation.

(7)

Query String

http://example.com/path/to/application.cgi?param1=value1&param2=value2

The last part of the URL after the question mark is referred to as the “query string.” filled with parameter name-value pairs. Web

applications use parameter values as an input to the program. For example, by reading the following URL, we can conclude that we are using http to connect to google.com, executing the search web application “search” and searching for “testing.”

http://www.google.com/search?q=testing

If we were to change the “q” parameter value, we could search for “security.”

http://www.google.com/search?q=security

In more severe cases, hackers use parameter tampering to gain unauthorized access to customer order numbers and other private data.

Cookies

As mentioned earlier, HTTP is a stateless protocol. From one HTTP request to the next, the web server cannot determine if the second request is from the same person. Without the ability to make this connection, there is no way to track a user on a website and it’s

difficult to maintain user login state. Cookies provide a mechanism to keep state.

Cookies are a small amount of data supplied by the web server and stored by the web browser. With each new request the browser

sends, the cookies stored by the web server are returned. This allows a user to be uniquely identified and state maintained. Since cookies are used to identify users, they are an attractive target to hackers. Once someone has a user’s cookie, he can effectively become that user.

(8)

HTTP Response

HTTP/1.1 200 OK

Date: Thursday, 01 Dec 2006 23:37:18 GMT Server: Apache

Set-Cookie: Name=Value; path=/; expires=Thursday, 01-Dec-06 23:12:40 GMT Content-Type: text/html HTTP Request GET http://www.whitehatsec.com HTTP/1.1 Host: www.whitehatsec.com Cookie: Name=Value

Post Data

Post Data is another portion of an HTTP request, and is typically populated with data from web forms. Post Data is most often used when larger amounts of data need to be sent from the browser to the web server. Post data is more or less identical to a query string, except that it’s located in the lower body of the request.

(9)

Post Data

GET http://www.whitehatsec.com HTTP/1.1 Host: www.whitehatsec.com

Content-length: 31

username=johndoe&password=abc123

Three Places to Attack a Website

Statistically, over 90% of all websites have serious security issues, but the big question is “how are they attacked?” If we look at everything we’ve covered so far, basically there are only three places to attack a website: URL’s, cookies, and Post Data. We’ll explore each of these attack points by analyzing real-world exploits. Recall that there are twenty-four classes of attack, and each of these points is vulnerable to all of them.

URL

In July of 2005, the University of Southern California (USC) was informed of a SQL Injection vulnerability found within one of their websites15_{. The website in question is used to accept applications}

from prospective students. According to sources, a lack of security checks on the login web-form text boxes allowed commands to be sent to the back-end database. These commands enabled public access to personal information including the names, birth dates, addresses, and social-security numbers of up to 280,000 users. While the specific technical details remain undisclosed, here is a likely scenario of what took place.

When filling out the login web-form with a username (johndoe) and password (abc123), the web browser would generate a URL similar to following:

(10)

The data from the HTTP request is passed into a database SQL

statement of the web application “login.asp.” Login.asp is responsible for performing user authentication.

string strQry = "SELECT Count(*) FROM Users WHERE UserName='" + username.Text + "' AND Password='" + password.Text + "'";

The resulting SQL command is sent to the database:

SELECT Count(*) FROM Users WHERE UserName='johndoe' AND Password='abc123'

If the SQL command succeeded (username/password combo was correct), a database record is returned and the user is logged-in. A security issue arises if meta-characters are submitted into the web-form instead of the expected, alpha-numeric usernames and

passwords. Specifically, we’re referring to single quotes and semi-colons.

http://victim.com/login.asp?username=’;&password=’; The resulting SQL command:

SELECT Count(*) FROM Users WHERE UserName='’;' AND Password='’;'

The above SQL command produces a database error caused because the syntax is wrong. An error similar to the following often is

displayed within resulting web pages as a solid indication that a SQL Injection vulnerability exists.

Microsoft OLE DB Provider for SQL Server error '80040e14' Unclosed quotation mark before the character string '; password. /login.asp, line 39

In the case of the USC incident, here is what the hacker likely submitted into the web-form:

http://victim.com/login.asp?username=’+OR+1=1&password=’+OR+1=1; Producing the following SQL command:

(11)

The previous SQL command always returns true, and produces the first record in the User’s table. This allows authentication to be

bypassed completely. From this point, the hacker is also able to send any valid SQL commands he can generate to pull information from the database.

This vulnerability could have been avoided with proper sanity checking of the incoming username and password values. If login.asp ONLY accepted alphanumeric characters, the incident would not have occurred.

Cookie

Just before Valentine’s Day 2003, FTD.com (a large online florist) received word that hackers could illegally access customer information by simply changing a particular number within a cookie16_{. The number}

was a customer identifier that could be easily guessed and changed to access the sensitive information of other users. This class of attack is generally referred to as Credential Session Prediction17. Security experts confirmed the problem existed and that customer billing records, names, addresses, and phone numbers were exposed. Here is an example of what of likely took place:

When users login to FTD.com or add products to their shopping carts, they are given a cookie to track the current session. The cookie

contains a unique customer identifier; in this case “CustomerID”, with a value of “1001.”

Set-Cookie: CustomerID=1001; path=/; expires=Thursday, 01-Dec-06 23:12:40 GMT

On subsequent requests, the user’s browser returns the cookie so that he or she may be properly identified.

Cookie: CustomerID=1001

At this point, an easy assumption is that the next user on the FTD.com website would be given a cookie with a customer identifier of “1002”. What a hacker would do is simply edit their own FTD.com cookie customer identifier to that of someone else who has already been to the website.

(12)

Cookie: CustomerID=1000

If successful, the hacker would automatically jump into another user’s session, with the ability to take over the account and access personal information.

In this case, the solution is to have random Customer Identifiers. FTD should have been using random-unique integers of at least 10 to 12 characters in length. This would prevent the Customer Identifier from being ascertained, even after extended attempts.

Post Data

In October 2005, in an incident known as the Samy Worm, a hacker (Samy) used a common, Cross-Site Scripting (XSS) vulnerability to exploit the MySpace social networking website18_{. Users’ web browsers}

were forced to send web requests that they did not expect to make. Approximately one million users unwittingly altered their profile web page with malicious JavaScript code when they visited other infected profiles.

Samy altered his MySpace profile web page to include some crafty JavaScript exploit code. When another MySpace user visited Samy’s profile, the JavaScript code would execute its payload in the user’s browser. At this point, Samy had complete control over the user’s browser. The payload forced the user to automatically befriend Samy, and also add him as a hero (“Samy is my hero” was appended to each profile page). To propagate site-wide, the exploit code would copy itself to the user’s profile and wait to exploit other unsuspecting users. To halt the infection, MySpace was forced to shutdown its website for twenty hours. While the incident was severe, the payload of the Samy worm was relatively benign. It would have been trivial for Samy to force everyone to delete their profile and blog data, resulting in a far more damaging situation for users and for MySpace.

(13)

Defense-in-Depth

All it takes is a single security flaw, or one small oversight, for your company to make headlines. Experience tells us that no single protective measure is completely impenetrable. Everything has its weakness, and it’s only a matter of time before that weakness is found and exploited. With this real-world knowledge, most security experts subscribe to a philosophy called defense-in-depth to protect their systems. Defense-in-depth promotes a layered security approach, so that if any single control mechanism fails, other defensive measures are in place to ensure nothing is compromised.

Architecture Security

All secure e-commerce infrastructures must be built on a solid

foundation. Without a solid foundation, no amount of security in web application code will be enough to defend a website. Below is a top-level checklist to use to assess your overall security. The Center for Internet Security19_{(CIS) has excellent resources for in-depth,}

system-specific knowledge of architecture security issues. Also, the Payment Card Industry20_{(PCI) Data Security Standard is another resource for a}

comprehensive security program.

o Networks are properly segmented to separate public,

semi-private, and private systems.

o Perimeter firewalls are in place between network segments to only allow a limited set of network services to communicate.

o Operating systems are hardened and patches are kept up-to-date.

o Web servers are properly patched, configured, and have any

(14)

Secure Software Development Program

Secure software is quality software. Vulnerabilities are nothing more than software bugs. And the best way to squash these bugs before they become real problems is to tightly integrate security

consideration at all points of the software development life cycle (SDLC), from architecture design, to development releases, to quality assurance phases.

While there are copious amounts of data available covering software security best practices, the primary caution to developers is “DO NOT TRUST CLIENT-SIDE DATA.” Lack of proper input validation is the number-one cause of web application security issues. The Web is a hostile environment; therefore it’s absolutely critical to validate all data you plan to utilize, whether it’s from HTTP requests or the database. Here is a checklist for how to perform proper input validation in any programming language:

o Character-set: Ensure the data only contains characters you expect to receive.

o Length: Ensure the data falls within a restricted minimum and maximum number of bytes.

o Data Format: Ensure the structure of the data is consistent with what is expected. Phone numbers should look like phone numbers, email addresses should look like email address, etc.

o Escape: Before data is passed onto sub-systems, especially database or operating system calls, all characters should be escaped, meaning no special characters should be allowed into the system unchecked.

o Filtering: Sanitize data to not include dangerous characters. Specifically, convert < and > characters into their equivalent HTML entities to prevent XSS issues.

(15)

Vulnerability Assessment and Management

No matter how many defensive measures are piled onto a system, the only way to tell if risk is being mitigated is to adequately measure security. For web application security this means comprehensively assessing websites for vulnerabilities (using the WASC Threat

Classification) and managing the remediation process when issues are found. To ensure security, this process should be conducted with each change to the web application code.

There are three key points to consider when assessing web applications:

1. Web applications are inherently unique.

As we’ve discussed, each website, whether e-commerce, online banking or health information, contains custom code. Off-the-shelf products cannot fully identify web application vulnerabilities in custom code, and many website vulnerabilities cannot be found in an

automated manner. Since each site has specialized functionality, the methods to exploit that application may be just as specialized.

Software cannot respond to that level of customization. Security expertise is required.

2. Assessing your production website is essential.

Hackers enter through holes in a live site, not the development or QA environment. Unseen flaws can appear between the time

development is completed and production is started. The credit card companies have realized this important distinction. That is why PCI mandates scanning all custom web applications in the production environment.

3. Communication between the development and security teams is critical.

90% of all websites have security issues. That’s a fact. It will take time to fix all the vulnerabilities in a website, which is why teamwork is important. The security organization needs updates on remediation progress so that they can adequately protect corporate applications. The development organization needs to work with security to prioritize fixes so that the most dangerous issues are resolved quickly.

(16)

In the past, website vulnerability assessment was a time-consuming and often expensive process. Today, WhiteHat Security offers

WhiteHat Sentinel21_{, the only continuous vulnerability assessment and}

management service for web applications.

WhiteHat Sentinel identifies, manages, and recommends remediation for website vulnerabilities. We supply the information needed for organizations to protect corporate websites from attack.

WhiteHat Sentinel is a must-have to secure valuable customer data, comply with industry standards and maintain brand integrity. WhiteHat Sentinel is a crucial component in a website security program and makes web application security easy.

Why choose WhiteHat Sentinel?

• Turnkey

WhiteHat Sentinel is a turnkey service. There’s nothing to install and no technology or personnel investment is required. We’ve simplified the complex world of web application security by delivering only actionable information to clients.

• Comprehensive

WhiteHat Sentinel is the only solution that finds all website flaws, both technical (SQL Injection and Cross Site Scripting) and logical (Insufficient Authorization), in your website. It covers the WASC twenty-four classes of attack for maximum coverage. And WhiteHat Sentinel can assess both production and development websites, an important requirement for PCI.

• Timely

WhiteHat Sentinel continuously assesses websites and provides web-based reporting, delivering 24/7 access to vulnerability status for development and security personnel.

• Cost-effective

It’s difficult to quantify the total cost of a security breach, but we know it’s a combination of diminished customer confidence,

exposed data, technical repairs, customer notification and brand damage, all of which can have significant financial impact.

(17)

knowledge to repair potential trouble spots before they can be exploited. In addition, our customers save time by replacing hard-to-use tools with an easy-to-access, web-based

management console. They also save money over quarterly or annual web assessments by consultants because one year of WhiteHat Sentinel is approximately the same cost as a single consultant assessment. With WhiteHat Sentinel, corporate security has the ongoing, big-picture view of the security of all web applications necessary to protect business from the growing number of web-based attacks.

For more information about WhiteHat Sentinel or to find more white papers on web application security, please contact WhiteHat Security:

Email: [email protected] Website: www.whitehatsec.com Telephone: (408) 492-1817

About WhiteHat Security, Inc.

Headquartered in Santa Clara, California, WhiteHat Security is a leading provider of web application security services. WhiteHat develops comprehensive, easy-to-use, cost-effective solutions that enable companies to secure valuable customer data, meet federal compliance standards, and maintain customer confidence. WhiteHat Sentinel, the company’s flagship service, provides continuous

(18)

Notes

1

5 Security Myths

http://www.varbusiness.com/showArticle.jhtml?articleID=22104030&flatPage=true

2_{Cross-site Scripting (XSS) is an attack technique that forces a web site to echo}

attacker-supplied executable code, which loads in a user's browser. The code itself is usually written in HTML/JavaScript, but may also extend to VBScript, ActiveX, Java, Flash, or any other browser-supported technology.

http://www.webappsec.org/projects/threat/classes/cross-site_scripting.shtml

3_{SQL Injection is an attack technique used to exploit web sites that construct SQL}

statements from user-supplied input.

http://www.webappsec.org/projects/threat/classes/sql_injection.shtml

4_{A common form of distributed system in which software is split between server}

tasks and client tasks. A client sends requests to a server, according to some protocol, asking for information or action, and the server responds.

http://dictionary.reference.com/search?q=client-server

5_{A popular open-source web browser.}

http://www.getfirefox.com

6_{Microsoft’s web browser (IIS).}

http://www.microsoft.com/windows/ie/

7_{A general-purpose software application used to handle HTTP requests. A web server}

may utilize a web application for dynamic web page content. http://www.webappsec.org/projects/glossary/#WebServer

8_{A popular open-source web server by the Apache Software Foundation.}

http://www.apache.org/

9_{Microsoft Internet Information Server (IIS)}

http://www.microsoft.com/WindowsServer2003/iis/default.mspx

10_{A document on the World Wide Web, consisting of an HTML file and any related}

files for scripts and graphics, and often hyperlinked to other documents on the Web. http://dictionary.reference.com/search?q=web%20page

11_{Uniform Resource Locator (URL). The location of an on-line web-based resource.}

http://www.w3.org/Addressing/

12_{Hypertext Transfer Protocol (HTTP) is an application-level protocol for distributed,}

collaborative, hypermedia information systems. http://www.ietf.org/rfc/rfc2616.txt

13_{An industry standard public-key protocol used to create encrypted tunnels}

between two network-connected devices

http://www.webappsec.org/projects/glossary/#SecureSocketsLayer

14_{A software application, executed by a web server, which responds to dynamic web}

page requests over HTTP.

http://www.webappsec.org/projects/glossary/#WebApplication

15_{Flawed USC admissions site allowed access to applicant data}

http://online.securityfocus.com/news/11239

16_{FTD.com hole leaks personal information}

http://news.com.com/2100-1017-984585.html

17_{Credential/Session Prediction is a method of hijacking or impersonating a web site}

user. Deducing or guessing the unique value that identifies a particular session or user accomplishes the attack. Also known as Session Hijacking, the consequences could allow attackers the ability to issue web site requests with the compromised

(19)

http://www.webappsec.org/projects/threat/classes/credential_session_prediction.sht ml

18_{Teen uses worm to boost ratings on MySpace.com}

http://www.computerworld.com/securitytopics/security/holes/story/0,10801,105484 ,00.html

19_{The Center for Internet Security (CIS) is a non-profit enterprise whose mission is}

to help organizations reduce the risk of business and e-commerce disruptions resulting from inadequate technical security controls.

http://www.cisecurity.com/

20_{Payment Card Industry (PCI) Data Security Requirements apply to all Members,}

merchants, and service providers that store, process or transmit cardholder data. http://usa.visa.com/business/accepting_visa/ops_risk_management/cisp.html

21_{WhiteHat Sentinel is the only continuous vulnerability assessment and}

management service for web applications. http://www.whitehatsec.com/services.shtml

Web Application Security 101 Real-world examples, tools and techniques for securing websites

Web Application Security 101

Real-world examples, tools and

techniques for securing websites

Table of Contents

Introduction

The Basics

Anatomy of Web Requests and URLs

HTTP and HTTPS

Web Server / Domain Name

Directory Path

Web Applications

Query String

Cookies

Post Data

Three Places to Attack a Website

URL

Cookie

Post Data

Defense-in-Depth

Architecture Security

Secure Software Development Program

Vulnerability Assessment and Management