Website Security 101. Real-World Examples, Tools & Techniques for Protecting & Securing Websites. A WhiteHat Security Whitepaper

12  Download (0)

Full text


Website Security 101

Real-World Examples, Tools & Techniques

for Protecting & Securing Websites

June 2007 – updated

Jeremiah Grossman

Founder and CTO, WhiteHat Security


Table of Contents




















Over 700 million people worldwide bank, shop, buy airline tickets, and perform research using the World Wide Web. With each transaction, private information, including names, addresses, phone numbers, credit card numbers, and passwords, are routinely transferred and stored in a variety of locations. Billions of dollars and millions of personal identities are at stake every day. In the past, security professionals thought firewalls, Secure Sockets Layer (SSL), patching, and privacy policies were enough to protect websites from hackers (see 5 Myths of Website Security1). Today,

with prominent Web attacks taking place seemingly every week, the industry knows better.

The Web Application Security Consortium (WASC) has identified twenty-four classes of Web attacks, including Cross-Site Scripting2 (XSS) and SQL Injection3, used to prey upon corporations, their customers, and educational institutions.

These attacks are forcing many organizations to take a hard look at their existing website security posture. In many cases, web application or website security is a new concept with many facets. This paper will examine the fundamental components of a website, entry points of Web attacks, attack methodologies, and suggested preventive measures for effective and complete website vulnerability management.

The Basics

The best way to begin exploring website security is by learning how the Web works. While most IT professionals are very comfortable with using a Web browser to surf the Web, few of us look behind the application, at the client-server4

structure that powers the Web. This structure governs the way Web browsers (Firefox5, Microsoft Internet Explorer6)

must communicate with Web servers7 (Apache8, Microsoft IIS9) to retrieve Web pages10. To peer deeper into the world

of the Web, we’ll begin by looking at the Web browser location bar (see diagram 1).

All major Web browsers possess a location bar that displays the Web address11 (URL) of the current Web page. URL

manipulation is one of the many ways to launch a Web application attack. And yet, they (location bars) are required to enable customers, partners, and hackers to view your website. URL’s are used to uniquely identify the location of a Web page or on-line resource. When traveling from one Web page to the next, the displayed URL is updated. URLs, also referred to as links, are commonly embedded in Web pages to click on to visit other pages. URLs also tell us a lot about a website. They tell us what type of communication they expect, what type of operating system they run, the type of Web application code is being used, and more. We’ll be exploring the anatomy of URLs closely in the following section and we’ll look at how each section can be vulnerable to attack.

Diagram 1: Location Bar

A critical point to note here: the lack of firewall protection on the Web. When visiting any one of the millions of websites that exist today, it’s unlikely that you will encounter firewall protection. It’s not that firewalls aren’t there or not useful, they are. In fact, most websites have firewalls protecting them from network-based attack (worms, viruses, hackers). But, network-based attacks are fundamentally different from Web-based attacks (XSS, SQL Injection), which are immune to a firewall’s defenses.

A firewall’s job is to prevent unauthorized connections to protected network devices. For instance, you probably do not want intruders connecting to internal databases, workstations, printers, and so forth. For a website to be accessible to the public, firewalls must allow traffic to the Web server. If it did not, no one would be able to visit a website. This means if a website has a vulnerability, firewalls are powerless, since Web traffic must be allowed in. Website security sits above the network layer, in a world of its own.


Basic Web Architecture.

Now that we’ve established a basic understanding of Web communication, let’s dive deeper into the technology by analyzing the anatomy of a URL.

Anatomy of Web Requests and URLs


At the beginning of a URL, there is the designated communication protocol (in this case “http”). The protocol designates how the Web browser and the Web server communicate with each other. HTTP14 is a stateless protocol. This means

when a user wants something, a connection to the Web server is established, a request is sent, and a response is received. Afterwards, the connection is severed. HTTPS is another common protocol specification which is HTTP wrapped with Secure Sockets Layer13 (SSL) encryption.

SSL connections, indicated by a tiny lock symbol in the Web browser window ( ), ensures that information sent to and from a website is encrypted. Anyone monitoring the network traffic will not be able to read the data. This is great for protecting credit card numbers, social security numbers, and other forms of sensitive data traveling across the network. Contrary to popular belief, however, SSL does not secure a website. SSL only protects data in transit, and does not protect information stored once it arrives. Many sites using strong, 128-bit SSL have been hacked as often as those that do not. When private data is stored on the Web, the risk is at the server, not in between.

Web Server / Domain Name

The next section of the URL after the double forward slash, “,” specifies the Web server’s domain name. When we click on a URL, this is the Web server to which the Web browser will connect. Web servers handle the network communication between the website and the visitor.

Directory Path

Beginning with the forward slash after the domain name, is the directory or file path. This points to the location on the Web server where the resource is found. A resource could be the path to an html file, website, Powerpoint presentation, or almost any other file type. The URLs tell the Web server where to find it. Hackers are able to manipulate the directory path to find old files, like customer account numbers, that may have been updated and forgotten on a server.


Custom Web Applications

Web application (website) security owes its name to the next section of the URL. The portion of the URL after the final forward slash and before the question mark is the “Web application14.” Web applications are software that enables a

website to serve-up dynamic content. They can do just about anything and be programmed in just about any language. When they receive requests, these applications dynamically create Web pages to return to the Web browser. Web applications turn an ordinary Web server into a Google (, a Yahoo! (, an online auction, Web bank, message board, blog, etc. Without Web applications, the Web would be filled with static content and none of the interactivity that drives innovation.

Query String

The last part of the URL after the question mark is referred to as the “query string.” filled with parameter name-value pairs. Web applications use parameter values as an input to the program. For example, by reading the following URL, we can conclude that we are using http to connect to, executing the search Web application “search” and searching for “testing.”

If we were to change the “q” parameter value, we could search for “security.”

In more severe cases, hackers use parameter tampering to gain unauthorized access to customer order numbers and other private data.


As mentioned earlier, HTTP is a stateless protocol. From one HTTP request to the next, the Web server cannot

determine if the second request is from the same person. Without the ability to make this connection, there is no way to track a user on a website and it’s difficult to maintain user login state. Cookies provide a mechanism to keep state. Cookies are a small amount of data supplied by the Web server and stored by the Web browser. With each new request the browser sends, the cookies stored by the Web server are returned. This allows a user to be uniquely identified and state maintained. Since cookies are used to identify users, they are an attractive target to hackers. Once someone has a user’s cookie, he can effectively become that user.

HTTP/1.1 200 OK

Date: Thursday, 01 Dec 2006 23:37:18 GMT Server: Apache

Set-Cookie: Name=Value; path=/; expires=Thursday, 01-Dec-06 23:12:40 GMT Content-Type: text/html HTTP Response GET HTTP/1.1 Host: Cookie: Name=Value HTTP Request


Post Data

Post Data is another portion of an HTTP request, and is typically populated with data from Web forms. Post Data is most often used when larger amounts of data need to be sent from the browser to the Web server. Post data is more or less identical to a query string, except that it’s located in the lower body of the request.

Web Form GET HTTP/1.1 Host: Content-length: 31 username=johndoe&password=abc123 Post Data

Three Places to Attack a Website

Statistically, over 90% of all websites have serious security issues, but the big question is “how a re they attacked?” If we look at everything we’ve covered so far, basically there are only three places to attack a website: URL’s, cookies, and Post Data. We’ll explore each of these attack points by analyzing real-world exploits. Recall that there are twenty-four classes of attack, and each of these points is vulnerable to all of them.


In July of 2005, the University of Southern California (USC) was informed of a SQL Injection vulnerability found within one of their websites15. The website in question is used to accept applications from prospective students. According

to sources, a lack of security checks on the login Web-form text boxes allowed commands to be sent to the back-end database. These commands enabled public access to personal information including the names, birth dates, addresses, and social-security numbers of up to 280,000 users. While the specific technical details remain undisclosed, here is a likely scenario of what took place.

When filling out the login Web-form with a username (johndoe) and password (abc123), the Web browser would generate a URL similar to following:

The data from the HTTP request is passed into a database SQL statement of the web application “login.asp.” Login.asp is responsible for performing user authentication.

string strQry = “SELECT Count(*) FROM Users WHERE UserName=’” + username.Text + “’ AND Password=’” + password.Text + “’”;


The resulting SQL command is sent to the database:

SELECT Count(*) FROM Users WHERE UserName=’johndoe’ AND Password=’abc123

If the SQL command succeeded (username/password combo was correct), a database record is returned and the user is logged-in. A security issue arises if meta-characters are submitted into the Web-form instead of the expected, alpha-numeric usernames and passwords. Specifically, we’re referring to single quotes and semi-colons.’;&password=’;

The resulting SQL command:

SELECT Count(*) FROM Users WHERE UserName=’’;’ AND Password=’’;

The above SQL command produces a database error caused because the syntax is wrong. An error similar to the following often is displayed within resulting Web pages as a solid indication that a SQL Injection vulnerability exists.

Microsoft OLE DB Provider for SQL Server error ‘80040e14’

Unclosed quotation mark before the character string ‘; password. /login.asp, line 39

In the case of the USC incident, here is what the hacker likely submitted into the Web-form:’+OR+1=1&password=’+OR+1=1;

Producing the following SQL command:

SELECT Count(*) FROM Users WHERE UserName=’’ OR 1=1 AND Password=’’ OR 1=1

The previous SQL command always returns true, and produces the first record in the User’s table. This allows

authentication to be bypassed completely. From this point, the hacker is also able to send any valid SQL commands he can generate to pull information from the database.

This vulnerability could have been avoided with proper sanity checking of the incoming username and password values. If login.asp ONLY accepted alphanumeric characters, the incident would not have occurred.


Just before Valentine’s Day 2003, (a large online florist) received word that hackers could illegally access customer information by simply changing a particular number within a cookie16. The number was a customer identifier

that could be easily guessed and changed to access the sensitive information of other users. This class of attack is generally referred to as Credential Session Prediction17. Security experts confirmed the problem existed and that

customer billing records, names, addresses, and phone numbers were exposed. Here is an example of what of likely took place:

When users login to or add products to their shopping carts, they are given a cookie to track the current session. The cookie contains a unique customer identifier; in this case “CustomerID”, with a value of “1001.”

Set-Cookie: CustomerID=1001; path=/; expires=Thursday, 01-Dec-06 23:12:40 GMT

On subsequent requests, the user’s browser returns the cookie so that he or she may be properly identified.

Cookie: CustomerID=1001

At this point, an easy assumption is that the next user on the website would be given a cookie with a customer identifier of “1002”. What a hacker would do is simply edit their own cookie customer identifier to that of someone else who has already been to the website.


If successful, the hacker would automatically jump into another user’s session, with the ability to take over the account and access personal information.

In this case, the solution is to have random Customer Identifiers. FTD should have been using random-unique integers of at least 10 to 12 characters in length. This would prevent the Customer Identifier from being ascertained, even after extended attempts.

Post Data

In October 2005, in an incident known as the Samy worm, a hacker (Samy) used a common, Cross-Site Scripting (XSS) vulnerability to exploit the MySpace social networking website18. Users’ Web browsers were forced to send Web requests that they did not expect to make. Approximately one million users unwittingly altered their profile Web page with malicious JavaScript code when they visited other infected profiles.

Samy altered his MySpace profile Web page to include some crafty JavaScript exploit code. When another MySpace user visited Samy’s profile, the JavaScript code would execute its payload in the user’s browser. At this point, Samy had complete control over the user’s browser. The payload forced the user to automatically befriend Samy, and also add him as a hero (“Samy is my hero” was appended to each profile page). To propagate site-wide, the exploit code would copy itself to the user’s profile and wait to exploit other unsuspecting users.

To halt the infection, MySpace was forced to shutdown its entire website for twenty hours. While the incident was severe, the payload of the Samy worm was relatively benign. It would have been trivial for Samy to force everyone to delete their profile and blog data, resulting in a far more damaging situation for users and for MySpace.


All it takes is a single security flaw, or one small oversight, for your company to make headlines. Experience tells us that no single protective measure is completely impenetrable. Everything has its weakness, and it’s only a matter of time before that weakness is found and exploited. With this real-world knowledge, most security experts subscribe to a philosophy called defense-in-depth to protect their systems. Defense-in-depth promotes a layered security approach, so that if any single control mechanism fails, other defensive measures are in place to ensure nothing is compromised. Architecture Security

All secure e-commerce infrastructures must be built on a solid foundation. Without a solid foundation, no amount of security in Web application code will be enough to defend a website. Below is a top-level checklist to use to assess your overall security. The Center for Internet Security19 (CIS) has excellent resources for in-depth, system-specific knowledge

of architecture security issues. Also, the Payment Card Industry20 (PCI) Data Security Standard is another resource for

a comprehensive security program.

– Networks are properly segmented to separate public, semi-private, and private systems.

– Perimeter firewalls are in place between network segments to only allow a limited set of network services to communicate.

– Operating systems are hardened and patches are kept up-to-date.

– Web servers are properly patched, configured, and have any additional security add-ons applied. Secure Software Development Program

Secure software is quality software. Vulnerabilities are nothing more than software bugs. And the best way to squash these bugs before they become real problems is to tightly integrate security consideration at all points of the software development life cycle (SDLC), from architecture design, to development releases, to quality assurance phases. While there are copious amounts of data available covering software security best practices, the primary caution to


developers is “DO NOT TRUST CLIENT-SIDE DATA.” Lack of proper input validation is the number-one cause of website security issues. The Web is a hostile environment; therefore it’s absolutely critical to validate all data you plan to utilize, whether it’s from HTTP requests or the database. Here is a checklist for how to perform proper input validation in any programming language:

– Character-set: Ensure the data only contains characters you expect to receive.

– Length: Ensure the data falls within a restricted minimum and maximum number of bytes.

– Data Format: Ensure the structure of the data is consistent with what is expected. Phone numbers should look like phone numbers, email addresses should look like email address, etc.

– Escape: Before data is passed onto sub-systems, especially database or operating system calls, all characters should be escaped, meaning no special characters should be allowed into the system unchecked.

– Filtering: Sanitize data to not include dangerous characters. Specifically, convert < and > characters into their equivalent HTML entities to prevent XSS issues.

Website Vulnerability Assessment and Management

No matter how many defensive measures are piled onto a system, the only way to tell if risk is being mitigated is to adequately measure security. For website security, this means comprehensively assessing websites for vulnerabilities (using the WASC Threat Classification) and managing the remediation process when issues are found. To ensure security, this process should be conducted with each change to the custom Web application code, NOT just once a year. There are three key points to consider when assessing Web applications:

1. Web applications are inherently unique.

As we’ve discussed earlier, each website, whether e-commerce, online banking or health information, contains custom code. Off-the-shelf website scanning products cannot fully identify Web application vulnerabilities in custom code, and many website vulnerabilities simply cannot be found in an automated manner. Since each site has specialized functionality, the methods to exploit that application may be just as specialized. Software cannot respond to that level of customization. Integrating and applying security expertise is required.

2. Assessing your production website is essential.

Hackers enter through holes in a “live” site, not the development or QA environment. Unseen flaws can appear between the time development is completed and production is started. Credit card companies have realized this important distinction. That is why PCI mandates scanning all custom Web applications in the production environment. 3. Communication between the development and security teams is critical.

90% of all websites have security issues. That’s a fact. It will take time to fix all the vulnerabilities in a website, which is why teamwork is important. The security organization needs updates on remediation progress so that they can adequately protect corporate applications. The development organization needs to work with security to prioritize fixes so that the most dangerous issues are resolved quickly.

In the past, website vulnerability management was a time-consuming and often expensive process. Today, WhiteHat Security offers WhiteHat Sentinel21, the only website vulnerability assessment and management service that is combines

proprietary scanning technology with security experts for complete protection.

WhiteHat Sentinel identifies, manages, and recommends remediation for website vulnerabilities. We supply the information needed for organizations to protect corporate websites from attack. The system is iterative, intelligent and organic: WhiteHat Sentinel applies what it’s learned to continuously refine and retune its assessments in order identify the latest attack vectors and protect customer sites.

WhiteHat Sentinel is a must-have to secure valuable customer data, comply with industry standards and maintain brand integrity. WhiteHat Sentinel is a crucial component for complete website vulnerability management.



1 5 Security Myths

2 Cross-site Scripting (XSS) is an attack technique that forces a web site to echo attacker-supplied executable code,

which loads in a user’s browser. The code itself is usually written in HTML/JavaScript, but may also extend to VBScript, ActiveX, Java, Flash, or any other browser-supported technology.

3 SQL Injection is an attack technique used to exploit web sites that construct SQL statements from user-supplied input.

4 A common form of distributed system in which software is split between server tasks and client tasks. A client sends

requests to a server, according to some protocol, asking for information or action, and the server responds.

5 A popular open-source web browser.

The WhiteHat Sentinel Service – Complete Website Vulnerability Management

Find Vulnerabilities, Protect Your Website – The WhiteHat Sentinel Service is a unique combination of expert analysis and proprietary automated scanning technology that delivers the most comprehensive website vulnerability coverage available. Worried about the OWASP Top Ten vulnerabilities or the WASC Threat Classification? Scanners alone cannot identify all the vulnerabilities defined by these standards. WhiteHat Sentinel can. Many of the most dangerous vulnerabilities reside in the business logic of an application and are only uncovered through expert human analysis. Continuous Improvement and Refinement – WhiteHat Sentinel stays one step ahead of the latest website attack vectors with persistent updates and refinements to its service. Updates are continuous – as often as one day to several weeks, versus up to six months or longer for traditional software tools. And, Sentinel uses its unique “Inspector” technology to apply identified vulnerabilities across every website it evaluates. Ultimately, each site benefits from the protection of others.

Virtually Eliminate False Positives – No busy security team has time to deal with false positives. That’s why the WhiteHat Sentinel Security Operations Team verifies the results of all scans. Customers see only real, actionable vulnerabilities, saving time and money.

Total Control – WhiteHat Sentinel runs on the customer’s schedule, not ours. Scans can be manually or automatically scheduled to run daily, weekly, and as often as websites change. Whenever required, WhiteHat Sentinel provides a comprehensive assessment, plus prioritization recommendations based on threat and severity levels, to better arm security professionals with the knowledge needed to secure them.

Unlimited Assessments, Anytime Websites Change – With WhiteHat Sentinel, customers pay a single annual fee, with unlimited assessments per year. And, the more applications under management with WhiteHat Sentinel, the lower the annual cost per application. High volume e-commerce sites may have weekly code changes, while others change monthly. WhiteHat Sentinel offers the flexibility to assess sites as frequent as necessary.

Simplified Management – There is no cumbersome software installation and configuration. Initial vulnerability assessments can often be up-and-running in a matter of hours. With WhiteHat Sentinel’s Web interface, vulnerability data can be easily accessed, scans or print reports can be scheduled at any time from any location. No outlays for software, hardware or an engineer to run the scanner and interpret results. With the WhiteHat Sentinel Service, website vulnerability management is simplified and under control.


6 Microsoft’s web browser (IIS).

7 A general-purpose software application used to handle HTTP requests. A web server may utilize a web application for

dynamic web page content.

8 A popular open-source web server by the Apache Software Foundation.

9 Microsoft Internet Information Server (IIS)

10 A document on the World Wide Web, consisting of an HTML file and any related files for scripts and graphics, and

often hyperlinked to other documents on the Web.

11 Uniform Resource Locator (URL). The location of an on-line web-based resource.

12 Hypertext Transfer Protocol (HTTP) is an application-level protocol for distributed, collaborative, hypermedia

information systems.

13 An industry standard public-key protocol used to create encrypted tunnels between two network-connected devices

14 A software application, executed by a web server, which responds to dynamic web page requests over HTTP.

15 Flawed USC admissions site allowed access to applicant data 16 hole leaks personal information.

17 Credential/Session Prediction is a method of hijacking or impersonating a web site user. Deducing or guessing the

unique value that identifies a particular session or user accomplishes the attack. Also known as Session Hijacking, the consequences could allow attackers the ability to issue web site requests with the compromised user’s privileges.

18 Teen uses worm to boost ratings on story/0,10801,105484,00.html

19 The Center for Internet Security (CIS) is a non-profit enterprise whose mission is to help organizations reduce the risk

of business and e-commerce disruptions resulting from inadequate technical security controls.

20 Payment Card Industry (PCI) Data Security Requirements apply to all Members, merchants, and service providers that

store, process or transmit cardholder data.

21 WhiteHat Sentinel is the only continuous vulnerability assessment and management service for web applications


About the Author

Jeremiah Grossman is the Founder and Chief Technology Officer of WhiteHat Security (, where he is responsible for web application security R&D and industry evangelism. As an industry veteran and well-known security expert, Mr. Grossman is a frequent international conference speaker at the BlackHat Briefings, ISSA, ISACA, NASA, and many other industry events. Mr. Grossman’s research, writings, and discoveries have been featured in USA Today, VAR Business, NBC, ABC News (AU), ZDNet, eWeek, BetaNews, etc. Mr. Grossman is also a founder of the Web Application Security Consortium (WASC), as well as a contributing member of the Center for Internet Security Apache Benchmark Group. Prior to WhiteHat, Mr. Grossman was an information security officer at Yahoo!, responsible for performing security reviews on the company’s hundreds of websites.

About WhiteHat Security, Inc.

Headquartered in Santa Clara, California, WhiteHat Security is a leading provider of website vulnerability management services. WhiteHat delivers turnkey solutions that enable companies to secure valuable customer data, comply with industry standards and maintain brand integrity. WhiteHat Sentinel, the company’s flagship service, is the only solution that incorporates expert analysis and industry-leading technology to provide unparalleled coverage to protect critical data from attacks. For more information about WhiteHat Security, please visit


Diagram 1: Location Bar

Diagram 1:

Location Bar p.3


Related subjects : Website Security