• No results found

Supporting the Cookie Protocol

One of the early problems that plagued Web page designers was how to give information to the client browser for it to remember. If you had one million people accessing your Web server, you don't want to keep information for each one of them on your server if their browsers could just as easily store the information. Fortunately, Netscape noticed this problem fairly early and came up with the notion of a cookie.

A cookie is really just a piece of information that has a name, a value, a domain, and a path. Whenever you open up an URL to the cookie's domain and access any files along the cookie's path, the cookie's name and value are passed to the server when you open the URL. A typical use of this might be an access count or a user name. Netscape defined a request header tag called "Cookie:" that is used to pass cookie name-value pairs to the server. A server can set cookie values in a browser by sending a Set-cookie tag in the response header.

You should now be able to create Java applications that can open up URLs directly, without the interference of a browser, so you may want to support the cookie protocol. It would be nice if this protocol could be built right into the URL and

URLConnection classes. You are welcome to tackle this problem. At first, it would seem like a simple thing to do, but

you will find that the URLConnection class, although it has methods to set the desired fields in a request header, will not actually pass these fields to the server. This means that you can call setRequestProperty("Cookie",

"Something=somevalue") all day long and the server will never see it. If you want to speak cookies, you'll have to

speak HTTP over a socket. Luckily for you, this chapter contains code to do just that.

Listing 6.7 shows a Cookie class that represents the information associated with a cookie. It doesn't actually send or receive cookies; it is more like a Cookie data type. One interesting feature is that its constructor can create a cookie from the string returned by the cookie's toString method, making it easy to store cookies in a file and retrieve them.

Chapter 6 -- Communicating with a Web Server Tip

It is often useful to create a string representation of an object that can be used to re- create the object at a later time. While you can use object serialization to read and write objects to a file, a string representation can be edited with a simple text editor.

Listing 6.7 Source Code for Cookie.java

import java.net.*; import java.util.*;

// This class represents a Netscape cookie. It can parse its // values from the string from a Set-cookie: response (without // the Set-cookie: portion, of course). It is little more than // a fancy data structure.

public class Cookie {

// Define the standard cookie fields public String name;

public String value; public Date expires; public String domain; public String path;

public boolean isSecure;

// cookieString is the original string from the Set-cookie header. // Just save it rather than trying to regenerate for the toString // method. Note that since this class can initialize itself from this // string, it can be used to save a persistent copy of this class! public String cookieString;

// Initialize the cookie based on the origin URL and the cookie string public Cookie(URL sourceURL, String cookieValue)

{

domain = sourceURL.getHost(); path = sourceURL.getFile(); parseCookieValue(cookieValue); }

// Initialize the cookie based solely on its cookie string public Cookie(String cookieValue)

{

Chapter 6 -- Communicating with a Web Server

}

// Parse a cookie string and initialize the values

protected void parseCookieValue(String cookieValue) {

cookieString = cookieValue;

// Separate out the various fields, which are separated by ;'s StringTokenizer tokenizer = new StringTokenizer( cookieValue, ";");

while (tokenizer.hasMoreTokens()) { // Eliminate leading and trailing white space

String token = tokenizer.nextToken().trim(); // See if the field is of the form name=value or if it is just // a name by itself.

int eqIndex = token.indexOf('='); String key, value;

// If it is just a name by itself, set the field's value to null if (eqIndex == -1) {

key = token; value = null;

// Otherwise, the name is to the left of the '=', value is to the right } else {

key = token.substring(0, eqIndex); value = token.substring(eqIndex+1); }

isSecure = false;

// convert the key to lowercase for comparison with the standard field names String lcKey = key.toLowerCase();

if (lcKey.equals("expires")) { expires = new Date(value);

} else if (lcKey.equals("domain")) { if (isValidDomain(value)) { domain = value; } } else if (lcKey.equals("path")) { path = value; } else if (lcKey.equals("secure")) { isSecure = true;

// If the key wasn't a standard field name, it must be the cookie's name

Chapter 6 -- Communicating with a Web Server

// You don't use the lowercase version of the name here. } else { name = key; this.value = value; } } }

// isValidDomain performs the standard cookie domain check. A cookie // domain must have at least two portions if it ends in

// .com, .edu, .net, .org, .gov, .mil, or .int. If it ends in something // else, it must have 3 portions. In other words, you can't specify // .com as a domain, it has to be something.com, and you can't specify // .ga.us as a domain, it has to be something.ga.us.

protected boolean isValidDomain(String domain) {

// Eliminate the leading period for this check

if (domain.charAt(0) == '.') domain = domain.substring(1); StringTokenizer tokenizer = new StringTokenizer(domain, "."); int nameCount = 0;

// just count the number of names and save the last one you saw String lastName = "";

while (tokenizer.hasMoreTokens()) { lastName = tokenizer.nextToken(); nameCount++;

}

// At this point, nameCount is the number of sections of the domain // and lastName is the last section.

// More than 2 sections is okay for everyone if (nameCount > 2) return true; // Less than 2 is bad for everyone

if (nameCount < 2) return false;

// Exactly two, you better match one of these 7 domain types

if (lastName.equals("com") || lastName.equals("edu") || lastName.equals("net") || lastName.equals("org") || lastName.equals("gov") || lastName.equals("mil") || lastName.equals("int")) return true;

// Nope, you fail - bad domain! return false;

}

// Use the cookie string as originally set in the Set-cookie header

Chapter 6 -- Communicating with a Web Server

// this string to a file, you can completely regenerate this object from // this string, so you can read the cookie back out of a file.

public String toString() {

return cookieString; }

}

The Cookie class is basically a holder for cookie data. The only methods in the Cookie class deal with converting

strings into cookies and vice versa. The parseCookieValue method in the Cookie class implements a crucial part of the cookie protocol. It takes a string containing the settings for a cookie. The settings are of the form name=value and are separated by semicolons. The settings include the name of the cookie, the cookie's value, its expiration date, and the path name for the cookie.

The domain setting for a cookie specifies which hosts should receive the cookie. Whenever a URL in the cookie's domain is opened and the URL is in the cookie's path, the server for that URL is passed the cookie. For example, if you set the

domain to mydomain.com and the path to /me/stuff, then the URL http://mydomain. com/me/stuff/mycgi will receive the cookie. An URL of http://mydomain.com/you/files would not receive the cookie, because the paths don't match.

There are some restrictions on the cookie's domain, too. If the domain ends in .com, .edu, .org, .net, .gov, .mil, or .int, you only need two components in the domain. In other words, you need one other name in addition to the ending. For example,

mydomain.com is a valid domain.

If the domain ends with any other name, you must have at least three components in the domain. For example,

mydomain.au would not be a valid cookie domain, but mydomain.outback.au would be valid.

Because cookies are supposed to be persistent, you need a class to manage your cookies-preferably by storing them in a file or a database. Listing 6.8 presents a portion of the CookieDatabase class that maintains a table of known cookies. The full source to the class is available on the CD-ROM that comes with this book. It has methods to store the table in a file and retrieve the table from a file. It can also examine an URL and return a string of cookie values for that URL.

The CookieDatabase class does not actually read cookies from a Web server or write them to the server. It simply

keeps a table of known cookies. If presented with a host name and path name, the CookieDatabase class will determine which cookies are valid for that host name and path name and will return the appropriate cookie string.

The getCookieString method from the CookieDatabase class, shown in Listing 6.8, performs the matching

between an URL and a cookie. It decides what cookies should be sent for a particular URL and creates a string containing all the cookie values that need to be sent.

Listing 6.8 getCookieString Method from CookieDatabase

// getCookieString does some rather ugly things. First, it finds all the // cookies that are supposed to be sent for a particular URL. Then

// it sorts them by path length, sending the longest path first (that's // what Netscape's specs say to do - I'm only following orders).

Chapter 6 -- Communicating with a Web Server

public static String getCookieString(URL destURL) {

if (cookies == null) {

cookies = new Vector(); }

// sendCookies will hold all the cookies you need to send Vector sendCookies = new Vector();

// currDate will be used to prune out expired cookies as we go along Date currDate = new Date();

for (int i=0; i < cookies.size();) {

Cookie cookie = (Cookie) cookies.elementAt(i); // See if the current cookie has expired. If so, remove it

if ((cookie.expires != null) && (currDate.after( cookie.expires))) {

cookies.removeElementAt(i); continue;

}

// You only increment i if you haven't removed the current element i++;

// If this cookie's domain doesn't match the URL's host, go to the next one if (!destURL.getHost().endsWith(cookie.domain)) {

continue; }

// If the paths don't match, go to the next one

if (!destURL.getFile().startsWith(cookie.path)) { continue;

}

// Okay, you've determined that the current cookie matches the URL, now // add it to the sendCookies vector in the proper place (i.e. ensure // that the vector goes from longest to shortest path).

int j;

for (j=0; j < sendCookies.size(); j++) {

Cookie currCookie = (Cookie) sendCookies. elementAt(j);

// If this cookie's path is longer than the cookie[j], you should insert // it at position j.

if (cookie.path.length() <

currCookie.path.length()) { break;

Chapter 6 -- Communicating with a Web Server

}

// If j is less than the array size, j represents the insertion point if (j < sendCookies.size()) {

sendCookies.insertElementAt(cookie, j); // Otherwise, add the cookie to the end

} else {

sendCookies.addElement(cookie); }

}

// Now that the sendCookies array is nicely initialized and sorted, create // a string of name=value pairs for all the valid cookies

String cookieString = "";

Enumeration e = sendCookies.elements(); boolean firstCookie = true;

while (e.hasMoreElements()) {

Cookie cookie = (Cookie) e.nextElement(); if (!firstCookie) cookieString += "; ";

cookieString += cookie.name + "=" + cookie.value; firstCookie = false;

}

// Return null if there are no valid cookies

if (cookieString.length() == 0) return null; return cookieString;

}

Finally, Listing 6.9 shows you an example application that fetches a Web page that contains a cookie. Whenever the application runs, it loads its cookie table from a file called cookies.dat. After you run the program, you can look at the cookies.dat file. It is printable text. The program accesses a Web page called "Andy's Netscape HTTP Cookie Page" (http://www.illuminatus.com/cookie), which is a great resource for learning about cookies and seeing them in action.

Since the CookieDatabase class does not automatically look for cookies in a response from a Web server, and does not automatically send cookie data, you have to do that yourself. Cookies are sent to the server in the header portion of an HTTP command.

Note

You can set only a few specific header values in the URL class, and the cookie string is not one of them. This means that you have to use sockets to perform a

GET or POST that supports cookies.

Chapter 6 -- Communicating with a Web Server

Whenever you open an URL, you can get the cookie string for the URL by calling getCookieString in the

CookieDatabase class. When reading the response from the Web server, you must scan the header results for the Set-

cookie command. Whenever you find this command, you pass the cookie string from the Set-cookie command to the

addCookie method in the CookieDatabase class. The method will extract all the important information from the

cookie string.

Listing 6.9 Source Code for TestCookie.java

import java.net.*; import java.io.*;

// This application demonstrates the CookieDatabase and Cookie // classes. It first loads the cookie database from cookies.dat, // then it opens up Andy's Netscape HTTP Cookie Page, which happens // to assign you a cookie.

// Because the Java URL classes do not let you set arbitrary header

// strings (GRR!!!), you have to do cookie stuff MANUALLY (double-GRR!!) //

// Much of this code was taken from the example of doing a GET with // raw sockets.

public class TestCookie extends Object {

public static void main(String args[]) {

try {

CookieDatabase.loadCookies("cookies.dat"); } catch (IOException ignore) {

} try {

// URL to Andy's Netscape HTTP Cookie Page, it's quite helpful

URL url = new URL("http://www.illuminatus.com/cookie"); int port = url.getPort();

if (port < 0) port = 80; // Open a socket to the server

Socket socket = new Socket(url.getHost(), port); // Create an output stream so you can write out the request header DataOutputStream outStream = new DataOutputStream( socket.getOutputStream());

// Write the GET command

outStream.writeBytes(

"GET "+url.getFile()+" HTTP/1.0\r\n"); // See if there are any valid cookies for this URL

Chapter 6 -- Communicating with a Web Server

String cookieString = CookieDatabase. getCookieString(url);

// If so, write out a cookie header

if (cookieString != null) {

outStream.writeBytes("Cookie: "+ cookieString+"\r\n");

}

// Write out \r\n for the end of the header area outStream.writeBytes("\r\n"); // Now read the response from the server

DataInputStream inStream = new DataInputStream( socket.getInputStream());

String line;

// Read the header strings scanning for a set-cookie tag, which // means you have to update the cookie database

while ((line = inStream.readLine()) != null) { if (line.length() == 0) break;

// if you got a set-cookie, create a new cookie and add it to the database if (line.toLowerCase().startsWith( "set-cookie: ")) { CookieDatabase.addCookie( new Cookie(url, line.substring(12))); } }

// Now that you've finished with the header, just dump out the // contents of the page. This won't look too pretty, it's all pure // HTML.

int ch;

while ((ch = inStream.read()) >= 0) { System.out.print((char) ch); }

// Save the cookie database for later use

CookieDatabase.saveCookies("cookies.dat"); } catch (Exception e) { e.printStackTrace(); } } } file:///E|/Java%20Professor/Hacking%20Java%20Professional%20Resource%20Kit/ch6.htm (22 of 23) [8/14/02 10:52:42 PM]

Chapter 7 -- Creating Smarter Forms

Chapter 7