All Permissions
7.1 The Need for Authentication
We are primarily concerned with one goal of the security package: the ability to authenticate classes that have been loaded from the network. The components of the Java API that provide authentication may have other uses in other contexts (including within your own Java applications), but their primary goal is to allow a Java application (and the Java Plug−in) to load a class from the network and be assured of two things:
The identity of the site from which the class was loaded can be verified ( author authentication).
•
The class was not modified in transit over the network (data authentication).
•
As we've seen, Java applications typically assume that all classes loaded over the network are untrusted classes, and these untrusted classes are generally given permissions consistent with that assumption. Classes that meet the above two criteria, however, need not necessarily be so constrained. If you walk into your local software store and buy a shrink−wrapped piece of software, you're generally confident that the software will not contain viruses or anything else that's harmful. This is part of the implied contract between a commercial software producer and a commercial software buyer. If you download code from that same software
producer's web site, you're probably just as confident that the code you're downloading is not harmful;
perhaps it should be given the same access rights as the software you obtained from that company through a more traditional channel.
There's a small irony here because many computer viruses are spread through commercial software. That's one reason why the fact that a class has been authenticated does not necessarily mean it should be able to
access anything on your machine that it wants to. It's also a reason why the fine−grained nature of the access controller is important: if you buy classes from acme.com but only give them access to certain things on your machine, you are still somewhat protected if by mistake acme.com includes a virus in their software.
Even if all commercial software were virus−free, however, there is a problem with assuming that code
downloaded from a commercial site is safe to run on your machine. The problem with that assumption −− and the reason that Java by default does not allow that assumption to be made −− has to do with the way in which the code you execute makes its way through the Internet. If you load some code from www.xyz.com onto your machine, that code will pass through many machines that are responsible for routing the code between your site and XYZ's site. Typically, we like to think that the data that passes between our desktop and
www.xyz.com enters some large network cloud; it's called a cloud because it contains a lot of details, and the details aren't usually important to us. In this case, however, the details are important. We're very interested to know that the data between our desktop and xyz.com passes through, for example, our Internet service
provider, two other sites on the Internet backbone, and XYZ's Internet service provider. Such a transmission is shown in Figure 7−1. The two types of authentication that we mentioned above provide the necessary
assurance that the data passing through all these sites is not compromised.
Figure 7−1. How data travels through a network
7.1.1 Author Authentication
First we must prove that the author of the data is who we expect it to be. When you send data that is destined for www.xyz.com, that data is forwarded to site2, who is supposed to forward it to site1, who should simply forward it to XYZ's Internet service provider. You trust site1 to forward the data to XYZ's Internet service provider unchanged; however, there's nothing that causes site1 to fulfill its part of this contract. A hacker at site1 could arrange for all the data destined for www.xyz.com to be sent to the hacker's own machine, and the hacker could send back data through site2 that looked as if it originated from www.xyz.com. The hacker is now successfully impersonating the www.xyz.com site. Hence, although the URL in your browser says www.xyz.com, you've been fooled: you're actually receiving whatever data the impersonator of XYZ Corporation wants to send to you.
There are a number of ways to achieve this masquerade, the most well−known of which is DNS spoofing.
When you want to surf to www.xyz.com, your desktop asks your DNS server (which is typically your Internet service provider) for the IP address of www.xyz.com and you then send off the request to whatever address you receive. If your Internet service provider knows the IP address of www.xyz.com, it tells your desktop what the correct address is; otherwise, it has to ask another DNS server (e.g., site1) for the correct IP address. If a hacker has control of a machine anywhere along the chain of DNS servers, it is relatively simple for that hacker to send out his own address in response to a DNS request for www.xyz.com.
Now say that you surf to www.xyz.com and request a Java class (or set of classes) to run a spellchecker for your Java−based word processor. The request you send to www.xyz.com will be misaddressed by your
machine −− your machine will erroneously send the request to the hacker's machine since that's the IP address your machine has associated with www.xyz.com. Now the hacker is able to send you back a Java class. If that Java class is suddenly trusted (because, after all, it allegedly came from a commercial site), it has access that you wouldn't necessarily approve: perhaps while it's spellchecking your document, it is also searching your hard disk to find the datafile of your financial planning software so that it can read that file and send its contents back over the network to the hacker's machine.
Yes, we've made this sound easier than it is −− the hacker would have to have intimate knowledge of the xyz.com site to send you back the classes you requested, and those classes would have to have the expected interface in order for any of their code to be executed. But such situations are not difficult to set up either; if the hackers stole the original class files from www.xyz.com −− which is usually extremely easy −− all they need to do is set themselves up at the right place in the DNS chain.
In the strict Java security model we explored earlier, this sort of situation is possible, but it is not dangerous.
Because the classes loaded from the network are never trusted at all, the class that was substituted by the hackers is not able to damage anything on your machine. At worst, the substituted class does not behave as you expect and may in fact do something quite annoying −− like play loud music on your machine instead of spellchecking your document. But the class is not able to do anything dangerous, simply because all classes from the network are untrusted.
In order to trust a class that is loaded from the network, then, we must have some way to verify that the class actually came from the site it said it came from. This authentication comes from a digital signature that comes with the class data −− an electronic verification that the class did indeed come from www.xyz.com.
7.1.2 Data Authentication
The second problem introduced by the fact that our transmissions to www.xyz.com must pass through several hosts is the possibility of snooping. In this scenario, assume that site2 on the network is under control of a hacker. When you send data to www.xyz.com, the data passes through the machine on site2, where the hacker can modify it; when data is sent back to you, it travels the same path, which means that the hacker on site2 can again modify the data.
This lack of privacy in data transmission is one reason you might want data over the network to be encrypted
−− certainly if the spellchecking software you're using from www.xyz.com is something you must pay for, you don't want to send your unencrypted credit card number through the network so that site2 can read it.
However, for authentication purposes, encrypting the data is not strictly necessary. All that is necessary is some sort of assurance that the data that has passed through the network has not been modified in transit. This can be achieved by various cryptographic algorithms even though the data itself is not encrypted. The simpler path is to use such a cryptographic algorithm (known as a message digest algorithm or a digital fingerprint) instead of encrypting the data.
Encryption Versus Data Authentication
When you send data through a public network, you can use a digital fingerprint of that data to ensure that the data was not modified while it was in transit over the network. This fingerprint is sufficient to prevent a snooper from substituting new data (e.g., a new Java class file) for the original data in your transmission.
However, this authentication does not prevent a snooper from reading the data in your
transmission; authenticated data is not encrypted data. If you are worried about someone stealing
your data, the security provided by data authentication is insufficient. Data authentication prevents writing of data but not reading of data.
This can be a very important difference in countries that place import or export controls on encryption. Those restrictions do not apply to digital signatures, so the Java code that implements digital signatures is freely available. Hence, it is easier to deploy an application that requires digital signatures than one that requires encryption.
Without some cryptographic mechanism in place, the hacker at site2 has the option of modifying the classes that are sent from www.xyz.com. When the classes are read by the machine at site2, the hacker could modify them in memory before they are sent back onto the network to be read by site1 (and ultimately to be read by your machine). Hence, the classes that are sent need to have a digital fingerprint associated with them. As it turns out, the digital fingerprint is required to sign the class as well.
7.1.3 Java's Role in Authentication
When Java was first released and touted as being "secure," it surprised many people to discover that the types of attacks we've just discussed were still possible. As we've said, security means many things to many people, but a reasonable argument could be made that the scenarios we've just outlined should not be possible in a secure environment.
The reasons Java did not solve these problems in its first release are varied, but they essentially boil down to one practical reason and one philosophical reason.
The practical reason is that all the solutions we're about to explore depend to a high degree on technologies that are just beginning to become viable. As a practical matter, authentication relies on everyone having public keys available −− and as we'll discuss in Chapter 10, that's not necessarily the case. Without a robust
mechanism to share public keys, Java had two options:
Provide no security at all, and allow applets full use of the resources of the user's computer. By now, we know all the possible problems with that route.
•
Provide the very strict security that was implemented in 1.0−based versions of Java, with a view toward ways of enhancing that model as technologies evolved. While not the best of all possible worlds, this compromise allowed Java to be adopted much sooner than it would otherwise have been.
•
On a philosophical level, however, there's another argument: Java shouldn't solve these problems because they are not confined to Java itself. Even if Java classes were always authenticated, that would not prevent the types of attacks we've outlined here from affecting non−Java−related transmissions. If you surf to
www.xyz.com and that site is subject to DNS spoofing, you'll be served whatever pages the spoofer wants to substitute. If you engage in a standard non−Java, forms−based transmission with www.xyz.com, a snooper along the way can steal and modify the data you're sending over the standard HTTP protocol.
In other words, the attacks we've just outlined are inherent in the design of a public network, and they affect all traffic equally −− email traffic, web traffic, FTP traffic, Java traffic, and so on. In a perfect world, solving these problems at the Java level is inefficient, as it means that the same problem must still be solved for all the other traffic on the public network. Solving the problem at the network level, on the other hand, solves the problem once and for all, so that every protocol and every type of traffic are protected.
There are a number of popular technologies that solve this problem in a more general case. If all the traffic between your site and www.xyz.com occurs over SSL using an HTTPS−based URL, then your browser and the www.xyz.com web server will take care of the details of authentication of all web−based traffic, including the Java−related traffic. That solves the problem at the level of the web browser, but that still is not a complete
solution. If the applet needs to open a connection back to www.xyz.com, it must use SSL for this communication as well. And we still have other, non−web−related traffic that is not authenticated.
It would be better still to solve this problem at the network level itself. There are many products from various vendors that allow you to authenticate (and encrypt) all data between your site and a remote site on the
network. Using such a product is really the ideal from a design point of view; in that way, all data is protected, no matter what the source of the traffic. Either of these solutions makes authentication and fingerprinting of Java classes redundant (and may offer the benefit that the data is actually encrypted when it passes through the network).
Unfortunately, these solutions lead us back to practical considerations: if it's hard for Java environments to share digital keys and to manage cryptographic technology, it's harder still to depend on the network software to manage this process. So while it might be ideal for this problem to be solved for the network as a whole, it's impractical to expect such a solution. Hence, the Java security package offers a reasonable compromise: it allows you to deploy and use trusted (i.e., authenticated) classes, but their use is not mandated in case you prefer to employ a broader solution to this problem.