bulk transfer of data to a group of receivers, most send streaming data. Furthermore, data may need to be transmitted in real time, with implications on buffer space requirements at both the sender and the receivers. Finally, packets may be lost in transit or arrive out of order. Thus, source authentication and non-repudiation of group communications are difficult problems.
There are several documented solutions that provide various levels of authentication with trade-offs in computation and communication over- head, buffer space requirements, authentication delay, and verification probability. Some of these schemes are loss tolerant, while others require reliable delivery. Among the loss tolerant schemes, some can tolerate any kind of losses, whereas others are optimized to tolerate bursty or random losses, with limitations such as verification probability.
Digitally signing each packet addresses the security needs of most, if not all, applications. But it is computationally expensive to sign or verify digital signatures. The per-packet communication overhead is also excessive. Thus, most schemes use amortization techniques to reduce both these costs. Block hashing, described in Section 3.2.1, amortizes the computational cost of a digital signature over a block of packets. In hash chaining–based schemes, each packet’s hash is sent as part of one or more packets. Signature packets hold the hashes of end points (packets) of the hash chains. Section 3.3 describes a variety of schemes based on hash chaining. Another class of protocols uses MAC-(symmetric key–based) based authentication, with keys derived from one-way function chains. These protocols get around the limitation of MAC keys for source authentication using delayed key disclosures.
Note that irrespective of the authentication scheme, IPsec ESP [1], MESP [2] or AMESP protocols may be used to send authentication information or keys.
3.1
Issues in multicast data authentication
The problem of secure group communication can be divided into several building blocks for better understanding of the requirements, and simpler analysis of solutions. In that spirit, Section 2.6 lists several building blocks, two of which, multicast source authentication and data integrity, and group authentication, are addressed here. Recall that these two building blocks belong to problem area 1 of the Reference Framework described in Chapter 2. The problem of multicast data authentication has three components: data integrity, authentication, and non-repudiation.
w Receivers must be able to determine that data has not been modified
either by other members of the multicast group or by external adversaries. This property is referred to as data integrity protection.
w Receivers need to be able to establish the source of the data, at least
for themselves. In other words, we need data origin authentication.
w A stronger version of the above property, referred to as non-
repudiation, allows impartial third-party verification of the data source.
Data integrity and authentication go hand in hand. Notice that if data has been modified in transit, the source is no longer the legitimate origin of data. Similarly, if a receiver can establish the source of data (at least to itself), data has not been modified en route. Therefore data integrity and authentication are dependent on each other. Non-repudiation is essen- tially a stronger version of data authentication. In other words, a protocol or mechanism that ensures nonrepudiation also guarantees authentica- tion. Additional security services such as confidentiality and access control are addressed in Chapters 4, 5, and 6.
There are two distinct types of applications, with varying requirements, to consider in authenticating multicast data. In the first, a sender transmits a bulk of data to receivers. In other words, receivers can wait until they receive all the data sent, before verifying the authenticity and integrity of that data. Examples of such applications include multicast ftp [3] and Web cache synchronization. The second category of applications streams multicast data. Receivers might want to verify the integrity and authenticity of each data packet as it arrives, and use it immediately. We also need to handle lossy communication channels, as well as out-of-order packet delivery. In other words, authentication information must be associated with each packet. Video-on-demand and multimedia conferencing are examples of applica- tions that need multicast streaming.
Two different mechanisms are generally used for source authentication. The first is to use digital signatures for non-repudiation, and the second is to use MACs for authentication only. Recall that non-repudiation is a stronger form of authentication. However, while MACs cannot provide non- repudiation, they are more efficient compared to digital signatures. Digitally signing each packet of streaming data is prohibitively expensive (both computationally and with respect to communication overhead per packet). In unicast communication, MACs support data authentication as follows. Consider two communicating peers, Alice and Bob, holding a secret key for authentication. Alice uses the key and a one-way function to
compute the keyed hash (e.g., HMAC [4]) of the message, and sends the message along with the MAC to the receiver. Bob repeats the procedure to compute the MAC, and compares it with the received MAC. If the MACs are identical, Bob knows that the message has not been modified en route. He also knows that he has not sent the message and therefore Alice must have sent it, assuming the authentication key has not been compromised. However, a third party cannot verify whether the message has been sent by Alice or Bob. Therefore, MACs cannot provide non-repudiation.
We can use MACs for authenticating group communications following a similar procedure as above, but with a reduced level of security. Consider a group, consisting of Alice, Bob and Cindy, holding an authentication key. Alice might use a MAC to authenticate a message sent to Bob and Cindy. Bob (or Cindy), however, does not know whether the message has been sent or last modified by Alice or Cindy (or Bob). In general, members of a group can verify only that nonmembers, that is, people who do not hold the group authentication key, have not changed the data in transit. This property that guarantees only that a message was sent (last modified) by a member of the group is referred to asgroup authentication[5].
In contrast, if a member can establish whether the data sender is legitimate or not, we refer to that property as source authentication. With source authentication, a member can verify the data source and know that data has not been modified en route. Solutions for source authentication in general are either expensive or complex, and often application dependent.
3.1.1 Providing group authentication
Group authentication of a message implies that the message originated within the group, and has not been modified by entities outside the group. A MAC is used for group authentication, and thus it is rather inexpensive to authenticate even streaming data in real-time. Group authentication has some important applications. Consider, for example, secure communication between entities (e.g., gateways) that trust each other, over the public Internet. Group authentication is sufficient in this case, since members holding the group keys are assumed to be not interested in modifying data sent by other members. Group authentication only serves a limited purpose however, and may not be sufficient for most applications.
Members of the group need a common a key for group authentication (for MAC computation). Thus, we need to be able to establish and update the authentication keys among the members securely. Group key distribution protocols and algorithms described in Chapters 4, 5, and 6, provide ways to establish a common key among members of a group. Along with the
48 Multicast data authentication
TEAM
FLY
encryption keys, a group manager may also distribute authentication keys [6, 7]. The registration protocol is used to send the keys initially, and the rekey protocol is used to send key updates [6]. A sender can use those keys in a data security protocol (e.g., IPsec ESP, MESP [2], or AMESP) for authenticated group communication.
3.1.2 Providing source authentication
It is often not sufficient to be able to verify that a message originated within the group. Members would like to establish, at least to themselves, the sender of multicast data. Recall that a stronger property is for any third party to be able to independently verify the sender of the data. This, as introduced earlier, is known as non-repudiation.
Application requirements greatly influence the solution space for source authentication. First, an application may require non-repudiation or only source authentication. Next, data transmission may be reliable or lossy. Furthermore, the sender or the receivers may have limited buffer space. Moreover, receivers could have limited computational power (e.g., mobile devices), and, in some cases, receivers’ computational capacity may be heterogeneous. Receivers may be at different distances from the sender. Finally, the application may involve bulk data transfer(s) or streaming. In the following, we discuss the source authentication requirements of several different types of applications.
Reliable bulk data transmission. We assume that the data transmission is reliable and the sender has the data available in advance. The receivers can use the data only after all of it has been received. Buffer space is not of concern either at the sender or at the receivers. These flows are referred to as
all or nothing flows [8]. Multicast ftp and Web cache synchronization are examples of all or nothing flows.
A simple solution may be for the sender to compute the hash of the data and sign it. Notice, however, that an adversary can disrupt this authentica- tion process by changing just a single bit in the flow. Receivers cannot detect this attack until all the data has been received. Furthermore, they cannot identify the portion of the data that has been changed. Thus, they have to request retransmission of the entire flow.
Reliable streaming of stored data. Consider the reliable transmission of data that the sender knows in advance. The sender needs a large buffer to store the data. Since the sender knows the data in advance, it can perform authentication transforms off-line. Receivers are expected to authenticate
and use the data packets as they arrive. Thus, receiver-side buffering requirements are relatively modest.
Lossy streaming of (partially) stored data. This is similar to the above category, except that packets may be lost in transmission. Considering packet losses, receivers should be able to verify the authenticity of portions of data that are received. Video-on-demand is an application that falls into this category. These applications may be able to tolerate delayed (with fixed delay) verification.
Real-time streaming with packet loss. Real-time applications require the sender to transmit as soon as the data becomes available. Thus the sender needs to apply the authentication transforms in real-time. Considering lossy transmission, each packet’s authenticity must be independently verifiable. Multimedia conferencing is an application that requires real-time streaming in the presence of packet losses.
3.2
Digital signatures for source authentication
Source authentication can be achieved using digital signatures. The sender divides the data into blocks. For each block, it computes a hash, signs the hash, and sends the signature along with the data block. There are several issues to address. As the block size increases, the sender needs to perform fewer digital signatures and members need to perform fewer signature verifications. However, a member needs to receive an entire block before verifying its authenticity. For smaller block sizes, the number of signatures and verifications increases. The advantage is that members need not wait long before verifying and using a block.
Note that this procedure has several useful properties. Each block is individually authenticated and thus independently verifiable. Furthermore, this technique provides block-level non-repudiation. But all this comes at a cost. Signing and verifying each block is computationally expensive. Moreover, each block needs to carry its own signature, which results in excessive communication overhead. Independent packet authentication makes signing each packet look attractive. However, in practice, signing each packet in a high data rate real-time stream may not be feasible. Using 1-time signatures [9] is a slightly efficient alternative to signing each packet. But, 1-time signatures require a large number (60–80 [9]) of hash com- putations, and cannot handle packet losses.
Given that digital signatures can provide individual packet authentica- tion, several solutions have been proposed that amortize their cost over a stream or a block. Some of the solutions are applicable to specific application scenarios, while others make assumptions about the relationship between the sender(s) and receivers (e.g., that they are synchronized). A few solu- tions simply trade the excessive computational cost of digital signatures with communication cost.
In the next section, we describe schemes that amortize a digital signature over a block of data. These mechanisms reduce the number of digital signatures at the expense of increased communication overhead per packet.
3.2.1 Block signatures and individual packet authentication
For delay-sensitive flows, signing each packet is too expensive. However, we still need authentication mechanisms so receivers can verify packets as they are received. In other words, each packet must be independently verifiable. In the rest of this section, we describe a couple of techniques called star hashing and the more efficient tree hashing [8]. These schemes amortize the cost of a signature over a block of packets, and they require a sender-side buffer that can hold the block of packets. Thus, star and tree hashing are sometimes referred to as block hashing.
Star hashing
The sender divides a block of data into m packets. It signs a hash of the block (block hash), and thus amortizes the cost of the signature operation over m packets. For individual packet authentication, it computes the hashes h1, h2, . . ., hm, of the m packets. The block hash, h, is a hash of the concatenation of all the individual packet hashes. Thus, h¼hashðh1,
h2, . . ., hmÞ, where hi ¼hash(Pi). Note that Pi represents packet i. With each packet, the sender includes the block hash and the hashes of all the packets in the block. It also sends the relative position of the packet in the block.
Figure 3.1 illustrates star hashing. The edges (or leaves) represent packet hashes, and the root represents the block hash. The hash dependency graph is a star and hence the name star hashing. The figure also illustrates the relationship between the packet hashes and the block hash; that is, a block hash is dependent on all of the packet hashes.
Upon reception of Pi, the receiver computes its hash, h0i (the prime indicates receiver-side computation). It repeats the block hash computation procedure as described earlier, but usingh0iinstead ofhi. If the signed block
hash is identical to the computed block hash, the receiver knows thatPi is authentic. Furthermore, it also knows that the rest of the hashes are also authentic. Otherwise, the block hash comparison would have failed. For other packets in the same block, the receiver needs only to compute and verify whether the computed packet hash is equal to the received hash. In other words, there is only a single signature verification operation per block, at the receivers.
Receivers perform one signature verification operation and two hash computations to verify the authenticity of the first received packet of a block. For the other packets of that block, a single hash computation and comparison suffices. While the computational overhead in star hashing is minimal, the same cannot be said about the communication overhead. Recall that each packet needs to carry the hashes of all the packets (m) in a block, as well as a digital signature. A hash is typically 20 bytes (secure hash algorithm, SHA-1) [10] in length, whereas a digital signature is about 128 bytes in size (e.g., 1,024-bit RSA [11]).
Tree hashing
Tree hashing [12] employs a different block hash computation mechanism than in star hashing. While the hash computation mechanism is itself slightly complicated and inefficient, this scheme reduces the communication overhead associated with hashing.
Figure 3.1 Star hashing.
The sender divides a block of data into m packets and computes the individual packet hashes. For block hash computation, it associates each individual packet hash with a leaf node of thehash tree(see Figure 3.2). Each internal node’s hash is the hash of the concatenation of the children’s hashes. Thush12¼hashðh1;h2Þ. Using this function, the sender recursively
computes the root node’s hash. With each packet, the sender includes the signed block hash, the packet ID, and the hashes of siblings of all the nodes in the current packet’s path to the root.
Receivers follow a similar procedure to that in star hashing to verify the authenticity of each packet. A receiver first computes the hash of the received packet. It uses the computed hash and the received hashes to compute the root hash. If the computed root hash is identical to the signed block hash, the received packet is authentic.
We use Figure 3.3 to illustrate the computation process at a receiver. Let us sayP2is received first. The receiver computes the hashh02of the received
packet. WithP2, the sender includes the hashes of the siblings of all nodes
in P2’s path to the root. In our example, that implies hashes h1, h34, and h5:8. The receiver computes h012¼ hashðh1,h02Þ, h01:4¼ hashðh012,h34Þ, and h01:8¼hashðh01:4,h5:8Þ. If h01:8 and the signed block hash, h1:8 are identical, P2is authentic. Furthermore, the received and the computed hashes, that is, h1, h34, and h5:8, and h2,h12, and h1:4, are authentic as well. The receiver
caches the verified nodes of the hash tree for efficient verification of the other packets in the same block.
Figure 3.4 illustrates the advantage of caching verified hash nodes. For example, if P4 is received next, the receiver needs only to compute h04
followed byh034. Notice thath34is among the verified nodes; therefore it is
sufficient to compareh034toh34. If they are identical,P4is authentic.
Figure 3.2 Tree hashing.
Authenticity verification of the first received packet of a block consists of a digital signature verification operation, and computation of all hashes in the path from the packet’s position in the tree to the root. In all, the receiver