Data Integrity - Bob calculates (g b mod p) a mod p

4. Bob calculates (g b mod p) a mod p

1.8.4 Data Integrity

An important aspect of cryptography is protecting data integrity from accidental or intentional modification. Accidental data integrity violations include transmission errors, media errors, and noise. Intentional modifi-cation generally refers to hacker attacks trying to modify data content while stored on media or in transit.

To provide for data integrity, today’s cryptography typically suggests the use of message integrity codes (MICs) that can then be protected with a symmetric or asymmetric key to become message authentication codes (MACs).

AU5219_book.fm Page 59 Thursday, May 17, 2007 2:36 PM

60 Mechanics of User Identification and Authentication

Although they are often confused, MACs differ from user authentication.

MACs are meant to use cryptographic mechanisms to ascertain that user data is not modified improperly. The process of ascertaining the integrity of user data is often referred to as data integrity authentication, or message authentication, or data authentication, or even just authentication, which makes it even more confusing.

User authentication is the process of ascertaining the identity of a user and typically takes place before user data is transferred on the network. User authentication is also sometimes referred to as just authentication. Apparently, there is difference between the two, and they should not be confused.

1.8.4.1 Message Integrity Code (MIC)

Message integrity code (MIC) is a one-way function of the message that must be protected. The MIC provides a hash (digest) of the entire message and can use a linear or nonlinear algorithm. It can be considered a checksum for the message. It is important to understand that the actual message cannot be restored from the MIC, and hence the statement that the MIC is a one-way function.

If a sender needs to guarantee the integrity of a message, he can send the message followed by a MIC code, calculated for that specific message.

If the message is modified in transit, the recipient will calculate the MIC for the received message and compare it with the MIC sent with the original message. If the two do not match, the message has been altered.

Because the MIC represents a one-way function, there is more than one cleartext message that corresponds to a specific MIC. Technically, this means that if a sender generates a message and an MIC for it, and the message gets modified in transit, then a recipient might receive a com-pletely different message with the same MIC. The recipient can therefore be convinced that the message has not been modified. The fact that each MIC has more than one corresponding cleartext message is referred to as collision. MIC algorithms should provide for as few collisions as possible.

Apparently, it is not possible to eliminate all collisions due to the one-way nature of the hash function.

There are a number of algorithms that can be used to calculate MICs.

Some of the popular algorithms that are also used in user authentication mechanisms include the following.

CRC32. CRC32 is a linear MIC that generates a 32-bit hash from the cleartext message. CRC32 represents the message to be pro-tected as a polynomial, and uses polynominal division to generate the MIC. CRC32 is capable of detecting specific changes in the

AU5219_book.fm Page 60 Thursday, May 17, 2007 2:36 PM

User Identification and Authentication Concepts 61

message, but not all. For example, the addition or deletion of a string of 0’s cannot be detected. Furthermore, it is easy to modify data without changing the MIC for that data. Therefore, CRC32 is primarily used in hardware for error detection and is not recom-mended for data integrity protection.

MD2, MD4, and MD5. MD2 (Message Digest 2), MD4 (Message Digest 4), and MD5 (Message Digest 5) are MIC functions invented by Roland Rivest (see [5], [125], [126]). All these functions generate a 128-bit hash from a cleartext message.

MD2 is now considered old, and collisions have been found in this algorithm. It is virtually not used anymore.

MD4 was designed for implementation in software on 32-bit computers. The original cleartext message is processed in three rounds. Not long after its release, however, it became clear that there are collisions in the MD4 algorithm, and that the one-way function is not really irreversible. Therefore, MD4 is currently considered weak and should be avoided.

MD5 was designed as a strengthened version of MD4. MD5 uses four rounds of processing. Compared to MD4, round 4 is a new function and the function in round 2 has been altered.

Furthermore, MD5 uses additive constants and feedback between processing steps. Still, collisions have been found in the MD5 compression function.

The MD4 and MD5 algorithms are widely used and are very similar. Despite the fact that they have collisions, MD5 is considered reasonably secure and is the most popular data integrity algorithm on the Internet.

SHA-1. SHA-1 is the U.S. National Security Agency (NSA) standard for message integrity. 1 generates a 160-bit output hash. SHA-1 uses an algorithm very similar to MD5 but is generally considered stronger than MD5, and unlike MD5 is a gover nment agency published standard rather than an Internet standard.

Despite the fact that SHA-1 is not collision-free either (see [129]), it is currently considered very secure and is widely used in the Internet.

1.8.4.2 Message Authentication Code (MAC)

Message integrity codes (MICs) provide for error detection but are not suitable by themselves for message authentication or digital signatures.

This is primarily due to the fact that the MIC might protect a message from modification but there is nothing that can protect the MIC itself. If an attacker is able to modify the message and the attached MIC, then

AU5219_book.fm Page 61 Thursday, May 17, 2007 2:36 PM

62 Mechanics of User Identification and Authentication

data integrity is no longer guaranteed. Therefore, the industry has come up with MACs that involve symmetric or public key cryptography to protect the integrity of messages.

HMAC

HMAC (Hashed Message Authentication Code) is a NIST standard, defined in [130]. HMAC specifies how a key can be used to protect MIC functions.

HMAC defines the protection of the MIC function by a stream cipher function in the secret key.

HMAC is very simple to implement. If the cleartext message to be protected is b bytes long, then the HMAC specification defines the so-called inner and outer pads in the following way:

ipad = the byte 0x36 repeated b times opad = the byte 0x5C repeated b times

There can virtually be HMAC equivalents for all MIC functions. If H is the MIC function for which the HMAC equivalent is being calculated, and K is the secret K to protect the MAC, then:

MAC(text) = HMAC(K, text)

= H((K XOR opad) || H((K XOR ipad) || text)) Similar to MICs, HMACs always have a fixed size, which is the same as the corresponding MIC. If the result from the above calculation is longer than the standard hash size, only the leftmost part of it is used so that the HMAC size is adjusted accordingly.

To denote the use of a particular MIC, HMACs are typically referred to as HMAC-MD5 (denoting the use of MD5) or HMAC-SHA1 (denoting the use of SHA-1).

MD2.5

RFC 1964 (see [51]) defines a keyed MD5 variant referred to as MD2.5.

This MAC algorithm only uses half (i.e., 8 bytes) of the MIC generated by MD5. Before generating the MD5 hash, the MD2.5 algorithm also prepends the cleartext to be hashed with the result from the DES-CBC encryption of a message consisting of all 0’s with a zero initial vector and the protection key in reverse order as the key. The MD2.5 algorithm is considered a MAC because the string prepended to the cleartext depends on the protection key.

AU5219_book.fm Page 62 Thursday, May 17, 2007 2:36 PM

User Identification and Authentication Concepts 63

DES-CBC MAC

As already discussed, block ciphers in cipher block chaining (CBC) mode are capable of using feedback between blocks, so that the encryption of each block, as well as the cryptotext for the block, depend on all previous blocks of the message. This has led researchers to use DES-CBC as a message authentication code. The last encrypted block from a message will be generated with feedback from all previous blocks, and the DES algorithm uses a key to perform encryption. Therefore, if only the last block of a message is taken, the entire message cannot be restored (so the algorithm is irreversible, which is a requirement), the block depends on all previous blocks of the message and is protected with a key (the DES key).

DES-CBC MACs are not as widely used as HMAC functions but are a secure way for protecting message integrity.

A variant of DES-CBC MAC is DES-CBC MD5, whereby as a final round, a 16-byte MD5 hash is calculated for the DES-CBC MAC.

RSA Signature (Asymmetric Algorithm)

The RSA Signature algorithm is based on the RSA encryption algorithm.

If a message m needs to be signed, RSA Signature requires the sender of the message to sign it using his own private key (n, d) and the following formula:

s = m^d mod n

Essentially, the above formula represents RSA Encryption using the sender’s private key. Because only the sender possesses his public key, no one else can generate the signature s from the message m. A recipient can verify the signature of the message s by decrypting s using the sender’s public key (n, e).

The signature algorithm described above is based on encryption and provides for assurance that the sender and the message are genuine. In RSA implementations, however, the algorithm used to generate an RSA signature is a bit different.

The signature of a message in RSA implementations is not represented by the actual message, encrypted in the sender’s private key. To generate a signature, the sender first generates a hash of the message, using an MIC algorithm supported by both the sender and the recipient, such as MD5. The hash itself is then encrypted in the sender’s private key and attached to the actual message. Therefore, the message is not encrypted but it has an encrypted hash (called signature) attached to it that can validate both the sender and the message. Upon receipt of the message,

AU5219_book.fm Page 63 Thursday, May 17, 2007 2:36 PM

64 Mechanics of User Identification and Authentication

the recipient needs to calculate the same hash for the cleartext message, and then decrypt the signature using the sender’s public key. If the calculated and the decrypted hash for this message match, the message and the sender are genuine. If they do not match, either the sender does not possess the private key (and therefore may be an attacker claiming to be the real sender of the message) or the message has been tampered with (an attacker has maliciously modified the message).

DSA/DSS (Asymmetric Algorithm)

The Digital Signature Algorithm (DSA) is another public key cryptographic technique used for data signing. The use of the DSA for data signing is defined as the NIST Digital Signature Standard (DSS) in [131].

DSA is very similar to RSA but does not provide for message encryption

— only for message signing. Key sizes for DSA are 512 or 1024 bits. DSA uses SHA-1 as the hashing algorithm.

It is important to note that RSA signature generation is a relatively slow process, whereas RSA signature verification is a fast process. With DSA, signature generation is a fast process while signature verification is a slow, processor-intensive process. Due to the fact that the signature is generated only once and then verified many times, RSA supporters claim that the RSA signature is more efficient than DSS because it consumes less cycles.

Considering the processing power of today’s computers, however, this is becoming less important.

DSA is currently considered a secure algorithm and is used in a number of applications and authentication schemes.

AU5219_book.fm Page 64 Thursday, May 17, 2007 2:36 PM

In document User Identification and Authentication Concepts (Page 59-64)