**CHAPTER 4 Cryptology**

**4.5. DIGITAL SIGNATURES 81 Digital Signatures**

4.5. Digital Signatures

Public-key cryptosystems allow several use-cases which symmetric cryptosystems do
not. One which has come to have more and more importance in the modern digital economy
is the creation of *digital signatures* – these are parts of electronic documents which are
supposed to have something of the qualities of a physical signature in that are hard for an
imposter to forge. Such a signed document might be needed, for example, if Bob from
the last section (whose RSA public key is on his website) wished to send a legally binding
contract via e-mail. Perhaps Alice and Bob wish to e-mail to their future landlord Larry a
signed lease for an apartment that they will share. When Larry gets an e-mail from Bob
saying “I agree to be bound by the terms of this lease,” Larry needs to have confidence that
this e-mail did originate from Bob, which he can if there is a digital signature.

Here’s what Bob can do: he takes a copy of the lease, adds a section at the end stating
his agreement to its terms and giving some personally identifying information (perhaps
a scan of his driver’s license). Call this whole chunk of data m. Then Bob applies his
*decryption*algorithm, using his private (decryption) keykd, yieldings=dkd(m)– thissis

called Bob’s signature on the messagem. He then e-mails bothmandsto Larry.

When Larry receives this signed message, the first thing he does is detach the signature

s and compute its encryptioneke(s)using the public key he got off Bob’s website. Since

eke anddkd are inverses and it does not matter in which order they are applied, the result

should bem. If that is so, Larry can be sure that whoever sent the message also had access to Bob’s secret key and so presumably is Bob himself.

Graphically:

Basic digital signatures:

Larry on public network Bob

pickkd∈ Kd

computeke =E(kd)

downloadke public keyke publishke

messagem_{∈ M}

computes=dkd(m)

receive(m, s) signed message(m, s) transmit(m, s)

ifeke(s) =m

ACCEPT otherwise,

REJECT

One problem with this scheme is that it has effectively doubled the size of the message. The way to make a smaller, more efficient signature is for it to consist of the decryption not of all ofmbut instead of some functionh(m). Here the functionhshould take a message

of arbitrary size and produce a small, digested piece of data ... which nevertheless depends upon every part of the inputm. After, all, if h(m)depended only upon the first 100 bits ofm, for example, then a malicious Eve could alter the message in transit, and her change would go undetected as long as she did not change the first 100 bits of the message.

Cryptologists have a name for functions like thish.

DEFINITION 4.5.1. A functionh which takes as input arbitrary length strings of bits and produces output bit strings of a fixed length is called acryptographic hash function

if it satisfies

ease of computation: it is feasible to compute theh(m)for anym;

pre-image resistance: given a hash value t, it is infeasible to find an m such that

h(m) =t;

second pre-image resistance: given a specific inputm1, it is infeasible to find an-

otherm2such thath(m2) =h(m1);

collision resistance: it is infeasible to find two messagesm1andm2such thath(m2) =

h(m1).

The words *feasible* and *infeasible* here have the same meaning here as in the previous
section that it is, or is not, possible to complete the computation in an amount of time
bounded by a polynomial function of the size of the inputs.

Notice that since a hash function takes inputs of arbitrary length but has a fixed output size, there will necessarily be an infinite number of collisions

The creation of cryptographic hash functions is something of a black art. It turns out
that if one builds a candidate hash function with some clear structure (usually mathemati-
cal) – particularly if it is one that is fast to compute – a way to break one of the resistance
requirements is usually found by the cryptological community. For this reason, the algo-
rithms currently in wide use tend to be very*ad hoc*computations that just seem messy and
have resisted attempts at inversion or breaking resistance.

EXAMPLE4.5.2. For around a decade starting in the early 1990s, the most widely used cryptographic hash function was calledmd5. This algorithm was developed by Ron Rivest and published in 1992. The output size ofmd5is 128 bits.

While md5 was thought to be flawed since the middle 1990s, a real attack was not published until 2004, when it was shown not to be collision resistant [WY05]. However,

md5 is still used extensively today to verify that a large data transfer has not suffered a
transmission error – *i.e.,* it is still a useful tool to test for non-malicious data corruption.
(In this context of providing evidence for data integrity against non-malicious corruption,
a hash function is frequently called afingerprint.)