Step 3 (Comparing the distributions)
7.3 Implementation and Efficiency
In this section, we present some consideration regarding the implementation. At the end of this section, we compare our algorithms with EIA1 and EIA3, two MACs deployed in the 4G LTE system.
7.3.1
Selection of Fields and APN Function
The length of a MAC tag is usually some power of two. Several decades ago, people believed that 32-bit tag was enough. For a 32-bit tag, the complexity of the birthday attack is 216,
which is too huge to be computed at that time. Therefore, the tag sizes of some legacy systems are still 32-bit, such as cellular, Wi-Fi, and etc. Since the computers become more powerful, 216 is easy to compute even for a personal laptop computer. Usually, a powerful
server has the capability to do the exhaustive search beyond 248. Therefore, nowadays even
128 or 256, for example TUAK [53]. We want our algorithm to work with both the legacy systems and the modern systems. For this reason, our design has four versions, 32-bit, 64-bit, 128-bit and 256-bit.
Selection of Finite Fields.
The multiplication over finite filed is a necessary block of our algorithm. Assume the defining polynomial of the finite field GF (2n) over GF (2) is
f (x) = xn+ g(x).
To make the multiplication more efficient, we want both the degree and the number of terms of g(x) to be as small as possible. Usually, for a USIM card, there is an 8-bit chip inside, which means it can compute 8-bit XOR simultaneously. The 8-bit platform is currently the smallest platform considered by us. Therefore, we restrict the degree of g(x) to be smaller than eight. By exhaustive search, we find the defining polynomial for each version of our design. The defining polynomials are listed in Table 7.3. Note that for the field GF (2256), we cannot find a polynomial satisfies our criterion. Thus, we loosen our
condition, and find a polynomial that the degree of g(x) is ten. It is not efficient on the 8-bit platform, because the XOR is computed in two clock cycles. But it is still efficient on the 16-bit and higher platforms.
Table 7.3: Defining Polynomials
Version Field Defining Polynomial
32-bit GF (232) x32+ x7+ x6+ x2 + 1
64-bit GF (264) x64+ x4+ x3+ x + 1
128-bit GF (2128) x128+ x7+ x2 + x + 1
256-bit GF (2256) x256+ x10+ x9+ x8+ x7+ x4+ x2+ x + 1
Selection of APN Function.
Another critical block is the APN function. There are several constructions of APN func- tions. Among all those constructions, we want the function has the following properties.
• Be efficient to compute.
Proposition 6 in [35] suggests one construction that has the form of x3+ αT r(βx3+ γx9),
where α, β, γ ∈ GF (2n), α 6= 0 and n is even. We know that this construction is EA-
equivalent to x3. Since the computation of x3 is more efficient than x3+ αT r(βx3+ γx9),
we choose x3 as our function.
If the field element is represented under a normal basis, the square is simply cyclic shifting one bit. To compute x3, we may first compute x2, and then compute x2· x.
7.3.2
Experiment Result of Efficiency
Since our MACs are based on APN function, we call it AMAC. We implement both Con- struction I and Construction II, which are called AMAC I and AMAC II respectively.
0 50 100 150 0 200 400 600 Length (10 bits) Clo ck cycle Efficiency Comparison EIA1 EIA3 AMAC AMACII
Figure 7.1: Efficiency comparison among EIA1, EIA3, and AMAC
Figure 7.1 shows the comparison of EIA1, EIA3, and our algorithms. In this test, we choose ZUC as our underlying cipher. From the figure we can clearly see that although our algorithms are slower than EIA3, they are overwhelmingly faster than EIA1.
0 50 100 150 0 500 1,000 1,500 2,000 Length (10 bits) Clo ck cycle Efficiency Comparison AMAC (ZUC) AMAC (Snow3G) AMACII (ZUC) AMACII (Snow3G)
Figure 7.2: Efficiency comparison when using ZUC and Snow3G as underlying ciphers Since we generate random numbers in each round, the efficiency of our algorithms highly depend on the underlying cipher. Figure 7.2 demonstrates the difference between different underlying ciphers. Obviously, ZUC is much faster than Snow3G.
7.4
Summary
We improved EIA1 such that the new variant is resistant to the linear forgery attack. Af- ter that, we proposed two new MAC constructions using APN function. Compared with previous works based on APN function, ours can take messages with any length as the input, and output a fixed-length tag as MAC. Such design is more flexible and practical. Both constructions are analog to XOR-MAC. However, both constructions have simpler round functions compared with XOR-MAC. Therefore, we have better performance than MACs based on block ciphers or keyed hash functions. Moreover, we have solid mathe- matical security proofs for both constructions. We compare the security and efficiency of our algorithm with two well known stream cipher based MACs, EIA1 and EIA3, which are deployed in 4G LTE cellular network. Our algorithms resist to the cycling attack and linear forgery attack, which can be applied to EIA1 and EIA3. The experiments show that our algorithms are slower than EIA3 but faster than EIA1.
Chapter 8
Conclusion and Future Work
8.1
Conclusion
This thesis first proposes a practical attack on EIA1 called linear forgery attack, which exploits the linear structure of EIA1. Known two message-MAC pairs, the opponent can forge up to 232 valid pairs.
Inspired by the linear forgery attack, this thesis assesses the security of EIA1 under different models. The security proof of EIA1 is only a special case of the analysis in this thesis. The assessment shows that when the assumption of one-time pad is broken, EIA1 is not safe at all. In addition to the linear forgery attack, attackers can even recover the key stream used in EIA1. Broking the one-time pad assumption is sometimes referred to as misuse. Several researchers have begun to study the misuse-resistant MACs, which have some level of security when misuse happens. After the security assessment, several optimizations of EIA1 are presented, and a more efficient polynomial evaluation algorithm is introduced to replace the Horner’s Rule. The experiments suggest that the optimized version is much faster than the official implementation of EIA1.
For TUAK, this thesis suggests to use different constant INSTANCE for function f2 -
f5 to follow the recommendation of Keccak. After this modification, we theoretically prove
that f1, f1∗ and f2 are secure MACs, and f3, f4, f5 and f5∗ are secure KDFs. Specifically,
f1, f1∗ and f2 resist the universal key recovery, substitution forgery, pre-image and second
pre-image attacks, and f3, f4, f5 and f5∗ resist universal key recovery, pre-image and second
pre-image attacks.
To assess TUAK, a new crypto analysis method called Multi-Output Filtering Model is proposed. This model is inspired by the system test using LFSR in communication
systems. An LFSR generates the input, and the cipher works as the filtering function. The output can be treated as a sequence or the output of the boolean function. MOFM reveals the randomness of the cryptographic primitive being tested. This thesis applies MOFM to TUAK, AES, PRESENT and KASUMI. The results suggest that TUAK and AES have very good randomness properties, but PRESENT and KASUMI show some non randomness.
After studying existing MACs, such as EIA1, EIA3, TUAK, this thesis proposes two novel MAC constructions. The first called WGIA-128 is a variant of EIA1. The addition over finite filed used in EIA1 is replaced by the addition over ring in order to resist the linear forgery attack. The second called AMAC is quite different from EIA1 and EIA3, and it is based on APN functions. In addition to the linear forgery attack, AMAC is also resistant to the cycling attack.