4.3 Polynomial time I: Algorithms in arithmetic
4.3.2 Gaussian elimination
The basic operations of linear algebra are polynomial: addition and inner product of vectors, multiplication and inversion of matrices, the computation of determinants. However, these facts are non-trivial in the last two cases, so we will deal with them in detail.
LetA= (aij) be an arbitraryn×nmatrix consisting of integers.
Let us verify, first of all, that the polynomial computation of det(A) is not inherently impossible, in the sense that the result can be written down with polynomially many bits. LetK= max|aij|, then to write down the matrixAwe need obviously at leastL=n2+logK
bits. On the other hand, the definition of determinants gives |det(A)| ≤n!Kn,
hence det(A) can be written down using
log(n!Kn) +O(1)≤n(logn+ logK) +O(1)
bits. This is polynomial inL.
Linear algebra gives a formula for each element of det(A−1) as the quotient of two sub-
determinants of A. This shows thatA−1 can also be written down with polynomially many
bits.
The usual procedure to compute the determinant isGaussian elimination. We can view this as the transformation of the matrix into a lower triangular matrix with column op- erations. These transformations do not change the determinant, and in the final triangular matrix, the computation of the determinant is trivial: we just multiply the diagonal elements to obtain it. It is also easy to obtain the inverse matrix from this form; we will not deal with this issue separately.
Gaussian elimination. Suppose that for allisuch that 1≤i≤t, we have achieved already that in thei’th row, only the firsti entries hold a nonzero element. Pick a nonzero element from the lastn−t columns (stop if if there is no such element). Call this element thepivot element of this stage. Rearrange the rows and columns so that this element gets into position (t+ 1, t+ 1). Subtract column t+ 1, multiplied by at+1,i/at+1,t+1, from column i column
for alli=t+ 2, . . . , n, in order to get 0’s in the elements (t+ 1, t+ 2), . . . ,(t+ 1, n). These subtractions do not change value of the determinant and the rearrangement changes at most the sign, which is easy to keep track of.
Since one iteration of the Gaussian elimination uses O(n2) arithmetic operations and n
iterations must be performed, this procedure uses O(n3) arithmetic operations. But the
problem is that we must also divide, and not with remainder. This does not cause a problem over a finite field, but it does in the case of the rational field. We assumed that the elements of the original matrix are integers; but during the run of the algorithm, matrices also occur that consist of rational numbers. In what form should these matrix elements be stored? The natural answer is that as pairs of integers (whose quotient is the rational number).
Do we require that the fractions be in simplified form, i.e., that their numerator and denominator be relatively prime to each other? We could do so; then we have to simplify each matrix element after each iteration, for which we would have to perform the Euclidean algorithm. This can be performed in polynomial time, but it is a lot of extra work, and it is desirable to avoid it. (Of course, we also have to show that in the simplified form, the occurring numerators and denominators have only polynomially many digits. This will follow from the discussions below.)
We could also choose not to require that the matrix elements be in simplified form. Then we define the sum and product of two rational numbersa/bandc/dby the following formulas: (ad+bc)/(bd) and (ac)/(bd). With this convention, the problem is that the numerators and denominators occurring in the course of the algorithm can become very large (have a nonpolynomial number of digits)!
Fortunately, we can give a procedure that stores the fractions in partially simplified form, and avoids both the simplification and the excessive growth of the number of digits. For this, let us analyze a little the matrices occurring during Gaussian elimination. We can assume that the pivot elements are, as they come, in positions (1,1), . . . ,(n, n), i.e., we do not have to permute the rows and columns. Let (a(ijk)) (1≤i, j ≤n) be the matrix obtained afterk
iterations. Let us denote the elements in the main diagonal of the final matrix, for simplicity, by d1, . . . , dn (thus, di = aii(n)). Let D(k) denote the submatrix determined by the first k
rows and columns of matrix A, and let Dij(k), for k+ 1 ≤ i, j ≤ n, denote the submatrix determined by the firstk rows and theith row and the first kcolumns and thejth column. Letd(ijk)= det(Dij(k)). Obviously, det(D(k)) =d(k−1)
kk . Lemma 4.3.3 a(ijk)= d (k) ij det(D(k)).
Proof. If we compute det(D(ijk)) using Gaussian elimination, then in its main diagonal, we obtain the elementsd1, . . . , dk, a(ijk). Thus
4.3. POLYNOMIAL TIME I: ALGORITHMS IN ARITHMETIC 65 Similarly,
det(D(k)) =d1· · ·dk.
Dividing these two equations by each other, we obtain the lemma. ¤ By this lemma, every number occurring in the Gaussian elimination can be represented as a fraction both the numerator and the denominator of which is a determinant of some submatrix of the original Amatrix. In this way, a polynomial number of digits is certainly enough to represent all the fractions obtained.
However, it is not necessary to compute the simplifications of all fractions obtained in the process. By the definition of Gaussian elimination we have that
a(ijk+1)=a(ijk)−a (k) i,k+1a (k) k+1,j a(kk+1) ,k+1 and hence d(ijk+1)=d (k) ij d (k) k+1,k+1−d (k) i,k+1d (k) k+1,j d(k,kk−1) .
This formula can be considered as a recurrence for computing the numbers d(ijk). Since the left-hand side is an integer, the division can be carried out exactly. Using the above considerations, we find that the number of digits in the quotient is polynomial in terms of the size of the input.
There are at least two further possibilities to remedy the problem of the fractions occurring in Gaussian elimination.
We can approximate the numbers by binary “decimals” of limited accuracy (as it seems natural from the point of view of computer implementation), allowing, say, pbits after the binary “decimal point”. Then the result is only an approximation, but since the determinant is an integer, it is enough to compute it with an error smaller than 1/2. Using the methods of numerical analysis, it can be determined how largepmust be chosen to make the error in the end result smaller than 1/2. It turns out that a polynomial number of digits are enough, and this leads to a polynomial algorithm.
The third possibility is based on the remark that if m > |det(A)| then it is enough to determine the value of det(A) modulo m. If m is a prime number then computing modulo
m, we don’t have to use fractions at all. Since we know that |det(A)|< n!Kn it is enough
to choose forma prime number greater thann!Kn.
It is, however, not quite easy to select such a large prime (see the section on randomized algorithms). An easier method is to choose m as the product of different small primes:
m = 2·3· · ·pk where for k we can be choose, e.g., the total number of bits occurring in
all pi, using Gaussian elimination in the field of residue classes modulo pi. Then we can
compute the remainder of det(A) modulo musing the Chinese Remainder Theorem. (Since
kis small we can afford to find the firstkprimes simply by brute force. But the cost of this computation must be judged differently anyway since the same primes can then be used for the computation of arbitrarily many determinants.)
Remark 4.3.3 The modular method is successfully applicable in a number of other cases. One way to look at this method is to consider it as an encoding of the integers in a way differ- ent from the binary (or decimal) number system: we code the integernby its remainder after division by the primes 2,3, etc. This is an infinite number of bits, but if we know in advance that no number occurring in the computation is larger thanN then it is enough to consider the firstkprimes whose product is larger thanN. In this encoding, the arithmetic operations can be performed very simply, and even in parallel for the different primes. Comparison by magnitude is, however, awkward.
4.3.3
*Discrete square roots
In this section we discuss the number theoretic algorithm to extract square roots.
We call the integers 0,1, . . . , p−1 residues(modulop). Letpbe an odd prime. We say thaty is a square rootofx(modulop), if
y2≡x (modp).
Ifxhas a square root then it is called aquadratic residue.
Obviously, 0 has only one square root modulo p: ify2≡0 (modp), thenp|y2, and since
pis a prime, this implies thatp|y. For every other integerx, ify is a square root ofx, then so is p−y = −y (mod p). There are no further square roots: indeed, if z2 ≡ xfor some
residuez, then p|y2−z2 = (y−z)(y+z) and so eitherp|y−z or p|y+z. Thusz ≡y or
z≡ −y as claimed.
This implies that not every integer has a square root modulo p: squaring maps the non- zero residues onto a subset of size (p−1)/2, and the other (p−1)/2 have no square root.
The following lemma provides an easy way to decide if a residue has a square root. Lemma 4.3.4 A residuexhas a square root if and only if
x(p−1)/2≡1 (modp). (4.1)
Proof. The “only if” part is easy: ifxhas a square root y, then
4.3. POLYNOMIAL TIME I: ALGORITHMS IN ARITHMETIC 67 by Fermat’s “Little” Theorem. Conversely, the polynomialx(p−1)/2−1 has degree (p−1)/2,
and hence it has at most (p−1)/2 “roots” modulo p(this can be proved just like the well- know theorem that a polynomial of degree n has at most nreal roots). Since all quadratic residues are roots ofx(p−1)/2−1, none of the quadratic non-residues can be. ¤
But how to find this square root? For some primes, this is easy:
Lemma 4.3.5 Assume thatp≡3 (mod 4). Then for every quadratic residuex,x(p+1)/4 is
a square root of x.
Indeed, ³
x(p+1)/4´2=x(p+1)/2=x·x(p−1)/2≡x (modp).
The case whenp≡1 (mod 4) is more difficult, and the solution uses randomization. In fact, randomization is only needed in the following auxiliary algorithm:
Lemma 4.3.6 Let pbe an odd prime. Then we can find a quadratic non-residue modulo p
in randomized polynomial time.
This can be done by selecting a random residuez 6= 0, and then testing (using lemma 4.3.4 whether it is a quadratic residue. If not, we try another z. Since the chance of hitting one is 1/2, we find one in an average of two trials.
One could, of course, try to avoid randomization by testing the residues 2,3,5,7, . . . to see if they have a square root. Sooner or later we will find a quadratic non-residue. However, it is not known whether the smallest quadratic non-residue will be found in polynomial time this way. It is conjectured that one never has to try more thanO(log2p) numbers this way.
Now let us return to the problem of finding the square root of a residue x, in the case when pis a prime satisfying p≡1 (mod 4). We can writep−1 = 2kq, whereq is odd and
k≥2.
We start with finding a quadratic non-residuez. The trick is to find an even power z2t
such thatxqz2t≡1 (modp). Then we can take y=x(q+1)/2zt (modp). Indeed,
y2≡xq+1z2t≡x (modp).
To construct such a power ofz, we consruct for allj≤k−1 an integertj>0 such that
x2jq
z2j+1t
j ≡1 (modp). (4.2)
Forj= 0, this is just what we need. Forj=k−1, we can taketk−1=q:
x2k−1q
z2kq
since x is a quadratic residue and zp−1 ≡ 1 (modp) by Fermat’s “Little” theorem. This
suggests that we construct the numbertj “backwards” forj=k−2, k−3, . . .
Suppose that we have tj,j >0, and we want to constructtj−1. We know that
p ¯ ¯ ¯x2jqz2j+1tj−1 = ³ x2j−1qz2jtj −1 ´ ³ x2j−1qz2jtj + 1 ´
We test which of the two factors is a multiple of p. If it is the first, we can simply take
tj−1=tj. So suppose that it is the second. Now take
tj−1=tj+ 2k−j−1q.
Then
x2j−1qz2jtj−1 =x2j−1qz2jtj+2k−1q =x2j−1qz2jtjz(p−1)/2≡(−1)(−1) = 1,
sincezis a quadratic non-residue.
This completes the description of the algorithm.