A Computational Introduction to Number Theory
and Algebra
(Version 2)
Copyright © 2008 by Victor Shoup <victor@shoup.net>
Theelectronicversion of this work is distributed under the terms and conditions of a Creative Commons license (Attribution-NonCommercial-NoDerivs 3.0):
You are free to copy, distribute, and display the electronic version of this work under the following conditions:
Attribution. You must give the original author credit.
Noncommercial. You may not use the electronic version of this work for commercial purposes.
No Derivative Works. You may not alter, transform, or build upon the electronic version of this work.
For any reuse or distribution, you must make these license terms clear to others.
Any of these conditions can be waived if you get permission from the author.
For more information about the license, visit
creativecommons.org/licenses/by-nd-nc/3.0.
Contents
Preface pagex
Preliminaries xiv
1 Basic properties of the integers 1
1.1 Divisibility and primality 1
1.2 Ideals and greatest common divisors 5
1.3 Some consequences of unique factorization 10
2 Congruences 15
2.1 Equivalence relations 15
2.2 Definitions and basic properties of congruences 16
2.3 Solving linear congruences 19
2.4 The Chinese remainder theorem 22
2.5 Residue classes 25
2.6 Euler’s phi function 31
2.7 Euler’s theorem and Fermat’s little theorem 32
2.8 Quadratic residues 35
2.9 Summations over divisors 45
3 Computing with large integers 50
3.1 Asymptotic notation 50
3.2 Machine models and complexity theory 53
3.3 Basic integer arithmetic 55
3.4 Computing inZn 64
3.5 Faster integer arithmetic (∗) 69
3.6 Notes 71
4 Euclid’s algorithm 74
4.1 The basic Euclidean algorithm 74
4.2 The extended Euclidean algorithm 77
4.4 Speeding up algorithms via modular computation 84 4.5 An effective version of Fermat’s two squares theorem 86
4.6 Rational reconstruction and applications 89
4.7 The RSA cryptosystem 99
4.8 Notes 102
5 The distribution of primes 104
5.1 Chebyshev’s theorem on the density of primes 104
5.2 Bertrand’s postulate 108
5.3 Mertens’ theorem 110
5.4 The sieve of Eratosthenes 115
5.5 The prime number theorem . . . and beyond 116
5.6 Notes 124
6 Abelian groups 126
6.1 Definitions, basic properties, and examples 126
6.2 Subgroups 132
6.3 Cosets and quotient groups 137
6.4 Group homomorphisms and isomorphisms 142
6.5 Cyclic groups 153
6.6 The structure of finite abelian groups (∗) 163
7 Rings 166
7.1 Definitions, basic properties, and examples 166
7.2 Polynomial rings 176
7.3 Ideals and quotient rings 185
7.4 Ring homomorphisms and isomorphisms 192
7.5 The structure ofZ∗n 203
8 Finite and discrete probability distributions 207
8.1 Basic definitions 207
8.2 Conditional probability and independence 213
8.3 Random variables 221
8.4 Expectation and variance 233
8.5 Some useful bounds 241
8.6 Balls and bins 245
8.7 Hash functions 252
8.8 Statistical distance 260
8.9 Measures of randomness and the leftover hash lemma (∗) 266
8.10 Discrete probability distributions 270
Contents vii
9 Probabilistic algorithms 277
9.1 Basic definitions 278
9.2 Generating a random number from a given interval 285
9.3 The generate and test paradigm 287
9.4 Generating a random prime 292
9.5 Generating a random non-increasing sequence 295
9.6 Generating a random factored number 298
9.7 Some complexity theory 302
9.8 Notes 304
10 Probabilistic primality testing 306
10.1 Trial division 306
10.2 The Miller–Rabin test 307
10.3 Generating random primes using the Miller–Rabin test 311 10.4 Factoring and computing Euler’s phi function 320
10.5 Notes 324
11 Finding generators and discrete logarithms inZ∗p 327
11.1 Finding a generator forZ∗p 327
11.2 Computing discrete logarithms inZ∗p 329
11.3 The Diffie–Hellman key establishment protocol 334
11.4 Notes 340
12 Quadratic reciprocity and computing modular square roots 342
12.1 The Legendre symbol 342
12.2 The Jacobi symbol 346
12.3 Computing the Jacobi symbol 348
12.4 Testing quadratic residuosity 349
12.5 Computing modular square roots 350
12.6 The quadratic residuosity assumption 355
12.7 Notes 357
13 Modules and vector spaces 358
13.1 Definitions, basic properties, and examples 358
13.2 Submodules and quotient modules 360
13.3 Module homomorphisms and isomorphisms 363
13.4 Linear independence and bases 367
13.5 Vector spaces and dimension 370
14 Matrices 377
14.1 Basic definitions and properties 377
14.2 Matrices and linear maps 381
14.4 Gaussian elimination 388
14.5 Applications of Gaussian elimination 392
14.6 Notes 398
15 Subexponential-time discrete logarithms and factoring 399
15.1 Smooth numbers 399
15.2 An algorithm for discrete logarithms 400
15.3 An algorithm for factoring integers 407
15.4 Practical improvements 414
15.5 Notes 418
16 More rings 421
16.1 Algebras 421
16.2 The field of fractions of an integral domain 427
16.3 Unique factorization of polynomials 430
16.4 Polynomial congruences 435
16.5 Minimal polynomials 438
16.6 General properties of extension fields 440
16.7 Formal derivatives 444
16.8 Formal power series and Laurent series 446
16.9 Unique factorization domains (∗) 451
16.10 Notes 464
17 Polynomial arithmetic and applications 465
17.1 Basic arithmetic 465
17.2 Computing minimal polynomials inF[X]/(f)(I) 468
17.3 Euclid’s algorithm 469
17.4 Computing modular inverses and Chinese remaindering 472 17.5 Rational function reconstruction and applications 474
17.6 Faster polynomial arithmetic (∗) 478
17.7 Notes 484
18 Linearly generated sequences and applications 486
18.1 Basic definitions and properties 486
18.2 Computing minimal polynomials: a special case 490 18.3 Computing minimal polynomials: a more general case 492
18.4 Solving sparse linear systems 497
18.5 Computing minimal polynomials inF[X]/(f)(II) 500 18.6 The algebra of linear transformations (∗) 501
18.7 Notes 508
19 Finite fields 509
Contents ix
19.2 The existence of finite fields 511
19.3 The subfield structure and uniqueness of finite fields 515
19.4 Conjugates, norms and traces 516
20 Algorithms for finite fields 522
20.1 Tests for and constructing irreducible polynomials 522 20.2 Computing minimal polynomials inF[X]/(f)(III) 525 20.3 Factoring polynomials: square-free decomposition 526 20.4 Factoring polynomials: the Cantor–Zassenhaus algorithm 530 20.5 Factoring polynomials: Berlekamp’s algorithm 538 20.6 Deterministic factorization algorithms (∗) 544
20.7 Notes 546
21 Deterministic primality testing 548
21.1 The basic idea 548
21.2 The algorithm and its analysis 549
21.3 Notes 558
Appendix: Some useful facts 561
Bibliography 566
Index of notation 572
Preface
Number theory and algebra play an increasingly significant role in computing and communications, as evidenced by the striking applications of these subjects to such fields as cryptography and coding theory. My goal in writing this book was to provide an introduction to number theory and algebra, with an emphasis on algorithms and applications, that would be accessible to a broad audience. In particular, I wanted to write a book that would be appropriate for typical students in computer science or mathematics who have some amount of general mathematical experience, but without presuming too much specific mathematicalknowledge.
Prerequisites. The mathematical prerequisites are minimal: no particular math-ematical concepts beyond what is taught in a typical undergraduate calculus sequence are assumed.
The computer science prerequisites are also quite minimal: it is assumed that the reader is proficient in programming, and has had some exposure to the analysis of algorithms, essentially at the level of an undergraduate course on algorithms and data structures.
Even though it is mathematically quite self contained, the text does presup-pose that the reader is comfortable with mathematical formalism and also has some experience in reading and writing mathematical proofs. Readers may have gained such experience in computer science courses such as algorithms, automata or complexity theory, or some type of “discrete mathematics for computer science students” course. They also may have gained such experience in undergraduate mathematics courses, such as abstract or linear algebra. The material in these math-ematics courses may overlap with some of the material presented here; however, even if the reader already has had some exposure to this material, it nevertheless may be convenient to have all of the relevant topics easily accessible in one place; moreover, the emphasis and perspective here will no doubt be different from that in a traditional mathematical presentation of these subjects.
Preface xi
Structure of the text. All of the mathematics required beyond basic calculus is developed “from scratch.” Moreover, the book generally alternates between “theory” and “applications”: one or two chapters on a particular set of purely mathematical concepts are followed by one or two chapters on algorithms and applications; the mathematics provides the theoretical underpinnings for the appli-cations, while the applications both motivate and illustrate the mathematics. Of course, this dichotomy between theory and applications is not perfectly main-tained: the chapters that focus mainly on applications include the development of some of the mathematics that is specific to a particular application, and very occasionally, some of the chapters that focus mainly on mathematics include a discussion of related algorithmic ideas as well.
In developing the mathematics needed to discuss certain applications, I have tried to strike a reasonable balance between, on the one hand, presenting the abso-lute minimum required to understand and rigorously analyze the applications, and on the other hand, presenting a full-blown development of the relevant mathemat-ics. In striking this balance, I wanted to be fairly economical and concise, while at the same time, I wanted to develop enough of the theory so as to present a fairly well-rounded account, giving the reader more of a feeling for the mathematical “big picture.”
The mathematical material covered includes the basics of number theory (including unique factorization, congruences, the distribution of primes, and quadratic reciprocity) and of abstract algebra (including groups, rings, fields, and vector spaces). It also includes an introduction to discrete probability theory — this material is needed to properly treat the topics of probabilistic algorithms and cryp-tographic applications. The treatment of all these topics is more or less standard, except that the text only deals with commutative structures (i.e., abelian groups and commutative rings with unity) — this is all that is really needed for the purposes of this text, and the theory of these structures is much simpler and more transparent than that of more general, non-commutative structures.
The choice of topics covered in this book was motivated primarily by their applicability to computing and communications, especially to the specific areas of cryptography and coding theory. Thus, the book may be useful for reference or self-study by readers who want to learn about cryptography, or it could also be used as a textbook in a graduate or upper-division undergraduate course on (com-putational) number theory and algebra, perhaps geared towards computer science students.
integer and polynomial arithmetic — although some of the basic ideas of this topic are developed in the exercises, the main body of the text deals only with classical, quadratic-time algorithms for integer and polynomial arithmetic. However, there are more advanced texts that cover these topics perfectly well, and they should be readily accessible to students who have mastered the material in this book.
Note that while continued fractions are not discussed, the closely related prob-lem of “rational reconstruction” is covered, along with a number of interesting applications (which could also be solved using continued fractions).
Guidelines for using the text.
• There are a few sections that are marked with a “(∗),” indicating that the material covered in that section is a bit technical, and is not needed else-where.
• There are many examples in the text, which form an integral part of the book, and should not be skipped.
• There are a number of exercises in the text that serve to reinforce, as well as to develop important applications and generalizations of, the material presented in the text.
• Some exercises are underlined. These develop important (but usually sim-ple) facts, and should be viewed as an integral part of the book. It is highly recommended that the reader work these exercises, or at the very least, read and understand their statements.
• In solving exercises, the reader is free to use anypreviouslystated results in the text, including those in previous exercises. However, except where otherwise noted, any result in a section marked with a “(∗),” or in §5.5, need not and should not be used outside the section in which it appears.
• There is a very brief “Preliminaries” chapter, which fixes a bit of notation and recalls a few standard facts. This should be skimmed over by the reader.
• There is an appendix that contains a few useful facts; where such a fact is used in the text, there is a reference such as “see §An,” which refers to the item labeled “An” in the appendix.
Preface xiii
more fully developed. Also, a number of topics have been moved forward in the text, so as to enliven the material with exciting applications as soon as possible; for example, the RSA cryptosystem is now described right after Euclid’s algorithm is presented, and some basic results concerning quadratic residues are introduced right away, in the chapter on congruences. Finally, there are numerous changes in notation and terminology; for example, the notion of a family of objects is now used consistently throughout the book (e.g., a pairwise independent family of random variables, a linearly independent family of vectors, a pairwise relatively prime family of integers, etc.).
Feedback.I welcome comments on the book (suggestions for improvement, error reports, etc.) from readers. Please send your comments to
victor@shoup.net.
There is also a web site where further material and information relating to the book (including a list of errata and the latest electronic version of the book) may be found:
www.shoup.net/ntb.
Acknowledgments. I would like to thank a number of people who volunteered their time and energy in reviewing parts of the book at various stages: Joël Alwen, Siddhartha Annapureddy, John Black, Carl Bosley, Joshua Brody, Jan Camenisch, David Cash, Sherman Chow, Ronald Cramer, Marisa Debowsky, Alex Dent, Nelly Fazio, Rosario Gennaro, Mark Giesbrecht, Stuart Haber, Kristiyan Haralambiev, Gene Itkis, Charanjit Jutla, Jonathan Katz, Eike Kiltz, Alfred Menezes, Ilya Mironov, Phong Nguyen, Antonio Nicolosi, Roberto Oliveira, Leonid Reyzin, Louis Salvail, Berry Schoenmakers, Hovav Shacham, Yair Sovran, Panos Toulis, and Daniel Wichs. A very special thanks goes to George Stephanides, who trans-lated the first edition of the book into Greek and reviewed the entire book in prepa-ration for the second edition. I am also grateful to the National Science Foundation for their support provided under grants CCR-0310297 and CNS-0716690. Finally, thanks to David Tranah for all his help and advice, and to David and his colleagues at Cambridge University Press for their progressive attitudes regarding intellectual property and open access.
Preliminaries
We establish here some terminology, notation, and simple facts that will be used throughout the text.
Logarithms and exponentials
We write logxfor the natural logarithm ofx, and logbxfor the logarithm ofxto
the baseb.
We writeexfor the usual exponential function, wheree≈2.71828 is the base of the natural logarithm. We may also write exp[x] instead ofex.
Sets and families
We use standard set-theoretic notation:∅denotes the empty set;x∈Ameans that
x is an element, or member, of the setA; for two setsA,B, A ⊆ B means that
Ais a subset of B (withApossibly equal toB), and A ( B means thatAis a proper subset ofB (i.e.,A⊆B butA 6=B). Further,A∪Bdenotes the union of
AandB,A∩B the intersection ofAandB, andA\B the set of all elements of
Athat are not inB. IfAis a set with a finite number of elements, then we write
|A| for itssize, orcardinality. We use standard notation for describing sets; for example, if we define the setS:={−2,−1, 0, 1, 2}, then{x2:x∈S}={0, 1, 4}
and{x∈S:xis even}={−2, 0, 2}.
We writeS1× · · · ×Snfor theCartesian productof setsS1, . . . ,Sn, which is
the set of alln-tuples (a1, . . . ,an), whereai∈Sifori=1, . . . ,n. We writeS×nfor
the Cartesian product ofncopies of a setS, and forx ∈S, we writex×n for the element ofS×nconsisting of ncopies ofx. (This notation is a bit non-standard, but we reserve the more standard notation Sn for other purposes, so as to avoid ambiguity.)
Preliminaries xv
Afamilyis a collection of objects, indexed by some setI, called anindex set. If for eachi ∈ I we have an associated objectxi, the family of all such objects
is denoted by {xi}i∈I. Unlike a set, a family may contain duplicates; that is, we
may havexi = xjfor some pair of indicesi,jwithi6= j. Note that while{xi}i∈I
denotes a family,{xi : i ∈ I}denotes the setwhose members are the (distinct)
xi’s. If the index setIhas some natural order, then we may view the family{xi}i∈I
as being ordered in the same way; as a special case, a family indexed by a set of integers of the form{m, . . . ,n}or{m,m+1, . . .}is asequence, which we may write as{xi}ni=mor{xi}∞i=m. On occasion, if the choice of index set is not important, we may simply define a family by listing or describing its members, without explicitly describing an index set; for example, the phrase “the family of objectsa,b,c” may be interpreted as “the family{xi}3i=1, wherex1:=a,x2:=b, andx3:=c.”
Unions and intersections may be generalized to arbitrary families of sets. For a family{Si}i∈Iof sets, the union is
[
i∈I
Si:={x:x∈Sifor somei∈I}, and forI 6=∅, the intersection is
\
i∈I
Si:={x:x∈Sifor alli∈I}.
Note that ifI = ∅, the union is by definition∅, but the intersection is, in general, not well defined. However, in certain applications, one might define it by a spe-cial convention; for example, if all sets under consideration are subsets of some “ambient space,”Ω, then the empty intersection is usually taken to beΩ.
Two setsAandB are calleddisjointifA∩B = ∅. A family{Si}i∈I of sets is
calledpairwise disjointifSi∩Sj =∅for alli,j ∈Iwithi6=j. A pairwise disjoint
family of non-empty sets whose union isSis called apartitionofS; equivalently,
{Si}i∈I is a partition of a set S if eachSi is a non-empty subset ofS, and each
element ofSbelongs to exactly oneSi.
Numbers
We use standard notation for various sets of numbers:
Z:=the set of integers={. . . ,−2,−1, 0, 1, 2, . . .}, Q:=the set of rational numbers={a/b:a,b∈Z,b6=0}, R:=the set of real numbers,
We sometimes use the symbols ∞ and −∞ in simple arithmetic expressions involving real numbers. The interpretation given to such expressions should be obvious: for example, for every x ∈ R, we have −∞ < x < ∞, x+∞ = ∞,
x− ∞ = −∞, ∞+∞ = ∞, and (−∞) +(−∞) = −∞. Expressions such as
x·(±∞) also make sense, providedx6=0. However, the expressions∞ − ∞and 0· ∞have no sensible interpretation.
We use standard notation for specifying intervals of real numbers: fora,b∈R witha≤b,
[a,b] :={x∈R:a≤x≤b}, (a,b) :={x∈R:a < x < b}, [a,b) :={x∈R:a≤x < b}, (a,b] :={x∈R:a < x≤b}. As usual, this notation is extended to allowa = −∞ for the intervals (a,b] and (a,b), andb=∞for the intervals [a,b) and (a,b).
Functions
We writef : A → B to indicate that f is a function (also called a map) from a setA to a setB. IfA0 ⊆ A, then f(A0) := {f(a) : a ∈ A0} is the imageof
A0 under f, andf(A) is simply referred to as the imageof f; if B0 ⊆ B, then
f−1(B0) :={a∈A:f(a) ∈B0}is thepre-imageofB0underf.
A functionf :A→Bis calledone-to-oneorinjectiveiff(a) =f(b) implies
a = b. The functionf is calledontoorsurjectiveiff(A) = B. The functionf
is called bijectiveif it is both injective and surjective; in this case, f is called a bijection, or aone-to-one correspondence. Iff is bijective, then we may define the inverse function f−1 : B → A, where for b ∈ B, f−1(b) is defined to be the unique a ∈ Asuch that f(a) = b; in this case, f−1 is also a bijection, and (f−1)−1=f.
IfA0⊆A, then theinclusion mapfromA0toAis the functioni:A0→Agiven by i(a) := afor a ∈ A0; whenA0 = A, this is called theidentity mapon A. If
A0 ⊆ A,f0 : A0 → B,f : A→ B, andf0(a) = f(a) for alla ∈A0, then we say thatf0is therestrictionofftoA0, and thatf is anextensionoff0toA.
Iff :A → B andg :B → C are functions, theircompositionis the function
g◦f : A → C given by (g◦f)(a) := g(f(a)) for a ∈ A. Iff : A → B is a bijection, thenf−1◦fis the identity map onA, andf◦f−1is the identity map on
B. Conversely, iff :A → Bandg :B →Aare functions such thatg◦f is the identity map onAandf ◦gis the identity map onB, thenf andgare bijections, each being the inverse of the other. Iff :A → B andg :B → C are bijections, then so isg◦f, and (g◦f)−1=f−1◦g−1.
Function composition is associative; that is, for all functions f : A → B,
Preliminaries xvii
can simply write h◦g◦f without any ambiguity. More generally, if we have functionsfi : Ai → Ai+1fori = 1, . . . ,n, wheren ≥ 2, then we may write their
composition asfn◦ · · · ◦f1without any ambiguity. If eachfiis a bijection, then so
isfn◦ · · · ◦f1, its inverse beingf1−1◦ · · · ◦fn−1. As a special case of this, ifAi =A
andfi=f fori=1, . . . ,n, then we may writefn◦ · · · ◦f1asfn. It is understood
thatf1 =f, and thatf0is the identity map onA. Iff is a bijection, then so isfn
for every non-negative integer n, the inverse function offn being (f−1)n, which one may simply write asf−n.
Iff : I → S is a function, then we may viewf as the family {xi}i∈I, where xi :=f(i). Conversely, a family{xi}i∈I, where all of thexi’s belong to some set S, may be viewed as the functionf :I →Sgiven byf(i) :=xifori∈I. Really, functions and families are the same thing, the difference being just one of notation and emphasis.
Binary operations
Abinary operation?on a setSis a function fromS×S toS, where the value of the function at (a,b) ∈S×Sis denoteda ? b.
A binary operation ?onS is called associative if for alla,b,c ∈ S, we have (a ? b) ? c = a ?(b ? c). In this case, we can simply write a ? b ? c without any ambiguity. More generally, for a1, . . . ,an ∈ S, where n ≥ 2, we can write a1?· · ·? anwithout any ambiguity.
A binary operation?on S is calledcommutativeif for alla,b ∈ S, we have
a?b=b?a. If the binary operation?is both associative and commutative, then not only is the expressiona1?· · ·? anunambiguous, but its value remains unchanged
even if we re-order theai’s.
If?is a binary operation onS, andS0 ⊆S, thenS0is calledclosed under?if
1
Basic properties of the integers
This chapter discusses some of the basic properties of the integers, including the notions of divisibility and primality, unique factorization into primes, greatest com-mon divisors, and least comcom-mon multiples.
1.1 Divisibility and primality
A central concept in number theory isdivisibility.
Consider the integersZ= {. . . ,−2,−1, 0, 1, 2, . . .}. Fora,b∈Z, we say thata dividesbifaz=bfor somez∈Z. Ifadividesb, we writea|b, and we may say thatais adivisorofb, or thatbis amultipleofa, or thatbisdivisible bya. Ifa
does not divideb, then we writea-b.
We first state some simple facts about divisibility: Theorem 1.1. For all a,b,c∈Z, we have
(i) a|a,1|a, and a|0; (ii) 0|aif and only if a=0;
(iii) a|bif and only if −a|bif and only if a| −b; (iv) a|banda|cimpliesa|(b+c);
(v) a|bandb|cimplies a|c.
Proof. These properties can be easily derived from the definition of divisibility, using elementary algebraic properties of the integers. For example,a | abecause we can writea·1=a; 1|abecause we can write 1·a=a;a|0 because we can writea·0=0. We leave it as an easy exercise for the reader to verify the remaining properties. 2
We make a simple observation: ifa | bandb 6= 0, then 1 ≤ |a| ≤ |b|. Indeed, ifaz = b 6= 0 for some integerz, thena 6= 0 andz 6= 0; it follows that|a| ≥ 1,
|z| ≥1, and so|a| ≤ |a||z|=|b|.
Theorem 1.2. For all a,b∈Z, we havea| band b|aif and only if a =±b. In particular, for everya∈Z, we havea|1if and only if a=±1.
Proof. Clearly, ifa = ±b, thena | bandb | a. So let us assume that a | band
b|a, and prove thata=±b. If either ofaorbare zero, then the other must be zero as well. So assume that neither is zero. By the above observation,a | bimplies
|a| ≤ |b|, andb|aimplies|b| ≤ |a|; thus,|a|=|b|, and soa=±b. That proves the first statement. The second statement follows from the first by settingb:=1, and noting that 1|a. 2
The product of any two non-zero integers is again non-zero. This implies the usualcancellation law: ifa,b, andcare integers such thata6=0 andab=ac, then we must haveb= c; indeed,ab =acimpliesa(b−c) = 0, and soa 6= 0 implies
b−c=0, and henceb=c.
Primes and composites. Letnbe a positive integer. Trivially, 1 and ndividen. Ifn > 1 and no other positive integers besides 1 andndividen, then we saynis prime. Ifn >1 butnis not prime, then we say thatniscomposite. The number 1 is not considered to be either prime or composite. Evidently,nis composite if and only ifn= abfor some integersa,bwith 1< a < nand 1< b < n. The first few primes are
2, 3, 5, 7, 11, 13, 17, . . . .
While it is possible to extend the definition of prime and composite to negative integers, we shall not do so in this text:whenever we speak of a prime or composite number, we mean a positive integer.
A basic fact is that every non-zero integer can be expressed as a signed product of primes in an essentially unique way. More precisely:
Theorem 1.3 (Fundamental theorem of arithmetic). Every non-zero integer n
can be expressed as
n=±pe1
1 · · ·p er
r,
wherep1, . . . ,prare distinct primes and e1, . . . ,erare positive integers. Moreover,
this expression is unique, up to a reordering of the primes.
Note that ifn = ±1 in the above theorem, thenr = 0, and the product of zero terms is interpreted (as usual) as 1.
1.1 Divisibility and primality 3
of the rest of this section and the next are devoted to developing a proof of this theorem. We shall give a quite leisurely proof, introducing a number of other very important tools and concepts along the way that will be useful later.
To prove Theorem 1.3, we may clearly assume thatnis positive, since otherwise, we may multiplynby−1 and reduce to the case wherenis positive.
The proof of the existence part of Theorem 1.3 is easy. This amounts to showing that every positive integer n can be expressed as a product (possibly empty) of primes. We may prove this by induction onn. Ifn = 1, the statement is true, as
n is the product of zero primes. Now letn > 1, and assume that every positive integer smaller thann can be expressed as a product of primes. If nis a prime, then the statement is true, asnis the product of one prime. Assume, then, thatn
is composite, so that there exista,b ∈Zwith 1< a < n, 1< b < n, andn= ab. By the induction hypothesis, bothaandbcan be expressed as a product of primes, and so the same holds forn.
The uniqueness part of Theorem 1.3 is the hard part. An essential ingredient in this proof is the following:
Theorem 1.4 (Division with remainder property). Let a,b ∈ Z with b > 0.
Then there exist uniqueq,r∈Zsuch thata=bq+rand 0≤r < b.
Proof. Consider the setSof non-negative integers of the forma−btwitht ∈Z. This set is clearly non-empty; indeed, ifa≥0, sett:= 0, and ifa <0, sett:= a. Since every non-empty set of non-negative integers contains a minimum, we define
rto be the smallest element ofS. By definition,ris of the formr = a−bq for someq∈Z, andr≥0. Also, we must haver < b, since otherwise,r−bwould be an element ofSsmaller thanr, contradicting the minimality ofr; indeed, ifr≥b, then we would have 0≤r−b=a−b(q+1).
That proves the existence ofrandq. For uniqueness, suppose thata = bq+r
anda = bq0+r0, where 0 ≤ r < band 0 ≤ r0 < b. Then subtracting these two equations and rearranging terms, we obtain
r0−r=b(q−q0).
Thus, r0 −r is a multiple of b; however, 0 ≤ r < b and 0 ≤ r0 < b implies
|r0−r|< b; therefore, the only possibility isr0−r =0. Moreover, 0 =b(q−q0) andb6=0 impliesq−q0=0. 2
Theorem 1.4 can be visualized as follows:
Starting witha, we subtract (or add, ifais negative) the valuebuntil we end up with a number in the interval [0,b).
Floors and ceilings. Let us briefly recall the usual floorand ceiling functions, denotedb·candd·e, respectively. These are functions fromR(the real numbers) toZ. Forx∈R,bxcis the greatest integerm≤ x; equivalently,bxcis the unique integermsuch that m≤ x < m+1, or put another way, such thatx = m+εfor someε ∈[0, 1). Also,dxeis the smallest integerm ≥ x; equivalently,dxeis the unique integermsuch thatm−1< x≤m, or put another way, such thatx=m−ε
for someε∈[0, 1).
The mod operator.Now leta,b∈Zwithb >0. Ifqandrare the unique integers from Theorem 1.4 that satisfya=bq+rand 0≤r < b, we define
amodb:=r;
that is,amodbdenotes the remainder in dividingabyb. It is clear thatb | aif and only ifamodb= 0. Dividing both sides of the equationa= bq+rbyb, we obtaina/b=q+r/b. Sinceq ∈Zandr/b∈[0, 1), we see thatq=ba/bc. Thus,
(amodb)=a−bba/bc.
One can use this equation to extend the definition ofamodbto all integersaand
b, withb6=0; that is, forb <0, we simply defineamodbto bea−bba/bc. Theorem 1.4 may be generalized so that when dividing an integeraby a positive integer b, the remainder is placed in an interval other than [0,b). Let x be any real number, and consider the interval [x,x+b). As the reader may easily verify, this interval contains preciselybintegers, namely,dxe, . . . ,dxe+b−1. Applying Theorem 1.4 witha− dxein place ofa, we obtain:
Theorem 1.5. Let a,b ∈ Zwith b > 0, and let x ∈ R. Then there exist unique
q,r∈Zsuch thata=bq+rand r∈[x,x+b).
EXERCISE 1.1. Leta,b,d∈Zwithd6=0. Show thata|bif and only ifda|db. EXERCISE 1.2. Letn be a composite integer. Show that there exists a prime p
dividingn, withp≤n1/2.
EXERCISE 1.3. Letmbe a positive integer. Show that for every real numberx≥1, the number of multiples ofmin the interval [1,x] isbx/mc; in particular, for every integern≥1, the number of multiples ofmamong 1, . . . ,nisbn/mc.
1.2 Ideals and greatest common divisors 5
EXERCISE 1.5. Letx∈Randn∈Zwithn >0. Show thatbbxc/nc=bx/nc; in particular,bba/bc/cc=ba/bccfor all positive integersa,b,c.
EXERCISE 1.6. Leta,b∈Zwithb <0. Show that (amodb) ∈(b, 0].
EXERCISE1.7. Show that Theorem 1.5 also holds for the interval (x,x+b]. Does it hold in general for the intervals [x,x+b] or (x,x+b)?
1.2 Ideals and greatest common divisors
To carry on with the proof of Theorem 1.3, we introduce the notion of anideal of Z, which is a non-empty set of integers that is closed under addition, and closed under multiplication by an arbitrary integer. That is, a non-empty setI ⊆Zis an ideal if and only if for alla,b∈Iand allz∈Z, we have
a+b∈I and az∈I.
Besides its utility in proving Theorem 1.3, the notion of an ideal is quite useful in a number of contexts, which will be explored later.
It is easy to see that every idealI contains 0: sincea ∈ I for some integer a, we have 0 = a·0 ∈I. Also, note that if an idealI contains an integera, it also contains−a, since−a = a·(−1) ∈I. Thus, if an ideal containsaandb, it also containsa−b. It is clear that{0}andZare ideals. Moreover, an idealI is equal toZif and only if 1∈I; to see this, note that 1 ∈Iimplies that for everyz∈Z, we havez= 1·z ∈I, and henceI = Z; conversely, ifI = Z, then in particular, 1∈I.
Fora∈Z, defineaZ:={az:z∈Z}; that is,aZis the set of all multiples ofa. Ifa=0, then clearlyaZ={0}; otherwise,aZconsists of the distinct integers
. . . ,−3a,−2a,−a, 0,a, 2a, 3a, . . . .
It is easy to see that aZ is an ideal: for allaz,az0 ∈ aZ andz00 ∈ Z, we have
az+az0 = a(z+z0) ∈ aZ and (az)z00 = a(zz00) ∈ aZ. The idealaZis called theideal generated bya, and an ideal of the formaZfor somea ∈Zis called a principal ideal.
Observe that for all a,b ∈ Z, we have b ∈ aZ if and only if a | b. Also observe that for every ideal I, we haveb ∈ I if and only ifbZ ⊆ I. Both of these observations are simple consequences of the definitions, as the reader may verify. Combining these two observations, we see thatbZ⊆aZif and only ifa|b.
is also an ideal. Indeed, supposea1+a2∈I1+I2andb1+b2∈I1+I2. Then we have (a1+a2)+(b1+b2)=(a1+b1)+(a2+b2)∈I1+I2, and for everyz∈Z, we have (a1+a2)z=a1z+a2z∈I1+I2.
Example 1.1. Consider the principal ideal 3Z. This consists of all multiples of 3; that is, 3Z={. . . ,−9,−6,−3, 0, 3, 6, 9, . . .}. 2
Example 1.2. Consider the ideal 3Z+5Z. This ideal contains 3·2+5·(−1)=1. Since it contains 1, it contains all integers; that is, 3Z+5Z=Z. 2
Example 1.3. Consider the ideal 4Z+6Z. This ideal contains 4·(−1)+6·1=2, and therefore, it contains all even integers. It does not contain any odd integers, since the sum of two even integers is again even. Thus, 4Z+6Z=2Z. 2
In the previous two examples, we defined an ideal that turned out upon closer inspection to be a principal ideal. This was no accident: the following theorem says that all ideals ofZare principal.
Theorem 1.6. Let I be an ideal of Z. Then there exists a unique non-negative
integer dsuch that I=dZ.
Proof. We first prove the existence part of the theorem. IfI = {0}, thend = 0 does the job, so let us assume thatI 6= {0}. SinceIcontains non-zero integers, it must contain positive integers, since ifa ∈I then so is−a. Letdbe the smallest positive integer inI. We want to show thatI=dZ.
We first show that I ⊆ dZ. To this end, leta be any element inI. It suffices to show thatd | a. Using the division with remainder property, writea = dq+r, where 0≤r < d. Then by the closure properties of ideals, one sees thatr=a−dq
is also an element of I, and by the minimality of the choice ofd, we must have
r=0. Thus,d|a.
We have shown thatI ⊆ dZ. The fact thatdZ ⊆ I follows from the fact that
d∈I. Thus,I=dZ.
That proves the existence part of the theorem. For uniqueness, note that if
dZ = eZfor some non-negative integer e, thend | e ande | d, from which it follows by Theorem 1.2 thatd=±e; sincedandeare non-negative, we must have
d=e. 2
Greatest common divisors. For a,b∈Z, we calld∈ Zacommon divisorofa andbifd|aandd|b; moreover, we call such adagreatest common divisorof
aandbifdis non-negative and all other common divisors ofaandbdivided. Theorem 1.7. For all a,b∈Z, there exists a unique greatest common divisor dof
1.2 Ideals and greatest common divisors 7
Proof. We apply the previous theorem to the idealI :=aZ+bZ. Letd∈Zwith
I =dZ, as in that theorem. We wish to show thatdis a greatest common divisor ofaandb. Note thata,b,d∈Ianddis non-negative.
Since a ∈ I = dZ, we see that d | a; similarly,d | b. So we see that dis a common divisor ofaandb.
Sinced∈I =aZ+bZ, there exists,t∈Zsuch thatas+bt=d. Now suppose
a=a0d0andb=b0d0for somea0,b0,d0∈Z. Then the equationas+bt=dimplies thatd0(a0s+b0t)=d, which says thatd0 |d. Thus, any common divisord0ofaand
bdividesd.
That proves thatdis a greatest common divisor ofaandb. For uniqueness, note that ife is a greatest common divisor ofa andb, thend | eande | d, and hence
d=±e; since bothdandeare non-negative by definition, we haved=e. 2 Fora,b∈Z, we write gcd(a,b) for the greatest common divisor ofaandb. We say thata,b∈Zarerelatively primeif gcd(a,b) =1, which is the same as saying that the only common divisors ofaandbare±1.
The following is essentially just a restatement of Theorem 1.7, but we state it here for emphasis:
Theorem 1.8. Leta,b,r∈Zand letd:=gcd(a,b). Then there exist s,t∈Zsuch
that as+bt= rif and only if d| r. In particular,aand bare relatively prime if and only if there exist integerssandtsuch thatas+bt=1.
Proof. We have
as+bt=r for somes,t∈Z
⇐⇒ r∈aZ+bZ
⇐⇒ r∈dZ (by Theorem 1.7)
⇐⇒ d|r.
That proves the first statement. The second statement follows from the first, setting
r:=1. 2
Note that as we have defined it, gcd(0, 0) = 0. Also note that when at least one ofaorbare non-zero, gcd(a,b) may be characterized as thelargestpositive integer that divides bothaandb, and as thesmallestpositive integer that can be expressed asas+btfor integerssandt.
b, we obtain
abs+cbt=b. (1.1)
Sincecdividesabby hypothesis, and sincecclearly dividescbt, it follows thatc
divides the left-hand side of (1.1), and hence thatcdividesb. 2
Suppose thatpis a prime andais any integer. As the only divisors ofpare±1 and±p, we have
p|a =⇒ gcd(a,p)=p, and
p-a =⇒ gcd(a,p)=1.
Combining this observation with the previous theorem, we have:
Theorem 1.10. Let pbe prime, and let a,b∈Z. Thenp|abimplies that p|aor
p|b.
Proof. Assume thatp|ab. Ifp|a, we are done, so assume thatp-a. By the above observation, gcd(a,p)=1, and so by Theorem 1.9, we havep|b. 2
An obvious corollary to Theorem 1.10 is that ifa1, . . . ,ak are integers, and ifp
is a prime that divides the producta1· · ·ak, thenp|aifor somei=1, . . . ,k. This
is easily proved by induction onk. Fork =1, the statement is trivially true. Now letk >1, and assume that statement holds fork−1. Then by Theorem 1.10, either
p|a1orp|a2· · ·ak; ifp|a1, we are done; otherwise, by induction,pdivides one
ofa2, . . . ,ak.
Finishing the proof of Theorem 1.3.We are now in a position to prove the unique-ness part of Theorem 1.3, which we can state as follows: ifp1, . . . ,pr are primes
(not necessarily distinct), andq1, . . . ,qsare primes (also not necessarily distinct),
such that
p1· · ·pr =q1· · ·qs, (1.2)
then (p1, . . . ,pr) is just a reordering of (q1, . . . ,qs). We may prove this by induction
on r. Ifr = 0, we must haves = 0 and we are done. Now supposer > 0, and that the statement holds for r −1. Since r > 0, we clearly must have s > 0. Also, asp1 obviously divides the left-hand side of (1.2), it must also divide the
right-hand side of (1.2); that is, p1 | q1· · ·qs. It follows from (the corollary to)
Theorem 1.10 thatp1 | qjfor somej = 1, . . . ,s, and moreover, sinceqjis prime,
we must havep1 = qj. Thus, we may cancel p1 from the left-hand side of (1.2)
1.2 Ideals and greatest common divisors 9
EXERCISE 1.8. LetIbe a non-empty set of integers that is closed under addition (i.e.,a+b∈Ifor alla,b∈I). Show thatIis an ideal if and only if−a∈Ifor all
a∈I.
EXERCISE 1.9. Show that for all integersa,b,c, we have: (a) gcd(a,b)=gcd(b,a);
(b) gcd(a,b)=|a| ⇐⇒ a|b;
(c) gcd(a, 0)=gcd(a,a)=|a|and gcd(a, 1)=1; (d) gcd(ca,cb)=|c|gcd(a,b).
EXERCISE 1.10. Show that for all integersa,bwithd:= gcd(a,b) 6=0, we have gcd(a/d,b/d) =1.
EXERCISE 1.11. Letnbe an integer. Show that ifa,bare relatively prime integers, each of which dividesn, thenabdividesn.
EXERCISE 1.12. Show that two integers are relatively prime if and only if there is no one prime that divides both of them.
EXERCISE 1.13. Leta,b1, . . . ,bk be integers. Show that gcd(a,b1· · ·bk) = 1 if
and only if gcd(a,bi) =1 fori=1, . . . ,k.
EXERCISE 1.14. Letpbe a prime andkan integer, with 0< k < p. Show that the binomial coefficient
p k
= p!
k!(p−k)!, which is an integer (see §A2), is divisible byp.
EXERCISE 1.15. An integer a is calledsquare-free if it is not divisible by the square of any integer greater than 1. Show that:
(a) a is square-free if and only ifa = ±p1· · ·pr, where thepi’s are distinct
primes;
(b) every positive integerncan be expressed uniquely asn=ab2, whereaand
bare positive integers, andais square-free.
EXERCISE 1.16. For each positive integerm, letIm denote{0, . . . ,m−1}. Let a,bbe positive integers, and consider the map
τ : Ib×Ia→Iab
EXERCISE 1.17. Let a,b,c be positive integers satisfying gcd(a,b) = 1 and
c ≥ (a− 1)(b− 1). Show that there exist non-negative integers s,t such that
c=as+bt.
EXERCISE 1.18. For each positive integer n, let Dn denote the set of positive
divisors ofn. Letn1,n2be relatively prime, positive integers. Show that the sets Dn1 ×Dn2 and Dn1n2 are in one-to-one correspondence, via the map that sends
(d1,d2)∈Dn1 ×Dn2 tod1d2.
1.3 Some consequences of unique factorization
The following theorem is a consequence of just the existence part of Theorem 1.3: Theorem 1.11. There are infinitely many primes.
Proof. By way of contradiction, suppose that there were only finitely many primes; call themp1, . . . ,pk. Then setM := Q
k
i=1piandN :=M +1. Consider a prime p that divides N. There must be at least one such prime p, since N ≥ 2, and every positive integer can be written as a product of primes. Clearly, p cannot equal any of thepi’s, since if it did, thenpwould divideM, and hence also divide
N−M =1, which is impossible. Therefore, the primepis not amongp1, . . . ,pk,
which contradicts our assumption that these are the only primes. 2
For each primep, we may define the functionνp, mapping non-zero integers to non-negative integers, as follows: for every integern6=0, ifn=pem, wherep-m, thenνp(n) :=e. We may then write the factorization ofninto primes as
n=±Y
p pνp(n),
where the product is over all primes p; although syntactically this is an infinite product, all but finitely many of its terms are equal to 1, and so this expression makes sense.
Observe that ifaandbare non-zero integers, then
νp(a·b) =νp(a)+νp(b) for all primesp, (1.3) and
a|b ⇐⇒ νp(a)≤νp(b) for all primesp. (1.4) From this, it is clear that
gcd(a,b)=Y
p
1.3 Some consequences of unique factorization 11
Least common multiples. For a,b ∈ Z, acommon multiple of a andb is an integer msuch thata | mandb | m; moreover, such anmis the least common multipleofa andbifmis non-negative andmdivides all common multiples of
a and b. It is easy to see that the least common multiple exists and is unique, and we denote the least common multiple ofaandbby lcm(a,b). Indeed, for all
a,b∈Z, if eitheraorbare zero, the only common multiple ofaandbis 0, and so lcm(a,b)=0; otherwise, if neitheranorbare zero, we have
lcm(a,b) =Y
p
pmax(νp(a),νp(b)),
or equivalently, lcm(a,b) may be characterized as the smallest positive integer divisible by bothaandb.
It is convenient to extend the domain of definition of νp to include 0, defining
νp(0) :=∞. If we interpret expressions involving “∞” appropriately (see Prelimi-naries), then for arbitrarya,b∈Z, both (1.3) and (1.4) hold, and in addition,
νp(gcd(a,b))=min(νp(a),νp(b)) and νp(lcm(a,b))=max(νp(a),νp(b))
for all primesp.
Generalizing gcd’s and lcm’s to many integers. It is easy to generalize the notions of greatest common divisor and least common multiple from two integers to many integers. Leta1, . . . ,ak be integers. We call d ∈ Z a common divisor ofa1, . . . ,ak if d | ai fori = 1, . . . ,k; moreover, we call such a dthe greatest
common divisor of a1, . . . ,ak if d is non-negative and all other common
divi-sors ofa1, . . . ,ak divided. The greatest common divisor ofa1, . . . ,ak is denoted
gcd(a1, . . . ,ak) and is the unique non-negative integerdsatisfying νp(d)=min(νp(a1), . . . ,νp(ak)) for all primesp.
Analogously, we call m ∈ Z a common multiple of a1, . . . ,ak if ai | m for all i=1, . . . ,k; moreover, such anmis called the least common multiple ofa1, . . . ,ak
if mdivides all common multiples ofa1, . . . ,ak. The least common multiple of a1, . . . ,ak is denoted lcm(a1, . . . ,ak) and is the unique non-negative integerm
sat-isfying
νp(m)=max(νp(a1), . . . ,νp(ak)) for all primesp.
Finally, we say that the family{ai}ki=1ispairwise relatively primeif for all indices
i,jwithi6= j, we have gcd(ai,aj) =1. Certainly, if{ai}ki=1 is pairwise relatively prime, andk >1, then gcd(a1, . . . ,ak)=1; however, gcd(a1, . . . ,ak) =1 does not
Rational numbers. Consider the rational numbersQ= {a/b :a,b∈Z, b6= 0}. Given any rational numbera/b, if we set d := gcd(a,b), and define the integers
a0 := a/dandb0 :=b/d, then we havea/b= a0/b0 and gcd(a0,b0) = 1.
More-over, if a1/b1 = a0/b0, then we havea1b0 = a0b1, and sob0 | a0b1; also, since
gcd(a0,b0) =1, we see thatb0| b1; writingb1 =b0c, we see thata1=a0c. Thus,
we can represent every rational number as a fraction inlowest terms, which means a fraction of the form a0/b0 wherea0 andb0 are relatively prime; moreover, the
values of a0 andb0 are uniquely determined up to sign, and every other fraction
that represents the same rational number is of the forma0c/b0c, for some non-zero
integerc.
EXERCISE 1.19. Let n be an integer. Generalizing Exercise 1.11, show that if
{ai}ki=1 is a pairwise relatively prime family of integers, where eachaidividesn,
then their productQk
i=1aialso dividesn.
EXERCISE 1.20. Show that for all integersa,b,c, we have: (a) lcm(a,b)=lcm(b,a);
(b) lcm(a,b)=|a| ⇐⇒ b|a; (c) lcm(a,a) =lcm(a, 1)=|a|; (d) lcm(ca,cb)=|c|lcm(a,b).
EXERCISE 1.21. Show that for all integersa,b, we have: (a) gcd(a,b)·lcm(a,b) =|ab|;
(b) gcd(a,b)=1 =⇒ lcm(a,b) =|ab|.
EXERCISE 1.22. Leta1, . . . ,ak ∈Zwithk >1. Show that:
gcd(a1, . . . ,ak)=gcd(a1, gcd(a2, . . . ,ak))=gcd(gcd(a1, . . . ,ak−1),ak);
lcm(a1, . . . ,ak)=lcm(a1, lcm(a2, . . . ,ak))=lcm(lcm(a1, . . . ,ak−1),ak).
EXERCISE 1.23. Leta1, . . . ,ak ∈Zwithd := gcd(a1, . . . ,ak). Show thatdZ = a1Z+· · ·+akZ; in particular, there exist integersz1, . . . ,zk such thatd=a1z1+ · · ·+akzk.
EXERCISE 1.24. Show that if{ai}ki=1is a pairwise relatively prime family of inte-gers, then lcm(a1, . . . ,ak) =|a1· · ·ak|.
EXERCISE 1.25. Show that every non-zerox∈Qcan be expressed as
x=±pe1
1 · · ·p er
r ,
where thepi’s are distinct primes and the ei’s are non-zero integers, and that this
1.3 Some consequences of unique factorization 13
EXERCISE 1.26. Letnandkbe positive integers, and suppose x ∈ Qsuch that
xk =nfor somex ∈Q. Show thatx∈Z. In other words,√knis either an integer or is irrational.
EXERCISE 1.27. Show that gcd(a+b, lcm(a,b))=gcd(a,b) for alla,b∈Z. EXERCISE 1.28. Show that for every positive integerk, there existkconsecutive composite integers. Thus, there are arbitrarily large gaps between primes.
EXERCISE1.29. Letpbe a prime. Show that for alla,b∈Z, we haveνp(a+b) ≥ min{νp(a),νp(b)}, andνp(a+b) =νp(a) ifνp(a)< νp(b).
EXERCISE 1.30. For a given primep, we may extend the domain of definition of
νp fromZtoQ: for non-zero integersa,b, let us defineνp(a/b) := νp(a)−νp(b).
Show that:
(a) this definition of νp(a/b) is unambiguous, in the sense that it does not depend on the particular choice ofaandb;
(b) for allx,y∈Q, we haveνp(xy) =νp(x)+νp(y);
(c) for allx,y ∈Q, we haveνp(x+y) ≥ min{νp(x),νp(y)}, andνp(x+y) =
νp(x) ifνp(x)< νp(y);
(d) for all non-zerox∈Q, we havex=±Qppνp(x), where the product is over all primes, and all but a finite number of terms in the product are equal to 1; (e) for allx∈Q, we havex∈Zif and only ifνp(x)≥0 for all primesp.
EXERCISE 1.31. Letnbe a positive integer, and let 2k be the highest power of 2 in the setS:={1, . . . ,n}. Show that 2k does not divide any other element inS. EXERCISE 1.32. Letn∈Zwithn >1. Show thatPni=11/iis not an integer.
EXERCISE1.33. Letnbe a positive integer, and letCndenote the number of pairs of integers (a,b) witha,b∈ {1, . . . ,n}and gcd(a,b) =1, and letFnbe the number ofdistinctrational numbersa/b, where 0≤a < b≤n.
(a) Show thatFn=(Cn+1)/2.
(b) Show thatCn ≥ n2/4. Hint: first show thatCn ≥ n2(1−Pd≥21/d2), and
then show thatP
d≥21/d2≤3/4.
EXERCISE 1.34. This exercise develops a characterization of least common mul-tiples in terms of ideals.
(a) Arguing directly from the definition of an ideal, show that ifI andJ are ideals ofZ, then so isI∩J.
(a), we know that I ∩ J is an ideal. By Theorem 1.6, we know that
2
Congruences
This chapter introduces the basic properties of congruences modulon, along with the related notion of residue classes modulon. Other items discussed include the Chinese remainder theorem, Euler’s phi function, Euler’s theorem, Fermat’s little theorem, quadratic residues, and finally, summations over divisors.
2.1 Equivalence relations
Before discussing congruences, we review the definition and basic properties of equivalence relations.
LetSbe a set. A binary relation∼onSis called anequivalence relationif it is reflexive: a∼afor alla∈S,
symmetric: a∼bimpliesb∼afor alla,b∈S, and transitive: a∼bandb∼cimpliesa∼cfor alla,b,c∈S.
If∼is an equivalence relation onS, then fora∈Sone defines itsequivalence classas the set{x∈S:x∼a}.
Theorem 2.1. Let ∼be an equivalence relation on a set S, and fora ∈S, let [a]
denote its equivalence class. Then for all a,b∈S, we have: (i) a∈[a];
(ii) a∈[b]implies[a]=[b].
Proof. (i) follows immediately from reflexivity. For (ii), supposea∈ [b], so that
a ∼ bby definition. We want to show that [a] = [b]. To this end, consider any
x∈S. We have
x∈[a] =⇒ x∼a (by definition)
=⇒ x∼b (by transitivity, and sincex∼aanda∼b) =⇒ x∈[b].
Thus, [a]⊆[b]. By symmetry, we also haveb∼a, and reversing the roles ofaand
bin the above argument, we see that [b]⊆[a]. 2
This theorem implies that each equivalence class is non-empty, and that each element of S belongs to a unique equivalence class; in other words, the distinct equivalence classes form a partition of S (see Preliminaries). A member of an equivalence class is called arepresentativeof the class.
EXERCISE 2.1. Consider the relations=,≤, and<on the setR. Which of these are equivalence relations? Explain your answers.
EXERCISE 2.2. LetS := (R×R)\ {(0, 0)}. For (x,y), (x0,y0) ∈ S, let us say (x,y) ∼ (x0,y0) if there exists a real numberλ > 0 such that (x,y) = (λx0,λy0). Show that∼is an equivalence relation; moreover, show that each equivalence class contains a unique representative that lies on the unit circle (i.e., the set of points (x,y) such thatx2+y2 =1).
2.2 Definitions and basic properties of congruences
Letnbe a positive integer. For integersaandb, we say thatais congruent tob
modulonifn|(a−b), and we writea≡b(modn). Ifn-(a−b), then we write
a 6≡ b (modn). Equivalently,a ≡ b (modn) if and only ifa = b+nyfor some
y ∈ Z. The relationa ≡ b (modn) is called acongruence relation, or simply, a congruence. The numbernappearing in such congruences is called themodulus of the congruence. This usage of the “mod” notation as part of a congruence is not to be confused with the “mod” operation introduced in §1.1.
If we view the modulus n as fixed, then the following theorem says that the binary relation “· ≡ ·(modn)” is an equivalence relation on the setZ.
Theorem 2.2. Let nbe a positive integer. For all a,b,c∈Z, we have:
(i) a≡a(modn);
(ii) a≡b(modn)impliesb≡a (modn);
(iii) a≡b(modn)andb≡c(modn)implies a≡c(modn).
2.2 Definitions and basic properties of congruences 17
a−b, then it also divides−(a−b) =b−a. For (iii), observe that ifndividesa−b
andb−c, then it also divides (a−b)+(b−c)=a−c. 2
Another key property of congruences is that they are “compatible” with integer addition and multiplication, in the following sense:
Theorem 2.3. Let a,a0,b,b0,n∈Zwith n >0. If
a≡a0 (modn) and b≡b0 (modn),
then
a+b≡a0+b0 (modn) and a·b≡a0·b0 (modn).
Proof. Suppose thata ≡ a0 (modn) andb ≡ b0 (modn). This means that there exist integersxandysuch thata=a0+nxandb=b0+ny. Therefore,
a+b=a0+b0+n(x+y), which proves the first congruence of the theorem, and
ab=(a0+nx)(b0+ny) =a0b0+n(a0y+b0x+nxy), which proves the second congruence. 2
Theorems 2.2 and 2.3 allow one to work with congruence relations modulo n
much as one would with ordinary equalities: one can add to, subtract from, or multiply both sides of a congruence modulo n by the same integer; also, if bis congruent to a modulo n, one may substitute b for a in any simple arithmetic expression (involving addition, subtraction, and multiplication) appearing in a con-gruence modulon.
Now suppose ais an arbitrary, fixed integer, and consider the set of integersz
that satisfy the congruence z ≡ a (modn). Since zsatisfies this congruence if and only if z = a +ny for somey ∈ Z, we may apply Theorems 1.4 and 1.5 (withaas given, andb:=n) to deduce that every interval ofnconsecutive integers contains exactly one suchz. This simple fact is of such fundamental importance that it deserves to be stated as a theorem:
Theorem 2.4. Let a,n∈Zwith n >0. Then there exists a unique integer zsuch
that z ≡ a (modn) and 0 ≤ z < n, namely,z := amodn. More generally, for everyx∈R, there exists a unique integer z∈[x,x+n)such thatz≡a(modn). Example 2.1. Let us find the set of solutionszto the congruence
Suppose that zis a solution to (2.1). Subtracting 4 from both sides of (2.1), we obtain
3z≡2 (mod 7). (2.2)
Next, we would like to divide both sides of this congruence by 3, to getzby itself on the left-hand side. We cannot do this directly, but since 5·3≡ 1 (mod 7), we can achieve the same effect by multiplying both sides of (2.2) by 5. If we do this, and then replace 5·3 by 1, and 5·2 by 3, we obtain
z≡3 (mod 7).
Thus, ifzis a solution to (2.1), we must havez≡3 (mod 7); conversely, one can verify that ifz≡3 (mod 7), then (2.1) holds. We conclude that the integerszthat are solutions to (2.1) are precisely those integers that are congruent to 3 modulo 7, which we can list as follows:
. . . ,−18,−11,−4, 3, 10, 17, 24, . . . 2
In the next section, we shall give a systematic treatment of the problem of solving linear congruences, such as the one appearing in the previous example.
EXERCISE 2.3. Leta,b,n∈Zwithn >0. Show thata≡b(modn) if and only if (amodn)=(bmodn).
EXERCISE 2.4. Let a,b,n ∈ Z with n > 0 and a ≡ b (mod n). Also, let
c0,c1, . . . ,ck ∈Z. Show that
c0+c1a+· · ·+ckak ≡c0+c1b+· · ·+ckbk (modn).
EXERCISE 2.5. Leta,b,n,n0 ∈ Zwith n > 0, n0 > 0, andn0 | n. Show that if
a≡b(modn), thena≡b(modn0).
EXERCISE 2.6. Leta,b,n,n0 ∈Zwithn > 0, n0 > 0, and gcd(n,n0) = 1. Show that ifa≡b(modn) anda≡b(modn0), thena≡b(modnn0).
EXERCISE 2.7. Let a,b,n ∈ Z with n > 0 and a ≡ b (mod n). Show that gcd(a,n)=gcd(b,n).
EXERCISE 2.8. Leta be a positive integer whose base-10 representation isa =
(ak−1· · ·a1a0)10. Letb be the sum of the decimal digits of a; that is, letb := a0+a1+· · ·+ak−1. Show thata≡b(mod 9). From this, justify the usual “rules of