SELECTIONS FROM THE ARITHMETIC GEOMETRY OF SHIMURA CURVES I: MODULAR CURVES

(1)

SHIMURA CURVES I: MODULAR CURVES

PETE L. CLARK

These are notes accompanying a sequence of three lectures given in the UGA num-ber theory seminar in August and Septemnum-ber of 2006. I originally planned to give a single lecture on some Diophantine problems involving Atkin-Lehner twists of modular curves, but have been persuaded by Dino Lorenzini to expand it into three lectures and include Shimura curves as well.

This first lecture is on (classical) modular curves. It is a rather “straight up” expository account of the subject, suitable for people who have heard about mod-ular curves before but seen little about them. After reviewing some truly classical1 results about modular curves over the complex numbers, we will discuss rational and integral canonical models, focusing in on the case of X0(N) for squarefreeN. We hope to end this first lecture by revealing a (presumably) surprising connection between the geometry of modular curves and the arithmetic of quaternion algebras.

1. Uniformization of Riemann surfaces

I will for the most part be making a bunch of interrelated statements of fact. Most of them are nontrivial and the reader should not be alarmed by their inability to supply a proof. Sometimes when some statement follows especially easily from an-other statement, I rapidly indicate a proof.

Convention: All algebraic varieties we meet will be connected and nonsingular. In this section, all varieties will be defined overC. The first fact that we will be taking for granted is that every (nonsingular!) algebaraic variety V/C gives rise to

a complex manifold, whose set of points is V(C). Moreover, the connectedness of

V in the algebraic sense – or, if you like, the connectedness ofV(C) endowed with theZariski topology - is equivalent to the connectedness of the associated complex manifold – or, if you like, the connectedness of V(C) endowed with the analytic topology (which is strictly finer than the Zariski topology unlessV(C) is finite).2 Now suppose C is a complex algebraic curve, so thatC(C) can be endowed with the structure of a Riemann surface (i.e., a one-dimensional complex manifold). On the other hand, not every Riemann surface arises this way. The precise condition for a Riemann surface to arise in this way – to be algebraic – is that it be the complement of a finite set in a compact Riemann surface. In this case the com-pactification is well-determined up to isomorphism – in particular, its genus g(M) 1_{Classical does not imply trivial, of course, nor even devoid of contemporary research interest.} 2_{Note that this is a special property of the complex numbers}_{C. It does not even hold, e.g., for}

varieties overR. In fact the (finite) number of real-analytic components of a (Zariski-)connected varietyV/R is an interesting invariant.

(2)

is well-determined – as is the numberp(M) of points required to compactify. (The corresponding algebraic curve isprojective ifp(M) = 0 and affine ifp(M)>0.) To say a bit more: given any connected Riemann surface M, the uniformization theorem asserts the following: (i) the universal covering ˜M can be endowed with the structure of a Riemann surface in a unique way so that the covering map

π : ˜M → M is a local C-analytic isomorphism; and (ii) there are, up to equiva-lence, precisely three simply connected Riemann surfaces: the Riemann sphereP1_; the Euclidean planeA1_{; and the upper half-plane}_H_.3

P1 _{covers only itself.}4 _{Given an unramified covering} _A1 _→ _M_{, we can view the} deck transformation group Λ as a lattice inA1 _{and hence}_M ₌_A1_/_{Λ. Aside from} the trivial case of Λ = 0, we have the case in which Λ ∼= Z and the quotient is isomorphic toC\0, the multiplicative group; otherwise Λ∼=Z2_and_M _{is an elliptic} curve.

In contrast toP1_and_A1_{, the upper halfplane}_H_{is not itself algebraic, which makes} it a much richer source of Riemann surfaces: indeed, by the process of elimination, every Riemann surface which is notP1_,_A1_, _G

m or an elliptic curve is uniformized

byH: in particular every compact Riemann surface of genus at least 2. 2. Fuchsian groups

In real life, most group actions have fixed points; in particular, it is important to expand our notation of “uniformized Riemann surface” to include (finite) ram-ification. This goes as follows: first, the group of biholomorphic automorphisms ofHisP SL2(R) =SL2(R)/±1 acting by usual linear fractional transformations:

z7→ az+b cz+d.

(Remarks: More generally we have GL2(C) acting on P1_{(C) =} _C_{∪ ∞}_, _GL_2(R) acting onH±₌_P1_(C)_\_P1_{(R), and}_GL_2(R)+_{, the matrices of positive determinant,} acting onH. However, scalar matrices give rise to the trivial linear transformation, andGL2(R)+_/_{scalars =}_{P SL}_2(R).)

By definition, a Fuchsian group Γ is a discrete subgroup ofP SL2(R), so every Fuchsian group acts onH, and we may consider the map

πΓ:H 7→Γ\H=Y(Γ).

It turns out thatY(Γ) can be given the structure of a Riemann surface, uniquely specified by the condition that πΓ be a quotient map – that is, for any Riemann surface M, a functionf :Y(Γ)→M is holomorphic iff its pullbackf◦πΓ toHis holomorphic.5 _{Note well that this map is in general ramified (over a discrete but} 3_{The Riemann mapping theorem asserts that any simply connected proper domain in}_C_is

biholomorphic toH, so can be used in place ofH. Occasionally it is also useful to work with the open unit disk.

4_{Exercise: Give as many different proofs as you can, using (for instance) Euler characteristic,}

the Riemann-Hurwitz theorem, the Lefschetz Fixed point theorem, Luroth’s Theorem. . . 5_{This “categorical” perspective on quotients of manifolds with extra structure seems to be}

better known among European mathematicians, since it is used systematically in the Bourbaki books.

(3)

at the moment possibly infinite set of points).

Remark: In some sense this is a “lucky fact,” turning on the especially simple local structure of ramified finite-to-one maps of Riemann surfaces (up to local bi-holomorphism, there are only the maps z 7→ zn_{). If I am not mistaken, given}

a locally compact group G acting properly discontinuously on a locally compact spaceX, the only general assertion that can be made about the quotient G\X is that it is a Hausdorff space. In particular, the quotient of even a finite groupGof biholomorphic automorphisms of a higher-dimensional complex manifold can have singularities (“orbifold points”).

What is the condition on Γ such thatY(Γ) be an algebraic Riemann surface? It is most easily phrased as follows: there exists onHa very nice hyperbolic6_metric

ds2₌ dxdy

y2 ;

in particular it is P SL2(R)-invariant. (After remarking that conformal automor-phisms of planar regions preserve angles andds2_{, being a rescaling of the Euclidean} metric, measures angles in the same way, what remains to be checked is that linear fractional transformations preserve hyperboliclength, which, if you like, is the ex-planation for why the factor ofy2_{has been inserted in the denominator.) We have} the notion of afundamental domain Dfor Γ – this is the closureD of a connected open subset of Hsuch that the Γ-translates ofD tileH (i.e.,H=S_γ_∈_ΓγD, and distinct translates intersect only along∂D) and of its volumev with respect to the hyperbolic metric.7

Theorem 1. For a Fuchsian groupΓ, the quotient surfaceY(Γ) = Γ\His algebraic iffΓ has finite hyperbolic volume.

The special case in which Y(Γ) is compact – i.e., Γ has a compact fundamental domain – will be especially important in what follows. In older terminology, these are said to be Fuchsian groupsof the first kind.

Example: Take Γ(1) =P SL2(Z). Then, as is well-known, a fundamental domain for Γ can be obtained by taking |<(z)| ≤ 1

2, |z| ≥1, and the identifications along the boundary are such that Γ(1)\His, as a topological space, homeomorphic toA1_. Moreover, the volume is π

3 <∞, so it is isomorphic to A1 as a Riemann surface (and not to the other, nonalgebraic Riemann surface to which it is homeomorphic, namely H). Note that the uniformization map π : H → Y(1) is ramified over two points ofY(1) (it is also, in a sense in which we are trying to avoid discussing in detail, infinitely ramified over the missing point at∞), namely the distinguished points e2πi/4 _and _e2πi/6 _{of the fundamental region. These are, respectively, the} unique fixed points in H of the elementsS = . . . and T =. . .. S and T are, up to conjugacy, the only nontrivial elements of Γ(1) of finite order. It follows that if Γ ⊂ Γ(1) is any subgroup containing no conjugacy class of S or T – and in 6_{We will ignore the temptation – quite strong in August 2006 – to digress on the general topic}

of geometric structures on manifolds.

7_{Since there are many different fundamental domains for a given Γ, it formally speaking must}

(4)

particular if Γ is normal and contains neitherS norT – then the map

πΓ :H →Y(Γ) is the universal covering map.

For anyN ≥1, we define the subgroup Γ(N)⊂P SL2(Z) as the projectivization of the kernel of the (surjective) homomorphism SL2(Z)→ SL2(Z/NZ), Y(N) =

Y(Γ(N)), andX(N) the compactification of Y(N). For N ≥2, Γ(N) has no ele-ments of finite order.

Example: X(2) is the projective line and Y(2) is the projective line with three points removed. It follows that Γ(2) is the fundamental group of the complex pro-jective line with three points removed, i.e., a free group on 2 generators.

Now a remarkable theorem of Belyi states that every (compact, connected, non-singular) algebraic curveC defined over some number field – i.e., overQ– admits a map to P1 _{branched over 3 points. Removing the branch points, we get an} un-ramified cover fromCminus its branch points toY(2). This means that there is a finite index subgroup Γ of Γ(2) such that the compactification of Y(Γ) is isomor-phic toC! In other words, if we allow a modular curve to be the compactification of some finite-index subgroup of the modular group Γ(1), then everyalgebraic al-gebraic curve is a modular curve (and conversely, because the ramification is over only three points, one can see that every modular curve can be defined over some number field).

Although on the face of it this result says that all algebraic curves over Q have a “hidden modular structure,” in practice it is not clear how to exploit this “extra structure” and it seems that we should read this result in the other direction: i.e., to get interesting results we should restrict the class of finite index subgroups of Γ(1). This brings us to the definition of acongruence subgroupΓ⊂Γ(1), which is a subgroup containing Γ(N) for someN.

Exercise 1: Use the above reasoning to explain why there must be many non-congruence finite-index subgroups of Γ(1). (One can in fact show that the probabil-ity that an indexdsubgroup of Γ(1) is a congruence subgroup goes to 0 asd→ ∞.) Exercise 2: Show that there are infinitely many finite-index subgroups Γ ⊂ Γ(1) such thatY(Γ) has genus 0.

Remark: Henceforth we will only discuss noncongruence subgroups as a source of counterexamples, and by a modular curve we shall mean a curve of the form

Y(Γ) for a congruence subgroup Γ⊂2(Z).

Particular congruence subgroups of interest are Γ0(N) and Γ1(N); the correspond-ing curves are denotedY0(N) andY1(N) and their compactified versions asX0(N) andX1(N). We thus have, for anyN, finite covering maps

(5)

as well as, for anyM | N, natural mapsX•(N)→X•(M).

Cusps: Since Y0(1) has genus zero and needs precisely one point in order to be compact, there exists a coordinate function J on X0(1) – i.e., an isomorphism

X0(1) →∼ P1 _{which carries the unique point of} _X₀₍₁₎_\_Y_{0(1) to the point at} _∞_, which carriese2πi/4_{to 1728 and}_e2πi/6_{to 0 (since the automorphism group of}_P1_is triply transitive, we can send any three points anywhere we like). Indeed, the clas-sical theory tells us that the functionτ7→j(Eτ) descends to give such a coordinate function which has especially nice arithmetic properties: in particular, the field of moduli of τ is Q(j(Eτ). We are avoiding mentioning this (important!) classical

construction in the interest of saving time.

Having fixed such a coordinate, we speak of the point at ∞ of X0(1) and call it thecuspofX0(1). Moreover, for any Γ⊂Γ(1), thecuspsonX(Γ) are precisely the preimages of the single cusp∞onX0(1). In other words, the cusps are precisely the points we must add in order to compactifyY(Γ). The theory of Fuchsian groups describes the cusps explicitly in terms fixed points of certain “parabolic” elements of Γ on the boundary RP1∼₌_S1 _of _H_{, another important and useful construction} we will omit for lack of time.

We will be most interested in the family of curves X0(N) for squarefree N. In this case it can be shown that the number of cusps is precisely equal to the number of positive divisors ofN, i.e., 2r _if_N ₌Qr

i=1pi. (This would follow quite easily if only we described cusps in the usual way. Later we will deduce it as a consequence of the structure of the modular automorphism group ofX0(N).)

It is not especially hard to compute the genus ofX0(N). The degree of the covering

X0(N)→X1(N) isψ(N) =Q_i(pi+ 1), so by Riemann-Hurwitz it suffices to figure out the ramification overJ = 0, 1728 and ∞. This can be done in an elementary way, although it is probably more enlightening to do the calculation after one has the moduli interpretation in mind. Anyway, here is the well-known answer: put

e2(N) = Y

p|N

(1 +

µ −4

p

¶

),

e3(N) = Y

p|N

(1 +

µ −3

p

¶

).

Then

g(X0(N)) = 1 +ψ(N) 12 −

e2(N)

4 −

e3(N) 3 −2

r−1_. In other words the genus is roughly N

12, and is indeed asymptotic to it when we bound the number of prime divisors ofN. In particular it tends to∞withN. The (squarefree) values ofN for which X0(N) has genus 0 (i.e., is itself isomorphic to P1_{) are:}

N= 1,2,3,5,6,7,10,13 and the values for which it has genus 1 are

(6)

Remark: For anyN, there are known formulas for the genera of X0(N),X1(N),

X(N) and other similarly related curves. In the case ofX0(N) for non-squarefree

N, the formulas become a bit more complicated (and especially so whenN is divis-ible by 4 or 9). As we shall see later, this is indicative of the fact that theintegral geometry of X0(N) is completely understood when N is squarefree, whereas the case of generalN is a topic of contemporary research interest.

Exercise 3:

a) Suppose p is prime. Show that the P SL2(Fp)-Galois cover X(p) → X(1) is

ramified overJ = 1728 with ramification index 2, overJ = 0 with r. index 3, and overJ =∞with indexp.

b) Using the Riemann-Hurwitz formula, show that

g(X(p)) = 1 + (p

2₋₁₎₍_p₋₆₎

24 .

In particular,X(p) has genus zero ⇐⇒ p≤5.

c) (Not so easy) Show8 _{that the genus 3 curve} _X_{(7) is nothing else than Klein’s} quartic curve, defined e.g. by the equation

X3Y +Y3Z+Z3X = 0.

3. Studying the modular curves over C

Complex geometers tend to study properties of “the generic curve” of genusgrather than specific curves of genus g. Although as arithmetic geometers we care most about the additional arithmetic structure on modular curves to be discussed in the next section, it is worth pointing out that modular curves arealreadyof deep inter-est as compact Riemann surfaces. Here are some sample results and conjectures: Definition: The gonality of a complex (projective, nonsingular, connected) al-gebraic curve is the least degree of a morphism toP1_{. It is easy to see, using the} canonical divisor, that (ifg(C)>1), the gonality is at most 2g(C)−2.9

Theorem 2. (Abramovich) LetΓ⊂P SL2(Z)be a congruence subgroup. Then the gonality ofX(Γ)is at least 7

800[P SL2(Z) : Γ].

Remark: Note that by Exercise 2 (and the fact that there are only finitely many subgroups of Γ(1) of any given finite index) this result is false without the “con-gruence” hypothesis.

It is remarkable that the proof, although rather short, uses deep results from dif-ferential geometry (due to Li and Yau) and spectral theory (Selberg’s celebrated boundλ1≥₁₆3 on the smallest positive eigenvalue of the hyperbolic Laplacian). In the case of X0(N), this gives an linear lower bound on the gonality in terms

8_{For a solution, see Noam Elkies’s beautiful survey article on the arithmetic geometry of the}

Klein quartic.

9_{Actually this holds for the}_k_{-gonality for curves over any field} _k_{; in the special case}_k₌_C,

(7)

of the genus. If one is content with ag12-type lower bound, then there is an

argu-ment due to Ogg which stays within the realm of arithmetic geometry.10

Since the generic curve of genus g will have gonality approximately equal to g, this shows that modular curvesare “typical” in this geometric sense.

However, there is another sense in which they are not typical. Given a (discrete) family Ci of compact Riemann surfaces, one can ask which elliptic curves E are covered by this family, i.e., for which there exists for someia finite mapCi →E. Note that this is equivalent to E occurring up to isogeny as a direct factor of the JacobianJ(Ci). In particular, for any fixedCi, only countably many isomorphism classes of elliptic curves are covered byCi, so assuming our family is itself countable, in all only countably many elliptic curves could be covered by the family (whereas there are uncountably many elliptic curves in all, as we shall shortly see). We might as well assume that eachCihas genus at least 2; then, it seems reasonable to expect thatgenerically they do not cover any elliptic curves at all. (As far as I know this is purely geometric problem – concerning how certain loci of principally polarized abelian varieties with extra endomorphisms intersect the Torelli, or Jacobian, locus – is open.)

On the other hand the elliptic modularity theorem (Wiles-Breuil-Conrad-Diamond-Taylor) asserts that at least every elliptic curve which can be defined overQis cov-ered by someX0(N). Are there other elliptic curves covered byX0(N) (for suitable

N)? Yes indeed: there is a conjectural characterization due to Ribet, which would follow from Serre’s conjecture (i.e., it looks to be pretty close to a theorem). Finally, here is another interesting question with a “classical” geometric flavor: Question 1. Where are the Weierstrass points onX0(N)?

In the 1960’s, Oliver Atkin asked whether, for any squarefreeN, a cusp can be a Weierstrass point ofX0(N).11 _{Atkin showed that this does not happen when}_X₀₍_N₎ has genus 0 and Ogg extended this to the case of X0(pN) where (p, N) = 1 and

X0(N) has genus zero (i.e., infinitely many cases), as well as the case ofX0(11p). (To make things even more intriguing, Atkin showed that whenN is sufficiently far from being squarefree, there are cuspidal Weierstrass points onX0(N).) There are closely related recent results of Ahlgren-Ono and El-Guindy about modpreductions of Weierstrass points. But if they are not cusps, then (as we shall see presently) the Weierstrass points correspond to specific elliptic curves. Which ones? What are their fields of moduli? There are, as far as I know, only two papers addressing these questions, by Silverman and Burnol. Notwithstanding the interesting mathematics that these two papers contain, the subject remains essentially wide open.

10_{Note mostly to myself: I have a paper in which I use this result for}_X1₍_N_{) as well as}_X0₍_N₎

– or, in fact for the Shimura curve analogues thereof – and cited Abramovich’s result rather than Ogg’s for this reason. A while ago, Matt Baker indicated to me that Ogg’s arguments can be made to work forX1(N) (and possibly all congruence subgroups?) as well; at some point I should try to write this up.

11_{Computations of W.A. Stein confirm that for squarefree}_N_{at most 2000 or so cusps are never}

Weierstrass points, so it seems reasonable to upgrade “Atkin’s Question” to “Atkin’s Conjecture.” A proof seems to be totally out of current reach.

(8)

4. Connections with elliptic curves

In factY(1) =P SL2(Z)\Hnaturally parameterizes isomorphism classes of elliptic curves and congruence coverings parameterize isomorphism classes of elliptic curves together with some extra torsion structure.

Let us at least explain the first statement. Fix τ ∈ H. Then we can define an elliptic curve Eτ which is the quotient of C by the lattice Λτ = Z1⊕Zτ. We

don’t get all lattices in this way, but that’s okay, because two lattices Λ1, Λ2 will uniformize isomorphic elliptic curves iff they are homothetic, i.e., if there exists

α∈Csuch that αΛ1= Λ2. Given any basisτ1, tau2for a lattice Λ in C– i.e.,τ1 andτ2 are two elements ofC which areR-linearly independent – we can uniquely rescale to makeτ1= 1 (i.e., divide byτ1!) and thenτ :=τ2/τ1∈C\R. τ is then not necessarily inH; the extra (necessary and sufficient) condition here is that the original basisτ1, τ2 bepositively oriented, a notion which homothety does not dis-turb. Of course any lattice has a positively oriented basis: if the given basis is not positively oriented, interchange the basis elements.

This shows that the map τ 7→Eτ is surjective onto all elliptic curves. More pre-cisely, it gives an isomorphism between Hand the set of all homothety classes of positively oriented lattice bases. The right hand side has “too much structure” – we would rather just have each homothety class appearing once. But we have natural

SL2(Z) actions on both sides: on the left by linear fractional transformations and on the right sinceSL2(Z) acts simply transitively on the set of positively oriented bases for a given lattice. Working this out a bit, one deduces:

Proposition 3. The assignmentτ7→Λτ descends to an isomorphism fromY(1) = P SL2(Z)\H to the set of all isomorphism classes of complex elliptic curves. Thus Y(1) is, at least in some naive sense, a moduli space of elliptic curves.12 _If we mod out by some subgroup of Γ(1), we are then obtaining some structure which is intermediate between the choice of a positively oriented basis for Λτ and just

the homothety class of Λτ (i.e., the elliptic curve itself). When Γ is one of the

congruence subgroups, this extra structure can be nicely interpreted. We will just give the answers:

I.Y(N) = Γ(N)\H parameterizes elliptic curvesE together with an isomorphism

E[N]∼= Λτ[N] = 1/NZ1⊕1/NZτ

Z1⊕Zτ .

Such an isomorphism is called afull level N structureonE.

II. Y1(N) = Γ1(N)\H parameterizes isomorphisms (E, P) 7→ (Λτ,_N1 ·1), or,

in-formally, elliptic curves together with a distinguished point of exact orderN. III. Y0(N) = Γ0(N)\H parameterizes isomorphisms (E, CN)7→ (Λτ,h1

N ·1i), or,

informally, elliptic curves together with a distinguished orderN cyclic subgroup. 12_{It is not within the scope of these lectures to carefully define a (coarse) moduli space. Indeed,}

in a department rife with algebraic geometers, there are plenty of people better qualified than I to explain these ideas.

(9)

One can consider intermediate and composite level structures.

In fact, there is nothing formally to stop us from making the following general-ization: given any subgroup Γ⊂SL2(Z),Y(Γ) parameterizes Γ-orbits of isomor-phisms from a positively oriented basis of a uniformizing lattice forE to Λτ. Thus,

as described earlier, from Belyi’s theorem it follows that every algebraic curve over Qis technically a moduli space of elliptic curves.13

5. Canonical models

We would like to study the modular curves arithmetically; in particular, at least for

X0(N) andX1(N) we would like to know them as algebraic curves overQ(rather than overC, or, by “the obvious direction of Belyi’s theorem,” overQ).

There are many ways of defining canonical Q-models on X0(N) and X1(N) (all compatible with each other). The most classical methods exploit one or both of the following properties of the modular curves Y•(N): (i) they are non-compact,

i.e., there are always cusps; (ii) they all cover the affine line, namelyY(1). Let us briefly recall some of these methods:

1) (q-expansion principle): For a Fuchsian group Γ, define a Γ-modular function

f :H →Cto be a function which is invariant under Γ – so they descend toY(Γ) – and are meromorphic at the cusps – so they extend to functionsX(Γ)→P1_{. Thus} we get a complex-analytic description of the field of functionsC(Y(Γ)) (which, re-call, is enough to determine the compactified curve up to isomorphism). If Γ is a congruence subgroup ofP SL2(Z) then it contains z7→ z+N for someN, and this periodicicity leads to a Fourier series expansion relative to each of the cusps. Consider the subfieldQΓ of modular functions whose Fourier series expansions at each cusp have Q-coefficients, i.e., are elements of Q((q)). Then, if Γ = Γ0(N) or Γ1(N), one can show thatQΓ, which is evidently a subfield ofC(X(Γ)), is a regular extension ofQof transcendence degree 1, so defines aQ-rational model.

Advantages: This generalizes nicely to Hilbert and Siegel modular forms, and is an important part of the theory ofp-adic modular forms.

Disadvantages: If Γ has no cusps, the rationality condition is vacuous and thus

QΓ=C(X(Γ)): no good.

2) (explicit functions approach): Recall that we chose a special function J on

X(1) =C(t) by normalizing its value at the three ramification points of the uni-formization mapH → Y(1). Thus we can certainly put a rational model onY(1) corresponding to the function fieldQ(J).

Begin digression of David Foster Wallacian proportions We could have taken any nonconstant function on Y(1). Why was the J-function well-chosen? Because for τ ∈ H, J(τ) = J(Eτ) = j(Eτ) generates the field of moduli of the 13_{Again, when Γ is a non-congruence subgroup it seems unlikely that this is of any use, but I}

(10)

elliptic curveEτ. Note that this is true forJ iff it is true for any functionJ which we normalize to take values inP1_{(Q) at the three ramification points. Why did we} make the specific choicesJ(epii/4_{) = 1728,}_J₍_e2πi/6_{) = 0? (Or, if you like, why not} 3J or 17J

J+5?) Because then theJ function not only has good rationality properties, it has good integrality properties: (e.g. if j(E)∈Q, thenE has potentially good reduction atpiffj(E)∈Zp), so is thus the correct choice for anintegral canonical

model ofX(1).

The importance of finding the “integrally correct” normalization of the coordinate function on a genus zero modular curve is a topic that the late 19th century number theorists (notably Weber, Fricke, Klein) were well aware of (probably better aware of than most present-day number theorists), a favorable consequence of the more down-to-earth approach to mathematical objects that was the spirit of the day. As far as I know, the classical theorists did indeed find such a distinguished genera-tor of all genus zero modular function fields: such a thing is called aHauptmodul. In some sense, the modern “explanation” for what a Hauptmodul is – namely, a choice of integral canonical model well before the notions of mod p reduction of curves were considered.14 _{– is anachronistic and limited. Rather, if you actually} workexplicitly with modular curves, then certain nice functions naturally present themselves to you, and working with the “naturally nice” functions will turn out to have advantages beyond any particular axiomatic framework. For instance, the Moonshine phenomenon began when John Thompson noticed a simple arithmetic relationship between the Fourier coefficients of theJ-function and the dimensions of the irreducible representations of the Monster (i.e., the largest sporadic simple group). One of the truly puzzling things about Moonshine is that it is a remarkable, but finite, coincidence: there are, of course, only finitely many irreducible represen-tations of the monster which we are comparing to the first finitely many coefficients of J: how do you tell if it is all just some ridiculous concidence? What you can do is look for similar structure among Fourier coefficients of other Hauptmoduls, as did Conway, Norton and McKay, and they found them!15 _{In some sense, then,} although as far as rational canonical models are concerned, 17J

J+5 is as good as J, the Monster prefers J, and similarly for finitely many other genus zero modular function fields. It is all a bit spooky, and (as I have heard Conway himself remark, in paraphrase) although Borcherds’ work is spectacular mathematics, it leaves most of the spookiness of the situation intact.

End D.F.W.-style digression.

Anyway, if J(τ) is the j-invariant of Eτ, then J(N τ) is the j-invariant of Enτ, which is canonically cyclically N-isogenous to E. (Check for yourself.) Thinking about this a bit, it is not too hard to see that the function JN :τ 7→J(N τ) is a Γ0(N)-modular function, and that C(J, JN) =C(Y0(N)), hence it makes sense to define QΓ0(N) as Q(J, JN) and not so hard to show that this gives a Q-model of

14_{I am no historian, but I would be shocked if Weber, Fricke and Klein were, in any reasonable}

sense, interested in characteristicp.

15_{It was left to Richard Borcherds to actually find the mathematics linking these objects, using}

(11)

C(Y0(N)). This is a sort of maximally explicit approach: there is an actual poly-nomial ΦN[X, Y]∈Z[X, Y] out there which gives the algebraic relation betweenJ

andJN, and

ΦN[X, Y] = 0

is an explicit equation forX0(N).

Advantages: This is a maximally elementary approach, understandable if you know only the definitions of elliptic curve, j-invariant and modular function. We have not really used (i) (the existence of cusps): replacing J by any element of Q(J) would not change the function field generated by the modular polynomial.

Disadvantages: We did, of course use (ii), the existence of a mapping down to a genus zero curve. Moreover actually computing the modular polynomial is in-credibly time-consuming (I don’t have precise references here, but unless you are a professional computational number theorist showing off your clever new approach, even something like N = 101 is hopeless), and the model given by ΦN[X, Y] = 0

will have horrible singularities.16

(3) Galois Theory approach: take E to be an elliptic curve over Q(J) with j -invariant J, i.e., a so-called generic elliptic curve. Let KN/Q(J) be the field ex-tension cut out by trivializing the N-torsion on E. One can show that KN/Q(J) is Galois with groupGL2(Z/NZ). Now let us work with congruence subgroups

KN ⊂G˜ ⊂GL2(Z)

where KN is the kernel of the reduction map GL2(Z) → GL2(Z/NZ). Such subgroup ˜G clearly correspond to subgroups G ⊂ GL2(Z/NZ). We can define Q( ˜G) =K_N±G.

Advantages: this approach shows us why we have been staying away from the Q-canonical model of X(N): namely, the full extension KN/Q(J) contains the constant extension Q(ζN). (In general, the theory of the Weil pairing implies that to trivialize theN-torsion on an elliptic curve in characteristic 0, one necessarily also trivializes the N-torsion in the multiplicative group, i.e., all N-torsion fields contain a primitive Nth root of unity, say ζN.) Thus we get a canonical model for X(N) over Q(ζN). (In general, we say that a modular curve Γ has level N if

N is the least positive integer M for which Γ⊂Γ(N). Then the argument shows that any modular curve of levelN is defined overQ(ζN).) Looking more carefully, one can see that the constant extension of Q contained in KG

N(Γ) is cut out by

the image of the determinant map det :GL2(Z/NZ)→(Z/NZ)× _{restricted to}_G_,

and this map is surjective for ˜G= ˜G0(N) (defined in the same way as Γ0(N), i.e., matrices which are upper triangular moduloN) and Γ1(N) (defined as matrices in

·

a b c d

¸

witha∼= 1 (mod N), c∼= 0 (mod N)).

Overall this is a quite enlightening approach, and I recommend reading Rohrlich’s article, which takes this tack to develop much of the basic theory of modular curves. 16_{On the other hand, there is enough literature on computing modular polynomials that there}

(12)

Disadvantages: We are still assuming (ii), that we have a P1 _{at the bottom of} our tower.

(4) Moduli space approach: We have been working complex-analytically with el-liptic curves overCbut there is a perfectly well-defined moduli problem of elliptic curves overQ, whose coarse moduli space gives rise to the canonical modelY(1)Q,

which is indeed just the affine line overQ. Moreover, the moduli problem of isomor-phism classes of elliptic curves overQ-schemes together with a distinguished point of order N (i.e., an isomorphism (E, P) → (E0_{, P}0_{) is an isomorphism of elliptic}

curves (over some base Q-scheme) which carries the point P 7→ P0_{) gives rise to}

the curves Y1(N) and X1(N)/Q. (Having exact order N is a condition that, by

definition if you like, we check on geometric fibers.) Finally, the moduli problem of an elliptic curve together with a cyclic, orderN subgroup scheme (i.e., a finite flat subgroup scheme each of whose geometric fibers yields a cyclic orderN subgroup) gives rise toY0(N) andX0(N) overQ.

Advantages: This will work for Shimura curves over Q, once we define the anal-ogous moduli problem. Moreover it works nicely for Hilbert moduli space, Siegel moduli space (and in general for PEL-type Shimura varieties), and is in general such an important technique in arithmetic and algebraic geometry that one needs to know about it no matter what. Perhaps most importantly, it points the way to a similar approach tointegral canonical models.

Advantage/Disadvantage?: The uniqueness of the model is solved immediately; it is then tempting to slough over the existence problem as being relatively ab-stract nonsense. The latter cannot really be the case: consider, for instance that the moduli problem makes perfect sense even for non-congruence subgroups, so at some point in the proof of the existence of a coarse moduli scheme overQ, we must be making nontrivial use of the congruence subgroup property. (It is not especially mysterious how this goes – the strategy will be (i) to show that X(N) forN ≥3 is a fine moduli space and then (ii) use descent theory to solve the corresponding relative moduli problem – but I don’t want to enter into the details here.) But in fact this may be an advantage: if, as a student of modular curves, you are looking for corners to cut in order to learn the material, then wondering about the details of the existence of an object guaranteed to be unique by general principles and which has been studied by many people over the years, then perhaps the proof that such canonically defined objects do indeed exist is a relatively safe corner to cut! Disadvantage: The method still does not work for the most general quaternionic Shimura curves, which are in fact not moduli spaces, at least not over their ground field. The truly satisfactory theory of rational canonical models of Shimura vari-eties – as laid down by Shimura himself and clarified and extended by Deligne – uses a yet more complicated approach, essentially using the explicit reciprocity law of Shimura-Taniyama for abelian varieties with complex multiplication to predict what the field of definition of each of a certain Zariski-dense countable subset of “special points” should be and then showing that there exists a unique model over an appropriate number field giving the correct field of moduli at each of the special

(13)

points.

How important is this special point construction, both technically in the defin-ition of Shimura varieties (including, to be sure, certain Shimura curves that are rather closely related to the ones we shall discuss) and philosophically in under-standing what’s going on in things like the Andr´e-Oort Conjecture, Heegner point constructions on elliptic curves, work of Mazur, Wiles, Rubin, Bertolini, Darmon, Vatsal, Cornut, Dasgupta.... . .? It is of the highest level of importance. But it is technically demanding enough to be ever so far beyond the scope of these notes.

6. Some enormous theorems

We are now well entitled to considerX0(N) andX1(N) as curves defined overQ. It would be silly to spend so many pages describing the rational canonical models of these curves without reviewing at least some of the spectacular results concerning these models.

First, a (relatively elementary) key observation:

Fact 1. For anyN, there is at least oneQ-rational cusp onX1(N), and hence also onX0(N).

We have not said enough about the cusps in order to get into the proof. But morally, a theme in algebraic geometry is that most naturally occurring moduli spaces are not compact, and in order to compactify them we have to enlarge our moduli problem by considering degenerate, or (hopefully mildly) singular objects. This is possible in our case: we can consider the cuspidal points ofX1(N) as para-meterizing “N´eronN-gons” and from this moduli interpretation it is easy to find that there is a distinguished cusp (still called∞) which isQ-rational.17

Now for the pyrotechnics:

Theorem 4. (Mazur) For any positive integerN, the following are equivalent: a)X1(N)∼=P1_.

a0₎_X₁₍_N₎_{has genus zero.}

b)N ≤10orN = 12. c)Y1(N)(Q) is nonempty.

Remark: The equivalence of a) and a0_{) follows from the existence of} _Q-rational

cusps; the equivalence of these conditions with b) follows from known genus formu-las forX1(N).18 _{That these conditions imply c) is obvious, so the meat (and there} is about a hundred pages worth of meat) is in c) =⇒ b).

Remark: Mazur proved this theorem in 1974. It had long been part of the folklore (for a while it was called Ogg’s conjecture, but it was later pointed out that it went back at least as far as Beppo-Levi).

17_{More elementary proofs are certainly possible.}

18_{Although we did not these write down explicitly, they are not too hard to derive; easier still}

(14)

Theorem 4 leads rather easily to a characterization of all possible torsion sub-groups of rational elliptic curves: namely, cyclic of orderN for the above permitted values ofN orZ/2Z×Z/2νZfor 1≤ν ≤4.

Remark: One can seek to generalize this result by studying points onX1(N) over number fields of higher degree. Work of Kamienny, Abramovich, Merel and Parent culminated in a proof of the uniform boundedness conjecture: there exists a func-tionF(d) such that if K/Qis any number field of degreed, then the order of the torsion subgroup of any elliptic curveE/K is at mostF(d).

Exercise XX: Show that the uniform boundedness of the size of the entire tor-sion subgroup in [K:Q] is equivalent to the finiteness of the set of allN such that

Y1(N)(K)6=∅for all number fields Kof any given degree.

The bounds of Merel and Parent are exponential in d, which is presumably far from the truth, so the problem of finding the optimal bound F(d) remains wide open.19

Yet more impressive is the following:

Theorem 5. (Mazur) Suppose thatN >163is prime. ThenY0(N)(Q) =∅. More precisely, Y0(N)(Q)6=∅iff any one of the following holds:

a)Y0(N)has genus0.

b)Q(√−N) has class number1.

c)N = 17(genus one) or N= 37(genus two).

Remark: To discuss the “easy” half of Mazur’s theorem: if a) holds, thenX0(N) has genus 0 and a rational cusp so isP1_{and hence has infinitely many rational points,} only finitely many of which are cusps. Part b) follows easily from the theory of complex multiplication. The case ofN = 17 is not too surprising, sinceX0(17) is an elliptic curve, and current beliefs are that the “chance” that an elliptic curve has infinitely many rational points is 1

2, whereas rational points on curves of higher genus are thought to be significantly more rare. (On the other hand, the Mordell-Weil group of the elliptic curveX0(17)/Q is not infinite; it isZ/4Z, which means

there are precisely two noncuspidal rational points.) The case ofN= 37 is a famous anomaly which had earlier been studied in a paper of Mazur and Swinnerton-Dyer. Theorem 6. (Elliptic Modularity Theorem) For every elliptic curve E/Q, there exists, for some N, a finite morphismX0(N)→E.

Remark: This is a somewhat more precise form of the result alluded to in§X. One can be even more precise: there exists a finite morhpism X0(N) → E iff N is a multiple of the conductorN(E) ofE, an integer which is (almost) characterized by the following: for any primep≥5,N(E) is exactly divisible byp0 _if_E _{has good} reduction at p, by pif E has semistable (aka multiplicative) bad reduction at p, and byp2 _if _E _{has nonsemistable (aka additive) bad reduction at} _p_{. (Things are} more complicated at 2 and 3.)

Following a complaint/suggestion of Dino Lorenzini, let us discuss a bit of the 19_{Some remarks on this are made in [?] and [?].}

(15)

history of this all-important theorem. To every elliptic curveE/Qone can associate itsL-seriesL(E, s), a certain meromorphic function. In the first half of the twenti-eth century Hecke was able to identify theL-series of an elliptic curve withcomplex multiplication as theL-series obtained from a character of the maximal abelian ex-tension of the CM fieldK (a “Grossencharacter”). This showed in particular that

L(E, s), a priori defined for s > 1 by an Euler product, had much nicer analytic properties: in particular, it extends to an entire function on the complex plane and satisfies a certain functional equation.

In the 1950’s Eichler and Shimura made the following construction: given a weight two modular (new)form f on Γ0(N) with Q-rational Fourier coefficients, they de-fined an elliptic curve E(f)/Q in such a way that L(E(f), s) = L(f, s), where

the latter is the natural Dirichlet series P∞_n₌₁an

ns associated to the modular form P_∞

n=1anqn. This elliptic curve is constructeda priorias a quotient of the Jacobian ofX0(N), so it admits a finite morphism fromX0(N). (This construction general-izes Hecke’s construction in a way that we will not attempt to discuss.) Conversely, it is not so hard to see that any elliptic curveE which is the target of a morphism

X0(N)→Eis obtained via the Eichler-Shimura construction.

It is therefore natural to ask for the image of the Eichler-Shimura construction, i.e., which rational elliptic curves arise in this way? Apparently the bold suggestion thatall rational elliptic curves should arise in this way is due to Yutaka Taniyama. Taniyama’s original thoughts in this direction were vaguer (and the first precision of them turned out to be incorrect), and the precise form of the conjecture was, I believe, developed in collaboration with Shimura. In its early days the conjecture met with great skepticism, which seems a little odd from our modern eyes. (Given

E/Q, to disprove the strong form of the conjecture, namely that for anyE/Q there exists a weight 2 newformf on Γ0(N(E)) such thatL(f, s) =L(E, s), would be a matter of finite computation. Similarly, given anyE/Q, one can computeN(E) and

show that there is at most one newform f of level N(E) whose coefficients match theL-series coefficients ofE. One cannot prove the modularity this way, but one can match theap’s for as many primespas one likes, so where are the grounds for disbelief? Of course, such computations were far from routine back then.)

One of the controversies in this subject is whether and/or to what extent Weil’s name should be associated with this conjecture. I was not around at the time so it’s not for me to try to sort out the history of this, but let me instead point out a very important theorem of Weil’s. Remember that one of the reasons that the modularity conjecture is so exciting is that it implies that theL-functionL(E, s) has desirable analytic properties. What Weil showed is that if you instead postu-late the L(E, s), together with all (or sufficiently many) of its twists by Dirichlet characters, has these properties – namely analytic continuation and a precise func-tional equation – then one candeduce the modularity ofE. This theorem (“Weil’s converse theorem”) is important enough in the subject (and in particular, gives strong philosophical evidence for the conjecture, whether Weil thought so or not!) that I find it appropriate to speak of theTaniyama-Shimura-Weil Conjecture.

(16)

Of course the T-S-W conjecture was proved in the semistable case (squarefree con-ductor) by Andrew Wiles with the assistance of Richard Taylor in 1995, and the full theorem was proven by Christophe Breuil, Brian Conrad, Fred Diamond and Taylor in 1999. Wiles’ work was undoubtedlyun des plus grand tours de force in mathematical history, but like any modern mathematician his work relies on that of many others, in particular that of Mazur and Langlands-Tunnell.

7. Transition to Shimura curves

Why is the modularity theorem so important for the study of thearithmetic(rather than merely than analytic number theory) of rational elliptic curves? Given any elliptic curveE/Q, there is a map

X0(N)→E

defined over Q, so that we can do something remarkable: study a fixed elliptic curve using a family of all elliptic curves. At first it might seem that Theorem XX limits the usefulness of this: ifN > 163, there are no-noncuspidal rational points onX0(N), and moreover one can show that cusps always map to torsion points on

E.

However, using the fact that E is an algebraic group, we can employ the fol-lowing trick: let P ∈ X0(N)(L) be a point defined over any number field L. Then ϕ(P) ∈ E(L). However, if σi : L ,→ Q are all the embeddings, then

P

iσi(P)∈E(Q). In other words, even irrational points onX0(N) produce

ratio-nal points onE!

This is however a bewildering array of riches: if we take a breath and remem-ber that E(Q) is finitely generated, then it must be that for most L and points

P ∈ E(L), we are getting the same points on E, or multiples thereof. Where to look for “special points”? I wish I could remember the reference, but there is an old paper of Birch in which this question occurs non-rhetorically. He mentions the family of points on curves which algebraic geometers regard as special, namely Weierstrass points. Indeed, one can certainly ask the following

Question 2. Letϕ:X0(N(E))→E be an (optimal) modular parameterization of a rational elliptic curve. LetW ⊂E(Q)be the subgroup generated by the images of the traces of Weierstrass points on X0(N(E)). What can be said aboutW? This is a perfectly interesting question. It is however one that we seem not to be ready to answer, because as mentioned earlier we don’t really know “where the Weierstrass points are” onX0(N), nor anything about their arithmetic properties. Better would be a supply of points whose arithmetic we understand. Now comes what is (to my mind) the single most beautiful idea in modern elliptic curve the-ory: in so many ways elliptic curves with complex multiplication are much easier to understand. Let us then, via modular parameterizations, studyall rational elliptic curves using these easiest elliptic curves.20

20_{To make a Langlands-style slogan out of it, let us study}_GL2_{-type arithmetic objects using} GL1=Gm-type arithmetic objects.

(17)

It turns out that the particular class of CM points on X0(N) we want to study is the following: we want pairsE→E0 _{of cyclically}_N_{-isogenous elliptic curves so}

that End(E) and End(E0_{) have CM by the}_same _{order in an imaginary quadratic}

fieldK. Without getting too bogged down into the particulars of CM theory, sup-pose we want both to have CM by the maximal order: then E0 ₌_E/_I_{, where} _I

is an ideal of oK of norm N. The condition then is that every prime dividingN

shouldsplit inK, the so-calledHeegner condition. For any suchK, we can consider the images of Heegner points onE(K). The details are of course beyond the scope of this survey, but suffice it to say that, thanks to beautiful work of Gross-Zagier and Kolyvagin, we get a highly interesting and useful supply of rational points on

E(K) in this way.

But what aboutE(Q)? Well, given any quadratic field, we get modulo 2-torsion, a decomposition ofE(K) into the direct sum ofE(Q) andEχK_{(Q), so the arithmetic}

overK and over Qis closely related. The original perspective on this was thatK

andE(K) were auxiliary data to studyE(Q). The main theorem that comes out of all this is that if the analytic rank ofE/Q is at most 1, then the rank is equal to the analytic rank, and to prove this one can always choose a suitable Ksatisfying the Heegner hypothesis.

On the other hand we would also like to understand the arithmetic of E not just over Q but over all number fields and in particular over all imaginary quadratic fields. What if we are interested in an imaginary quadratic fieldK such that some primes of bad reduction do not split inK? The modern answer is that, at least when

N(E) is squarefree, we consider not just the parameterizationX0(N(E))→E, but for all factorizationsN(E) =N·D, the Shimura curve parameterization

X0D(N)→E.

This is the “mainstream” motivation for studying Shimura curves.

However, just as one can be interested in modular curves in their own right and not just as a supply of rational points on elliptic curves, one can be interested in intrinsic the Diophantine geometry of Shimura curves. In fact, as we shall see, there is one respect in which the Shimura curves aremore interesting than the classical modular curves: their lack of cusps leads to to the possibility that they may fail to have points rational over some given number fieldK and/or over all completions

Kv ofK.

8. Introduction to Shimura curves

Let us go back to Fuchsian groups. In some naive sense, the subgroups of the mod-ular groupP SL2(Z) are the easiest Fuchsian groups (of the first kind) to construct. There are evidently many more, because every compact Riemann surface of genus at least 2 is uniformized by a Fuchsian groupof hyperbolic type(i.e., with a compact fundamental domain and without elements of finite order). Nevertheless it is not so easy to get our hands on them, e.g. to write down generators. For instance, if you give me a hyperelliptic curvey2₌_P₍_x_{) for}_P _∈_Q[_x_{] a polynomial of degree at} least 5, then the corresponding algebraic curve is uniformized by some unique (up to

(18)

P SL2(R)-conjugacy) Fuchsian group of hyperbolic type. Can we write down gen-erators? I don’t know how to do it, and especially there is no easy correspondence between the arithmetic properties of the curve and of the uniformizing Fuchsian group.

8.1. An interesting family of Fuchsian groups. There is one class of examples of Fuchsian groups which one can write down explicitly, which generalizesP SL2(Z): namely, choose any elementsa, b, cin Z+_{∪ ∞}_{such that} 1

a +1b+1c <1 (here our

convention is that _∞1 = 0, of course). Then, through the magic of hyperbolic geom-etry, there is up to isometry a unique hyperbolic triangle with angles π

a, πb, πc. Here

if any ofa,b,care zero then we are getting an “ideal” triangle, namely with angle 0 and with vertex on the boundary of the hyperbolic plane (this is an instance where the unit disc model of the hyperbolic plane is easier to visualize). For instance, if we take (a, b, c) = (2,3,∞) then the corresponding hyperbolic triangle bounds the region inHwhich is the right half of the standard fundamental region forP SL2(Z). Consider the group ˜∆ of isometries ofHgenerated by reflectionsτ1, τ2, τ3through these three sides. Reflections are orientation-reversing, so these are not elements of

P SL2(R). However, there is an index 2 subgroup of orientation-preserving isome-tries which is a Fuchsian group: it is generated byx=τ3◦τ2, y=τ1◦τ3, z=τ2◦τ1 and these are respectively, rotations through the three vertices of the triangle of angles 2π/a, 2π/b,2π/c. We get a very explicit and interesting Fuchsian group ∆(a, b, c) with presentation

∆(a, b, c) =hx, y, z|xa=yb=zc=xyz= 1i.

In particular ∆(2,3,∞) =P SL2(Z) (up to conjugacy) and ∆(a, b, c) is cocompact iffa, b, care all finite.

It is not hard to write down explicit generators for ∆(a, b, c) whose entries lie in

R=Ra,b,c=Z[2 cos(π/a),2 cos(π/b),2 cos(π/c)]. One can then define congruence subgroups with respect to ideals ofR: in particular for every idealN one can define Γ0(N), Γ1(N), Γ(N).

The corresponding family of algebraic curves is, I believe, a very interesting one, but it is not the family that we are meant to be talking about: this is a general-ization of modular curves which is in general different from the Shimura curves: there are precisely 85 triples (a, b, c) such that the curves uniformized by ∆(a, b, c) are Shimura curves. In particular there is, as far as I know, no good theory of Heegner points on a general such curve, so for elliptic curve applications this is the wrong generalization. On the other hand, they are much easier to compute with than arbitrary Shimura curves; in fact, in the burgeoning theory of Shimura curve computations, the vast majority of work concerns those 85 “Shimura” triples. It is an interesting question how much of the arithmetic theory of modular curves and Shimura curves can be generalized to this family of curves. (In fact I have been working on this on and off for several years.)

8.2. Shimura curves (over Q). To get the real thing, then, we need to take a somewhat more abstract perspective. Here is one way to motivate the construction: consider the class of Fuchsian groups whose matrix entries are as close as possible to being integral, without actually being in Z. What could this mean? Well, the

(19)

characteristic polynomial of any element ofSL2(Z) clearly has integral coefficients, so let us (just for the sake of exposition) define aZ-group to be a Fuchsian group Γ⊂SL2(Z) of the first kind such that for allγinΓ, the characteristic polynomial of Γ hasZ-entries. Since the determinant is always 1, this means precisely that we want integral trace. Now the rational canonical form tells us that any particular element γ of a Z-group can up to to conjugacy be expressed as a matrix with Z-entries. However, we may not be able to simultaneously conjugate all elements of Γ intoZ. Indeed, here is what we can say about aZ-group:

Proposition 7. Let Γ be a Z-group. Put Q[Γ] to be the Q-subalgebra of M2(R) generated by the entries of Γ and similarly put R[Γ]to be the R-subalgebra. Then R[Γ] =M2(R).

Since R[Γ] =Q[Γ]⊗QR, this tells us precisely that Q[Γ] is an indefinite rational

quaternion algebra. That is, it is a four-dimensionalQ-algebra, which is simple (no two-sided ideals), whose center is preciselyQ.

Example: Γ = SL2(Z) and Q[Γ] = M2(Q). This is called the split quaternion algebra. Any other rational quaternion algebra is a division algebra.

It is of course, not yet clear that anyZ-groups other thanSL2(Z) and its subgroups of finite index exist. But notice that we can express the construction of a Fuchsian groupSL2(Z) fromM2(Q) in a way which generalizes to arbitrary indefinite ratio-nal quaternion algebras: namely, to say that a ratioratio-nal quaternion algebraB/Q is indefinite is to say that there exists an isomorphismB⊗QR=M2(R) (equality here

means we pick one such isomorphism, and since all such areSL2(R)-conjugate, it doesn’t matter which one we pick), and in particular an injection

B ,→M2(R).

Now select a maximalZ-orderOofB – that is a sub-Z-algebra ofB such that the canonical mapO ×Q→Bis an isomorphism, and which is not properly contained in any other such order – and take Γ(1) to be the units of O which, under the embedding intoGL2(R), have determinant 1 (as in the case ofSL2(Z), they would have to have determinant, or “reduced norm”±1), modulo±1. Now the basic fact is:

Proposition 8. a)Γ(1)⊂P SL2(R) is a Fuchsian group, i.e., is discrete. b) Up toP SL2(R)-conjugacy, Γ(1) is independent of the choices made. c) Except in the case where B=M2(Q),Γ(1) is cocompact.

For any quaternion algebraB/Q, we define its discriminant to be the product of the primespsuch thatB⊗Qpis a division algebra. Then it is a basic (but nontrivial;

it is closely related to classfield theory) fact that there is a unique quaternion alge-bra of any given squarefree discriminant which isindefinite if the number of finite ramified primes is even anddefinite (i.e., B⊗Ris the unique (Hamilton) division quaternion algebra overR). Let us rename Γ(1) ΓD_{(1) where}_D_{is the discriminant}

of the quaternion algebraB. So, for any squarefreeDwhich is divisible by an even number of primes, we get a well-determined complex algebraic curve Y(ΓD_{(1)); if} D= 1 this is justY(1) as before, but forD >1 it is cocompact.

(20)

positive integerN, NO is a 2-sidedO-ideal, so we may define ΓD₍_N_{) to be those}

elements of ΓD_{(1) which are congruent to 1 modulo}_N_O_{. Although this definition}

makes sense for arbitrary N, it is both more transparent and more useful in the case when N is prime to the discriminantD, which we will assume from now on. We then find that ΓD₍₁₎_/_ΓD₍_N₎ ∼₌ _{P SL}_2(Z_/N_{Z) as in the classical case, which}

means that we can also define ΓD

0(N) – i.e., matrices which modulo N become upper triangular – and ΓD

1(N) – i.e., matrices which moduloN become unipotent. Therefore we may define

YD

• (N) = ΓD•(N)\H

and XD

• (N) to be the compactification of Y•D(N) (which, in every case but the

classicalD= 1 case, is the same asYD

• (N)). These are the Shimura curves.

Here is a genus formula for XD

0 (N), for squarefree N, generalizing the case of

D= 1:

g(X0D(N)) = 1 + 1

12ϕ(D)ψ(N)−

e1(D, N)

4 −

e3(D, N)

4 −

e∞(D, N)

2 ,

whereϕis the Eulerφfunction, as beforeψ(N) =Q_p_|_N(p+ 1),e∞is the number

of cusps onXD

0 (N) – namely, the number of positive divisors ofN ifD= 1 and 0 ifD >1 – and finally, form= 1 orm= 3,

em(D, N) = Y

p|D

1− µ

−d(m)

p

¶ Y

q |N

1 +

µ −d(m)

q

¶

.

It is again easy to see that the genus tends to infinity with max(N, D); and in fact, Abramovich’s result holds verbatim here and gives a linear lower bound on the gonality in terms of the genus. An important difference is that there is usually not a projective line on the bottom. Indeed,

g(XD

0 (N)) = 0 ⇐⇒ (D, N) = (6,1), (6,7), (10,1), (22,1);

g(X0D(N)) = 1 ⇐⇒

(D, N) = (6,13), (10,7), (14,1), (15,1), (21,1), (33,1), (34,1), (46,1).

9. Moduli interpretation, canonical models

We would now like to provide a moduli interpretation for the Shimura curves

XD

• (N). Perhaps surprisingly, we will show that these curves are moduli of

cer-tainabelian surfaces; in fact our interpretation will hold even forD = 1, so that we will get a new moduli interpretation of the classical modular curves! (On the other hand, we will be able to see almost immediately how the new interpretation is related to the old interpetation.) Time constraints will cause us to restrict to the case of no level structure – XD_{. To understand the moduli interpretation of the}

level structure forN prime toD, one needs to exploit theO-module structure on

A[N] and use (to put a fancy label on a relatively simple piece of linear algebra) “Morita equivalence.” These ideas figure prominently in my PhD thesis, which I can give you if you really are interested in the details.

Okay, here goes: recall we have chosen an isomorphismB⊗RwithM2(R) (unique up to conjugacy). Earlier we exploited the fact that this yields an embedding of a maximal order (also unique up to conjugacy) intoM2(R). Carrying this one step

(21)

further, we also get an embedding of Ointo M2(C). In particular, for any τ ∈ H, one can consider the column vectorvτ = [τ1]t_in_C2_{, and then also the set}_O_vτ_{. A} (not completely trivial) computation reveals that this is a full lattice inC2_{, so we} get a two-dimensional complex torus

Aτ:=C2_/_O_vτ.

Now recall that unlike elliptic curves, the generic complex torus of dimension at least 2 is not an abelian variety. The necessary and sufficient condition is the ex-istence of a Riemann form. On the other hand, an important general principle is that one can build a Riemann form out of a sufficiently large ring of endomorphisms (“real multiplication” is enough; here we have more than that). To write down the Riemann form is not so difficult by requires an extra bit of data: indeed, once we define a Riemann form we will get a corresponding positive (“Rosati”) involution on the endomorphism algebra, so it turns out that the right thing to do is to specify this involution in advance, which requires a little bit of quaternion arithmetic. (See

§0.3.2 of my thesis for a discussion.)

In any case, things are chosen so that Aτ has O as an endomorphism ring, and moreover positive norm units ofOare just going to change the basis of the lattice. Therefore we get that

YD_{(1) = Γ}D₍₁₎_\H

parameterizes certain abelian surfaces withquaternionic multiplication, i.e., an in-jection

ι:O,→End(A).

We remark that the choice of ι – called a “QM structure” on A – determines a canonical principal polarization but gives (unless D = 1) a finite amount of ad-ditional information: there will in general be more than one QM structure on a pp abelian surface (assuming it has at least one QM structure). Thus a point on

YD_{(1) includes a}_choice _{of QM structure.}

What happens when D = 1? Then B = M2(Q) so we are talking about abelian surfaces which are isogenous to the square of an elliptic curve then. This is in fact what we do: given any elliptic curve,E,E2_{is an abelian surface whose} endo-morphism ring contains M2(Z). The lattice in this case is just the square of the corresponding two-dimensional lattice.

In summary (and with some technical details omitted): YD_{(1) parameterizes}

iso-morphism classes of QM abelian surfaces.

As above, this leads to a canonicalQ-model, since it makes sense to consider QM-abelian schemes over S (where S is a Q-scheme). This gives the uniqueness of theQ-rational model, and the existence follows as for elliptic curves (for instance

XD

1 (N) is a fine moduli scheme iffX1(N) is).

Note that because we, in general, neither cusps to give Fourier expansions nor an obvious map to P1_{, none of the more elementary descriptions of the canonical} rational model will work here.