Lecture Notes 24
Lecture Notes 24
In fact, as a result of subsequent work [4], we now have the following stronger result.
In this lecture we will explain what it means for an elliptic curve over Q to be modular
(we will also define the term semistable).
This requires us to delve briefly into the theory of modular forms. Our goal in doing so
is simply to understand the definitions and the terminology; we will omit all but the most
straight-forward proofs.
for all γ = a b
c d ∈ Γ.
Example 24.4. The j-function j(τ ) is a weak modular form of weight 0 for SL2 (Z), and
j(N τ ) is a weak modular form of weight 0 for Γ0 (N ). For an example of a weak modular
form of positive weight, recall the Eisenstein series
X 1
Gk (τ ) := Gk ([1, τ ]) := ,
(m + nτ )k
m,n∈Z
(m,n)6=(0,0)
which, for k ≥ 3, is a weak modular form of weight k for SL2 (Z). To see this, recall that
SL2 (Z) is generated by the matrices S = 01 −1 and = ( 10 11 ), and note that
0 T
X 1 X τk
Gk (Sτ ) = Gk (−1/τ ) = = = τ k Gk (τ ),
(m − nτ )k (mτ − n)k
m,n∈Z m,n∈Z
(m,n)6=(0,0) (m,n)6=(0,0)
Gk (T τ ) = Gk (τ + 1) = Gk (τ ) = 1k G(τ ).
If Γ contains −I, than any weakly modular form f for Γ must satisfy f (τ ) = (−1)k f (τ ),
since −I acts trivially and cτ +d = −1; this implies that when −I ∈ Γ the only weak modular
form of odd weight is the zero function. We are specifically interested in the congruence
subgroup Γ0 (N ), which contains −I, so we will restrict our attention to modular forms of
even weight, but we should note that for other congruence subgroups such as Γ1 (N ) that
do not contains −1 (for N > 2) there are interesting modular forms of odd weight.
As we saw with modular functions (see Lecture 19), if Γ is a congruence subgroup of
level N , meaning that it contains Γ(N ), then Γ contains the matrix T N = 10 N1 , and every
that contains only integer powers of q, regardless of the level N . This includes the congruence
subgroups Γ0 (N ) and Γ1 (N ) of interest to us. The coefficients an in the q-series for f are
also referred to as the Fourier coefficients of f .
The only modular forms of weight 0 are constant functions. This is the main motivation
for introducing the notion of weight, it allows us to generalize modular functions in an
interesting way, by strengthening their analytic properties (holomorphic on H∗ , not just
meromorphic) at the expense of weakening their congruence properties (modular forms of
positive weight are not Γ-invariant due to the factor (cτ + d)k ).
The j-function is not a modular form, since it has a pole at ∞, but the Eisenstein
functions Gk (τ ) are nonzero modular forms of weight k for SL2 (Z) for all even k ≥ 4. For
Γ = SL2 (Z) there is only one cusp to check and it suffices to note that
∞
X 1 X 1
lim Gk (τ ) = lim = 2 = 2ζ(k) < ∞,
im τ →∞ im(τ )→∞ (m + nτ )k nk
m,n∈Z n=1
(m,n)6=(0,0)
(recall that the series converges absolutely, which justifies rearranging its terms).
Definition 24.6. A modular form is a cusp form if it vanishes at all the cusps. Equivalently,
its q-expansion at every cusp has constant coefficient a0 = 0
Example 24.7. For even k ≥ 4 the Eisenstein series Gk (τ ) is not a cusp forms, but the
discriminant function
∆(τ ) = g2 (τ )3 − 27g3 (τ )2 ,
with g2 (τ ) = 60G4 (τ ) and g3 (τ ) = 140G6 (τ ), is a cusp form of weight 12 for SL2 (Z); to see
that it vanishes ∞, note that j(τ ) = g2 (τ )3 /∆(τ ) has a pole at ∞ and g2 (τ ) does not, so
∆(τ ) must vanish (see the proof of Theorem 15.11).
We are specifically interested in the vector space S2 (Γ0 (N )) of dimension g(Γ0 (N )).
Remark 24.9. Those who know a bit of algebraic geometry may suspect that there is a
relationship between the space of cusp forms S2 (Γ0 (N )) and the space of regular differentials
for the modular curve X0 (N ), since their dimensions coincide; this is indeed the case.
Here we are working in the free abelian group Div L generated by the setP
L of all P
lattices;
we extend Tn linearly to an an endomorphism of Div L (this means Tn L := Tn L).
Another family of endomorphisms of Div L are the homethety operators Rλ defined by
Rλ L := λL, (2)
1
One can define Hecke operators more generally on Mk (Γ1 (N )), which contains Mk (Γ0 (N ), but the
definition is more involved and not needed here.
Remark 24.10. Recall that if E/C is the elliptic curve isomorphic to the torus C/L, the
index-n sublattices of L correspond to n-isogenous elliptic curves. The fact that the Hecke
operators average over sublattices is related to the fact that the relationship between modular
forms and elliptic curves occurs at the level of isogeny classes.
Proof. (i) is clear, as is (ii) if we note that for m ⊥ n there is a bijection between index-mn
sublattices L00 of L and pairs (L0 , L00 ) with [L : L0 ] = n and [L0 : L00 ] = m. For (iii), the first
term on the RHS counts pairs (L0 , L00 ) with [L : L0 ] = p and [L0 : L00 ] = pr , and the second
term corrects for over counting; see [13, Prop. VII.10] for details.
Proof. By recursively applying (iii) we can reduce any Tpr to a polynomial in Tp and Rp ,
and any two such polynomials commute (since Tp and Rp commute, by (i)). Moreover,
(i) and (ii) imply that for distinct primes p and q, polynomials in Tp , Rp commute with
polynomials in Tq , Rq . Using (ii) and (iii) we can reduce any Tn to a product of polynomials
in Tpi , Rpi for distinct primes pi and the corollary follows.
we assume ω1 and ω2 are ordered so that ω2 /ω1 is in the upper half plane. Conversely, any
function F : L → C on lattices induces a function τ 7→ F ([1, τ ]) on the upper half plane.
Viewing our modular form f as a function L → C, we can transform this function by any
see [13, Lem. VII.5.2], for example. If we rescale by d−1 to put them in the form [1, ω], we
have ω = (aτ + b)/d. For f ∈ Mk (Γ0 (1)) we thus define Tn f as
k−1
X
k−1
X
−k aτ + b
Tn f (τ ) := n f (L) = n d f ,
d
[[1,τ ]:L]=n ad=n, 0≤b<d
The corollary implies that we may restrict our attention to the Hecke
P operators Tp for
p prime. Let us compute the q-series expansion of Tp f , where f (τ ) = ∞ a
n=1 n q n is a cusp
Remark 24.16. All the results in this section hold for f ∈ Sk (Γ0 (N )) if we restrict to Hecke
operators Tn with n ⊥ N , which is all that we require, and the key result a1 (Tn f ) = an (f )
holds in general. For p|N the definition of Tp (and Tn for p|n) needs to change and the
formulas in Corollary 24.13 and Theorem 24.14 must be modified. The definition of the
Hecke operators is more complicated (in particular, it depends on the level N ), but some of
the formulas are actually simpler (for example, for p|N we have Tpr = Tpr ).
Our goal in this section is to construct a basis of eigenforms for Sk (Γ0 (1)), and prove
that it is unique. In order to do so, we need to introduce the Petersson inner product, which
defines a Hermitian form on the C-vector spaces Sk (Γ) (for any2 congruence subgroup Γ).
Recall that for γ = c d ∈ SL2 (Z), we have im γτ = im τ /|cτ +d| , thus for any f, g ∈ Sk (Γ)
a b
we have
k
im τ
f (γτ )g(γτ )(im γτ )k = (cτ + d)k f (τ )(cτ̄ + d)k g(τ ) = f (τ )g(τ )(im τ )k .
|cτ + d|2
The function f (τ )g(τ )(im τ )k is thus Γ-invariant. If we parameterize the upper half-plane
H with real parameters x = re τ and y = im τ , so τ = x + iy, it is straight-forward to check
that the measure ZZ
dxdy
µ(U ) = 2
U y
is SL2 (Z)-invariant (hence Γ-invariant), that is, µ(γU ) = µ(U ) for all measurable sets U ⊆
H. This motivates the following definition.
One can show that the Hecke operators for Sk (Γ0 (1)) are self-adjoint with respect to
the Petersson inner product, that is, they satisfy hf, Tn gi = hTn f, gi. The Tn are thus
Hermitian (normal) operators, and we know from Corollary 24.13 that they all commute
with each other. This makes it possible to apply the following form of the Spectral Theorem.
Lemma 24.19. Let V be a finite-dimensional C-vector space equipped with a positive definite
Hermitian
L form, and let α1 , α2 , . . . be a sequence of commuting Hermitian operators. Then
V = i Vi , where each Vi is an eigenspace of every αn .
as a direct sum of eigenspaces for α1 , writing V = i V (λi ), where the λi are the distinct
L
eigenvalues of α1 . Because α1 and α2 commute, α2 must fix each subspace V (λi ), since
for each v ∈ V (λi ) we have α1 α2 v = α2 α1 v = α2 λi v = λi α2 v, and therefore α2 v is an
eigenvector for α1 with eigenvalue λi , so α2 v ∈ V (λi ). Thus we can decompose each V (λi )
as a direct sum of eigenspaces for α2 , and may continue in this fashion for all the αn .
Theorem 24.20. The space of cusp forms Sk (Γ0 (1)) is a direct sum of one-dimensional
eigenspaces for the Hecke operators Tn and has a unique basis of eigenforms f (τ ) = an q n ,
P
where each an is the eigenvalue of Tn on the one-dimensional subspace spanned by f .
The analog of Theorem 24.20 fails for Sk (Γ0 (N )) for two reasons, both of which are
readily addressed. First, as in Remark 24.16, we need to restrict our attention to the Hecke
operators Tn with n ⊥ N (when n and N have a common factor Tn is not necessarily a
Hermitian operator with respect to the Petersson inner product). We can then proceed as
above to decompose Sk (Γ0 (N )) into eigenspaces for the Hecke operators Tn with n ⊥ N . We
then encounter the second issue, which is that these eigenspaces need not be one-dimensional.
In order to obtain a decomposition into one-dimensional eigenspaces we must restrict our
attention to a particular subspace of Sk (Γ0 (N )).
Note that for any M |N the space Sk (Γ0 (M ) is a subspace of Sk (Γ0 (N )) (since Γ0 (M )-
invariance implies Γ0 (N )-invariance for M |N ). We say that a cusp form f ∈ Sk (Γ0 (N )) is
old if it also lies in the subspace Sk (Γ0 (M )) for some M properly dividing N . The oldforms
in Sk (Γ0 (N )) generate a subspace Skold (Γ0 (N )), and we define Sknew (Γ0 (N )) as the orthogonal
complement of Skold (Γ0 (N )) in Sk (Γ0 (N ) (with respect to the Petersson inner product), so
that
Sk (Γ0 (N )) = Skold (Γ0 (N )) ⊕ Sknew (Γ0 (N )),
and we call the eigenforms in Sknew (Γ0 (N )) newforms (normalized so a1 = 1). One can show
that the Hecke operators Tn with n ⊥ N preserve both Skold (Γ0 (N )) and Sknew (Γ0 (N )). If we
then decompose Sknew (Γ0 (N )) into eigenspaces with respect to these operators, the resulting
eigenspaces are all one-dimensional, moreover, each is actually generated by an eigenform (a
simultaneous eigenvector for all the Tn , not just those with n ⊥ N that we used to obtain
the decomposition); this is a famous result of Atkin and Lehner [3, Thm. 5]. Note that
Sknew (Γ0 (1)) = Sk (Γ0 (1)), and we thus have the following generalization of Theorem 24.20.
Theorem 24.21. The space Sknew (Γ0 (N )) is a direct sum of one-dimensional eigenspaces for
an q n , where each an
P
the Hecke operators Tn and has a unique basis of newforms f (τ ) =
is the eigenvalue of Tn on the one-dimensional subspace spanned by f .
where the an are complex numbers and s is a complex variable. Provided the an satisfy
a polynomial growth bound of the form |an | = O(nσ ) (as n → ∞), then the series L(s)
converges locally uniformly in the right half plane Re(s) > 1 + σ and defines a holomorphic
which converges locally uniformly to a holomorphic function on re(s) > 1. It has three
properties worth noting:
• analytic continuation: ζ(s) extends to a meromorphic function on C (with a simple
pole at s = 1 and no other poles);
• functional equation: the completed zeta function 3 ζ̂(s) = π −s/2 Γ(s/2)ζ(s) satisfies
ζ̂(s) = ζ̂(1 − s);
• Euler product: we can write ζ(s) as a product over primes (for re(s) > 1) via
Y Y ∞
X
ζ(s) = (1 − p−s )−1 = (1 + p−s + p−2s + . . . ) = n−s .
p p n=1
P∞
Definition 24.24. The L-function (or L-series) of a cusp form f (τ ) = n=1 an q
n of
weight k is the complex function defined by the Dirichlet series
∞
X
L(f, s) := an n−s ,
n=1
where χ(p) is 0 if E has bad reduction at p, and 1 otherwise.4 For primes p where E has
good reduction (all but finitely many), ap := p + 1 − #Ep (Fp ) is the trace of Frobenius,
where Ep denotes the reduction of E modulo p. Equivalently, Lp (T ) is the numerator of the
zeta function
∞
!
X Tn 1 − ap T + pT 2
Z(Ep ; T ) = exp #Ep (Fpn ) = ,
n (1 − T )(1 − pT )
n=1
that appeared in the special case of the Weil conjectures that you proved in Problem Set 7.
For primes p where E has bad reduction, the polynomial Lp (T ) is defined by
according to the type of bad reduction E that has at p, as explained in the next section.
This means that ap ∈ {0, ±1} at bad primes.
The L-function L(E, s)(s) converges to a holomorphic function on re(s) > 3/2.
y 2 + a1 xy + a3 y = x3 + a2 x2 + a4 x + a6 ,
of non-singular points of Ep (Fp ) is closed under the group operation.6 Thus Epns (Fp ) is a
finite abelian group. We now define
ap := p − #Epns (Fp ).
This is analogous to the good reduction case in which ap = p + 1 − #Ep (Fp ); we have
removed the (necessarily rational) singular point, so we reduce ap by one.
There are two cases to consider, depending on whether f (x) has a double or triple root
at 0; these two cases give rise to three possibilities for the group Epns (Fp ).
additive reduction,
0
ap = +1 split multiplicative reduction,
non-split multiplicative reduction.
−1
It can happen that the reduction type of E changes when we consider E as an elliptic
curve over a finite extension K/Q (in which case we are then talking about reduction modulo
primes p of K lying above p). It turns out that this can only happen when E has additive
reduction at p, which leads to the following definition.
Definition 24.30. An elliptic curve E/Q is semi-stable if it does not have additive reduction
at any prime.
As we shall see in the next lecture, for the purposes of proving Fermat’s Last Theorem,
we can restrict our attention to semi-stable elliptic curves.
We now observe that the integer coefficients an in the Dirichlet series for L(E, s) satisfy
the recurrence relations listed in (3) for an eigenform of weight k = 2. We have a1 = 1,
amn = am an for m ⊥ n, and apr+1 = ap apr − papr−1 for all primes p of good reduction, as
you proved on Problem Set 7. For the primes of bad reduction we have ap ∈ {0, ±1} and it
easy to check that apr = arp , which applies to the coefficients of an eigenform in Sknew (Γ0 (N ))
when p|N (see Remark 24.16).
∞
X
fE (τ ) = an q n (q := e2πiτ )
n=1
and it is essential that χ(p) is the same in both cases. For newforms f ∈ Sknew (Γ0 (N )) we
have χ(p) = 0 for primes p|N , while for elliptic curves E/Q we have χ(p) = 0 for primes
p|∆min (E). No elliptic curve over Q has good reduction at every prime, so we cannot use
eigenforms of level 1, we need to consider newforms of some level N > 1.
This suggests we take N to be the product of the prime divisors of ∆min (E), but note
that any N with the same set of prime divisors would have the same property, so this doesn’t
uniquely determine N . For semi-stable elliptic curves, it turns out that taking the product
of the prime divisors of ∆min (E) is the correct choice, and this is all we need for the proof
of Fermat’s Last Theorem.
Definition 24.31. Let E/Q be a semi-stable elliptic curve with minimal discriminant
∆min (E). The conductor NE of E is the product of the prime divisors of ∆min (E).
In general, the conductor NE of an elliptic curve E/Q is always divisible by the product
of the primes p|∆min (E), and NE is squarefree if and only if E is semi-stable. For primes
p where E has multiplicative reduction (split or non-split) p|NE but p2 - NE , and when E
has additive reduction at p then p2 |NE and if p > 3 then p3 - NE . The primes 2 and 3
require special treatment (as usual): the maximal power of 2 dividing NE may be as large
as 28 , and the maximal power of 3 dividing NE may be as large as 35 , see [15, IV.10] for the
details, which are slightly technical.
We can now say precisely what it means for an elliptic curve over Q to be modular.
If E/Q is modular, the modular form fE is necessarily a newform in S2new (Γ0 (NE )) with
an integral q-expansion; this follows from the Eichler-Shimura Theorem (see Theorem 24.37).
Proof. This is proved in [4], which extends the results in [19, 20] to all elliptic curve E/Q.
Prior to its proof, the conjecture that every elliptic curve E/Q is modular was variously
known as the Shimura-Taniyama-Weil conjecture, the Taniyama-Shimura-Weil conjecture,
the Taniyama-Shimura conjecture, the Shimura-Taniyama conjecture, the Taniyama-Weil
conjecture, or the Modularity Conjecture, depending on the author. Thankfully, everyone
is now happy to call it the Modularity Theorem!
Corollary 24.34. Let E be an elliptic curve over Q. Then L(E, s) has an analytic contin-
uation to a holomorphic function on C, and the normalized L-function
s/2
L̃E (s) := NE (2π)−s Γ(s)L(E, s)
where wE = ±1.
The sign wE in the functional equation is called the root number of E. If wE = −1 then
the functional equation implies that L̃E (s), and therefore L(E, s), has a zero at s = 1; in
fact it is easy to show that wE = 1 if and only if L(E, s) has a zero of even order at s = 1.
The conjecture of Birch and Swinnerton-Dyer (BSD) relates the order of vanishing of
L(E, s) at s = 1 to the rank of E(Q). Recall that
where E(Q)tor denotes the torsion subgroup of E(Q) and r is the rank of E.
Conjecture 24.35 (Weak BSD). Let E/Q be an elliptic curve of rank r. Then L(E, s) has
a zero of order r at s = 1.
The strong version of the BSD conjecture makes a more precise statement that expresses the
leading coefficient of the Taylor expansion of L(E, s) at s = 1 in terms of various invariants
of E. A proof of even the weak form of the BSD conjecture is enough to claim the Millennium
Prize offered by the Clay Mathematics Institute. There is also the Parity Conjecture, which
simply relates the root number wE in the functional equation for L(E, s) to the parity of r
as implied by the BSD conjecture.
Conjecture 24.36 (Parity Conjecture). Let E/Q be an elliptic curve of rank r. Then the
root number is given by wE = (−1)r .
References
[1] M. K. Agrawal, John H. Coates, David C. Hunt, Alfred J. van der Poorten, Elliptic
curves of conductor 11, Math. Comp. 35 (1980), 991-1002.
[2] Amod Agashe, Kenneth Ribet, and William A. Stein, The Manin constant, Pure and
Applied Mathematics Quarterly 2 (2006), 617–636.
7
The original results of Eichler and Shimura [17] proved ap (E) = ap (f ) only for primes of good reduction
and did not address the correspondence between the level and the conductor. The correspondence between
the level and conductor was conjectured by Weil but not rigorously proved until 1986 by Carayol [5, §0.8].
8
But there is an optimal representative for each isogeny class; see John Cremona’s appendix to [2].
9
This requires enumerating all solutions to certain Diophantine equations; see [1] and [11] for examples.
[4] Christophe Breuil, Brian Conrad, Fred Diamond, and Richard Taylor, On the modularity
of elliptic curves over Q: wild 3-adic exercises, Journal of the AMS 14 (2001), 843–939.
[5] Henri Carayol, Sur les représentations l-adiques associées aux formes modulaires de
Hilbert, Ann. Sci. École Norm. Sup. (4) 19 (1986), 409–468.
[6] Fred Diamond and Jerry Shurman, A first course in modular forms, Springer, 2005.
[7] Gerd Faltings, Finiteness theorems for abelian varieties over number fields, Inventiones
73 (1983), 349–366.
[9] Michael Laska, An algorithm for finding a minimal Weierstrass equation for an elliptic
curve, Mathematics of Computation 38 (1982), 257-260.
[11] Andrew P. Ogg, Abelian curves of small conductor , J. Reine Angew. Math. 224 (1967),
204–215.
[12] Corentin Perent-Gentil, Associating abelian varieties to weight-2 modular forms: the
Eichler-Shimura construction, Master’s thesis, EPF Lausanne, 2014.
[14] Joseph H. Silverman, The arithmetic of elliptic curves, second edition, Springer, 2009.
[15] Joseph H. Silverman, Advanced topics in the arithmetic of elliptic curves, Springer,
1994.
[16] Lawrence C. Washington, Elliptic curves: Number theory and cryptography, second
edition, Chapman and Hall/CRC, 2008.
[18] Goro Shimura, Introduction to the arithmetic theory of automorphic functions, Publi-
cations of the Mathematical Society of Japan 11, 1971.
[19] Richard Taylor and Andrew Wiles, Ring-theoretic properties of certain Hecke algebras,
Annals of Mathematics 141 (1995), 553–572.
[20] Andrew Wiles, Modular elliptic curves and Fermat’s last theorem, Annals of Mathe-
matics 141 (1995), 443-551.