Balady

Annotation: A discussion of the fundamental
ideas behind Selbergs Elementary proof of the

prime-number theorem
Steve Balady
August 11, 2006
1 Introduction
The study of prime numbers (non-unit numbers whose only proper divisors
are units) is both ancient and important: the prime numbers form the core of
number theory, necessary for many of the most fundamental results of algebra
and cryptography; and it was Euclid who proved their rst basic property,
their innitude. But the question of any sort of pattern to these numbers went
unsolved for two millenia after Euclid. Shortly before 1800, Gauss and Legendre
both stated (though they could not prove) versions of what is today called the
prime number theorem, concerning how the primes are distributed among the
integers. Gauss claimed that, for large x,
_
x
2
du
log u
is a good approximation to
(x), where (x) is the number of primes less than or equal to x and good
approximation is taken to mean that the percentage error of this estimate goes
to 0 as x ; Legendre suggested
x
log x+B
for some numerical constant B.
Today, we state the prime number theorem as
lim
x
(x)
x
log x
= 1. (1.1)
In the 1850s, the rst major progress on a proof of the prime number theorem
was made when Chebyshev proved that the limit above must be equal to 1
if it exists at all, and also proved that
x
log x
always lies between two positive
constants. These results involved only methods of real variables; but in 1896,
using the techniques of complex analysis, Hadamard and de la Valle`e Poussin
independently proved the prime number theorem. At the time, the proof seemed
so thoroughly dependent on complex variables that it was doubted that any
techniques of real analysis would be powerful enough to establish the the result.
But in 1946, Selberg and Erdos (working independently) gave proofs using only
real variables, the so-called elementary proofs. As it happens, Erdos published
rst, but his paper was dependent on a fundamental idea due to Selberg; this
complication lead to a bitter rivalry between Selberg and Erdos for years.
1
Who deserves the credit is not the point of this paper: instead, my goal
is to organize the basic ideas behind Selbergs formula and to indicate how it
can be used to prove the prime number theorem. I have called this paper an
annotation because its key results follow those of Selberg [2]. My sources for
the rather unfortunate history of this proof are Andrews [1] and Shapiro [3]; the
basic ideas in this paper are due to Landau, Chebyshev, Mobius, and Selberg;
the proofs that I give are usually syntheses of proofs in [1]-[3]. I take no credit
for any of these ideas (a good thing, especially since they are all relatively well-
known); instead, it is my hope that this organization of available information
will help to make Selbergs proof [2] more readily comprehensible to the diligent
mathematical tourist. Frankly, my main attemptive proof is that Selbergs
elementary proof is in fact both fairly elementary and well-motivated.
2 Asymptotics
Before we begin to study the properties of (x), we introduce some notation
rst proposed by Landau which will make our discussion signicantly easier to
follow.
1. If lim
x
f(x)
g(x)
= 1, we call f asymptotic to g and write f(x) g(x).
2. If lim
x
f(x)
g(x)
= 0, we write f(x) = o(g(x)) (read f is little-oh of g).
3. If k R s.t. lim
x
f(x)
g(x)
< k, we write f(x) = O(g(x)) (read f is

big-oh of g).
If we write O(g(x)) alone, we take it to stand for some function which is
O(g(x)); for example, O(1) stands for a function which is uniformly bounded,
and

nx
= O(x) if the various O(1) functions themselves are uniformly
bounded.
In addition, we have
Lemma 2.1.
f(x) g(x) f(x) = g(x) + o(g(x)).
Proof. If f(x) = g(x) + o(g(x)), then
f(x)
g(x)
= 1 +
o(g(x))
g(x)
; letting x gives
f(x) g(x). Conversely, f(x) g(x)
f(x)
g(x)
1 = o(1) f(x) g(x) =
o(g(x)).
We can now restate the prime number theorem as
(x)
x
log x
, (2.1)
and indeed this is the form in which it is most commonly expressed.
2
3 Equivalent formulations of the prime number
theorem
It turns out that we have very little control over (x) itself. To remedy this, we
introduce rst the von Mangoldt function (x) and then study the asymptotics
of the Chebyshev functions (x) and (x) which are related to (x) but prove
more susceptible to direct approximation because of the property log x+log y =
log xy. We note here that for the rest of the paper (though we consider this to
be a good idea in general), p will always be taken to denote a prime.
(x)
_
_
_
0 if x = 1
log p if N s.t. x = p
0 otherwise
(3.1)
(x)
px
log p (3.2)
(x)
nx
(n) (3.3)
Since log(x) is an increasing function, we have the trivial estimates
(x) = O(xlog x) and (3.4)
(x) = O(xlog x). (3.5)
Our rst goal is to verify that (x) and (x) have any bearing on the prime
number theorem.
Theorem 3.1. The following expressions are equivalent:
(x)
x
log x
(3.6)
(x) x (3.7)
(x) x (3.8)
Proof. First we show that (3.7) (3.8).
(x) =
x
log p =
px
1
log p. (3.9)
Since 2
x, or
log x
log 2
, (3.9) implies
(x) =
log x
log 2
=1
px
1
log p =
log x
log 2
=1
(x
1
) = (x) + (x
1
2
) + (x
1
3
) + . . .
3
Since
log x
log 2
, we can trivially bound this sum, apply our weak approxi-
mation (3.4), and conclude that
(x) = (x) + O((
x) log x),
= (x) + O(
xlog
2
x),
(x)
x
=
(x)
x
+ O
_
log
2
x
x
_
. (3.10)
As x ,
log
2
x
x
0, so letting x in (3.10),
lim
x
(x)
x
= lim
x
(x)
x
, (3.11)
from which it follows immediately that (x) x i (x) x.
Now we prove the other half of the theorem, that (3.6) (3.7).
(x) =
px
log p
px
log x = log x
px
1 = (x) log x,
(x)
x

(x)
x
log x
. (3.12)
Fix [0, 1]. Then
(x) (x
) =
px
log p
x
log p
=
px
log p
log x
px
1,
so that
(x) (x
) log x((x) (x
)) . (3.13)
Using our weak approximation (3.4) on (3.13), and noting that (x) = O(x
)
for the same reason,
(x) + O(x
log x) (x) log x + O(x
log x)
(x) (x) log x + O(x
log x)
(x)
x

(x) log x
x
+ O
_
log x
x
1
_
. (3.14)
Now assume that (x) x. Then by (3.12),
lim
(x) log x
x
1, (3.15)
4
Likewise, by (3.14),
1 lim
(x) log x
x
; (3.16)
this holds for all [0, 1]. Letting 1 from below and applying (3.15),
1 lim
(x) log x
x
lim
(x) log x
x
1, (3.17)
so that lim
x
(x) log x
x
exists and is equal to 1.
We proceed similarly if we assume that (x)
x
log x
: (3.12) gives
lim
(x)
x
1,
while (3.14) gives
lim
(x)
x
.
Taking 1 and combining these last two expressions, we conclude that
the limit exists and is 1.
It follows from this theorem that it is enough to prove any of these three
forms to establish the prime number theorem, and thus that Selberg is justied
in stating (1.1) and (1.2) in [2]. In what follows we will mainly be concerned
with (x), and so will be interested in the properties of (x). (We note that
[2] proves that (x) x; however, we feel that the properties of (x) are more
intuitive, and the theorems we prove in the following are almost identical to
those in [2].) First, however, we take a detour to prove a series of formulas
which will aid us in analyzing (x).
4 The Mobius Inversion Formula
Continuing our journey through the Greek alphabet, we rst introduce the
Mobius function (x) and then prove a basic result about it.
(n)
_
_
_
1 if n = 1
0 if p s.t. p
2
|n
(1)
r
if n = p
1
p
2
. . . p
r
for p
i
distinct primes
(4.1)
Lemma 4.1.
d|n
(d) =
_
1 if n = 1,
0 if n > 1.
Proof. If n = 1, this is trivial; we proceed by induction on the number of prime
factors of n. If n has precisely one prime factor (i.e., n = p
n
),
d|n
(d) = (1) + (p) + (p
2
) + . . . + (p
n
) = 1 1 + 0 + . . . + 0 = 0.
5
Now assume that the lemma is true for any number n
with k distinct prime

factors; then any n with k + 1 distinct prime factors is of the form n
p
n
for
(p, n
) = 1. But then
d|n
(d) =
d|n
(d) +
d|n
(pd) +
d|n
(p
2
d) + . . . +
d|n
(p
n
d)
=
d|n
(d)
d|n
(d) + 0 + . . . + 0 = 0.
The Mobius function (x) will be essential in analyzing the Chebyshev func-
tions in light of the
Theorem 4.2. (Mobius Inversion Theorem) For any f(n), g(n), the following
expressions are equivalent:
f(n) =
d|n
g(d) (4.2)
g(n) =
d|n
(d)f
_
n
d
_
(4.3)
Proof. If f(n) =
d|n
g(d), then we have that
d|n
(d)f
_
n
d
_
=
dd
=n
(d)f(d
)
=
dd
=n
(d)
e|d
g(e) =
deh=n
(d)g(e) =
eh
=n
g(e)
d|h
(d) = g(n),
where the last step follows by applying (4.1) to

d|h
(d).
Conversely, assuming that g(n) =
d|n
(d)f
_
n
d
_
,
d|n
g(n) =
d|n
|d
(d
)f
_
d
d
_
=
eh=n
(d
)f(e) =
eh
=n
f(e)
|h
(d) = f(n),
with the last step following, as before, by applying (4.1) to

d
|h
(d).
Several results that follow depend on well-known asymptotic formulas for
certain functions of log x; to prove these here would interrupt the ow of ideas.
We state without proof these formulas here: excellent derivations are given in
[3].
6
For some xed constants C > 0 and C
> 0,
nx
1
n
= log x + C + O
_
1
x
_
, (4.4)
nx
log n
n
=
1
2
log
2
x + C
+ O
_
log x
x
_
, (4.5)
nx
log
x
n
n
=
1
2
log
2
x + C log x C
+ O
_
log x
x
_
, (4.6)
nx
log
x
n
= x + O(log x). (4.7)
We now apply the Mobius inversion formula to some specic sums which
will be useful in proving our core result (6.1).
1. Take g(x) 1. Then, by (4.2),
f(x) =
nx
1 = [x] = x + O(1);
applying (4.2) and using that |(n)| 1, we obtain
1 =
nx
(n)
_
x
n
+ O(1)
_
= x
nx
(n)
n
+
nx
O(1) =
_
_
x
nx
(n)
n
_
_
+O(x),
x
nx
(n)
n
= O(x)
nx
(n)
n
= O(1). (4.8)
2. Take g(x) x.
f(x) =
nx
x
n
= x
nx
1
n
= xlog x + Cx + O(1) (by (4.4)).
Applying the inversion formula,
x =
nx
(n)
_
x
n
log
x
n
+ C
x
n
+ O(1)
_
= x
nx
(n)
n
log
x
n
+ Cx
nx
(n)
n
+
nx
O(1)
= x
nx
(n)
n
log
x
n
+ O(x) + O(x) (by (4.8)),
7
so that
x
nx
(n)
n
log
x
n
= O(x),
nx
(n)
n
log
x
n
= O(1). (4.9)
3. Finally, take g(x) xlog x.
f(x) =
nx
x
n
log
x
n
= x
xn
log
x
n
n
=
x
2
log
2
x+Cxlog xC
x+O(log x) (by (4.6)).

By the Mobius inversion formula (collecting terms in one step this time),
xlog x =
x
2
nx
(n)
n
log
2
x
n
+Cx
nx
(n)
n
log
x
n
C
nx
(n)
n
+
nx
O
_
log
x
n
_
.
Finally, applying (4.9) to the second sum, (4.8) to the third sum, and (4.7)
to the fourth sum,
xlog x =
x
2
nx
(n)
n
log
2
x
n
+ O(x) + O(x) + O(x),
nx
(n)
n
log
2
x
n
= 2 log x + O(1). (4.10)
5 Properties of (x)
We now return to our main goal: improving our asymptotic formula for (x)
(equivalently, (x) or (x)). As we said, we will focus on (x): to do this we
need the following two lemmas.
Lemma 5.1.
log n =
d|n
(d).
Proof. Letting n = p
a
i
i
, we have that
log n =
a
i
log p
i
.
On the other hand, there are precisely a
i
divisors of n for which (d) = log p
i
,
so that

d|n
(d) =
a
i
log p
i
= log n.
8
The motivation for the next lemma, and indeed the rest of this paper, is
the following. The prime numbers themselves are very dicult to work with,
which is precisely why the prime number theorem eluded proof for a century
after it was rst hypothesized. Therefore, we will have a much better chance of
proving statements about (x) which do not explicitly require knowledge of the
distribution of primes; Selbergs main insight was to construct an asymptotic
formula which related a sum over (p) to a function involving only log(x) which
allowed him to extend. We proceed in eectively the same manner to nd a
similar asymptotic formula relating (x) (that is, a sum over (n)) to a function
of log(x).
Lemma 5.2.
d|n
(d) log
2
n
d
= (n) log n +
d|n
(d)
_
n
d
_
.
Proof.
log
2
n = (log n)(log n) = log n
d|n
(d) (by (5.1))
=
d|n
(d) log
dn
d
=
d|n
(d) log
n
d
+
d|n
(d) log d
=
d|n
(d)
|
n
d
(d
) +
n|d
(d) log d (by (5.1))
=
dd
|n
(d)(d
) +
d|n
(d) log d.
Letting c = dd
and replacing d with c in the second sum,

log
2
n =
c|n
_
_
d|c
(d)
_
c
d
_
+ (c) log c
_
_
.
Now, applying (4.1) to this sum with respect to c, we conclude that
c|n
(c) log
2
n
c
=
_
_
d|n
(d)
_
n
d
_
_
_
+ (n) log n.
Changing c back to d in the rst sum, we obtain (5.2).
6 The symmetry formula
Lemma 5.2 gives us fundamental knowledge about (n) for individual n: it
follows that if we sum this information over all n x, we should obtain infor-
9
mation about (n). Selberg [2] calls attention to this as the basic new thing
in his proof; [3] calls it the symmetry formula.
Theorem 6.1. (The symmetry formula)
nx
(n) log n +
mnx
(m)(n) = 2xlog x + O(x).
Proof. Since (5.2) is true for all n, we may sum it over n x to obtain
nx
d|n
(d) log
2
n
d
=
nx
(n) log n +
nx
d|n
(d)
_
n
d
_
. (6.1)
Let n = mn
; then the third term in (6.1) becomes
mn
x
(m)(n
);
replacing n
with n gives us the second term in (6.1), so we need only show
nx
d|n
(d) log
2
n
d
= 2xlog x + O(x).
Letting n = dd
nx
d|n
(d) log
2
n
d
=
dd
x
(d) log
2
d
dx
(d)
x
d
log
2
d
dx
(d)
d
log
2
x
d
2x
dx
(d)
d
log
x
d
+ 2x
dx
(d)
d
+ O(
dx
log
2
x
d
),
(6.2)
with the last step following by (4.5). Now, by (4.10), the rst term is
2xlog x + O(x). By (4.9), the second term is O(x). By (4.8), the third term
is O(x). And by (4.7), the last term is O(x). Then, nally, (6.2) reduces to
2xlog x + O(x), which is what we needed.
It can become easy to get bogged down in the specic mechanics of this proof,
which is why the bulk of the heavy asymptotic lifting was done in Section 4.
The main idea here should be clear: we have now expressed a sum over (x)
asymptotically in terms of log(x), which is a huge step forward. If we take for
granted the formula in [3] that

nx
(n) log n = (x) log x + O(x), then we
can express Theorem 6.1 in the following way, which corresponds almost exactly
to (1.3) in [2]:
(x) log x +
nx
(n)
_
x
n
_
= 2xlog x + O(x).
10
7 Toward the prime number theorem
With the symmetry formula established, a sketch of the remainder of Selbergs
proof follows. First we dene the error term R(x) by the equation (x) =
x + R(x); then the theorem is proved if we show that R(x) = o(x), since this
will imply by Lemma 2.1 that (x) x. We establish, relying heavily on the
symmetry formula, the inequality
|R(x)| log
2
x 2
nx
log n
R
_
x
n
_
+ O(xlog x) (7.1)
which has the virtue of not relating directly to the primes and is consequently
easier to deal with. From there, we use R(x) to nd intervals farther than any
arbitrary distance from 0 on which the prime number theorem is approximately
true: that is, there exists a constant K such that for any < 1 and x > 4,
there exists y [x, e
K
x] for which
R(y)
y
< .
From there, we apply the fundamental inequality (7.1) iteratively over these
patches, each time giving us an estimate
n
such that |
R(y)
y
| <
n
, where
i
=
i1
a
3
i1
for some xed a > 0, so that
i
for some . Taking the
limit as i , we nd that = a
3
, and thus that = 0, establishing the
theorem.
References
[1] 1994. Andrews, George E. Number Theory. New York: Dover.
[2] Selberg, A. An elementary proof of the prime-number theorem. The An-
nals of Mathematics, 2nd Ser., Vol 50, No. 2. (Apr., 1949), pp. 205-313.
[3] 1948. Shapiro, Harold N. Lectures on the theory of numbers. New York: New
York University Institute for Mathematics and Mechanics.
11

Balady

Uploaded by

Balady

Uploaded by

Annotation: A discussion of the fundamental

ideas behind Selbergs Elementary proof of the

< k, we write f(x) = O(g(x)) (read f is

log x) (x) log x + O(x

with k distinct prime

x+O(log x) (by (4.6)).

and replacing d with c in the second sum,

; then the third term in (6.1) becomes

with n gives us the second term in (6.1), so we need only show

You might also like