Main Sequence Page 10
A Start on Prime Numbers

I have mentioned that the order of subjects in the main sequence is whimsical and so here without further ado is the second page devoted to mathematics.  (The first was about dice and probability.)

The subject I'm interested in is arguably the grail of math these days, and one reason I'm particularly interested in it is simply because it's not really clear to me why it's considered so important. I know enough about it to realize that it's weird. It has to do with prime numbers.

The goal on this page is pretty modest, just as the ground some distance away from a mountain can start to slope upwards very gradually. I just intend to run through the proof due to Euler that the sum of the reciprocals of the prime numbers diverges. And maybe do a little computer-based backup investigating. Divergence of course means that the sum in question is unbounded, which is one of these "For every a there is a b" type deals. I'll get to this more in a bit.

I was in High School in the early 1980s when I discovered Scientific American; at that time the recent back issues from the 1970s included Martin Gardner's monthly "Mathematical Games" column. They also included lots of ads for scotch and calculators and build-it-yourself personal computers from Heathkit. A lot of Mathematical Games concentrated on properties of numbers. Back then I was afflicted with a certain dreadful parochial closed-mindedness so this fascination with numbers for their own sake seemed... well, just dumb. Pointless. Trivial. I mean, who cares whether 24 is equal to the sum of it's divisors? So what if 1235 might be equal to 1^1 + 2^2 + 3^3 + 5^5.

Ok, well one of our teachers asked that question, actually: Are there any numbers that fit the mold of "self powerful"? That is, Self-Powerful Numbers (SPNs) would be equal to the sum of each digit raised to the (itself) power. 1 is the first SPN; are there others?

One evening I found myself trying to find SPNs. It was a long calculation to make but there were all these cool little shortcuts that kept popping up to make it simpler. Suddenly I had a little hook into Martin Gardner's fascination with numbers. I decided digging into 'trivial' questions about numbers could be pretty fun; I made a little progress towards being more open-minded.

Also around this time I got my hands on a computer with some graphics so I could use the math I was learning to draw nifty pictures.
hypocycloid

I found I could write computer programs to dig into questions about math as well. For example: A partition of a number breaks it into the sum of some other numbers. This is number theory so we're concerned with integers. A consecutive partition (CP) is a partition in which the summands are consecutive integers. Notice that 1 + 2 + 3 + 4 + 5 is a consecutive partition of 15, for example. So is 7 + 8. Is there a pattern for how many different CPs there are for a given number? How long must they be?

Well this is solvable with a little patience and some algebra and the answer has something to do with a number's prime factorization. So primes make an appearance in the CP problem. Prime numbers prime numbers prime numbers... the ground starts to get a little steeper.

It is often asserted (including here) that prime numbers have an element of randomness. Take a run through the positive integers without any prior expectation of which ones will be prime. As you reach each one, ask it: Are you prime? You get no, yes, yes, no, yes, no, yes, no, no, no, yes, no, yes, no, no, no, yes, no, yes, no, no, no, yes, no, no, no, no, no, yes, no, yes, no, no, no, no, no, yes, no, no, no, yes, no, and so on. A couple things will happen as you go on. First, of course fewer and fewer numbers will be prime as you get higher into the integers. Second, there is no easy pattern for guessing whether a number will be prime or not. You can "answer the question for each prime" by calculating whether or not it is prime. But this is not the same thing as having a rule or set of rules to predict if it is prime.

You can say "I can predict with perfect accuracy whether N is prime if N is even!" This is quite true, but it's also only the starting point of another method of calculation. Suppose you throw out all the even numbers and just look at the odds. Next you'll want to throw out multiples of 3 and 5. Fair enough but this still leaves a lot of numbers to check, in fact an infinite number. No matter which multiples you sieve out you'll still have an infinite number of potential primes to screen. So this techniques is not predictive.

Suppose it is accurate to say that there is no a priori way to determine if a number is prime, that primes appear kind of 'at random'. In this case we seem to be back in the realm of probability and as students coming late to class we can ask what sort of progress has been made in describing the distribution of primes.

Well our predecessors have been hard at work on this general problem. The statistical distribution of primes is described quite well by a set of simple rules (but there is still no precise formula for determining the nth prime). Aesthetically this is the sort of thing we tend to like; a little bit of disorder and mess superimposed over a simple reliably familiar background. And the deeper we dig the harder it appears to get the messy stuff to clean up into a simple obvious pattern. Good stuff!

This remark on aesthetics bears some further elaboration. I think a lot about things being beautiful, and here I've made an assertion that primes are beautiful for being simultaneously simple and complicated. Let me describe a simple idea to compare with this 'weird beauty of primes'. The Fibonacci series is, like the primes, an infinite succession of numbers that get bigger and bigger and further and further apart on average.

1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, ...

Is there a way to predict the nth Fibonacci number? Yes, actually. Unlike the primes, the Fibonacci numbers are so regularly spaced (by a multiplicative factor) that they can be predicted with excellent accuracy. This makes them less interesting than primes, even though they are quite interesting anyway.

I should mention that this factor can be seen in the ratio of successive Fibonacci numbers which is proportional to the golden ratio.  Strangely enough one does not need to begin a Fibonacci series with { 1, 1 } to see this pattern. Start instead with {19, 70000002} and pretty quickly the same ratio of successive terms emerges. Pretty weird all by itself.

Anyway the point is that the Fibonacci series is interesting but (being rather predictable) is not as interesting as the distribution of primes from a certain aesthetic perspective. I'll therefore return to prime numbers.

Let's suppose that the mysterious randomness of prime numbers is what makes them so interesting. It'd be nice to have a picture of what this looks like. So, okay, here is what it looks like for the integers n from 1 to 50. The vertical axis is "number of primes less than n". This is generally abbreviated π(n).

primes to 50
This jumpy curve is not a straight line; it is bending downwards as it goes out. However let's approximate it as a straight line for a second anyway. Since there are 15 primes by the time we reach 50 this gives a slope of about 15/50 or 0.3. Now let's do the same exercise letting n run up to 1000.

primes to 1000 
Again we see the curve bending over and indeed the slope is much more gradual, now only 170/1000 or 0.17. Apparently the higher we make n the more gradual the slope of the π(n) curve. So one last shot, this time letting n run to 10,000,000.

primes 1e7
Here the slope is .065 so the trend continues as expected. Here is our data:

n = 50           slope 0.30
n = 1,000        slope 0.17
n = 10,000,000   slope 0.065

This sort of suggests there might be a pattern or function giving slope(n) which is going to be similar to getting π(n). At least in a general way there might be some hope here.

Below is a detail or blow-up from the previous plot.
primes 1e7 detail 1
There is just enough granularity to show the randomness and jumpiness of π(n) at n just shy of 10 million. There are stretches of up to 50 numbers in a row where the curve is flat; no new primes over that stretch. Finally here is one more exerpt with still more granularity.

primes 1e7 detail 2
The horizontal extent of this plot is 100 consecutive integers, where n is close to 10,000,000. Over this stretch only 7 new primes are found. It's sort of like the atmosphere getting thinner the higher you go.

The series of plots above show that the overall trend is smooth, the fine details are jumpy, and the frequency of occurrence of prime numbers decreases with increasing n. Presumably by choosing n large enough, the slope of the curve π(n) can be made arbitrarily small. In fact characterizing π(n) is the whole game here, and the first obvious question is: Is there an upper bound to π(n), or equivalently, is the number of primes finite?

As you might guess, the first statistical fact about the primes is that there is no largest prime number. This is proven by Euclid who says "Look, assume there are finitely many primes; then we can start from this list and construct a new prime number not on the list! Contradiction, so there must be more than a finite number of primes." He proceeds to do the construction in a very straightforward way and the proof incidentally relies upon the fundamental theorem of arithmetic which states that any number has a unique prime factorization.

So π(n) the number of primes less than n is finite for any particular n but increases without bound as n gets larger. So far so good, but how about going further into the realm of predicting which numbers are prime? If we could find some sort of mathematical construction, no matter how bizarre, that managed to predict the occurrence of prime numbers then... well maybe they would become less mysterious. Anyway such a discovery would surely be quite interesting. And here again our predecessors have visited the topic and one of them has made a guess, or conjecture, or hypothesis about just such a prime number predictor. I don't understand it, but proving it true or false is the most famous problem in mathematics at this time. Understanding the conjecture is what I'm working towards; this is the mountain in the distance.

The guess / conjecture / hypothesis concerns a function defined on most of the complex plane; it is called the zeta-function. To arrive at the zeta function we go backwards in the history of mathematics to Euler who made the first connection between primes and zeta. Which will be the point of this page, to describe this connection. Before I get down to that, however, I should complete the reference to the mountain in the distance. So now shoot forward in time again to the mid 19th century.

This zeta function was originally defined for positive real numbers greater than one. But mathematicians being interested in expanding simple ideas to broader scope went on to expand the zeta function, defining it as a complex-valued function on the complex plane. Now if this expanded zeta function bore some deeper connection with the occurrence of prime numbers it would be a weird thing, like tunneling through hyperspace. In fact it is to my way of thinking like opening an alley door in San Francisco that leads into a building, walking through this building and out the front door, and finding oneself on a street in Paris. This zeta-function -- prime number connection is a wormhole through mathematical space and the guess that it really exists is called the 'Riemann hypothesis'.

I do not want to think "Oh the Riemann hypothesis, whatever it is...something about the zeta function and prime numbers...is pretty cool." Because that's just taking the mathematicians' word for it. I want to understand the nature of the open question (is the hypothesis true?) in terms of understanding primes, understanding the Euler result, understanding the generalization of zeta, and understanding the non-trivial zeros of zeta as... are you ready... something like the Fourier Transform of π(n). WHAT???!!!???!?!?!??!  You have gotta be kidding. Seriously I think I need to understand this in detail. That's the mountain.

But the mountain is still far off. All I've done so far is dredge up some terminology and suggest that there is a wormhole through mathematics connecting the ordered random distribution of prime numbers with some esoteric complex function. So let's get down to Step 1, the Euler proof that SUM(1/pn) diverges, and the "easy" definition of the zeta function. The source material I am following is found here:

The Distribution of Prime Numbers by A.E.Ingham, published originally by Cambridge University Press in 1932.

I'm going to write out the proof in slightly more detail than that provided in the sources so I can follow it without scratch paper. It's quite pretty.

Building a First Tool: Concerning log(1-x).
First I need to build some tools, and the first of these will be an expression for log(1-x).

log u can be defined for u real and u > 0 as:

log u = INTEGRAL(1, u) { dt / t }.

For those of us who learn calculus in the other order, differential first, we learn that (d/dx)(ln x) = 1/x so this integral definition for log looks just fine. I mean why not define it this way? This is called the natural or Naperian logarithm, by the way, and is often written ln(u) to distinguish it from log10. But I'll write log(u).

By the way I'm following Dr. Apostol's remarks in volume one of Calculus.

To interrup myself: Where is this headed? As I say I need a small box of tools in order to fly through the proof that S(x) and P(x) diverge. Let's finish off this first one before actually defining S and P though.

Suppose we define a new variable x in terms of u as follows: u = 1 - x. Or if you like x = 1 - u. So x < 1 since u > 0. Now

log 1-x = INTEGRAL(1, 1-x) { dt / t }.

Next a second change of variables, this time inside the integral as suggested by the first change. Let s = 1 - t. Then t = 1 - s and dt = -ds. If t = 1 then s = 0 and if t = 1-x then s = x. Now we have

log (1-x) = INTEGRAL(0, x) { -ds / (1-s) }.

But let's move the negative sign to the other side:

-log(1-x) = INTEGRAL(0, x) { ds / (1-s) }.

This is "halfway" to finishing up the first necessary tool. The second part is to look at the integrand (1/(1-s)) in terms of a series expansion so that it can be easily integrated. Of course the series will have infinitely many terms so we can't integrate them as a closed-form solution (which would make log(1-x) trivial!) but the point is to get the idea of the pattern.

Start with the identity 1 = 1 and add zero to the right-hand side...

1 = 1 - sn + sn (1 - s) / (1 - s).

Note that 1 - sn can be factored as (1 - s)(1 + s + s2 + s3 + ... + sn-1). Divide everything by (1-s) and we get:

1/(1-s) = 1 + s + s2 + s3 + ... + sn-1 + sn/ (1 - s)

Now we integrate both sides from 0 to x, hence the left-hand side is equal to -log(1-x) from up above. Integrating the right side is easy for all the terms except the last. There are then two things remaining: Show that this is valid in the range of interest, and show that the last term will give a positive contribution to the total integral.

-log(1-x) = x + x2/2 + x3/3 + ... + xn/n + INTEGRAL(0, x) {sn ds / (1 - s) }
 
Left off here.

Sorry, but what was the goal again?
There is a sum of some number of terms and we want to know if it is convergent or divergent in the limit as the number of terms becomes unbounded. In short: What do you get when you add up the reciprocals of all the prime numbers? There is also a related question concerning a product of (other) terms. Both sum and product are defined in finite terms based on a positive real number x; they are thus called respectively S(x) and P(x). Here are their definitions:

S(x) = Σ j = 1 to n   pj-1   where the (pj < x) are all the prime numbers less than x.

So for example if x = 17.4 then S(x) = 1/2 + 1/3 + 1/5 + 1/7 + 1/11 + 1/13 + 1/17. So far so good; the question is: As x -> +inf does S(x) approach a limiting value?

Next:

P(x) = Π i = 1 to m  1/(1-1/pi) where I need to explain the difference between n and m.