Think of a whole number. If it’s even, halve it. If it’s odd, multiply it by three and add one. Keep doing the same. You will find you have a series of numbers which go up and down in value, and if you plotted them on a graph they’d often seem to bounce up and down like hailstones in a hailstorm. For this reason, they’re called “hailstone numbers”.

Here’s an example. 5, 16, 8, 4, 2, 1, 4, 2, 1 . . .

Here’s another: 42, 21, 64, 32, 16, 8, 4, 2, 1 . . .

At first glance you might expect the situation to be as follows; the larger the number, the more steps it takes to reach the final cycle of 4, 2, 1 . . . It seems quite simple, but it isn’t. The number 27, for example, takes a hundred and eleven steps to get to the cycle and goes as high as 9 232. There seems to be no way to predict exactly how long the sequence will last before it gets to the cycle, although it is unsurprisingly true that larger numbers, particularly odd ones, tend to take longer. For instance, 27 is an odd number and the number 2596148429267413814265248164610048 also takes a hundred and eleven steps to reach the loop, and is considerably larger than 27. In case you’re interested, 27’s sequence is:

27, 82, 41, 124, 62, 31, 94, 47, 142, 71, 214, 107, 322, 161, 484, 242, 121, 364, 182,

91, 274, 137, 412, 206, 103, 310, 155, 466, 233, 700, 350, 175, 526, 263, 790, 395,

1186, 593, 1780, 890, 445, 1336, 668, 334, 167, 502, 251, 754, 377, 1132, 566,

283, 850, 425, 1276, 638, 319, 958, 479, 1438, 719, 2158, 1079, 3238, 1619, 4858,

2429, 7288, 3644, 1822, 911, 2734, 1367, 4102, 2051, 6154, 3077, 9232, 4616,

2308, 1154, 577, 1732, 866, 433, 1300, 650, 325, 976, 488, 244, 122, 61, 184, 92,

46, 23, 70, 35, 106, 53, 160, 80, 40, 20, 10, 5, 16, 8, 4, 2, 1, 4, 2, 1, …

All of this is easy to understand. Any numerate person with a late primary school level of maths could make these calculations and test the sequences. We the not particularly mathematically-skilled public have no problem getting this and doing the work on it to a certain extent. It also seems to be pure mathematics so far as I can tell: there seem to be no applications for it at all, except perhaps for that frondy thing at the top of this post which looks nice. Much of my own ability to do maths is blocked by my failure to get to grips with calculus, as I suspect it is for many other people. I don’t consider the obstacle to be entirely insurmountable but I’ve never actually succeeded in vaulting over it. But there’s another aspect to this set of sequences which seems to stymie everyone, no matter how adept at mathematics they might be. This is known as Collatz’s Conjecture.

In maths, a conjecture is a statement which seems to be true on the basis of preliminary evidence but has never been proven or disproven. A famous example is Goldbach’s Conjecture: every even whole number from four upwards is the sum of two prime numbers. This has been demonstrated for every such number up to at least 1000000000000000000, but there’s still no known proof or disproof.

Collatz’s Conjecture is that every whole number treated in this way will eventually collapse into the sequence “4, 2, 1, . . .” rather than either going on forever or being part of a different sequence. It’s clearly true for any power of two, but they get further and further apart the higher you go, and in their case there’s a simple relationship between their size and the number of steps before it happens. You might therefore think that prime numbers which are one less than a power of two, known as Mersenne Primes, would have some kind of relationship but they don’t seem to have. 127, for example, takes forty-six steps to reach the cycle.

If the distributions of the steps for each number are plotted on a graph with a logarithmic scale to the Y-axis, before they collapse into the cycle the movements are pretty close to looking random, although there’s an algorithm to them so they aren’t. It occurs to me that the paths might be similar to the movements of atoms and molecules in an atmosphere thin enough at the surface of a moon or planet to be a collisionless gas, but maybe not. It might also be that it would work as a pseudorandom number generator with that logarithmic step and the omission of the final falls to the cycle. It’s also remarkably easy to write a program to do this. For instance, in BASIC:

10 LET A=<value>

20 IF A DIV 2 * 2=A THEN LET A=A/2: GOTO 20

30 LET A=3*A+1

40 GOTO 20

It can also be implemented easily in assembler using shifting and adding: the above code is actually unnecessarily complex, and this may not be coincidence. John Conway of “Life” fame managed to generalise this sequence to produce a Turing Machine, i.e. a general computer which can, given time, do anything any digital computer can, and this opens it up to comparison with the Halting Problem. The Halting Problem is whether an arbitrary computer program, given an input, will finish running or continue forever. Alan Turing proved that there is no way to show that this will happen. If these sequences are shown to be sufficiently similar to computer programs, the Collatz Conjecture would therefore be shown to be unprovable. Conway came up with an esolang (esoteric programming language) called FRACTRAN which was Turing-complete in 1987, based on this sequence.

The largest number tested for this is 2^{68}, which is 295147905179352825856. Riho Terras was able to prove that almost no number reaches a point below its original value, and limits to the values were arrived at in 1979 and 1994 which showed that the function can rise as slowly as possible.

Although every number tested does gravitate to 4,2,1…, that isn’t enough to prove that it always happens. For instance, it could be that high enough numbers wouldn’t do this, and since there are infinity whole numbers, simply testing every step is impossible. There are two not mutually-exclusive possibilities. One is that there is a number out there somewhere which will never start to fall towards the cycle. The other is that there is a set of numbers which is part of a completely different loop. Both of these could be true, or only one. If there is such a cycle, it’s been proven that it must have at least 17 087 915 members, meaning that it can’t be practically proven by one person doing pen-and-paper calculations. It’s also true that if the calculation is changed to 3x-1, two other cycles appear:

5, 14, 7, 20, 10, 5,…

and:

17, 50, 25, 74, 37, 110, 55, 164, 82, 41, 122, 61,

192, 91, 272, 136, 68, 34, 17,…

There might also be more of them.

There is something of an argument which suggests the Collatz Conjecture is true. It goes like this: one more than thrice an odd number must be even, so it will divide by two. There is then a 50-50 chance that the result will also be even, and therefore also divisible by two. The longer the sequence is, the more likely this is to happen. This is a statistical argument, but some numbers in the sequence are omitted completely in these calculations and others crop up a lot more than they “should”, so the neat bell-curve that might be expected might not be forthcoming.

The numbers do, however, obey Benford’s Law. Benford’s Law was first noticed when books of logarithms in university libraries started to get dirtier towards the front of the books than the back. This is because numbers which begin with smaller digits are much more common than those which begin with larger ones. This applies, for example, to the lengths of rivers, electricity bills, stock and house prices. More than thirty percent of such numbers begin with a one, slightly more than a sixth begin with a two, an eighth begin with a three and so on, until only 4.6% begin with a nine. Benford’s Law does approximately apply to Collatz sequences, and it gets closer the more numbers are included. It’s also true of numbers in any base other than binary. It works best when considering data which span several orders of magnitude and is used to detect fraud in elections and accounting. This presumably means there are computer programs out there on the Dark Web or something which use Benford’s Law to befuddle forensic accountancy. Collatz sequences might find a use there.

Although the Collatz Conjecture seems useless as far as direct applications are concerned, it does have educational value. It shows, for example, how simple formulæ can lead to highly complicated systems, and that there are attractors as mentioned in Happy Catastrophe.

If the same thing is done with negative integers, there are three loops. These are:

-1, -2, -1. . .

-5, -14, -7, -20, -10, -5 . . .

-17, -50, -25, -74, -37, -110, -55, -164, -82, -41, -122, -61, -182, -91, -272, -136, -68, -34, -17 . . .

The image at the top of the post is produced using Collatz sequences. One way of thinking of them is as a tree. All known positive integers tested eventually settle down to the sequence 4, 2, 1 . . . , so this can be seen as something like a trunk or a final confluence of tributaries to a river. There are also other confluences further up the tree. If the transition to an odd number is drawn on this directed graph at a 20° angle in one direction and that to an even one as 8° in the other, you end up with what’s shown above. These angles can be adjusted and you end up with various shapes which look like living organisms such as corals, bryozoa or shrubs.

The oddity about this problem is that it jumps so rapidly from a simple issue which can be understood by anyone who knows basic arithmetic into a problem which has never been solved by the most skilled and advanced mathematicians who have applied themselves to it. This raises the question of why we are able to understand almost anything. Even though there are a huge number of mathematical problems which can and have been solved, it isn’t clear how we are able to do that. It seems that it could easily have turned out that we wouldn’t be able to do maths at all because it all turned out to be too hard, and where would that leave us? Why are we able to solve any mathematical problems? Also, most mathematicians consider this problem to be significant but also one which is likely to absorb all someone’s time without them coming up with a useful result, so it’s also the ultimate time-waster, or at least seems to be. Hence this entire post may or may not have been a waste of time. What do you think?