turing machine – A Box Of Chocolates

The Way In

Posted on 11/11/2025 by zerothly

Backing the losers

I’ve a tendency to back losers. For instance, Prefab Sprout and The The were my favourite bands when I was in my late teens and both were markedly unsuccessful. In keeping with this trend, I used to have a Jupiter Ace computer and I’ve alluded to this a few times on this blog. Jupiter Cantab, the company which designed and made it, had a total of I think five employees, seemed to work out of a garage and a suburban house in Cambridgeshire and had a turnover of around five thousand pounds. They went bust maybe a year or two after releasing the Ace in October 1982 CE and had to sell off all their old stock, a happy thing for me as it meant I could acquire a new computer. Its hardware was very basic even for late ’82, but its firmware decidedly was not. Unlike practically every other home computer of the time, the Ace used not BASIC but FORTH as its native programming language. Perversely, I considered writing a BASIC interpreter for it for a while but it seemed a bit silly so I didn’t.

FORTH, unlike BASIC as it was at the time, was considered a “proper” programming language. It has two distinctive features. One is that it uses a data structure known as the stack, which is a list of numbers in consecutive locations in memory presented to the user as having a top and a bottom. Words in the FORTH language usually take data off the top of the stack, operate on them and may leave one or more results on it. This makes the syntax like Latin, Turkish or Sanskrit rather than English, since instead of writing “2+2”, you write “2 2 +”, which leaves 4 on the stack. The other feature is that rather than writing single programs the user defines words, so for example to print out the character set one writes:
: CHARSET ( This is the defined word and can be whatever the user wants except for control characters ) 255 32 DO I EMIT LOOP ;
If one then types in CHARSET and presses return (or ENTER) in the Ace’s case), it will print out every character the Ace can display except for the graphics characters whose codes are below 32.

What you see above is the output from the Ace when you type in VLIST, i.e. list every word in its vocabulary. I think there are a total of about 140 words. All of them fit in 8K and show that FORTH is a marvellously compact language compared to BASIC or in fact most other programming languages. For instance, the ZX81’s BASIC has around forty-one words. FORTH on the Ace, and in general, was so fast that the cheapest computer faster than it, the IBM PC, cost more than a hundred times as much. For instance, in order to produce sound it was possible, as well as using the word BEEP, to define a word that counted from 0 to 32767 between vibrations and still produce a respectable note by moving the speaker in and out. By contrast, the ZX81 would take nearly ten minutes to count that high and had no proper sound anyway. This is a somewhat unfair comparison but illustrates the gulf between the speed of this high level language and the other.

Whittling Down The Vocab

As I’ve said, FORTH consists of words defined in terms of other words and therefore some people object to calling code written in it “programs”, preferring “words”. The fact that this definitional process was core to the language immediately made me wonder what would constitute a minimal FORTH system. There are quite a few words easy to dispense with in the vocabulary listed above. For instance, the word SPACES prints whatever number of spaces is indicated by the number on top of the stack, so 32 SPACES prints 32 spaces. However, this word could’ve been defined by the user, thus:
: SPACES 0 DO SPACE LOOP ;

The DO-LOOP control structure counts between the two numbers on top of the stack and executes the code between DO and LOOP the number of times it takes to do that. It can be taken a lot further than that though. SPACE and CR are two words with a similar structure: they print out a character. SPACE unsurprisingly prints out a space. CR does a carriage return. Both are part of the standard ASCII character set and the word for printing the ASCII character indicated by the number on top of the stack is EMIT, so they can be defined thus:
: SPACE 32 EMIT ;

: CR 13 EMIT ;

Hence three words are already shown to be unnecessary to the most minimal FORTH system, and the question arises of what, then, would constitute such a system. What’s the smallest set of words needed to do this?

The Ace had already added quite a lot of words which are not part of standard FORTH-79 and omitted others which are easily defined, examples being all the floating point words, PLOT, BEEP, CLS, VIS and INVIS. Others are trivial to define, such as 2+, 1- and 1+. Others are a bit less obvious: PICK can be used to replace DUP, SWAP and OVER, ROT is a special case of ROLL and so can be defined in those terms. . , that full stop, which prints a number, can be replaced by the number formatting words <#, # and #> . You can continue to whittle it down until you have a very small number of words along with the software which accepts input and definitions, and you’re done. In fact, if you know the hardware well enough you can make it even smaller because, with the Jupiter Ace for example, you know where the display is stored in RAM, how the stacks work (there are actually two because practically all computers implicitly use a stack for subroutines) and when it comes down to it it’s even possible to define words which accept machine code, the numbers computers actually use which represent simple instructions like adding two numbers together or storing one somewhere.

This is about how far I got until recently I managed to join two ideas together that I hadn’t previously managed.

Logic To Numbers

As you probably know, my degrees are in philosophy and my first degree is in the analytical tradition, which is the dominant one in the English-speaking world. It’s very common for philosophy degrees to be rubbished by the general public and even within philosophy, continental and analytical philosophers are often hostile to each other. What may not be appreciated is that much of philosophy actually closely resembles mathematics and by extension, computer science. When the department at my first university was closed down, some of it merged with computing. It also turns out, a little surprisingly, that one of my tutors, Nicholas Measor, was a significant influence on the theory of computing, having helped develop modal mu calculus, which is concerned with completeness of systems and temporal logic. He wrote a paper called “Duality and the Completeness of the Modal mu-Calculus” in the ’90s. This has kind of caused things to fall into place for me.

The Dedekind-Peano axioms for the set of natural numbers are central to the theoretical basis of arithmetic. They go as follows:

0 is a natural number.
For every natural number x, x=x.
For every natural number x equal to y, y is equal to x.
For all natural numbers x, y, z, if x=y and y=z then x=z.
For all a and b, if a is b natural number and a=b then a is a natural number.
Let S(n) be “the successor of n”. Then for every natural number n, S(n) is a natural number.
For every natural number S(m) and S(n), if S(m) = S(n) then m=n.
For every natural number n, S(n)=0 is false.
If K is a set such that 0 is in K, andfor every natural number n, n being in K implies that S(n) is in K,then K contains every natural number.

You can then go on to define addition, subtraction, multiplication and inequalities. Division is harder to define because this is about integers, and dividing one integer by another may lead to fractions, decimals and so forth. I’ve known about all this since I was an undergraduate but hadn’t given it much thought. It is, incidentally, possible to take this further and define negative, real and presumably imaginary, complex and hypercomplex numbers this way, but the principle of knowing that that’s possible is enough really.

Dyscalculic Programming

If you have a language which can express all of these axioms, you have a system which can do most arithmetic. And this is where I had my epiphany, just last week: you could have a programming language which didn’t initially use numbers at all, because numbers could be defined in terms of these axioms instead. It would be difficult to apply this to FORTH because it uses sixteen bit signed binary integers as its only data type but I don’t think it’s impossible and that would mean there could be a whole earlier and more primitive programming language which doesn’t initially even use numbers. This is still difficult and peculiar because all binary digital computers so far as I know use sequences of zeros and ones, making this rather theoretical. It’s particularly hard to see how to marry this with FORTH.

Proof Assistants

Well, it turns out that such programming languages do exist and that they occupy a kind of nebulous territory between what are apparently called “proof assistants” and programming languages. Some can be used as both, others just as the former. A proof assistant is a language somewhat similar to a programming language but helps the user and computer together arrive at proofs. I have actually used one of these without realising that that was what I was doing, back in the ’80s, where the aforementioned Nicholas Measor wrote an application for the VAX called “Citrus” after the philosopher E. J. Lemmon, who incidentally died 366 days before I was born, whose purpose was to assist the user to prove sequents in symbolic logic. My approach to this was to prove them myself, then just go along to the VAX in the computer basement and type in what I’d proven, although it was admittedly helpful on more than one occasion. While using this, I mused that it was somewhat like a programming language except that it wasn’t imperative but declarative and wondered how one might go about writing something like that. I also considered the concept of expressive adequacy, also known as functional completeness, in this setting in connection once again with FORTH, realising that if the Schaeffer stroke were to be included as a word in FORTH a whole host of definitions could easily provide any bitwise function. It was also borne in upon me that all the logics I’d come across so far were entirely extensional and that it might even be a distinctive feature of logic and mathematics per se that it was completely extensional in form. However, I understand that there are such things as intensional logics, and I suppose modality might be seen in that way although I always conceive of it in terms of possible world semantics and multivalent truth values.

It goes further than that though. I remember noticing that ALGOL 60 lacks input/output facilities in its standard, which made me wonder how the heck it was supposed to be used. However, it turns out that if you are sufficiently strict with yourself you can absolutely minimise I/O and do everything inside the compiler except for some teensy little bit of interaction with the user, and this instinct, if you follow it, is akin to functional programming, a much later idea which enables you to separate the gubbins from how it looks to the user. And there are purely functional languages out there, and at this point I should probably try to express what I mean.

From Metaphysics To Haskell

Functional programming does something rather familiar. Considering the possibility that programming can dispense with numbers as basic features, the emphasis shifts to operators and functions and they become “first-class citizens”. This, weirdly but then again not so weirdly, is exactly what happens in category theory. Haskell is the absolutely paradigmatic functional language, and it’s been said that when you think you’re programming in it, it feels like you’re actually just doing maths. This approach lacks what you’d think would be a crucial feature of operating a computer just as ALGOL 60 can’t actually print or read input, and such things are known as “side-effects” in functional programming. If a function does anything other than take the values, performing an operation on them and returning the result, that’s a side-effect. This makes it easier to formally verify a program, so it’s linked to the mu calculus.

I’ve now mentioned Haskell, which brings me a bit closer to the title of this post and now I’m going to have to talk about monads. Monads are actually something like three different things and it’s now occurred to me that if you put an I at the start rather than an M you get “ionad”, which gives me pause, but this is all quite arcane enough. Firstly, Leibniz’s metaphysics prominently features the concept of monads. In 1714, he brought out a ninety-paragraph book called ‘The Monadology’ setting out his beliefs. It wasn’t originally his idea but he developed it more than others. Leibniz’s monads are indivisible units within reality which has no smaller parts and is entirely self-contained, though not physical like an atom. Anything that changes within it has to arise within itself – “it has to really want to change”. Since monads don’t interact there’s an arrangement called “pre-ordained harmony” where things in each monad are destined to coincide appropriately. I mean, I think this is all very silly and it arises from Descartes and the problem of the interaction of soul and body, but it’s still a useful concept and got adopted into maths, specifically into category theory. In that, it’s notoriously and slightly humorously defined thus: “in concise terms, a monad is a monoid in the category of endofunctors of some fixed category”, and this at least brings us to the functor. A functor is a mapping between categories. Hence two different fields of maths might turn out to have identical relationships between their elements. It’s a little like intersectionality in sociopolitical terms, in that for example racism and sexism are different in detail but are both forms of marginalisation, the difference being that intersectionality is, well, intersectional, meaning that different kinds of oppression do interact, so it isn’t quite the same as either a monad or a functor. Finally, in Haskell a monad is – er. Okay, well at this point I don’t really know what a monad is in Haskell but the general idea behind Haskell was originally that it was safe and also useless because you could never get anything into or out of a program written in it. This isn’t entirely true because it does do work in a thermodynamic sense, so if you take a computer switched on but doing nothing and you run a Haskell program on it which does something, it does get at least slightly warmer. That is, it does stuff to the data already inside it which you can never find out about, but it’s entirely self-contained and does its own thing. So that’s all very nice for it, but rather frustrating, and just now I don’t know how to proceed with this exactly, except that I can recognise that the kind of discipline one places oneself under by not knowing how one is going to get anything on the screen, out of the speakers or off the keyboard, trackball, mouse or joystick has the potential of making one’s programming extremely pure, if that’s the word: operating in an extremely abstract manner.

Home Computing In The 1960s

I do, however, know how to proceed with what I was thinking about earlier. There is some tiny vocabulary of FORTH, perhaps involving a manner of using a language which defines numbers in the terms outlined above, which would be simple enough to run on a simple computer, and this is where things get theoretical, because according to Alan Turing any computer, no matter how simple, can do anything any other computer can do given enough time and resources. This is the principle of the Turing machine. Moreover, the Turing machine can be realised in terms of a language referred to as the Lambda Calculus.

Underneath the user interface of the Jupiter Ace operates the Z80A microprocessor. This has 694 instructions, which is of course quite a bit more than the 140 words of the Jupiter Ace’s primitive vocabulary. Other processors have fewer instructions, but all are “Turing-complete”, meaning that given enough time and memory they can solve any computing problem. In theory a ZX81 could run ChatGPT, just very, very, very slowly and with a very big RAM pack. So the question arises of how far down you can strip a processor before it stops working, and this is the nub of where I’m going, because actually you can do it with a single instruction, and there are even choices as to which instruction’s best.

The easiest one to conceive of is “subtract and branch if negative”. This is a machine which has two operands in memory. One of them is a number, which it subtracts from the number it already has in mind. If the result of this number turns out to be negative, it looks at the other operand and jumps to the number indicated in the memory. Otherwise it just goes to the next memory address and repeats the operation. It would also save space on a chip if the values are stored in memory rather than the chip itself, so I propose that the program counter and accumulator, i.e. where the data are stored, are in the main memory.

Moreover, I’ve been talking about a single instruction but in fact that actual instruction can be implied. It doesn’t need to exist explicitly in the object code of the memory. Instead it can be assumed and the processor will do what that instruction demands anyway, so in a way this is a zero instruction set CPU.

What this very simple computer does is run a program that emulates a somewhat more complex machine which runs the stripped down FORTH natively. This is where it gets interesting, because the very earliest microprocessors, the 4004 and 4040, needed more transistors than this machine would, and it would’ve been entirely feasible to put this on a single silicon wafer in 1970. This is a microcomputer like those found in the early ’80s for which the technology and concepts existed before the Beatles split up.

This is of course a bit of a mind game, though not an entirely useless one. What I’ve basically discovered is that I already have a lot of what I need to know to do this task, but it’s on a level which is hard to apply to the problem in hand. But it is there. This is the way in.

The Machine That Explains Everything

Posted on 25/10/2021 by zerothly

Compare and contrast:

with:

. . . and I’m now wondering if anyone has ever put those two songs next to each other before, on a playlist or otherwise. While I’m at it, here’s another:

(not quite the same). I’ve probably done it now.

Then there’s this:

That’s quite enough of that. Or is it?

Like Franco Battiato, Chiara Marletto is Italian, although she was born at the opposite end of the country. She’s Torinese while he’s Sicilian, although he did move to Milan(o). However, this is not that important unless it says something about the nature of northern Italian culture or education and that’s another story. The germane issue is that there are two distinct approaches to science, if science is seen as based on physics, and that is not the only option – biocentrism is another, possibly relevant to where I’m about to go with this – one of which is much more prominent and formally developed than the other. I’m not talking about classical versus quantum or the issue of quantum gravity and the reconciliation of relativity with quantum physics, although those are important and this is relevant to the latter. ‘To Faust A Footnote’ is a musically-accompanied recitation of Newton’s laws of motion, or at least something like that, and describes the likes of trajectories and objects in motion. Such descriptions are found in Johannes Keplers laws of planetary motion, and although relativity and quantum mechanics are radically different in some ways from this classical approach, this aspect remains the same.

Around a century ago the world of physics saw the climax of a revolution. Triggered, I’m going to say, by two thoughts and perhaps experiments, it was realised that the idea of particles as little snooker balls which could ping about at any speed and were pulled towards each other and pushed apart by various forces which were similar in nature such as magnetism and gravity didn’t really describe the world as it actually is. The first clue had been known for millennia, which is that hot objects glow red rather than white. This led to the realisation that because all objects emit electromagnetic radiation at a range of frequencies, meaning that they would glow orange if they were slightly more than red hot, they would be expected to radiate all of their heat immediately, meaning they would drop to absolute zero. The solution to this is that the range of frequencies is discrete. It has steps and there are minimum packets of light energy called quanta, from the Latin for “how much”, and this prevents the quantity of heat being radiated by any object from being infinite. The other was the increasing difficulty maintaining the idea that light waves had a medium, the luminiferous æther, which would have to have various unusual properties combined to work in this way, culminating in the Michelson-Morley experiment, which showed that light travels at the same speed regardless of the speed of the observer, meaning for example that, 299 792 kps being the speed of light, if you were travelling at 299 791 kps, you would still measure the speed of light at 299 792 kps. You can’t catch up with it. In the Michelson-Morley experiment, light is emitted in two directions at mirrors, at right angles to each other, and the interference patterns are observed. Because Earth is orbiting the Sun at around 30 kps, if light is moving through a medium which is not being dragged with us, and it had been previously shown that it couldn’t be, it “ought” to be moving 30 kps more slowly in one direction than the other, which would show by the wave fronts lining up differently, but this doesn’t happen.

These two lines of thought led to two major new breakthroughs in theoretical physics. One is general and special relativity, the idea that moving observers find themselves dividing space and time up differently than stationary ones and that gravity is not a force but a distortion of space. The other is quantum mechanics, which is that there are inherent limits to accuracy and that probability is fundamentally involved in physical phenomena on a small scale, so there is no certain location or direction to a sufficiently small particle, but it’s more likely to be in one place or be going in a particular direction than another, and it’s these which constitute the waves, which are like a graph depicting how likely something is to be in a certain place at a certain time. These are both “big ideas”. Since then, particle and other physics has tended to involve tinkering with the details and working out the consequences of these theories, notably trying to relate them to each other, which is difficult. Related to quantum mechanics is the Standard Model, really a big set of related ideas which classifies elementary particles and describes electromagnetism and the strong and weak nuclear forces. Gravity is missing from this model. If gravity was suddenly to be non-existent inside a sealed space station able to control its own temperature and pressure, its trajectory would change but there wouldn’t be a fundamental change in anyone’s lifestyle aboard the space station, and this illustrates how big the rift is between the two sides of physics. There are also problems with the recent model of cosmology, there are two many parameters involved for it to be considered elegant (nineteen) and for some reason the weak nuclear force is a quadrillion times (long scale) stronger than gravity and nobody knows why. In order to account more fully for neutrinos, another seven apparently arbitrary constants will be needed, so the whole this is a bit of a mess, although it does work well. It’s also become difficult to test because of the high energy levels involved. Another issue is that there’s a lot more matter than antimatter.

There are also a number of straightforward, everyday phenomena which the kind of physics involving particles and trajectories can’t account for. For instance, a drop of ink in a jar of water starts off as a small, self-contained blob which then spreads out and leaves the jar with a more homogenous tint. This is the usual operation of entropy, but although physics can account for individual molecules of pigment colliding with hydrogen molecules and moving in all sorts of different directions, it can’t explain, for instance, why it happens that way round. Well, I say “physics”. In fact there is a perfectly good branch of physics which does at least assert that this kind of thing will happen, and it’s the one referred to by Flanders and Swann: thermodynamics. The Second Law of Thermodynamics asserts that the entropy of a closed system tends towards a maximum. Another maxim from thermodynamics is the counterfactual: a perpetual motion machine is impossible.

Everything must change. This is Paul Young next to an overbalanced wheel, which might be thought to spin permanently once set in motion, but it doesn’t. The idea of an overbalanced wheel is to shift mass from the edge towards the centre in order to keep it spinning, but it doesn’t work because the torque is actually the same on either side. I find objections to perpetual motion machines odd because at first read they generally appear to be critical of a minor flaw in the machine which might be easily remedied, such as friction, but in fact resolving that problem would introduce another, and it’s a limitation of the entire system rather than that minor issue. All you’re doing is moving the limitation to a different aspect of the device. It will always be there somewhere.

And now it’s time to introduce Constructor Theory (CT), to which Chiari Marletto is a substantial contributor. The idea of a machine being impossible is called a “counterfactual” in CT, and a perpetual motion machine is a good example of that. It needn’t be a machine though, just a situation such as a block of lead spontaneously transmuting to gold, which cannot happen, or rather, is almost impossible. Thermodynamics has enough mathematics in it as it is, but not the same kind as quantum mechanics or relativity. Some physicists seem to feel thermodynamics isn’t rigorous enough for that reason, but it can be made more so without straying into the kind of trajectory and particles paradigm used elsewhere, and the wording of the laws of thermodynamics could also serve as the basis for more precise and less natural language.

Marletto uses transistors as an example. A transistor is, functionally speaking, a switch which can be operated electrically to turn it on or off. This means it has two states. Many other things are functionally identical to transistors in binary digital computers, such as valves and relays, but their physical details can be removed from the situation when making a computer. A 6502 CPU, as found in the Apple ][ and BBC Micro among many other computers, is a microprocessor whose chip comprises microlithographed NMOS transistors, but has also been made occupying an entire board from discrete TTL integrated circuits and even covering the walls of a living room or bedroom with lower-scale integration components, and it could even be made from valves or relays, but it would be slower. In all these cases, the computationally important aspect of the logic network is the ones and zeros and the logic functions applied to them. There are physical realisations of these but there’s a level of abstraction which means they don’t matter. Constructor theory appears to aim to generalise this, not necessarily in terms of computing but with the same kind of detachment. That said, it still recognises information as a neglected and important aspect of physics.

A second phenomenon which physics as it stands can’t really make sense of is life. When an animal, including a human, dies, most or all of its cells can still be alive but the animal as a whole is dead. The corpse obeys the same laws of physics as currently understood as the animal did when it was alive. The Krebs Cycle tends to run for a while until the oxygen runs out and there’s no longer a way for carbon dioxide to be transported away, so the acidity increases, and enzymes within the cells begin to break them down. Genes can also be expressed for days after death. Moreover, bacteria decompose or cremation converts the body to an inorganic form, all within the purvey of physics and in the former case biology, and yet the transformation of life to death is profound and meaningful, and can be as completely described by physics and chemistry as any other classical process, but is in another way not described at all. The counterfactual here would be resurrection, but time’s arrow only points one way.

Information, heat and the properties of living things cannot be accommodated as trajectories because they’re about what can or cannot be made to happen to a physical system rather than what happens to it given initial conditions and the laws of motion. Constructor theory’s fundamental statements are about which tasks are possible, which impossible and why that’s so. The fundamental items are not particles or trajectories but “tasks”: the precise specification of a physical transformation of a system. The transformed object is known as the substrate, so in a sense the duality of trajectories and particles is replaced by that of tasks and substrates.

It might be worth introducing another metaphor here. Suppose you have a machine which includes a cup on the end of an arm. If you put a six-sided die in that cup and it reliably throws a six every time and is in the same state at the end as it was before you put the die in, that machine can be called a “constructor”. If it isn’t in the same state at the end as at the start, it may not be able to repeat the task reliably, which means it isn’t a constructor. Now for me, and for all I know this has been addressed in the theory because once again I’m somewhat out of my depth here, this seems to ignore entropy. All machines wear out. Why would a constructor not? Clearly the machine is a metaphor, but how literally can it be taken?

Although laws of physics in this framework are characterised by counterfactuals and constructors, and the language of tasks and substrates is used, it’s often possible to get from such a description to traditional, and here “traditional” includes quantum physics, statements couched in trajectory/particle terms. In this way constructor theory includes traditional physics and can be used everywhere traditional physics can be used, but it can also cover so much more than that, including information, life, probability and thermodynamics, thereby bringing all of these things into the fold in a unified way. For instance, both the position and velocity of a particle cannot be measured precisely at the same time, which is tantamount to saying that there cannot be a machine which measures both position and velocity. In that context, it’s fairly trivial – the “machine” bit seems tacked on unnecessarily – but in others, such as life and information, it wouldn’t be so.

Information is a little difficult to describe formally and this is one of those situations where although the word does correspond to how it’s used colloquially, particularly in the phrase “information technology”, it isn’t quite the same thing as that. There are mathematical ways of describing the concept, but before covering that it’s important to point out that simply because the word “information” is used in this way, there is in some way an authority or a greater right to use it thus. It’s like the way “nut” and “berry” are used botanically, in that a peanut is not a nut but nutmeg is, and a banana is a berry but a blackberry isn’t, but that doesn’t mean the way we use “nut” and “berry” is in any way inferior. Nonetheless, this is how I’m going to be using it here.

The more ordered something is, the less information it takes to describe. Glass is a haphazard arrangement of, for example, sodium silicate molecules, and to describe precisely where each is would take a long list of coördinates which couldn’t be compressed much, but a crystal of sodium chloride is relatively easy to describe as a cube-shaped lattice comprising chlorine and sodium a certain distance apart, and once you’ve done that, you’ve described the whole crystal. Hence the more entropic something is, the more information is needed to describe it. If a crystal is disturbed, perhaps by the inclusion of a few atoms of other elements, it will be more complicated, and need more information to describe it. Likewise, if mercury is a solid crystal below -40, melting that mercury complicates its structure and therefore in a sense, melting something and increasing its entropy is adding information to it. Strangely, it follows that one way of freezing things is to remove information from them, which is Maxwell’s Demon.

Maxwell’s Demon has been evoked repeatedly as a provocative thought experiment in physics. It can be described thus. Imagine an aquarium of water divided in two halves. There is a door in the middle of the partition where a demon stands who inspects the individual water molecules. If they’re moving faster than a certain speed, the demon lets them through to compartment B or leaves them in there, but if they’re moving more slowly, the demon lets them through to compartment A or leaves them there. As time goes by, the temperature of compartment A falls and that of compartment B rises, until compartment A has frozen. This appears to violate the laws of thermodynamics. If you’re uncomfortable with a demon, you can imagine a membrane between the two which is permeable one way to faster molecules and the other way to slower ones, but the issue remains. One counter-claim to this is that the membrane or demon has to have information-processing power to do this, and that would involve at least the initial input of energy if not its continuous use. The membrane is very “clever” and organised: it’s a technologically advanced bit of kit, or alternatively it’s a highly evolved organism or living system, all of which involved the input of a lot of energy at some point, perhaps in a factory that makes these things. If it’s actually a demon, it has a brain or something like it: it has to be able to think about what it’s doing, and that takes energy. This is why zombies would probably be nuclear-powered, incidentally, but that’s another story.

Leaving aside the question of whether this inevitably breaks the laws of thermodynamics by reducing empathy without greater energy input than output, Maxwell’s Demon is relevant to Constructor theory and has on a very small scale been used to do something which would be useful on a larger scale. This is effectively a fridge which runs by moving information around. The information needed to describe the frozen side of the aquarium is probably less than that required to describe the liquid, or possibly steam, side, because the frozen side consists of ice crystals, which take less information to describe than water or steam. This membrane works by taking information away from hot things. This has been done with a transistor. A device has been invented which separates high- and low-energy electrons and only allows the latter to reach the transistor, which therefore cools it. This is actually useful because it could be employed in cooling computers. A somewhat similar device is the Szilard Engine, which detects the location of a gas molecule in a chamber, places a barrier across it and closes a piston on the vacuum side before releasing the gas molecule. This, too, releases a tiny bit of energy via information, namely the information about where in the chamber the molecule is. It’s also subject to the Uncertainty Principle because if the chamber were sufficiently small, and in this case subatomic particles would have to be used, the point would come when there was only a probability that the piston would move, which would create different timelines, but this isn’t the point under consideration. Hence there is a relationship between energy, information and entropy with real consequences. This is no longer just a thought experiment.

At this point, I seem to have missing information, because I’m aware of all this and on the other side I’m also aware of Universal Constructors, but I can’t make a firm connection between them. The link may become clear as I describe them. If not, I might try to find some information, so to speak, to link them. It is connected to quantum computing. I know that much. Also, that Universal Constructors are based on cellular automata, and that I really can explain.

By Lucas Vieira – Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=101736

In Conway’s game of Life, a grid of square cells, each of which is either occupied or empty, goes through generations where two or three occupied neighbouring cells preserves an occupied cell, any cell with three live neighbours becomes occupied and any cell with four disappears. If you watch the above GIF carefully, you can glean these rules. Conway’s Life is the best-known example of a cellular automaton, but there are others such as Highlife, with different rules, where cells with three or six neighbours become occupied and continue if they have two or three. Another one is Wireworld, which is useful for illustrating the way into one of the most important things about cellular automata:

By Thomas Schoch – Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=1863034

This is an XOR (either/or) logic gate from Wireworld, which works particularly well as a means of building structures which work as logic gates or transistors. It’s probably obvious that any binary digital computer can be built in Wireworld, because if logic gates can be made, anything made from them can be. It’s less obvious that many other cellular automata also have this feature, including Life. This means that many cellular automata are Turing-complete. Turing-completeness is the ability to simulate any Turing machine, which runs on a tape on which it writes and erases symbols according to instructions which either tell it to halt or advance to another instruction by moving the tape one way or the other, the symbols also acting as instructions. Perhaps surprisingly, this machine can emulate any computer, and there’s a layer on top of this which means that it can simulate anything a classical computer can simulate. Turing-completeness can be applied not only to any digital binary computer but also to programming languages and other things. There is, for example, a computer design with a single instruction: subtract one and branch if negative. This can do anything, but might take a long time to do it. Any practical computer would not be designed like this and the idea also ignores the fact that the machine might take an extremely long time to do anything, and memory limitations are also ignored, but it means, for example, that with the right peripherals a ZX81 could simulate a supercomputer or just a modern-day high end laptop, really, really slowly, assuming that the extra couple of dozen bits needed could be added to the address bus! Maybe this is what happened with the Acorn System 1 in ‘Blake’s 7’ series D:

One way to extend the cellular automaton concept is to make it quantum, which can then have the same relationship to quantum computers as Turing-complete classical cellular automata have to classic digital binary computers. If built, quantum cellular automata (QCAs) would have very low power-consumption and if they are Turing complete, would also be a possible replacement for classical computers, and they can also be made extremely small. However, there are two distinct types of QCAs which happen to have been given the same name by separate teams. The QCAs I referred to as having low power consumption are quantum dot based. Quantum dots are important to nanotechnology. They have to be very small to work, and consist of particles a few nanometres across which switch electrons from their orbital state to the delocalised state found in metals which renders them conductors. This means they can act as switches just like transistors. If they’re linked in the right way, they can be used to build cellular automata. This, however, is not what Deutsch and Marletto mean by a QCA, David Deutsch being the other main proponent of Constructor Theory, because although quantum dots computers arranged as cellular automata could indeed be used as computers through being lattices of devices which can do Highlife, Life or Wireworld, or some other kind of automata, the electron transition can be treated as a classical bit rather than a qubit and the fact that it happens to be a quantum phenomenon doesn’t alter the basic design of the computer. Real quantum dot computing has been around since 1997 CE. Qubits can be actualised through such phenomena as spin or the polarisation of light, where there are two possible states, but they differ from bits in that they can be both zero and one until measured or observed, and if they are observed, this can influence the chain of cause and effect which led up to that point. This means, for example, that the factors of integers can be found by observing the qubit rather than having to calculate them iteratively, and this is much faster. Since cryptographic security depends on prime factors, this also means that quantum computing might make secure financial transactions over the internet insecure.

In the Marletto-Deutsch sense, a QCA is to a classical CA (cellular automaton) as a quantum logic gate is to a classical logic gate. A classical logic gate may have two inputs and one output, and the output depends on those inputs. A quantum logic gate is bidirectional. Measuring or observing the output influences what the input was. Hence one rule for Life, for example, is:

¬(A /\ B /\ C /\ D)=>¬E

where A, B, C and D are E’s neighbours. This is a one-way process. You could build an array of LEDs behaving as a Life game where logic gates such as the one above linked the cells represented by them together and just let it run, having set the original conditions, but there would be only one outcome and there’s no going back unless the pattern involved cycles. If quantum gates were involved instead, the outcome would, when observed, determine what had happened before it was observed, and this could be done by building a grid out of quantum gate devices rather than classical TTL integrated circuits.

A Universal Constructor can be built in Life. This is a pattern which can build any other pattern. In fact, patterns can be built which can copy themselves and they can be coupled to Turing machines in Life which can be programmed to get them to make the pattern desired. This is the first successful Universal Constructor:

This shows three Universal Constructors, each made by its predecessor. The lines are lists of instructions which tell the machines how to make copies of themselves. Mutations can occur in these which are then passed on. Perhaps unsurprisingly, these were thought up by John von Neumann, and are therefore basically Von Neumann probes as in this. These are potentially the devices which will become our descendants as intelligent machines colonising the Galaxy, and possibly turning it into grey goo, but this is not what we’re talking about here. Here’s a video of one in action.

These are machines which do tasks on substrates, and this is where I lose track. Deutsch (whom I haven’t introduced properly) and Marletto seem to think that physics can be rewritten from the bottom up by starting with the concept of Universal Constructors running in quantum cellular automata. I haven’t read much of their work yet, but I presume this Universal Constructor is an abstract concept rather than something which actually exists to their minds, or at least only exists in the same way as mathematical platonists believe mathematics exists. A mathematical platonist believes maths exists independently of cognition, so for example somewhere “out there” is the real number π. It’s certainly hard to believe that if there were suddenly no living things in the world with no other changes, there wouldn’t still be three peaks in the Trinidadian Trinity Hills for example. Another option would be formalism, where mathematics is seen as a mere game played with symbols and rules. If this is true, it’s possible that aliens would have different mathematics, but this fails to explain why mathematics seems to fit the world so neatly. These same thoughts apply to Universal Constructors. These things may exist platonically speaking, or they may be formal. It’s also difficult to tell, given its recent advent, whether Constructor Theory is going to stand the test of time or whether it’s just fashionable right now, and that raises another issue: if platonism is true, do bad theories or unpopular ones exist anyway? Also, even if Constructor Theory did turn out to be unpopular that wouldn’t be the same as it being wrong. We might well stumble upon something which was correct and then abandon it without knowing.

The reason these questions are occupying my mind right now is that the idea that physics is based on a Universal Constructor, which I presume is doing the job of standing for a Theory of Everything but again I don’t know enough, would seem to have two interesting correlations with popular ways of looking at the world. One is that the Universe is a simulation, which I don’t personally believe for various reasons, one of which is the Three-Body Problem (not the book). This is the impossibility of calculating the movements of three massive bodies relatively close to each other. It is possible to calculate both the movements of two such bodies and to find special cases of three bodies which are calculable, but it isn’t possible to calculate most such cases, and given that the Universe consists of many more than three bodies of relatively significant size, the calculations necessary would need a computer more complex by many orders of magnitude than the Universe. There are many cases where the third body is too small or distant compared to the others where a good approximation can be calculated, but if the orbit of Mercury differs even by a millimetre, it can completely throw the other planets in our Solar System out of kilter to the extent that Earth would end up much closer to the Sun or Mercury and Venus would collide before the Sun becomes a red giant. Therefore, if the Universe is a simulation it would need to be run by a computer far more powerful than seems possible. Nonetheless, it’s possible to shrink the real world down so that, for example, everything outside the Solar System is simply a projection, and this would help. If it did turn out to be one, though, it seems that Constructor Theory and the Universal Constructor would be a major useful component in running it. The second question is a really obvious one: Is the Universal Constructor God? Like the Cosmological Argument, the Universal Constructor is very different from the traditional view of God in many religions, because it seems to be a deist God who sets the world in motion and retires, or at least leaves her Universal Constructor to run things, and “proof” of a Creator is not proof of God as she’s generally understood because there’s nothing in there about answering prayers or revealing herself to prophets, among many other things. Also, this would be a “God Of The Gaps”, as in, you insert the idea of a God whenever you can’t explain something. Nonetheless it is at least amusingly or quaintly God-like in some ways.

To summarise then, Constructor Theory is motivated by the problem of using conventional physics to describe and account for such things as the difference between life and death, the direction in which entropy operates and the nature of the way things are without supposing initial conditions. It seeks to explain this by proposing the idea of a Universal Constructor, which is a machine which can do anything, and more specifically performs tasks on the substrate that is the Universe, and also local cases of the Universe such as a melting ice cube, exploding galaxy or dying sparrow. This Universal Constructor can be composed of quantum cellular automata and is a kind of quantum computer, which it has to be because the Universe is quantum. This reminds me a bit of God. Have I got it? I dunno. But I want to finish with an anecdote.

Back in 1990, the future hadn’t arrived yet, so ‘Tomorrow’s World’ was still on. Nowadays it would just be called ‘Today’s World’ of course. At the start of one of the episodes, Kieran Prendeville, Judith Hann or someone said that CERN were building a “Machine That Explains Everything”, and they then went on to talk about a new design of inline roller skate. I’ve never forgotten that incident, mainly because of the bathos, but I suppose it was the Large Electron-Positron Collider. Of course, incidentally at the same time in the same organisation Tim Berners-Lee was inventing a different kind of “machine that explains everything”, but it seems to me now that this is also what Constructor Theorists are trying to do, because a Universal Constructor is definitely a machine, and it’s definitely supposed to explain everything. So that’s good. Apparently the game Deus Ex also has something with the same name, which I know thanks to Tim Berners-Lee’s invention, but I can’t find an explanation for it.

Oh, and I may have got all of this completely wrong.