*Note: this project is currently on hold, since some of my current work has focused more on a different kind
of instruction set that might not need this kind of analysis as much.*

The design of evolvable instruction sets is very important in artificial life and evolutionary computation. It is currently a "black art" for which there is very little theory or empirical data to guide us. Current efforts such as those present in projects Tierra, Avida, Ouroboros, etc. are guided largely by past experience and ad-hoc intuition. The technical term for this kind of approach is "winging it."

We cannot, for example, justify why one evolvable instruction set design is "good" and another is not. I have designed three myself and have no idea whether they are good relative to the set of all possible instruction sets.

At ALifeX I brought this up as a possible topic direction for the field. I called this project "exploring Turing space," where "Turing space" refers to the space of all possible Turing-complete instruction sets. Turing space is infinitely large, so the goal should be to gain some insight into the structure of this infinite space. What we'd end up learning would probably resemble the sort of general classification laws that have been learned about cellular automata by people like Wolfram and Langton. Think something analogous to Wolfram's classes or Langton's "edge of chaos."

I'd like to be able to look at an instruction set and say some qualitative/quantitative things about it the way I can with a cellular automaton. At the very least, this would give us some direction and rigor for our instruction set designs.

I have long wanted to do something to help expand our knowledge in this area, but haven't had the foggiest idea of how to do it. At ALifeX, I spoke with Jason Kelly, Larry Yeager, Charles Ofria, Virgil Griffith, and Chris Adami on the subject. I listed their names because they're the ones I credit with helping me gain the following not-so-foggy idea:

In digital logic there are two elementary types of logic gates out of which pretty much anything can be built: NAND and NOR. Of these, NAND is the most commonly used.

Instruction sets are really just sets of pre-packaged logic gate operations that form a Turing-complete set of operations.

Put these two ideas together, and it hit me: design a VM that is Turing-complete and is made up only of NAND gates. Then, do genetic programming with it and evolve some stuff. Finally, characterize the solutions and look for patterns.

That night I went up to my hotel room at the conference and hacked up the NAND VM. I called it Elemental since it's, well, elementary. It's probably pretty close to the simplest virtual machine you could possible have. It's only about 40 lines of C code.

To prove my NAND VM was Turing-complete, I used blind random search to generate NAND programs to do the following things:

- Conditional looping
- Unconditional looping (infinite loops)
- AND
- OR
- XOR
- NOT

If you had a language with loops and those ops, it would be Turing-complete. I think that pretty much does it. (Interesting side note: the randomly generated NAND programs for the logic ops vary hugely in size from a few NAND instructions to hundreds. There are, of course, infinitely many possible programs to do anything provided there is no bound on length.)

So now I have a tool.

I'm currently writing a very basic genetic programming framework around the core Elemental NAND VM. When that's done, I'm going to do some canonical GP problems like Fibonacci numbers, simple math ops, curve fitting, etc. After doing many reps of these problems, I'll have a large population of NAND programs to begin characterizing. The next step will be to do some modeling, analysis, and bioinformatics-type stuff on those NAND programs and look for patterns like...

- Are there common motifs?
- Do NAND code snippets for common logic ops like AND, OR, XOR, etc. come up? How often?
- What level of similarity is there between runs? Is evolution convergent? What does a clustering plot look like?

... and so on.

In the end I hope to have some empirical data on what kinds of patterns emerge when you do evolution in the most elementary VM space possible. These patterns, if they emerge, will be roughly equivalent to automatically discovered instruction sets. So, hopefully this will tell us something about what Turing space looks like and what, if any, preferences Darwinian processes display in Turing space.