Emulation

It is now time to discuss how the actual emulation is carried out in our machine. This functionality is implemented in the emulate function, in the file called emulate.c.

This function is given two input parameters. The first is the start address in the memory of our fictitious machine, from which execution of instructions should begin. Actually in a real machine, the start address for a program is usually specified by the operating system, since it is the OS that loads the program into memory before it is executed. But our machine does not have an operating system, so we improvise this detail.

The second parameter is a pointer to a machine structure, which contains all the register and other state information of our fictitious CPU.

Firstly, a couple of temporary variables are declared, which are used for storing intermediate results of computations. Doubtlessly a real CPU has all sorts of internal data stores for holding intermediate computations.

Next we set the instruction pointer (IP) of our machine to the start address in memory of the program which is going to be executed (supplied as already mentioned by the caller of the function). Note that IP has nothing to do with an internet IP address. Same abbreviation, different meaning.

The last thing which the function does before starting the execution proper is to set the state of the machine to running. This is done by setting the running field of our machine structure to the boolean value true. Note that we can just use the word true here, instead of using an integer value such as 1 or 0, since we included the standard system library stdbool.h at the beginning of our program, which defines the words true and false.

You may be asking, "why do we need to actually keep track of when the machine is running or not? Why can't we just run until the last instruction and then just stop?" Well, the answer is, we could. But it would require slightly messy code. As it is, we can use a nice while loop, to successively load instructions from memory, execute them, and keep doing so, until a HLT instruction is encountered. When such an instruction pops up, its implementation will simply be to set MC->running to false, at which point the while loop ceased looping and causing instructions from memory to be executed. It is a nice solution to an otherwise messy problem, though almost certainly not the highest performance solution.

Now you will observe that in order to tell which instruction should be executed, a giant switch construction is used. This construct is a very convenient one, which replaces a large number of if...else statements.

For example, if I had a variable, called data, say, and I wanted a program to do different things, depending on what is in the variable, could write the following code:

if (data == 0)
{
    //instructions here
} else if (data == 1)
{
    //instructions here
} else if (data == 2)
{
    //instructions here
}
    //etc, ad infinitum

However, an easier, and neater way of doing it, is as follows-na:

switch (data)
{
    case 0: //instructions here
        break;
    case 1: //instructions here
        break:
    case 2: //instructions here
        break;
    //etc
}

In our case, rather than use the contents of a variable as a basis for the decision, we use the contents of the next memory location, which is conveniently addressed for us by the instruction pointer of our machine. We want to read the data at that memory location, and based on what instruction is there, execute different statements as a result. Thus we just dive into our mem array, at the location pointed to by the instruction pointer, and run a giant case construction to execute different things based upon the result.

Note that we refer to the value found at the next instruction location in memory, not by the integer code of that instruction, but by its mnemonic, courtesy of the enum which we defined in the emulate.h file.

The break statement listed after each instruction implementation is necessary in order to break back out of the case construct. In reality, the C compiler stores all of the statements which comprise the cases of our switch statement one after the other in memory, and preceeds them all by a giant lookup table which tells the processor where to jump in memory to execute the requisite case. If it weren't for the break statements at the end of each case the processor would just execute the requisite case, having been sent there by the lookup table, and then proceed to execute all the cases that followed it in memory. Sometimes that can be a useful feature of C, but in our case, we don't want that.

Another useful thing that can be used with a switch statement, is a default: case. This will be executed if every other switch case has failed. One does not use the word case, but just writes default:, followed by the instructions which should be executed if all the specified cases have failed.