Imaginary registers

From llvm-mos

Like all modern compilers, LLVM assumes generally that the target machine has a large number of target registers that are more or less interchangeable. This restriction is loosened, at some cost to code complexity, for the X86 targets, but generally most targets assume that you have a range of compiler-controllable registers. However, the 6502 and its derivatives have only three registers with very different available functionality - A, X and Y. How does one make the two work together?

The 6502 CPU architecture also features a special range of memory referred to as the zero page - this typically refers to the first 256 bytes of memory. This area of memory is faster to access and modify; it also provides some additional addressing modes not available in general memory accesses. Due to the increased performance and capability of this area, it is idiomatic for 6502 code to use some of it as temporary variable space. This, then, allows reconciling LLVM's needs with the 6502's features - by defining an area of memory as imaginary registers; that is, registers which do not exist in silicon, but are provided by a set of zero-page memory locations reserved by the linker.

(The term imaginary registers is used because virtual registers have another pre-defined meaning in LLVM compiler development.)

These imaginary registers are represented at code generation time by a symbol like __rc17 or __rs5. When linking, this symbol is translated to an actual memory address by the linker.

Presently, the compiler requires 16 imaginary pointers, each consisting of two contiguous bytes. These are called rs0 - rs15. Each such pointer is divided into two subregisters; e.g. the low byte of rs0 is rc0, and the high byte is rc1. Continuing on in this fashion defines rc0 - rc31.

Note that LLVM-MOS does not assume the rs* imaginary registers need to be consecutive! Many targets have non-consecutive usable zero page memory locations. However, LLVM-MOS does assume that each rs* imaginary register consists of two neighboring bytes in memory. It is possible to split the imaginary register ranges into subsequences containing even numbers of bytes, if a particular target requires it.