Why yet another 6502 C compiler?[edit | edit source]
Performance[edit | edit source]
As an LLVM backend, we benefit from the expansive high-level optimizations available. These include radical code transformations of switch statements, loops, and table lookups. Nothing beyond what a human could do of course; a human wrote all these optimizations, of course. But there's potential far beyond what a human would have patience for. As an example, switch statement cases may be shifted and bitwise operations applied to them to make the different case integers denser. This increases the number that can fit into a jump table, which decreases the amount of branching needed to execute the switch. A human could do that for a switch statement, but it's unlikely they'd go through the effort for any but the most performance critical. LLVM will tirelessly consider it for every single switch in the program.
At a lower level, good use of the zero page is essential to producing good 6502 code. To that end, we model the zero page as an "imaginary register" bank. The number and placement of these registers are completely customizable by the end user to fit a variety of target system memory models. Using registers for this purpose allows us full access to LLVM's register allocator, which can often allocate program temporary values in such a way that they never need to leave the zero page, A, X, and Y. This vastly reduces need for soft (emulated) stack, which is a sticking point for earlier 6502 compilers.
Even when a stack of some kind is required, the optimizer performs whole-program analysis to identify functions that cannot simultaneously have more than one invocation active. These functions can have their "stack frames" allocated in absolute memory, again avoiding use of the soft stack. We reserve the actual soft stack only for cases where it cannot be statically proven that a function doesn't intrinsically require it (due to function pointers or other complex control flow).
As for the code itself, we perform a remarkably effective loop optimization that detects 16-bit index operations that can be converted to a 16-bit index plus an 8-bit offset. The latter is a directly-supported addressing mode on the 6502, and 8-bit index manipulation can be done in a single instruction. This allows us to convert idiomatic 16-bit "int c" loops into something much more suitable for the 6502. Eventually, we hope that optimizations of this kind will transform standard, naive C code into tightly optimized 6502 code.
Features[edit | edit source]
Because this is a subproject of LLVM, it inherits all the features of LLVM. Namely, this project provides full ELF support for 6502 objects, libraries, and executables. This opens up previously impossible functionality, such as viewing 6502 program properties in ELF tools that don't know anything about the 6502 specifically.
llvm-mos is not limited to C. It also provides a fully functional assembler and disassembler that reads and writes assembly source files in a GNU assembler compatible format. This in turns opens up a world of macro programming functionality, for those who prefer to work at the metal level.
A proof of concept exists, demonstrating that llvm-mos can support Rust as a source language. This suggests that LLVM-MOS can support other source languages as well, such as C++.
Lastly, the llvm-mos project is entirely open source, and developed entirely consistently with LLVM coding standards in mind. Want to experiment with a new codegen pass, or adding a new target? Jump right in, clone the codebase, and start playing.
Findings[edit | edit source]
Several common assumptions about the MOS 6502 processor, and C compilers targeting it, are now refuted by our work.
First, the assumption that a modern compiler framework, such as LLVM, cannot be targeted towards an old 8-bit CPU such as the 6502. LLVM's new GlobalISel architecture can very well be targeted to the 6502, and it can indeed produce superior code, if permitted to do so.
Second, the assumption that because the 6502 is "stackless," and has few registers, it is not a good host for C.
Regarding stacks, while it's true that the standard C runtime model is quite hostile to 6502 performance, the C standard provides broad latitude for alternative models that behavior in all points "as if" it were the C model. In the broad space of possible alternatives, we've found a collection of techniques that broadly preserve C standard compatibility while emitting very high quality code. To put it another way, we go to great lengths to emit code that operates "as if there were stacks", without using stacks at all. LLVM's sophistication facilitates this; the analyses required are quite intricate, but most of them are slight modifications to data structures already available in LLVM.
Regarding registers, the original 6502 designers were well aware of the 6502's register limitations, and so provided a bunch of zero-page addressing modes to compensate. We present these to LLVM as registers, which takes our backend from "alien nightmare" to "ugly duckling", not unlike x86 or AVR. Normal register allocation techniques apply, since 6502 instructions treat different zero page locations identically. While A, X, and Y are a bit unusual, they're not any worse than x86, and LLVM's register is fully capable of handling them, even if the relationship is a bit strained.
Third, that "simpler is better" for producing a performant compiler for 8-bit targets. llvm-mos's architecture and design choices are not at all simple. I haven't counted, but I think llvm-mos is doing about 100 passes through the code, about 8 of which are specific to the MOS 6502.
Fourth, that the 6502 is implicitly some sort of "special" architecture, and it therefore requires special compilers, linkers, binary file formats, etc. We treat the 6502, ultimately, as just another target within the LLVM framework, and as such it benefits from all the industry-standard ELF-compatible file formats.
Fifth, that because the 6502 is a small target, it requires a smaller compiler and smaller tools. This assumption never really made sense anyway. In fact the opposite is true: if you want to do advanced codegen for the 6502, you need a really intelligent (and large) compiler and toolchain framework, not a small one. The state of the art of optimization has advanced leaps and bounds in the past three decades, and the poor old 6502 has received none of those benefits, until the current work.
Sixth, that peephole optimization produces the best codegen for the 6502. In fact, llvm-mos gets the most benefit out of 6502-specific optimizations relatively early in the LLVM machine function pass pipeline, and the code it produces (in small tests) is quite efficient, even without any 6502-specific peephole optimizer at all. "The more clothes you put on during the day, the more you have to take off at night." One high-level instruction can become a big block of 6502, so a single high-level optimization that removes it can prevent a thousand cases from being needed to handle it later.
Seventh, that because the 6502 is small, it requires some sort of specialized language (in the David A. Wheeler sense) in order to generate performant code. While this makes the job of the compiler author considerably easier, we like the challenge. Organizing code for the 6502 is actually rather difficult; making "a version of C" that is easy to compile just shifts the burden of this work back onto the user. LLVM will likely be able to handle compiling other languages to the 6502, at some point in the future. Rust support has already been proven, but there are no problems in principle with lowering many more languages to the 6502.