Porting

We've designed the LLVM-MOS SDK to be as possible to port to new platforms (but no easier). This is a tutorial-style guide on how to do so.

Imaginary Target
For the purposes of this guide, we'll need a target to port to. Rather than use a real target (which may have an official port by the next time this guide is updated), we'll invent a new one.

We'll make the target as simple as possible. (Real targets are complicated, but they're all complicated in different ways). Let's say the target has 64KiB of RAM available, with no banking. We'll also imagine that the target has an emulator that's capable of loading programs in some file format. Let's say the file format is very simple: a 64KiB image to load into RAM, followed by two bytes indicating the start address, little-endian.

The Simplest Program
First, make sure the latest SDK release is extracted somewhere, and make a directory to work in. You can do most of this tutorial without the SDK sources; you only need the SDK sources if you're looking to contribute your port to the SDK. (But please do!)

Next, create the simplest possible C program:

Parent Target
The SDK's targets are hierarchical: a target can have an incomplete target as a parent. The parent is called incomplete because the child fills in missing pieces of it. An incomplete target can also have a different incomplete target as a parent, forming a tree. The complete targets form the leaves of this tree; only these can produce binaries.

For porting a real target, take a look at the SDK; there may already be an incomplete target for the family of devices or boards you're porting to. Completing an incomplete target is much much easier than building one from scratch.

Along those lines, the tree of targets is rooted at the  target. This provides functionality that is essential to running C on a 6502; it can be reasonably shared by all targets.

Since we're porting to fake target, we should select  as our parent target. This means to compile our code, we should invoke  as.

Compiling: First Attempt
Let's compile: This fails because the linker has no earthly idea how to layout out code for our platform. For this, we have to provide a linker script,.

Linker Script
The linker scripts are based on GCC linker scripts (reference), which is extended by LLD (reference), which is further extended by LLVM-MOS (reference).

There's a lot of functionality packed behind these little scripts; it can take time to learn the language thoroughly. However, you don't need very much to get started.

Here's a minimal linker script for our platform:  The   section describes the layout of the RAM available for the linker to put linked sections in. This typically excludes the zero page and stack; these are usually handled by other mechanisms. This  section states that there's a memory region named   suitable to be assigned both read-only and writable sections. It starts at 0x200, and it ends at the end of RAM.

The  directive states which sections from input files the linker should place in which output sections, as well as symbols relating to section placement. The linker will automatically place all sections in the  region, which is what we want.

The next bit of linker script assigns symbols  through   to addresses   through. This defines the "imaginary registers" in the zero page that are reserved for compiler use (and that form the C calling convention). is a helper script that automatically assigns each unset register to the register before it + 1. Thus, you only need to set the first register to zero, and the script takes care of the rest. Note that you can only specify the locations of even registers; the odd registers are fixed, since they must immediately follow the preceding register for the pair to work as a pointer.

Compiling: Second Attempt
That compiled, but it's an ELF file, which isn't at all what the target takes. LLVM-MOS uses ELF for its object files, libraries, and executables (by default). ELF provides rich information about the contents; this is what allows the full suite of LLVM tools to work. For example, you can use  to dump all the symbols in the generated file: Most of these symbols were generated by the   call, but you can also see our   function was placed at 0x204, and that the imaginary registers were set up appropriately.

You can get a disassembly too: However, none of this helps make an output file that our emulator can actually load.

Output Format
To make our object file, we return to our linker script:  The new line, , is an LLVM-MOS extension to the linker script language that describes what our actual output file should look like. We want the full contents of RAM to be output, padded with zeros. Afterwards, we want a little-endian short containing the starting point of the program, set up by the  target:.

Additionally, the  section was renamed to  ; this is more appropriate, since it doesn't actually span the full 64KiB address range, which is what we want to write to the file. Instead, we create an overlapping  section that starts at 0 and ends at 0xffff. This is the section we write in the.

Compiling: Third Attempt
This time two files were produced:, and. is the ELF file produced previously, while  contains the contents as described by the   script.

Looking at the contents of, we have 64KiB of data, with the interesting stuff beginning at 0x200, as expected. Afterwards, we have, which is the little-endian word 0x200, which is indeed the value of.

Optional Libraries
Take another look at the disassembly: Several things are amiss already. There's a  routine that contains , that's good. But  doesn't look hooked up to anything. And once it's finished,  just returns back to , which runs right into  , which returns into never-never land.

This is all because the  target's libraries don't by default include any functionality that isn't absolutely necessary for the C runtime. In this case, that's, which contains pre-main functionality (e.g.,  ) and C++ constructors, and  , which contains post-main functionality (e.g.,   and C++ destructors).

Other functionality is contained in optional libraries that must be explicitly included by the targets. This is either because the nature of the target and it's OS may make the functionality unnecessary, or because there is more than one best way to accomplish it, again depending on the target.

The canonical reference for these optional libraries are the various CMakeLists.txt files in the common target.

One of the things that a target needs to decide is how a program is exited. This happens whenever  is called, or implicitly when   returns. When either occurs,  must be called first. The  target provides optional libraries corresponding to the usual possibilities: Our target doesn't have an underlying OS, so  seems reasonable. Specifying this library on the command line produces a more sensible behavior: Now, after  returns back to ,   falls through to  , which jumps to. in turn calls  then enters an infinite loop, as desired. can then be called by any C function to run  and loop, not just by returning from.

There are some other optional libraries that are necessary to get full C language functionality on our target. Notably, we need to set up the stack. The  optional library can do this; we just need to provide a symbol, , that is the top of stack at program start. Since there's nothing on the stack, for this target the top of stack should actually be 0x10000; since that's not a real address should counterintuitively be 0. You could also use 0xffff and waste a byte; no-one would judge you.

We can set this in our linker script, : Finally, we can compile again and link against   too: Nothing changed! Is there a problem? Nope!

Since the C program doesn't actually use a stack, the linker doesn't include any code to set it up. That's the purpose of  and  ; these sections collect snippets of code to set things up, depending on what's actually used in the final binary. So, with this change, the target is complete for freestanding C/C++, and it even has an  routine.'

= Extending the SDK = After porting LLVM-MOS to a platform, please consider submitting your port for inclusion in the LLVM-MOS SDK. We'd like the SDK to be a nice out-of-the-box way to write code for any 6502 platform, and contributions along those lines are greatly appreciated.

To do so, continue onward to the Extending SDK guide.