https://llvm-mos.org/api.php?action=feedcontributions&user=71.198.117.145&feedformat=atomllvm-mos - User contributions [en]2024-03-29T08:22:47ZUser contributionsMediaWiki 1.39.6https://llvm-mos.org/index.php?title=NES_targets&diff=368NES targets2022-10-10T18:53:51Z<p>71.198.117.145: </p>
<hr />
<div>The NES is a particularly challenging compiler target, due to the extremely large number of configurations supported by the various "mapper" ASICs on its cartridges.<br />
<br />
The generic <code>nes</code> target contains only functionality that is generically applicable across mapper chips. Individual mappers each have their own derived target:<br />
{| class="wikitable"<br />
|+Supported Mappers<br />
!Number<br />
!Name<br />
!Target<br />
|-<br />
|000<br />
|NROM<br />
|<code>nes-nrom</code><br />
|-<br />
|001<br />
|MMC1<br />
|<code>nes-mmc1</code><br />
|-<br />
|003<br />
|CNROM<br />
|<code>nes-cnrom</code><br />
|}<br />
<br />
== Sections ==<br />
<br />
=== PRG-ROM ===<br />
If PRG-ROM banking is disabled, then data can be placed in PRG-ROM using the regular <code>.rodata</code> C sections. Otherwise, data can be placed into a specific PRG-ROM bank using section <code>.prg_rom_<bankno></code> or any section that begins with <code>.prg_rom_<bankno>.</code>. The load addresses of PRG-ROM banks begin at <code>0x01000000</code>. This allows the bottom 24-bits to represent a logical PRG-ROM address space, while the high byte being <code>0x01</code> indicates that the address is PRG-ROM.<br />
<br />
=== CHR-ROM ===<br />
If CHR-ROM banking is disabled, then data can be placed in CHR-ROM using the <code>.chr-rom</code> section or any section that begins with <code>.chr-rom.</code>. Otherwise, data can be placed into a specific CHR-ROM bank using section <code>.chr_rom_<bankno></code> or any section that begins with <code>.chr_rom_<bankno>.</code>. The logical addresses of CHR-ROM banks begins at <code>0x02000000</code> . This allows the bottom 24-bits to represent a logical CHR-ROM address space, while the highest byte being <code>0x02</code> indicates that the address is CHR-ROM.<br />
<br />
=== PRG-RAM ===<br />
If PRG-RAM banking is disabled, then data can be placed in PRG-RAM using the <code>.prg-ram</code> section or any section that begins with <code>.prg-ram.</code>. Otherwise, variables can be placed into a specific PRG-RAM bank using section <code>.prg_ram_<bankno></code> or any section that begins with <code>.prg_rom_<bankno>.</code>. The PRG-RAM is not initialized at program start, so any initializers given in C or data provided in assembly are ignored.<br />
<br />
== Symbol Configuration ==<br />
As far as is possible, mappers are configured by assigning symbol values. These control the contents of the [https://www.nesdev.org/wiki/NES_2.0 iNES 2.0 header] in the resulting binary, as well as the addresses assigned to program sections. Some of the following values have logical sizes less than the full 32-bits available for symbol values. In such cases, only the lowest order bits are considered. For example, <code>__prg_nvram_size_raw</code> corresponds to the high nibble of header field 10. This means that the lowest nibble of <code>__prg_nvram_size_raw</code> is mapped to the high nibble of header field 10.<br />
{| class="wikitable"<br />
|+Configuration Symbols<br />
!Symbol<br />
!Description<br />
|-<br />
|<code>__prg_rom_size</code><br />
|KiB of PRG-ROM<br />
|-<br />
|<code>__chr_rom_size</code><br />
|KiB of CHR-ROM<br />
|-<br />
|<code>__prg_ram_size</code><br />
|KiB of PRG-RAM<br />
|-<br />
|<code>__prg_nvram_size</code><br />
|KiB of PRG-NVRAM<br />
|-<br />
|<code>__chr_ram_size</code><br />
|KiB of CHR-RAM<br />
|-<br />
|<code>__chr_nvram_size</code><br />
|KiB of CHR-NVRAM<br />
|-<br />
|<code>__mirroring</code><br />
|Bit 0 of field 6<br />
|-<br />
|<code>__battery</code><br />
|Bit 1 of field 6<br />
|-<br />
|<code>__trainer</code><br />
|Bit 2 of field 6<br />
|-<br />
|<code>__four_screen</code> <br />
|Bit 3 of field 6<br />
|-<br />
|<code>__mapper</code> <br />
|Mapper number.<br />
|-<br />
|<code>__console_type</code><br />
|Low two bits of field 7<br />
|-<br />
|<code>__timing</code><br />
|Low two bits of field 12<br />
|-<br />
|<code>__ppu_type</code><br />
|Low nibble of byte 13 when Vs. System<br />
|-<br />
|<code>__hw_type</code><br />
|High nibble of byte 13 when Vs. System<br />
|-<br />
|<code>__extended_console_type</code><br />
|Low nibble of byte 13 when Extended Console Type<br />
|-<br />
|<code>__misc_roms</code><br />
|Low 2 bytes of byte 14<br />
|-<br />
|<code>__default_expansion_device</code><br />
|Low 6 bytes of byte 15<br />
|-<br />
|<code>__prg_rom_size_raw</code><br />
|The most significant nibble is the lower nibble of field 9, and the least significant byte is field 4. Overrides <code>__prg_rom_size</code>. <br />
|-<br />
|<code>__chr_rom_size_raw</code><br />
|The most significant nibble is the upper nibble of field 9, and the least significant byte is field 5. Overrides <code>__chr_rom_size</code>. <br />
|-<br />
|<code>__prg_ram_size_raw</code><br />
|Raw value of low 4 bits of header field 10. Overrides <code>__prg_ram_size</code>. <br />
|-<br />
|<code>__prg_nvram_size_raw</code><br />
|Raw value of high 4 bits of header field 10. Overrides <code>__prg_nvram_size</code>. <br />
|-<br />
|<code>__chr_ram_size_raw</code><br />
|Raw value of low 4 bits of header field 11. Overrides <code>__chr_ram_size</code>. <br />
|-<br />
|<code>__chr_nvram_size_raw</code><br />
|Raw value of high 4 bits of header field 11. Overrides <code>__chr_nvram_size</code>. <br />
|}<br />
<br />
== INCLUDE Configuration ==<br />
In some cases, the linker script semantics are insufficiently powerful to configure the linker alone. In this case, linker scripts can be composed by `INCLUDE`-ing script files to build up a custom linker script (rather than using the default). The following script files are common to all targets.<br />
{| class="wikitable"<br />
|+INCLUDE Files<br />
!Name<br />
!Description<br />
|-<br />
|<code>common.ld</code><br />
|Functionality common to all linker scripts for a given target. Must be included. Included in default linker script.<br />
|-<br />
|<code>c-in-ram.ld</code><br />
|Place the writable C sections into NES RAM. Exactly one <code>c-in-</code> script must be included. Included in default linker script.<br />
|}<br />
<br />
== [https://www.nesdev.org/wiki/NROM NROM], CNROM ==<br />
<br />
=== INCLUDE Configuration ===<br />
{| class="wikitable"<br />
|+INCLUDE Files<br />
!Name<br />
!Description<br />
|-<br />
|<code>c-in-prg-ram.ld</code><br />
|Place the writable C sections into PRG-RAM.<br />
|}<br />
<br />
== MMC1 ==<br />
Different PRG-ROM sizes of the MMC1 require different numbers of reset stubs to be available, so the PRG-ROM size must be set by INCLUDE, not by symbol. The below INCLUDE files automatically set the <code>__prg_rom_size</code> header field to the corresponding value. The 512KiB PRG-ROM mode and PRG-ROM bank modes other than 3 are not yet supported, since the C runtime model currently used by the compiler is only really practical if a fixed bank is present where e.g. libcalls can be placed.<br />
<br />
=== INCLUDE Configuration ===<br />
{| class="wikitable"<br />
|+INCLUDE Files<br />
!Name<br />
!Description<br />
|-<br />
|<code>c-in-prg-ram-0.ld</code><br />
|Place the writable C sections into PRG-RAM bank 0.<br />
|-<br />
|<code>prg-ram-16.ld</code><br />
|16 KiB of PRG-ROM without banking.<br />
|-<br />
|<code>prg-ram-32.ld</code><br />
|32 KiB of PRG-ROM without banking.<br />
|-<br />
|<code>prg-ram-32-banked.ld</code><br />
|32 KiB of PRG-ROM with banking. Initializes to PRG-ROM bank mode 3, bank 0.<br />
|-<br />
|<code>prg-ram-64.ld</code><br />
|64 KiB of PRG-ROM with banking. Initializes to PRG-ROM bank mode 3.<br />
|-<br />
|<code>prg-ram-128.ld</code><br />
|128 KiB of PRG-ROM with banking. Initializes to PRG-ROM bank mode 3.<br />
|-<br />
|<code>prg-ram-256.ld</code><br />
|256 KiB of PRG-ROM with banking. Initializes to PRG-ROM bank mode 3. Included in default linker script.<br />
|}</div>71.198.117.145https://llvm-mos.org/index.php?title=Frequently_asked_questions&diff=333Frequently asked questions2022-06-28T04:07:22Z<p>71.198.117.145: Add FAQ entry for missing sections.</p>
<hr />
<div>== Why is the compiler removing my infinite loops? ==<br />
Here's a really good article on this: [https://stefansf.de/post/non-termination-considered-harmful/ Non-Termination Considered Harmful in C and C++]<br />
<br />
The short version is that in C++, infinite loops are undefined behavior, and the compiler is free to assume that they cannot happen. This allows the compiler to emit faster code, so it will typcially do this whenever it can.<br />
<br />
In C, the situation is a bit more complicated; it can assume that any nontrivial loops terminate, so long as they don't contain certain things.<br />
<br />
Generally, if you want an infinite loop, you have to put something inside it that the C and C++ standard doesn't allow the compiler to elide. The easiest way to do this is typically <code>asm volatile ("");</code>, this tells the compiler that an inline assembly block with uncertain side effects should be inserted, which should keep the loop around. The compiler doesn't do deep introspection into ASM fragments, so it can't prove that it's safe to elide the loop, so it stays.<br />
<br />
This requires the inline-assembly extension; to do this in standard C/C++, just insert a volatile load or store into the loop. This will similarly force the compiler to keep the loop.<br />
<br />
The reason for all this is that it allows compilers to assume that code underneath loops will eventually be reached, which allows a variety of optimizations to occur. Modern C/C++ compilers like Clang operate by tearing down the code as written and rewriting it, bit by bit, into something more efficient. The only real constraints on this process are the letter of the C standard; undefined behavior exists to permit generating better code; it just creates unfortunate foot-guns like this one.<br />
<br />
== Why is the compiler removing my memory accesses? ==<br />
This is very similar to the above; if a memory access absolutely has to happen at a particular point in the program, it has to be done through a volatile object. Otherwise, the compiler is free to reason its heart out about what purpose that memory access serves, and if one isn't visible to the compiler, the access may be removed. This is usually very desirable; spurious accesses crop up all the time. For example, if you're incrementing a 16-bit counter variable, and some particular path through the program only ever looks at the low byte, the compiler might skip the high part of the increment. But, you wouldn't want this to happen if this variable was referenced by an OS routine! So, volatile is the blessed way to tell the compiler: "this needs to happen now!".<br />
<br />
== Why isn't the linker placing my sections? ==<br />
There are a two main reasons this can occur; the issue could be due to either or both.<br />
<br />
First, because the section wasn't marked "allocatable", which unfortunately isn't the default on linkers compatible with GNU ld for sections without a standard name (e.g., <code>.data</code>). Allocatable just means that the section should take up space in the final binary, as opposed to things that are merely there for linker or CLI utility use, like symbol tables. The syntax to add this flag is: <code>.section ''name'',"a"</code><br />
<br />
Second, because no symbols defined in the section were referenced, the linker may garbage-collect away the section. The linker has a sense of which sections must intrinsically be present (the logic is a bit complex, but it usually does a good job). Any sections that can't be reached from those known roots are removed, to save space. This is one of the features that allows uncalled functions to be stripped out of the binary at link time. To solve this, you can add the "retain" flag to the section declaration: <code>.section ''name'',"R"</code>. Another way to do this is to add <code>KEEP</code> to the linker script when assigning the section: <code>''output_section { KEEP(input_section) }''</code><br />
<br />
See the [https://sourceware.org/binutils/docs/as/Section.html GNU assembler manual] for the section directive and [https://sourceware.org/binutils/docs/ld/Input-Section-Keep.html#Input-Section-Keep GNU ld manual] for KEEP.</div>71.198.117.145https://llvm-mos.org/index.php?title=Modifiers&diff=325Modifiers2022-04-29T21:47:36Z<p>71.198.117.145: </p>
<hr />
<div>When writing MOS assembly code, it's often necessary to refer to a specific byte or short within a larger address, when the exact address is not known until code relaxation or linking. For example, code may need to refer to the low byte of a 16-bit function address:<br />
LDA ''[the lowest byte in the 16-bit address of QSORT]''<br />
However, the address of <code>QSORT</code> may not be exactly known, until your program has had relaxation and/or linking applied to it. Therefore, the lowest byte of <code>QSORT</code> won't be known until then as well.<br />
<br />
In LLVM land, that operator is referred to as a '''modifier'''. The llvm-mos project supports some MOS-specific modifiers, to refer to a specific part of a larger address. If <code>QSORT</code> refers to a 16-bit address:<br />
LDA #mos16lo(QSORT)<br />
then the lowest byte in the <code>QSORT</code> address is loaded into the accumulator.<br />
<br />
The following modifiers are available for your use:<br />
<br />
{| class="wikitable sortable"<br />
|+ MOS Modifiers<br />
|-<br />
! Modifier !! Description <br />
|-<br />
| <code>mos8()</code> || Forces a symbol to be considered an 8-bit (zero page) address.<br />
|-<br />
| <code>mos16lo()</code> or <code><</code> || The lowest byte in the given 16-bit address.<br />
|-<br />
| <code>mos16hi()</code> or <code>></code> || The highest byte in the given 16-bit address.<br />
|-<br />
| <code>mos24bank()</code> || The 8-bit bank portion of the given 24-bit address.<br />
|-<br />
| <code>mos24segment()</code> || The segment 16-bit portion of the given 24-bit address.<br />
|-<br />
| <code>mos24segmentlo()</code> || The lowest byte in the segment of the given 24-bit address.<br />
|-<br />
| <code>mos24segmenthi()</code> || The highest byte in the segment of the given 24-bit address.<br />
|}<br />
<br />
The assembly parser tries to be smart about what kinds of modifiers can be used for what purposes. For example, <code>mos24segment()</code> returns a 16-bit address, so you can't do <code>LDA #mos24segment(QSORT)</code> because the immediate LDA instruction only accepts 8-bit values.<br />
<br />
Much existing MOS code depends on the less-than operator <code><</code> and the greater-than operator <code>></code> as shorthand for the mos16lo() and mos16hi() functions, respectively. This functionality has been added to MOSAsmParser.cpp, so that you can use either representation in your own programs.<br />
<br />
All MOS processors are little endian, so the lowest byte appears at the zeroth position in a 16-bit address.<br />
<br />
The above syntax for modifiers only works inside assembly language instructions. If you need to use modifiers in assembly directives, the syntax is instead: <code>.byte address@mos16hi</code> (or similar).<br />
[[Category:Assembly]]</div>71.198.117.145https://llvm-mos.org/index.php?title=Porting&diff=316Porting2022-04-23T02:17:26Z<p>71.198.117.145: Start working on porting guide; not yet finished, but don't want to lose changes.</p>
<hr />
<div>We've designed the LLVM-MOS SDK to be as possible to port to new platforms (but no easier). This is a tutorial-style guide on how to do so.<br />
<br />
== Imaginary Target ==<br />
For the purposes of this guide, we'll need a target to port to. Rather than use a real target (which may have an official port by the next time this guide is updated), we'll invent a new one.<br />
<br />
We'll make the target as simple as possible. (Real targets are complicated, but they're all complicated in different ways). Let's say the target has 64KiB of RAM available, with no banking. We'll also imagine that the target has an emulator that's capable of loading programs in some file format. Let's say the file format is very simple: a 64KiB image to load into RAM, followed by two bytes indicating the start address, little-endian.<br />
<br />
== The Simplest Program ==<br />
First, make sure the latest SDK release is extracted somewhere, and make a directory to work in. You can do most of this tutorial without the SDK sources; you only need the SDK sources if you're looking to contribute your port to the SDK. (But please do!)<br />
<br />
Next, create the simplest possible C program: <code>main.c</code><syntaxhighlight lang="c"><br />
int main(void) { return 0; }<br />
</syntaxhighlight><br />
<br />
== Parent Target ==<br />
The SDK's targets are hierarchical: a target can have an ''incomplete target'' as a parent. The parent is called ''incomplete'' because the child fills in missing pieces of it. An incomplete target can also have a different incomplete target as a parent, forming a tree. The ''complete'' targets form the leaves of this tree; only these can produce binaries.<br />
<br />
For porting a real target, take a look at the SDK; there may already be an incomplete target for the family of devices or boards you're porting to. Completing an incomplete target is much much easier than building one from scratch.<br />
<br />
Along those lines, the tree of targets is rooted at the <code>common</code> target. This provides functionality that is essential to running C on a 6502; it can be reasonably shared by all targets.<br />
<br />
Since we're porting to fake target, we should select <code>common</code> as our parent target. This means to compile our code, we should invoke <code>clang</code> as <code>mos-common-clang</code>.<br />
<br />
== Compiling: First Attempt ==<br />
Let's compile:<syntaxhighlight lang="shell-session"><br />
$ mos-common-clang -o main -Os main.c<br />
ld.lld: error: cannot find linker script link.ld<br />
</syntaxhighlight>This fails because the linker has no earthly idea how to layout out code for our platform. For this, we have to provide a linker script, <code>link.ld</code>.<br />
<br />
== Linker Script ==<br />
The linker scripts are based on GCC linker scripts ([https://sourceware.org/binutils/docs/ld/ reference]), which is extended by LLD ([https://lld.llvm.org/ELF/linker_script.html reference]), which is further extended by LLVM-MOS ([[Linker Script|reference]]).<br />
<br />
There's a lot of functionality packed behind these little scripts; it can take time to learn the language thoroughly. However, you don't need very much to get started.<br />
<br />
Here's a minimal linker script for our platform: <code>link.ld</code><syntaxhighlight lang="text"><br />
MEMORY { ram (rw) : ORIGIN = 0x200, LENGTH = 0xfe00 }<br />
SECTIONS { INCLUDE c.ld }<br />
<br />
__rc0 = 0x00;<br />
INCLUDE imag-regs.ld<br />
ASSERT(__rc0 == 0x00, "Inconsistent zero page map.")<br />
ASSERT(__rc31 == 0x1f, "Inconsistent zero page map.")<br />
</syntaxhighlight>The <code>MEMORY</code> section describes the layout of the RAM available for the linker to put linked sections in. This typically excludes the zero page and stack; these are usually handled by other mechanisms. This <code>MEMORY</code> section states that there's a memory region named <code>ram</code> suitable to be assigned both read-only and writable sections. It starts at 0x200, and it ends at the end of RAM.<br />
<br />
The <code>SECTIONS</code> directive states which sections from input files the linker should place in which output sections, as well as symbols relating to section placement. The linker will automatically place all sections in the <code>ram</code> region, which is what we want.<br />
<br />
The next bit of linker script assigns symbols <code>__rc0</code> through <code>__rc31</code> to addresses <code>0</code> through <code>31</code>. This defines the "imaginary registers" in the zero page that are reserved for compiler use (and that form the C calling convention). <code>INCLUDE imag-regs.ld</code> is a helper script that automatically assigns each unset register to the register before it + 1. Thus, you only need to set the first register to zero, and the script takes care of the rest. Note that you can only specify the locations of even registers; the odd registers are fixed, since they must immediately follow the preceding register for the pair to work as a pointer.<br />
<br />
== Compiling: Second Attempt ==<br />
Let's try again!<syntaxhighlight lang="shell-session"><br />
$ mos-common-clang -o main -Os main.c<br />
ld.lld: error: cannot find linker script link.ld<br />
</syntaxhighlight></div>71.198.117.145https://llvm-mos.org/index.php?title=Welcome&diff=313Welcome2022-04-15T02:15:43Z<p>71.198.117.145: Add NES mention</p>
<hr />
<div>[[Category:Main]]<br />
[[File:Hello-vic20.png|thumb|Hello world of LLVM assembler targeting Commodore VIC-20]]<br />
[[File:Hello-apple2.png|thumb|Hello world of LLVM assembler targeting Apple IIe]]<br />
[[File:Hello-c64.png|thumb|Hello world of LLVM assembler targeting Commodore 64]]<br />
[[File:Rust-hello-atari-800.png|thumb|Hello world in Rust, with factorial calculation, for Atari 800, proof of concept by mrk]]<br />
<br />
== Welcome to the llvm-mos project! ==<br />
<br />
llvm-mos is a fork of [https://llvm.org/ LLVM] supporting the [[wikipedia:MOS_Technology|MOS Technology]] 65xx series of microprocessors and their clones.<br />
<br />
To get started playing with the tools, check out [https://github.com/llvm-mos/llvm-mos-sdk#getting-started Getting started] with our SDK.<br />
<br />
There have been many failed attempts to create a 6502 backend for LLVM. Ours is the first to successfully compile working programs. The llvm-mos Clang is broadly compatible with freestanding C99 and C++ (with some notable exceptions) and the relevant portions of the LLVM end-to-end test suite pass on a simulated 6502 in a variety of configurations. The project also includes a feature-complete assembler and ELF linker support for generic 6502 targets.<br />
<br />
This project permits modern C programs, written in a modern style, to target common microcomputers of the 1980s, including but not limited to the [[wikipedia:Commodore_64|Commodore 64]], the [[wikipedia:Apple_IIe|Apple IIe]], [[wikipedia:Atari_8-bit_family|Atari 8-bit family]], and the NES.<br />
<br />
Our work is based on LLVM's novel [https://llvm.org/docs/GlobalISel/index.html GlobalISel] architecture, and our compiler is ''aggressive'' about pursuing optimization opportunities for the 65xx series. While there's still much work to be done, we've already overcome the major theoretical hurdles necessary to emit high quality 6502 code. <br />
<br />
Core development occurs on a [https://github.com/llvm-mos project on Github]. Acceptance tests and packaging occur via [https://docs.github.com/en/actions/learn-github-actions/introduction-to-github-actions Github actions] as well. <br />
<br />
Ongoing, public development discussions occur on Slack. If you're an experienced programmer, with a detailed understanding of the LLVM architecture, then [https://join.slack.com/t/llvm-mos/shared_invite/zt-t88fyh4i-EGCLe~MSlHdz3~h~yYHgFA please join our Slack group now] and help out.<br />
<br />
==== Notice ====<br />
The llvm-mos project is not officially affiliated with or endorsed by the LLVM Foundation or LLVM project. Our project is a fork of LLVM that provides a new backend and Clang target; our project is based on LLVM, not a part of LLVM. Our use of LLVM or other related trademarks does not imply affiliation or endorsement.<br />
<br />
=== Category tree ===<br />
<br />
<CategoryTree mode="pages" depth="3" hideroot="on">Main</CategoryTree><br />
<br />
=== Categories ===<br />
<br />
{{Special:AllPages|namespace=14}}<br />
<br />
=== Pages ===<br />
<br />
{{Special:AllPages}}</div>71.198.117.145https://llvm-mos.org/index.php?title=Linker_Script&diff=312Linker Script2022-03-21T01:26:06Z<p>71.198.117.145: Clarify section vs region.</p>
<hr />
<div>The LLVM-MOS linker script format is an extension of the LLD linker script format, which is itself an extension of the GNU ld linker script format.<br />
<br />
See [https://sourceware.org/binutils/docs/ld/Scripts.html the ld manual] for a general reference to ld-style linker scripts.<br />
<br />
See LLD's [https://lld.llvm.org/ELF/linker_script.html linker script implementations notes and policy] for LLD's extensions and interpretations of ld's behavior.<br />
<br />
This page describes LLVM-MOS's extensions of LLD's behavior.<br />
<br />
===Custom Output Formats===<br />
LLVM-MOS provides an extension to the <code>OUTPUT_FORMAT</code> syntax to allow specifying the precise bytes produced in the output file generated by the linker. This subsumes and extends the functionality provided by ld/LLD's <code>OUTPUT_FORMAT(binary)</code>.<br />
<br />
The format of the extension is:<br />
OUTPUT_FORMAT<br />
{<br />
''output-format-command''<br />
''output-format-command''<br />
...<br />
}<br />
The command form a script that outputs bytes to the output file, from top to bottom. When this section is present, the usual output file contains these bytes, but an additional file is created with the <code>.elf</code> file extension containing the ELF file.<br />
<br />
Each ''output-format-comman''d can be:<br />
<br />
* A byte command: <code>BYTE(''expr'')</code>, <code>SHORT(''expr'')</code>, <code>LONG(''expr'')</code>, <code>QUAD(''expr'')</code> <br />
* A memory region command: <code>FULL(''mem-region'')</code>, <code>TRIM(''mem-region'')</code><br />
<br />
Byte commands have the same syntax and semantics as those occurring in output section decriptions; they output the values of the given expressions as little-endian bytes of the corresponding size.<br />
<br />
Memory region commands output the loaded contents of the named memory region, as described in <code>MEMORY { }</code> . The contents of the region are the contents of all output sections with LMA (load address) anywhere within the region. The relative locations of each byte are determined by their LMAs. Note that output sections doesn't need to be explictly assigned to the memory region with <code>></code> or <code>AT></code> to be included; it just needs to have at least one byte with an LMA that overlaps with the region. Any unreferenced sections of the memory region are filled with zeros. <br />
<br />
<code>FULL</code> causes the full output section to be emitted, from its origin through its full length. <code>TRIM</code> causes any trailing unreferenced bytes to be trimmed from the region before it's emitted. The last output byte corresponds to the highest LMA in any output section that overlaps with the region.<br />
<br />
The purpose of this extension is to allow defining custom file formats without adding code to the linker or <code>llvm-objcopy</code>. The byte commands can be used to construct header bytes based on the final locations of various symbols, and the load addresses can be chosen in such a fashion to provide a unique namespace for each byte that should be included in a binary image to be used by a 6502 target platform. See the SDK for examples of how this can be used.<br />
[[Category:Linking]]</div>71.198.117.145https://llvm-mos.org/index.php?title=C_calling_convention&diff=309C calling convention2022-03-08T06:11:09Z<p>71.198.117.145: Update calling convention to make small aggregates passed by value.</p>
<hr />
<div>The current calling convention is somewhat simplistic; it will be tuned for performance and size before the initial release of the compiler.<br />
<br />
* A, X, Y, C, N, V, Z and RS1 to RS12 (RC2 to RC28) are caller-saved. A function may freely overwrite any of these, and the function's callers have to just deal with it.<br />
* PC, S, D, I, RS0 (RC0 and RC1), and RS9 to RS15 (RC18 to RC31) are callee-saved. A function can use them freely, but before it returns it has to put them back exactly the way it found them, and the function's callers can rely on this behavior.<br />
* The bytes composing numeric arguments are passed individually in A, then X, then RC2 to RC15.<br />
* Pointers are assigned to imaginary register pairs, functioning as pointer registers (i.e., RS1=(RC1, RC2) to RS7=(RC14, RC15)).<br />
* If no registers remain available, values are passed through the soft stack.<br />
* Aggregate types (structs, arrays, etc.) 4 bytes or smaller are split into their individual value types, and each is passed individually. Such types are also returned by value.<br />
* Aggregate types larger than 4 bytes are passed by pointer. The pointer is managed entirely by the caller, and may or may not be on the soft stack. The callee is free to write to the memory; the caller must consider the memory overwritten by the call. Such types are returned by a pointer passed as an implicit first argument. The resulting function then returns void.<br />
* Variable arguments (those within the ellipses of the argument list) are passed through the stack. Named arguments before the variable arguments are passed as usual: first in registers, then stack. Note that the variable argument and regular calling convention differ; thus, variable argument functions must only be called if prototyped. The C standard requires this, but many platforms do not; their variable argument and regular calling conventions are identical. A notable exception is Apple ARM64.<br />
<br />
For insight into the design of performant calling conventions, see the following work by Davidson and Whalley. By their convention, this plaftorm uses the "smarter hybrid" method, since LLVM performs both shrink wrapping and caller save-restore placement optimizations, while using both callee-saved and caller-saved registers when appropriate.<br />
<br />
Our calling convention is roughly based on RISC-V, suggested after a discussion with one of their working group members.<br />
<br />
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.4669&rep=rep1&type=pdf Methods for Saving and Restoring Register Values across Function Calls: Software--Practice and Experience Vol 21(2), 149-165 (February 1991)]<br />
[[Category:C]]</div>71.198.117.145https://llvm-mos.org/index.php?title=Welcome&diff=304Welcome2022-02-25T18:27:38Z<p>71.198.117.145: Update the welcome page in preparation for initial release.</p>
<hr />
<div>[[Category:Main]]<br />
[[File:Hello-vic20.png|thumb|Hello world of LLVM assembler targeting Commodore VIC-20]]<br />
[[File:Hello-apple2.png|thumb|Hello world of LLVM assembler targeting Apple IIe]]<br />
[[File:Hello-c64.png|thumb|Hello world of LLVM assembler targeting Commodore 64]]<br />
[[File:Rust-hello-atari-800.png|thumb|Hello world in Rust, with factorial calculation, for Atari 800, proof of concept by mrk]]<br />
<br />
== Welcome to the llvm-mos project! ==<br />
<br />
llvm-mos is a fork of [https://llvm.org/ LLVM] supporting the [[wikipedia:MOS_Technology|MOS Technology]] 65xx series of microprocessors and their clones.<br />
<br />
To get started playing with the tools, check out [https://github.com/llvm-mos/llvm-mos-sdk#getting-started Getting started] with our SDK.<br />
<br />
There have been many failed attempts to create a 6502 backend for LLVM. Ours is the first to successfully compile working programs. The llvm-mos Clang is broadly compatible with freestanding C99 and C++ (with some notable exceptions) and the relevant portions of the LLVM end-to-end test suite pass on a simulated 6502 in a variety of configurations. The project also includes a feature-complete assembler and ELF linker support for generic 6502 targets.<br />
<br />
This project permits modern C programs, written in a modern style, to target common microcomputers of the 1980s, including but not limited to the [[wikipedia:Commodore_64|Commodore 64]], the [[wikipedia:Apple_IIe|Apple IIe]], and the [[wikipedia:Atari_8-bit_family|Atari 8-bit family]].<br />
<br />
Our work is based on LLVM's novel [https://llvm.org/docs/GlobalISel/index.html GlobalISel] architecture, and our compiler is ''aggressive'' about pursuing optimization opportunities for the 65xx series. While there's still much work to be done, we've already overcome the major theoretical hurdles necessary to emit high quality 6502 code. <br />
<br />
Core development occurs on a [https://github.com/llvm-mos project on Github]. Acceptance tests and packaging occur via [https://docs.github.com/en/actions/learn-github-actions/introduction-to-github-actions Github actions] as well. <br />
<br />
Ongoing, public development discussions occur on Slack. If you're an experienced programmer, with a detailed understanding of the LLVM architecture, then [https://join.slack.com/t/llvm-mos/shared_invite/zt-t88fyh4i-EGCLe~MSlHdz3~h~yYHgFA please join our Slack group now] and help out.<br />
<br />
==== Notice ====<br />
The llvm-mos project is not officially affiliated with or endorsed by the LLVM Foundation or LLVM project. Our project is a fork of LLVM that provides a new backend and Clang target; our project is based on LLVM, not a part of LLVM. Our use of LLVM or other related trademarks does not imply affiliation or endorsement.<br />
<br />
=== Category tree ===<br />
<br />
<CategoryTree mode="pages" depth="3" hideroot="on">Main</CategoryTree><br />
<br />
=== Categories ===<br />
<br />
{{Special:AllPages|namespace=14}}<br />
<br />
=== Pages ===<br />
<br />
{{Special:AllPages}}</div>71.198.117.145https://llvm-mos.org/index.php?title=C_volatile&diff=302C volatile2022-02-18T02:15:33Z<p>71.198.117.145: Add category.</p>
<hr />
<div>The compiler will freely issue indexed addressing modes wherever appropriate. Due to a processor bug, these addressing modes can cause NMOS 6502 to issue a spurious read to a memory location one page lower than the one intended. These reads are not considered to constitute an access to a volatile object, even if they overlap with one in memory.<br />
<br />
To avoid accidentally triggering I/O, pointer arithmetic should be avoided for addresses one page above read-sensitive I/O locations. To be safe, such routines can be written in assembly.<br />
<br />
The read/modify/write (RMW) instructions (e.g., INC) also issue a spurious read or write to the location, depending on the 6502 version. If the underlying object is volatile, these accesses would be considered to be volatile accesses; accordingly, the compiler avoids generating RMW instructions for such addresses. Instead, the compiler will manually issue the load, manipulate the value, and store it back.<br />
[[Category:C]]</div>71.198.117.145https://llvm-mos.org/index.php?title=C_volatile&diff=301C volatile2022-02-18T02:14:55Z<p>71.198.117.145: Add C volatile page.</p>
<hr />
<div>The compiler will freely issue indexed addressing modes wherever appropriate. Due to a processor bug, these addressing modes can cause NMOS 6502 to issue a spurious read to a memory location one page lower than the one intended. These reads are not considered to constitute an access to a volatile object, even if they overlap with one in memory.<br />
<br />
To avoid accidentally triggering I/O, pointer arithmetic should be avoided for addresses one page above read-sensitive I/O locations. To be safe, such routines can be written in assembly.<br />
<br />
The read/modify/write (RMW) instructions (e.g., INC) also issue a spurious read or write to the location, depending on the 6502 version. If the underlying object is volatile, these accesses would be considered to be volatile accesses; accordingly, the compiler avoids generating RMW instructions for such addresses. Instead, the compiler will manually issue the load, manipulate the value, and store it back.</div>71.198.117.145https://llvm-mos.org/index.php?title=Getting_started&diff=300Getting started2022-01-27T04:38:40Z<p>71.198.117.145: Soften the warning; we're getting closer.</p>
<hr />
<div>NOTICE: There's still much work to be optimization work to be done before our first official release. Please don't publicly review, compare, or benchmark against other compilers at this.<br />
<br />
To keep this project a clean fork of LLVM, no target-specific source code or libraries are part of this project. These are contained in the related llvm-mos-sdk. The default mos target will only use compiler built-in include and library paths (e.g., stdint.h), so the compiler can technically be used without the SDK; however, this means that you will have to provide your own libc and your own run-time initialization. If you don't understand what this means, then you should use llvm-mos in conjunction with the llvm-mos-sdk.<br />
<br />
For more information about this project, please see llvm-mos.org.<br />
<br />
For information about the current status of this project, please see [[Current status|Current status.]]<br />
<br />
To learn why this project exists, please see [[Rationale]].<br />
<br />
= Getting started =<br />
<br />
== Download the LLVM-MOS tools ==<br />
If you want to play with the current state of the LLVM-MOS toolchain, you may not have to build LLVM-MOS from source code yourself. Instead, just download the most recent binaries for your platform:<br />
<br />
* [https://github.com/llvm-mos/llvm-mos/releases/tag/llvm-mos-darwin-main MacOS]<br />
* [https://github.com/llvm-mos/llvm-mos/releases/tag/llvm-mos-linux-main Linux]<br />
* [https://github.com/llvm-mos/llvm-mos/releases/tag/llvm-mos-windows-main Windows]<br />
<br />
These binaries are built from the main branch of the LLVM-MOS project, using Github's actions functionality.<br />
<br />
== Or, build the LLVM-MOS tools ==<br />
However, if you're allergic to precompiled binaries, or your platform is not listed above, then you'll need to compile LLVM-MOS for your own platform.<br />
<br />
Generally, compiling LLVM-MOS follows the same convention as compiling LLVM. First, please review the hardware and software requirements for building LLVM.<br />
<br />
Once you meet those requirements, you may use the following formula within your build environment:<br />
<br />
=== Clone the LLVM-MOS repository ===<br />
On Linux and MacOS:<br />
<code>git clone <nowiki>https://github.com/llvm-mos/llvm-mos.git</nowiki></code><br />
On Windows:<br />
<code>git clone --config core.autocrlf=false <nowiki>https://github.com/llvm-mos/llvm-mos.git</nowiki></code><br />
If you fail to use the --config flag as above, then verification tests will fail on Windows.<br />
<br />
=== Configure the LLVM-MOS project ===<br />
<code>cd llvm-mos<br />
cmake -C clang/cmake/caches/MOS.cmake [-G <generator>] -S llvm -B build [...]</code><br />
This configuration command seeds the CMake cache with values from MOS.cmake. Feel free to review and adjust these values for your environment.<br />
<br />
Additional options can be added to the cmake command, which override the values provided in MOS.cmake. A handful are listed below. For a complete list of options, see Building LLVM with CMake.<br />
<br />
* <code>-G <generator></code> --- Lets you choose the CMake generator for your build environment. CMake will try to automatically detect your build tools and use them; however, it's recommended to install Ninja and pass Ninja as the parameter to the -G command.<br />
* <code>-DLLVM_ENABLE_PROJECTS=...</code> --- semicolon-separated list of the LLVM sub-projects you'd like to additionally build. Can include any of: clang, clang-tools-extra, libcxx, libcxxabi, libunwind, lldb, compiler-rt, lld, polly, or debuginfo-tests.<br />
* <code>-DCMAKE_INSTALL_PREFIX=directory</code> --- Specify for ''directory'' the full path name of where you want the LLVM tools and libraries to be installed (default <code>/usr/local</code>).<br />
* <code>-DCMAKE_BUILD_TYPE=type</code> --- Valid options for ''type'' are Debug, Release, RelWithDebInfo, and MinSizeRel. Default is MinSizeRel, if you are using the MOS.cmake cache file.<br />
* <code>-DLLVM_ENABLE_ASSERTIONS=On</code> --- Compile with assertion checks enabled (default is Yes for Debug builds, No for all other build types).<br />
<br />
=== Build the LLVM-MOS project ===<br />
<code>cmake --build build [-- [options] <target>]</code><br />
The default target will build all of LLVM. The <code>check-all</code> target will run the regression tests. The <code>distribution</code> target will build a collection of all the LLVM-MOS tools, suitable for redistribution.<br />
<br />
CMake will generate targets for each tool and library, and most LLVM sub-projects generate their own <code>check-<project></code> target.<br />
<br />
Running a serial build will be slow. To improve speed, try running a parallel build. That's done by default in Ninja; for <code>make</code>, use the option <code>-j NNN</code>, where <code>NNN</code> is the number of parallel jobs, e.g. the number of CPUs you have.<br />
<br />
= Help us out =<br />
We need your help! Please review the issue tracker, please review the current state of the code, and jump in and help us with pull requests for bug fixes.<br />
<br />
All LLVM-MOS code is expected to ''strictly'' observe the LLVM coding standards. That means your code must have been run through clang-format with the --style set to LLVM, and clang-tidy with the LLVM coding conventions with the llvm-*, modernize-*, and cppcore-* checks enabled. If your code does not observe these standards, there's a good chance we'll reject it, unless you have a ''good reason'' for not observing these rules.<br />
<br />
If you add new functionality or an optimization pass to LLVM-MOS, we're not going to accept it unless you have modified the associated test suite to exercise your new functionality. Drive-by feature pulls will probably not be accepted, unless their new functionality is too trivial to be tested. GlobalISel gives you no excuses ''not'' to write a full test suite for your codegen pass or your new functionality.<br />
<br />
You can submit well-written, carefully researched issue requests via the issue tracker. Please note, we don't have the bandwidth yet to handle "why dosent my pogrem compil" type requests.<br />
<br />
Additionally, the current state of our documentation at [[Welcome|https://llvm-mos.org]] can always use improvements and clarifications.<br />
[[Category:Main]]</div>71.198.117.145https://llvm-mos.org/index.php?title=Welcome&diff=299Welcome2022-01-27T04:35:37Z<p>71.198.117.145: Update wording on the welcome page.</p>
<hr />
<div>[[Category:Main]]<br />
[[File:Hello-vic20.png|thumb|Hello world of LLVM assembler targeting Commodore VIC-20]]<br />
[[File:Hello-apple2.png|thumb|Hello world of LLVM assembler targeting Apple IIe]]<br />
[[File:Hello-c64.png|thumb|Hello world of LLVM assembler targeting Commodore 64]]<br />
[[File:Rust-hello-atari-800.png|thumb|Hello world in Rust, with factorial calculation, for Atari 800, proof of concept by mrk]]<br />
<br />
== Welcome to the llvm-mos project! ==<br />
<br />
The llvm-mos project is intended to create a first-class [https://github.com/llvm-mos backend] in [https://llvm.org/ LLVM] for the [[wikipedia:MOS_Technology|MOS Technology]] 65xx series of microprocessors and their clones.<br />
<br />
To get started playing with the tools, check out [[Getting started]].<br />
<br />
There have been many failed attempts to create a 6502 backend for LLVM. Ours is the first to successfully compile working programs. The llvm-mos Clang is broadly compatible with freestanding C99, and the relevant portions of the LLVM end-to-end test suite pass on a simulated 6502 in a variety of configurations. The project also includes a feature-complete assembler and ELF linker support for generic 6502 targets.<br />
<br />
This project permits modern C programs, written in a modern style, to target common microcomputers of the 1980s, including but not limited to the [[wikipedia:Commodore_64|Commodore 64]], the [[wikipedia:Apple_IIe|Apple IIe]], and the [[wikipedia:Atari_8-bit_family|Atari 8-bit family]].<br />
<br />
Our work is based on LLVM's novel [https://llvm.org/docs/GlobalISel/index.html GlobalISel] architecture, and our compiler is ''aggressive'' about pursuing optimization opportunities for the 65xx series. While there's still much work to be done, we've already overcome the major theoretical hurdles necessary to emit high quality 6502 code. <br />
<br />
The development team has established a [https://github.com/llvm-mos project on Github]. Acceptance tests and packaging occur via [https://docs.github.com/en/actions/learn-github-actions/introduction-to-github-actions Github actions] as well. <br />
<br />
We provide current builds of the main branch of the llvm-mos tool chain for [https://github.com/llvm-mos/llvm-mos/releases/tag/llvm-mos-windows-main Windows], [https://github.com/llvm-mos/llvm-mos/releases/tag/llvm-mos-darwin-main MacOS], and [https://github.com/llvm-mos/llvm-mos/releases/tag/llvm-mos-linux-main Ubuntu Linux]. <br />
<br />
Ongoing, public development discussions occur on Slack. If you're an experienced programmer, with a detailed understanding of the LLVM architecture, then [https://join.slack.com/t/llvm-mos/shared_invite/zt-t88fyh4i-EGCLe~MSlHdz3~h~yYHgFA please join our Slack group now] and help out.<br />
<br />
==== Notice ====<br />
The llvm-mos project is not officially affiliated with or endorsed by the LLVM Foundation or LLVM project. Our project is a fork of LLVM that provides a new backend and Clang target; our project is based on LLVM, not a part of LLVM. Our use of LLVM or other related trademarks does not imply affiliation or endorsement.<br />
<br />
=== Category tree ===<br />
<br />
<CategoryTree mode="pages" depth="3" hideroot="on">Main</CategoryTree><br />
<br />
=== Categories ===<br />
<br />
{{Special:AllPages|namespace=14}}<br />
<br />
=== Pages ===<br />
<br />
{{Special:AllPages}}</div>71.198.117.145https://llvm-mos.org/index.php?title=C_calling_convention&diff=295C calling convention2021-11-16T07:40:46Z<p>71.198.117.145: </p>
<hr />
<div>The current calling convention is somewhat simplistic; it will be tuned for performance and size before the initial release of the compiler.<br />
<br />
* A, X, Y, C, N, V, Z and RS1 to RS12 (RC2 to RC28) are caller-saved. A function may freely overwrite any of these, and the function's callers have to just deal with it.<br />
* PC, S, D, I, RS0 (RC0 and RC1), and RS9 to RS15 (RC18 to RC31) are callee-saved. A function can use them freely, but before it returns it has to put them back exactly the way it found them, and the function's callers can rely on this behavior.<br />
* The bytes composing numeric arguments are passed individually in A, then X, then RC2 to RC15.<br />
* Pointers are assigned to imaginary register pairs, functioning as pointer registers (i.e., RS1=(RC1, RC2) to RS7=(RC14, RC15)).<br />
* If no registers remain available, values are passed through the soft stack.<br />
* Aggregate types (structs, arrays, etc.) are passed by pointer. The pointer is managed entirely by the caller, and may or may not be on the soft stack. The callee is free to write to the memory; the caller must consider the memory overwritten by the call. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* Aggregate types are returned by a pointer passed as an implicit first argument. The resulting function returns void. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* Variable arguments (those within the ellipses of the argument list) are passed through the stack. Named arguments before the variable arguments are passed as usual: first in registers, then stack. Note that the variable argument and regular calling convention differ; thus, variable argument functions must only be called if prototyped. The C standard requires this, but many platforms do not; their variable argument and regular calling conventions are identical. A notable exception is Apple ARM64.<br />
<br />
For insight into the design of performant calling conventions, see the following work by Davidson and Whalley. By their convention, this plaftorm uses the "smarter hybrid" method, since LLVM performs both shrink wrapping and caller save-restore placement optimizations, while using both callee-saved and caller-saved registers when appropriate.<br />
<br />
Our calling convention is roughly based on RISC-V, suggested after a discussion with one of their working group members.<br />
<br />
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.4669&rep=rep1&type=pdf Methods for Saving and Restoring Register Values across Function Calls: Software--Practice and Experience Vol 21(2), 149-165 (February 1991)]<br />
[[Category:C]]</div>71.198.117.145https://llvm-mos.org/index.php?title=Imaginary_registers&diff=294Imaginary registers2021-11-16T07:37:06Z<p>71.198.117.145: </p>
<hr />
<div>The MOS 65xx series has a 256-byte range of low memory referred to as zero page. Memory stored in this region is significantly faster to access and modify.<br />
<br />
Like all modern compilers, LLVM assumes generally that the target machine has a large number of target registers that are more or less interchangeable. This restriction is loosened, at some cost to code complexity, for the X86 targets, but generally most targets assume that you have a range of compiler-controllable registers.<br />
<br />
Because "virtual registers" has another predefined meaning in LLVM land, we use the term '''imaginary register''' to refer to a byte in zero page memory. It's represented at codegen time by a symbol like __rc17 or __rs5, which is then translated to an actual memory address by the linker at link time.<br />
<br />
Presently, the compiler requires 16 imaginary pointers (two contiguous bytes each), and the compiler doesn't presently have the means to make use of more. In the future, the compiler should make use of otherwise unused sections of the zero page for temporaries, local, and global variables.<br />
<br />
LLVM-MOS does not assume that imaginary registers need to be consecutive! Many targets have non-consecutive usable zero page memory locations.<br />
[[Category:Code generation]]<br />
[[Category:Assembly]]<br />
[[Category:C]]</div>71.198.117.145https://llvm-mos.org/index.php?title=Findings&diff=288Findings2021-10-17T04:58:39Z<p>71.198.117.145: Update findings page.</p>
<hr />
<div>Efficiently compiling C to the 6502 is a notoriously difficult problem. LLVM is an amazing platform for building compilers, and we've found that it presents some exciting new solutions, as well as new challenges.<br />
<br />
There are two main obstacles when turning C into efficient machine code: stacks and registers.<br />
<br />
== Stacks ==<br />
Past a certain point, processors were designed with C and other stack-bearing high-level languages in mind, while the 6502 decidedly predates this. C's runtime model explicitly states that the automatic variables of a function invocation must be kept separately on a per-invocation basis. The only efficient way to do this is with a stack. But the fastest available implementation of a sufficiently-large stack on the 6502 is quite slow, and much slower than the usual zeitgeist of assembly-language programming.<br />
<br />
While it's true that the standard C runtime model is quite hostile to 6502 performance, the C standard provides broad latitude for alternative models that behavior in all points "as if" it were the C model. In the broad space of possible alternatives, we've found a collection of techniques that broadly preserve C standard compatibility while emitting very high quality code. To put it another way, we go to great lengths to emit code that operates ''as if there were stacks'', without using stacks at all. LLVM's sophistication facilitates this. The analyses required are quite intricate, given that the compiler needs to correctly deal with recursive code as well as interrupt-driven reentrant code, but most of them are slight modifications to data structures already available in LLVM.<br />
<br />
== Registers ==<br />
The 6502 has only three relatively general-purpose registers. One is an accumulator, while the other two are index registers. Most instructions bake in which register they operate on, which is quite different than most modern CPUs, which take register numbers in an operand field in the instruction encoding. Since registers are few, it's difficult to keep values in them for any length of time. Additionally, different instructions are constrained to use different registers. Few registers and tight register constraints are both poison to a traditional Chaitin-style register allocator, which causes a proliferation of code to spill values to and from the stack. This in turn requires additional registers, producing a horrible soup of spill-reload-spill-reload.<br />
<br />
The original 6502 designers were well aware of the 6502's register limitations, so they provided zero-page addressing modes to compensate. The zero page locations are often treated much like processor registers; we take this view at face value (although we're not the first 6502 compiler to do so). We present ranges of zero page memory to LLVM in the form of ''imaginary registers''.<br />
<br />
Instead of keeping instructions like LDA, LDX, and LDY separate, we merge them together into a logical instruction set. This would have one instruction: LD, which takes either A, X, or Y as argument. The distinction is subtle, but this makes the instruction set much more regular, which makes the register allocation problem easier to solve.<br />
<br />
Together, these approaches take our backend from "alien nightmare" to "ugly duckling", not unlike x86 or AVR. Normal register allocation techniques apply, since the logical instruction set often treats different zero page locations and processor registers identically. While the instruction set remains a bit unusual, it's not *that* much worse than x86, and LLVM's register allocator is fully capable of handling it, even if the relationship is a bit strained.<br />
<br />
== Challenges ==<br />
LLVM's complexity is both a blessing and a curse. While the sheer scope of the platform has given us the basis to solve the above problems, working with it is a daunting and complicated task. Much of the really important stuff is barely documented, and much of the compiler's passes are on the level of someone's PhD thesis.<br />
<br />
Luckily, all that complexity gives us the support necessary to shoehorn the 6502 into it. It already supports AVR, weird DSP chips, GPUs, and WebAssembly (which doesn't even HAVE registers; it's a stack machine!). It even supports IBM SystemZ! All of the hooks necessary to emit good code for those platforms also help us get good code out of the 6502. We stand on the shoulders of weird, misshapen giants.<br />
[[Category:Main]]</div>71.198.117.145https://llvm-mos.org/index.php?title=C_calling_convention&diff=287C calling convention2021-10-15T02:29:22Z<p>71.198.117.145: Correct calling convention.</p>
<hr />
<div>The current calling convention is somewhat simplistic; it will be tuned for performance and size before the initial release of the compiler.<br />
<br />
* The caller-saved registers are flags, A, X, Y, and RS1 to RS14 (RC1 to RC29). A function may freely overwrite any of these, and the function's callers have to just deal with it. All other registers are callee-saved. A function can use them freely, but before it returns it has to put them back exactly the way it found them, and the function's callers can rely on this behavior.<br />
* The bytes composing numeric arguments are passed individually in A, then X, then RC2 to RC15. The other caller saved registers are not used for arguments, to make writing compiler thunks easier and to allow efficiently setting up call arguments.<br />
* Pointers are assigned to imaginary register pairs, functioning as pointer registers (i.e., RS1=(RC1, RC2) to RS7=(RC14, RC15)).<br />
* If no registers remain available, values are passed through the soft stack.<br />
* Aggregate types (structs, arrays, etc.) are passed by pointer. The pointer is managed entirely by the caller, and may or may not be on the soft stack. The callee is free to write to the memory; the caller must consider the memory overwritten by the call. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* Aggregate types are returned by a pointer passed as an implicit first argument. The resulting function returns void. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* Variable arguments (those within the ellipses of the argument list) are passed through the stack. Named arguments before the variable arguments are passed as usual: first in registers, then stack. Note that the variable argument and regular calling convention differ; thus, variable argument functions must only be called if prototyped. The C standard requires this, but many platforms do not; their variable argument and regular calling conventions are identical. A notable exception is Apple ARM64.<br />
<br />
For insight into the design of performant calling conventions, see the following work by Davidson and Whalley. By their convention, this plaftorm uses the "smarter hybrid" method, since LLVM performs both shrink wrapping and caller save-restore placement optimizations, while using both callee-saved and caller-saved registers when appropriate.<br />
<br />
Our calling convention is roughly based on RISC-V, suggested after a discussion with one of their working group members.<br />
<br />
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.4669&rep=rep1&type=pdf Methods for Saving and Restoring Register Values across Function Calls: Software--Practice and Experience Vol 21(2), 149-165 (February 1991)]<br />
[[Category:C]]</div>71.198.117.145https://llvm-mos.org/index.php?title=C_calling_convention&diff=286C calling convention2021-10-15T02:27:03Z<p>71.198.117.145: Update calling convention.</p>
<hr />
<div>The current calling convention is somewhat simplistic; it will be tuned for performance and size before the initial release of the compiler.<br />
<br />
* The caller-saved registers are flags, A, X, Y, and RS1 to RS14 (RC1 to RC29). A function may freely overwrite any of these, and the function's callers have to just deal with it. All other registers are callee-saved. A function can use them freely, but before it returns it has to put them back exactly the way it found them, and the function's callers can rely on this behavior.<br />
* The bytes composing numeric arguments are passed individually in A, then X, then RC1 to RC29. The other caller saved registers are not used for arguments, to make writing compiler thunks easier and to allow efficiently setting up call arguments.<br />
* Pointers are assigned to imaginary register pairs, functioning as pointer registers (i.e., RS1=(RC0, RC1) to RS7=(RC14, RC15)).<br />
* If no registers remain available, values are passed through the soft stack.<br />
* Aggregate types (structs, arrays, etc.) are passed by pointer. The pointer is managed entirely by the caller, and may or may not be on the soft stack. The callee is free to write to the memory; the caller must consider the memory overwritten by the call. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* Aggregate types are returned by a pointer passed as an implicit first argument. The resulting function returns void. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* Variable arguments (those within the ellipses of the argument list) are passed through the stack. Named arguments before the variable arguments are passed as usual: first in registers, then stack. Note that the variable argument and regular calling convention differ; thus, variable argument functions must only be called if prototyped. The C standard requires this, but many platforms do not; their variable argument and regular calling conventions are identical. A notable exception is Apple ARM64.<br />
<br />
For insight into the design of performant calling conventions, see the following work by Davidson and Whalley. By their convention, this plaftorm uses the "smarter hybrid" method, since LLVM performs both shrink wrapping and caller save-restore placement optimizations, while using both callee-saved and caller-saved registers when appropriate.<br />
<br />
Our calling convention is roughly based on RISC-V, suggested after a discussion with one of their working group members.<br />
<br />
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.4669&rep=rep1&type=pdf Methods for Saving and Restoring Register Values across Function Calls: Software--Practice and Experience Vol 21(2), 149-165 (February 1991)]<br />
[[Category:C]]</div>71.198.117.145https://llvm-mos.org/index.php?title=C_compiler&diff=265C compiler2021-08-25T16:08:18Z<p>71.198.117.145: </p>
<hr />
<div>A backend has been added to Clang to support the MOS instruction set. This backend may be targeted by adding the flag <code>--target=mos</code> to clang.<br />
<br />
The Clang is broadly compatible with the freestanding portion of the C99 standard, with one big caveat:<br />
* No float and no double. We'll eventually ship a working IEEE 754 soft float library with the compiler for completeness' sake, but we expect low demand for this, and it'll distract from the rest of the project.<br />
<br />
Some of the GCC and Clang language extensions work, some don't. Similarly some of the C++ language features work, some don't. We haven't exhaustively catalogued this; there's a lot, and some are extremely obscure and nearly useless.<br />
<br />
Still, some compiler extensions are de-facto standard, so we'll call out a few caveats for those.<br />
<br />
Known Compiler Extension Caveats:<br />
<br />
* The GCC/Clang alignment directives work only for global and static variables, not automatic (local) variables. Aligning the stack pointer is tricky, since it requires use of a frame pointer; but the rigors of 6502 indexing cause our frame pointer to be useless for this purpose. To get this to work, we may need *two* frame pointers. Expected call for this feature is low, so if you need it, let us know, and we'll reprioritize.<br />
<br />
[[Category:C]]</div>71.198.117.145https://llvm-mos.org/index.php?title=C_interrupts&diff=264C interrupts2021-08-23T04:12:10Z<p>71.198.117.145: Update interrupt sequences.</p>
<hr />
<div>== Normal C Interrupt Handling ==<br />
The techniques llvm-mos uses for interrupt handling are somewhat unusual. To understand why, it's useful to start with the normal way C compilers deal with interrupts.<br />
<br />
Generally, C calling conventions divide registers into two classes: caller-saved and callee-saved. Functions are free to overwrite caller-saved registers, and callers of those functions need to be aware of and correctly handle this possibility. Functions must preserve the values of callee-saved registers, and their callers are allowed to count on this.<br />
<br />
Interrupt handlers necessarily break with this convention; a function can't know when and how it's going to be interrupted, so there's no way to "deal with" the interrupt handler overwriting caller-saved registers. So interrupt handlers need to treat all registers as if they were callee-saved.<br />
<br />
Usually, a target has at most around 32 registers, more-or-less evenly split between caller-saved and callee-saved. So an interrupt handler can reasonably save all the caller-saved registers (it would implicitly also save the callee-saved registers if it uses them, just by virtue of being a C function). Note that if it calls any other function, it needs to save ''all'' of the caller-saved registers, since it can't know which the callee will overwrite.<br />
<br />
== Challenges ==<br />
The nature of the 6502 presents some challenges to using this model.<br />
<br />
=== Static Stack Allocation ===<br />
The indexed addressing modes on the 6502 are quite slow, which gives incentive to avoid the normal C stack implementation. For llvm-mos, we opted to perform a call graph analysis and allocate the stack frames of non-recursive functions statically. This is safe, since via a conservative analysis we can prove that certain functions cannot have more than one invocation active at a time.<br />
<br />
However, interrupts can be active at the same time as any other function, potentially including themselves. The static stack analysis will need to be made aware of interrupts somehow, which means that programmers will need to annotate which functions have this "can appear out of nowhere" property.<br />
<br />
== Interrupt Annotations ==<br />
To solve the above problems, we introduce three new function attributes "interrupt", "interrupt_norecurse", and "no_isr".<br />
<br />
=== "interrupt" Attribute ===<br />
This attribute isn't actually MOS-specific, but we do ascribe to it some additional semantics. Any function bearing this attribute will treat all registers (except flags) as callee-saved, will begin with a CLD (the state of the decimal flag is undefined upon interrupt), and will return with RTI instead of RTS. Additionally, any such a function will be marked as possibly recursive for the purpose of static stack allocation. This will in turn cause any function that might be called by such a function to be forced to use the dynamic soft stack.<br />
<br />
=== "interrupt_norecurse" Attribute ===<br />
This attribute behaves identically to "interrupt", with one exception. When performing the static stack analysis, functions marked with this attribute will not be automatically considered recursive. Instead, any functions that might possibly called by an interrupt_norecurse function and main or two ''different'' interrupt_norecurse functions will be considered possibly recursive. Another way of looking at it is that interrupt_norecurse functions correspond to different sources of interrupts, and the model is one where interrupts from that source are disabled until they finish processing. Judicious use of interrupt_norecurse allows interrupt handlers to benefit from static stack allocation, leaving the "interrupt" attribute for interrupt handlers that can interrupt even themselves.<br />
<br />
=== "no_isr" Attribute ===<br />
This attribute can be added to an interrupt or interrupt_norecurse function to cause it to return with RTS and not perform any of the additional saving that interrupt handlers usually do. All effects on static stack analysis and calling conventions (see below) still occur. The intent is to allow interrupt handlers to be implemented in assembly and call into C with the normal calling convention. The interrupt attributes would still be necessary in that case for program correctness.<br />
<br />
==== Manual Interrupt Sequence ====<br />
Some operating systems may impose unusual requirements on interrupt handlers. They may, for example, push certain registers before calling the handler, but expect the handler to pop them before returning. no_isr routines allow implementing this kind of custom interrupt prologue and epilogue. However, doing this safely requires saving and restoring anything that might be in-use by the compiler.<br />
<br />
Below is a sample routine that includes the sum total of all pushes and pops needed by the compiler. If it can be proven that the entire, transitively called interrupt handler cannot use certain locations, then they can be elided. This is typically only possible if the interrupt handler is either written entirely in assembly.<br />
<br />
The callee-saved registers must be preserved by ''any'' function callable by C; interrupt handlers are no different. Accordingly, callee-saved registers aren't included in the code below. In this example, the JSR to "body" is expected to preserve them.<syntaxhighlight lang="6502tasm"><br />
cld<br />
pha<br />
txa<br />
pha<br />
tya<br />
pha<br />
lda mos8(__rc2)<br />
pha<br />
lda mos8(__rc3)<br />
pha<br />
lda mos8(__rc4)<br />
pha<br />
lda mos8(__rc5)<br />
pha<br />
lda __save_a<br />
pha<br />
lda __save_y<br />
pha<br />
<br />
JSR body<br />
<br />
pla<br />
sta __save_y<br />
pla<br />
sta __save_a<br />
pla<br />
sta mos8(__rc5)<br />
pla<br />
sta mos8(__rc4)<br />
pla<br />
sta mos8(__rc3)<br />
pla<br />
sta mos8(__rc2)<br />
pla<br />
tay<br />
pla<br />
tax<br />
pla<br />
rti<br />
<br />
</syntaxhighlight><br />
<br />
=== Undefined Behavior ===<br />
It shall be undefined behavior for any mechanism external to a C module to asynchronously call a C function that does not bear either the "interrupt" or "interrupt_norecurse" attributes. It shall also be undefined behavior for any mechanism external to a C module to asynchronously call an "interrupt_norecurse" function while another invocation of that same function is still active.<br />
<br />
[[Category:C]]</div>71.198.117.145https://llvm-mos.org/index.php?title=C_interrupts&diff=263C interrupts2021-08-17T04:41:29Z<p>71.198.117.145: /* Manual Interrupt Sequence */</p>
<hr />
<div>== Normal C Interrupt Handling ==<br />
The techniques llvm-mos uses for interrupt handling are somewhat unusual. To understand why, it's useful to start with the normal way C compilers deal with interrupts.<br />
<br />
Generally, C calling conventions divide registers into two classes: caller-saved and callee-saved. Functions are free to overwrite caller-saved registers, and callers of those functions need to be aware of and correctly handle this possibility. Functions must preserve the values of callee-saved registers, and their callers are allowed to count on this.<br />
<br />
Interrupt handlers necessarily break with this convention; a function can't know when and how it's going to be interrupted, so there's no way to "deal with" the interrupt handler overwriting caller-saved registers. So interrupt handlers need to treat all registers as if they were callee-saved.<br />
<br />
Usually, a target has at most around 32 registers, more-or-less evenly split between caller-saved and callee-saved. So an interrupt handler can reasonably save all the caller-saved registers (it would implicitly also save the callee-saved registers if it uses them, just by virtue of being a C function). Note that if it calls any other function, it needs to save ''all'' of the caller-saved registers, since it can't know which the callee will overwrite.<br />
<br />
== Challenges ==<br />
The nature of the 6502 presents some challenges to using this model.<br />
<br />
=== Static Stack Allocation ===<br />
The indexed addressing modes on the 6502 are quite slow, which gives incentive to avoid the normal C stack implementation. For llvm-mos, we opted to perform a call graph analysis and allocate the stack frames of non-recursive functions statically. This is safe, since via a conservative analysis we can prove that certain functions cannot have more than one invocation active at a time.<br />
<br />
However, interrupts can be active at the same time as any other function, potentially including themselves. The static stack analysis will need to be made aware of interrupts somehow, which means that programmers will need to annotate which functions have this "can appear out of nowhere" property.<br />
<br />
== Interrupt Annotations ==<br />
To solve the above problems, we introduce three new function attributes "interrupt", "interrupt_norecurse", and "no_isr".<br />
<br />
=== "interrupt" Attribute ===<br />
This attribute isn't actually MOS-specific, but we do ascribe to it some additional semantics. Any function bearing this attribute will treat all registers (except flags) as callee-saved and will return with RTI instead of RTS. Additionally, any such a function will be marked as possibly recursive for the purpose of static stack allocation. This will in turn cause any function that might be called by such a function to be forced to use the dynamic soft stack.<br />
<br />
=== "interrupt_norecurse" Attribute ===<br />
This attribute behaves identically to "interrupt", with one exception. When performing the static stack analysis, functions marked with this attribute will not be automatically considered recursive. Instead, any functions that might possibly called by an interrupt_norecurse function and main or two ''different'' interrupt_norecurse functions will be considered possibly recursive. Another way of looking at it is that interrupt_norecurse functions correspond to different sources of interrupts, and the model is one where interrupts from that source are disabled until they finish processing. Judicious use of interrupt_norecurse allows interrupt handlers to benefit from static stack allocation, leaving the "interrupt" attribute for interrupt handlers that can interrupt even themselves.<br />
<br />
=== "no_isr" Attribute ===<br />
This attribute can be added to an interrupt or interrupt_norecurse function to cause it to return with RTS and not perform any of the additional saving that interrupt handlers usually do. All effects on static stack analysis and calling conventions (see below) still occur. The intent is to allow interrupt handlers to be implemented in assembly and call into C with the normal calling convention. The interrupt attributes would still be necessary in that case for program correctness.<br />
<br />
==== Manual Interrupt Sequence ====<br />
Some operating systems may impose unusual requirements on interrupt handlers. They may, for example, push certain registers before calling the handler, but expect the handler to pop them before returning. no_isr routines allow implementing this kind of custom interrupt prologue and epilogue. However, doing this safely requires saving and restoring anything that might be in-use by the compiler.<br />
<br />
Below is a sample routine that includes the sum total of all pushes and pops needed by the compiler. If it can be proven that the entire, transitively called interrupt handler cannot use certain locations, then they can be elided. This is typically only possible if the interrupt handler is either written entirely in assembly.<br />
<br />
The callee-saved registers must be preserved by ''any'' function callable by C; interrupt handlers are no different. Accordingly, callee-saved registers aren't included in the code below. In this example, the JSR to "body" is expected to preserve them.<syntaxhighlight lang="6502tasm"><br />
pha<br />
txa<br />
pha<br />
tya<br />
pha<br />
lda mos8(__rc2)<br />
pha<br />
lda mos8(__rc3)<br />
pha<br />
lda mos8(__rc4)<br />
pha<br />
lda mos8(__rc5)<br />
pha<br />
lda __save_a<br />
pha<br />
lda __save_x<br />
pha<br />
lda __save_y<br />
pha<br />
lda __call_indir<br />
pha<br />
lda __call_indir+1<br />
pha<br />
<br />
JSR body<br />
<br />
pla<br />
sta __call_indir+1<br />
pla<br />
sta __call_indir<br />
pla<br />
sta __save_y<br />
pla<br />
sta __save_x<br />
pla<br />
sta __save_a<br />
pla<br />
sta mos8(__rc5)<br />
pla<br />
sta mos8(__rc4)<br />
pla<br />
sta mos8(__rc3)<br />
pla<br />
sta mos8(__rc2)<br />
pla<br />
tay<br />
pla<br />
tax<br />
pla<br />
rti<br />
<br />
</syntaxhighlight><br />
<br />
=== Undefined Behavior ===<br />
It shall be undefined behavior for any mechanism external to a C module to asynchronously call a C function that does not bear either the "interrupt" or "interrupt_norecurse" attributes. It shall also be undefined behavior for any mechanism external to a C module to asynchronously call an "interrupt_norecurse" function while another invocation of that same function is still active.<br />
<br />
[[Category:C]]</div>71.198.117.145https://llvm-mos.org/index.php?title=C_interrupts&diff=262C interrupts2021-08-17T04:36:34Z<p>71.198.117.145: /* "no_isr" Attribute */</p>
<hr />
<div>== Normal C Interrupt Handling ==<br />
The techniques llvm-mos uses for interrupt handling are somewhat unusual. To understand why, it's useful to start with the normal way C compilers deal with interrupts.<br />
<br />
Generally, C calling conventions divide registers into two classes: caller-saved and callee-saved. Functions are free to overwrite caller-saved registers, and callers of those functions need to be aware of and correctly handle this possibility. Functions must preserve the values of callee-saved registers, and their callers are allowed to count on this.<br />
<br />
Interrupt handlers necessarily break with this convention; a function can't know when and how it's going to be interrupted, so there's no way to "deal with" the interrupt handler overwriting caller-saved registers. So interrupt handlers need to treat all registers as if they were callee-saved.<br />
<br />
Usually, a target has at most around 32 registers, more-or-less evenly split between caller-saved and callee-saved. So an interrupt handler can reasonably save all the caller-saved registers (it would implicitly also save the callee-saved registers if it uses them, just by virtue of being a C function). Note that if it calls any other function, it needs to save ''all'' of the caller-saved registers, since it can't know which the callee will overwrite.<br />
<br />
== Challenges ==<br />
The nature of the 6502 presents some challenges to using this model.<br />
<br />
=== Static Stack Allocation ===<br />
The indexed addressing modes on the 6502 are quite slow, which gives incentive to avoid the normal C stack implementation. For llvm-mos, we opted to perform a call graph analysis and allocate the stack frames of non-recursive functions statically. This is safe, since via a conservative analysis we can prove that certain functions cannot have more than one invocation active at a time.<br />
<br />
However, interrupts can be active at the same time as any other function, potentially including themselves. The static stack analysis will need to be made aware of interrupts somehow, which means that programmers will need to annotate which functions have this "can appear out of nowhere" property.<br />
<br />
== Interrupt Annotations ==<br />
To solve the above problems, we introduce three new function attributes "interrupt", "interrupt_norecurse", and "no_isr".<br />
<br />
=== "interrupt" Attribute ===<br />
This attribute isn't actually MOS-specific, but we do ascribe to it some additional semantics. Any function bearing this attribute will treat all registers (except flags) as callee-saved and will return with RTI instead of RTS. Additionally, any such a function will be marked as possibly recursive for the purpose of static stack allocation. This will in turn cause any function that might be called by such a function to be forced to use the dynamic soft stack.<br />
<br />
=== "interrupt_norecurse" Attribute ===<br />
This attribute behaves identically to "interrupt", with one exception. When performing the static stack analysis, functions marked with this attribute will not be automatically considered recursive. Instead, any functions that might possibly called by an interrupt_norecurse function and main or two ''different'' interrupt_norecurse functions will be considered possibly recursive. Another way of looking at it is that interrupt_norecurse functions correspond to different sources of interrupts, and the model is one where interrupts from that source are disabled until they finish processing. Judicious use of interrupt_norecurse allows interrupt handlers to benefit from static stack allocation, leaving the "interrupt" attribute for interrupt handlers that can interrupt even themselves.<br />
<br />
=== "no_isr" Attribute ===<br />
This attribute can be added to an interrupt or interrupt_norecurse function to cause it to return with RTS and not perform any of the additional saving that interrupt handlers usually do. All effects on static stack analysis and calling conventions (see below) still occur. The intent is to allow interrupt handlers to be implemented in assembly and call into C with the normal calling convention. The interrupt attributes would still be necessary in that case for program correctness.<br />
<br />
==== Manual Interrupt Sequence ====<br />
Some operating systems may impose unusual requirements on interrupt handlers. They may, for example, push certain registers before calling the handler, but expect the handler to pop them before returning. no_isr routines allow implementing this kind of custom interrupt prologue and epilogue. However, doing this safely requires saving and restoring anything that might be in-use by the compiler.<br />
<br />
The callee-saved registers must be preserved by ''any'' function callable by C; interrupt handlers are no different. Accordingly, callee-saved registers aren't included.<br />
<br />
Below is a sample routine that includes the sum total of all pushes and pops needed by the compiler. If it can be proven that the entire, transitively called interrupt handler cannot use certain locations, then they can be elided. This is typically only possible if the interrupt handler is either written entirely in assembly.<syntaxhighlight lang="6502tasm"><br />
pha<br />
txa<br />
pha<br />
tya<br />
pha<br />
lda mos8(__rc2)<br />
pha<br />
lda mos8(__rc3)<br />
pha<br />
lda mos8(__rc4)<br />
pha<br />
lda mos8(__rc5)<br />
pha<br />
lda __save_a<br />
pha<br />
lda __save_x<br />
pha<br />
lda __save_y<br />
pha<br />
lda __call_indir<br />
pha<br />
lda __call_indir+1<br />
pha<br />
<br />
JSR body<br />
<br />
pla<br />
sta __call_indir+1<br />
pla<br />
sta __call_indir<br />
pla<br />
sta __save_y<br />
pla<br />
sta __save_x<br />
pla<br />
sta __save_a<br />
pla<br />
sta mos8(__rc5)<br />
pla<br />
sta mos8(__rc4)<br />
pla<br />
sta mos8(__rc3)<br />
pla<br />
sta mos8(__rc2)<br />
pla<br />
tay<br />
pla<br />
tax<br />
pla<br />
rti<br />
<br />
</syntaxhighlight><br />
<br />
=== Undefined Behavior ===<br />
It shall be undefined behavior for any mechanism external to a C module to asynchronously call a C function that does not bear either the "interrupt" or "interrupt_norecurse" attributes. It shall also be undefined behavior for any mechanism external to a C module to asynchronously call an "interrupt_norecurse" function while another invocation of that same function is still active.<br />
<br />
[[Category:C]]</div>71.198.117.145https://llvm-mos.org/index.php?title=C_calling_convention&diff=260C calling convention2021-08-15T05:03:26Z<p>71.198.117.145: Adopt the interrupt friendly calling convention across the board.</p>
<hr />
<div>The current calling convention is somewhat simplistic; it will be tuned for performance and size before the initial release of the compiler.<br />
<br />
* The caller-saved registers are flags, A, X, Y, RC2, RC3, RC4, and RC5. A function may freely overwrite any of these, and the function's callers have to just deal with it. All other registers are callee-saved. A function can use them freely, but before it returns it has to put them back exactly the way it found them, and the function's callers can rely on this behavior.<br />
* The bytes composing numeric arguments are passed individually in caller-saved registers, except flags and Y. (It's not worth trying to get arguments into and out of the flags, and Y isn't used to make it easier to set up outgoing arguments for function calls.)<br />
* Pointers are preferentially assigned to imaginary register pairs, functioning as pointer registers (i.e., RS0=(RC0, RC1) to RS128=(RC254, RC255)). If none are available, the low and high bytes are split and passed as above.<br />
* If no registers remain available, any remaining bytes are passed through the soft stack.<br />
* Aggregate types (structs, arrays, etc.) are passed by pointer. The pointer is managed entirely by the caller, and may or may not be on the soft stack. The callee is free to write to the memory; the caller must consider the memory overwritten by the call. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* Aggregate types are returned by a pointer passed as an implicit first argument. The resulting function returns void. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* Variable arguments (those within the ellipses of the argument list) are passed through the stack. Named arguments before the variable arguments are passed as usual: first in registers, then stack. Note that the variable argument and regular calling convention differ; thus, variable argument functions must only be called if prototyped. The C standard requires this, but many platforms do not; their variable argument and regular calling conventions are identical. A notable exception is Apple ARM64.<br />
* Values may be returned on the soft stack if insufficiently many registers are available. Callers must reserve sufficient space for this as they do for arguments. The space reserved for arguments may overlap freely with the space used for return values; thus only enough space for the larger of the two need be allocated.<br />
<br />
For insight into the design of performant calling conventions, see the following work by Davidson and Whalley. By their convention, this plaftorm uses the "smarter hybrid" method, since LLVM performs both shrink wrapping and caller save-restore placement optimizations, while using both callee-saved and caller-saved registers when appropriate.<br />
<br />
<br />
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.4669&rep=rep1&type=pdf Methods for Saving and Restoring Register Values across Function Calls: Software--Practice and Experience Vol 21(2), 149-165 (February 1991)]<br />
[[Category:C]]</div>71.198.117.145https://llvm-mos.org/index.php?title=C_interrupts&diff=259C interrupts2021-08-15T04:57:29Z<p>71.198.117.145: Remove calling convention shenanigans.</p>
<hr />
<div>== Normal C Interrupt Handling ==<br />
The techniques llvm-mos uses for interrupt handling are somewhat unusual. To understand why, it's useful to start with the normal way C compilers deal with interrupts.<br />
<br />
Generally, C calling conventions divide registers into two classes: caller-saved and callee-saved. Functions are free to overwrite caller-saved registers, and callers of those functions need to be aware of and correctly handle this possibility. Functions must preserve the values of callee-saved registers, and their callers are allowed to count on this.<br />
<br />
Interrupt handlers necessarily break with this convention; a function can't know when and how it's going to be interrupted, so there's no way to "deal with" the interrupt handler overwriting caller-saved registers. So interrupt handlers need to treat all registers as if they were callee-saved.<br />
<br />
Usually, a target has at most around 32 registers, more-or-less evenly split between caller-saved and callee-saved. So an interrupt handler can reasonably save all the caller-saved registers (it would implicitly also save the callee-saved registers if it uses them, just by virtue of being a C function). Note that if it calls any other function, it needs to save ''all'' of the caller-saved registers, since it can't know which the callee will overwrite.<br />
<br />
== Challenges ==<br />
The nature of the 6502 presents some challenges to using this model.<br />
<br />
=== Static Stack Allocation ===<br />
The indexed addressing modes on the 6502 are quite slow, which gives incentive to avoid the normal C stack implementation. For llvm-mos, we opted to perform a call graph analysis and allocate the stack frames of non-recursive functions statically. This is safe, since via a conservative analysis we can prove that certain functions cannot have more than one invocation active at a time.<br />
<br />
However, interrupts can be active at the same time as any other function, potentially including themselves. The static stack analysis will need to be made aware of interrupts somehow, which means that programmers will need to annotate which functions have this "can appear out of nowhere" property.<br />
<br />
== Interrupt Annotations ==<br />
To solve the above problems, we introduce three new function attributes "interrupt", "interrupt_norecurse", and "no_isr".<br />
<br />
=== "interrupt" Attribute ===<br />
This attribute isn't actually MOS-specific, but we do ascribe to it some additional semantics. Any function bearing this attribute will treat all registers (except flags) as callee-saved and will return with RTI instead of RTS. Additionally, any such a function will be marked as possibly recursive for the purpose of static stack allocation. This will in turn cause any function that might be called by such a function to be forced to use the dynamic soft stack.<br />
<br />
=== "interrupt_norecurse" Attribute ===<br />
This attribute behaves identically to "interrupt", with one exception. When performing the static stack analysis, functions marked with this attribute will not be automatically considered recursive. Instead, any functions that might possibly called by an interrupt_norecurse function and main or two ''different'' interrupt_norecurse functions will be considered possibly recursive. Another way of looking at it is that interrupt_norecurse functions correspond to different sources of interrupts, and the model is one where interrupts from that source are disabled until they finish processing. Judicious use of interrupt_norecurse allows interrupt handlers to benefit from static stack allocation, leaving the "interrupt" attribute for interrupt handlers that can interrupt even themselves.<br />
<br />
=== "no_isr" Attribute ===<br />
This attribute can be added to an interrupt or interrupt_norecurse function to cause it to return with RTS and not perform any of the additional saving that interrupt handlers usually do. All effects on static stack analysis and calling conventions (see below) still occur. The intent is to allow interrupt handlers to be implemented in assembly and call into C with the normal calling convention. The interrupt attributes would still be necessary in that case for program correctness.<br />
<br />
=== Undefined Behavior ===<br />
It shall be undefined behavior for any mechanism external to a C module to asynchronously call a C function that does not bear either the "interrupt" or "interrupt_norecurse" attributes. It shall also be undefined behavior for any mechanism external to a C module to asynchronously call an "interrupt_norecurse" function while another invocation of that same function is still active.<br />
<br />
[[Category:C]]</div>71.198.117.145https://llvm-mos.org/index.php?title=Welcome&diff=249Welcome2021-07-16T20:51:42Z<p>71.198.117.145: Add trademark disclaimer.</p>
<hr />
<div>[[Category:Main]]<br />
[[File:Hello-vic20.png|thumb|Hello world of LLVM assembler targeting Commodore VIC-20]]<br />
[[File:Hello-apple2.png|thumb|Hello world of LLVM assembler targeting Apple IIe]]<br />
[[File:Hello-c64.png|thumb|Hello world of LLVM assembler targeting Commodore 64]]<br />
[[File:Rust-hello-atari-800.png|thumb|Hello world in Rust, with factorial calculation, for Atari 800, proof of concept by mrk]]<br />
<br />
== Welcome to the llvm-mos project! ==<br />
<br />
The llvm-mos project is intended to create a first-class [https://llvm.org/docs/WritingAnLLVMBackend.html backend] in [https://llvm.org/ LLVM] for the [[wikipedia:MOS_Technology|MOS Technology]] 65xx series of microprocessors and their clones.<br />
<br />
To get started playing with the tools, check out [[Getting started]].<br />
<br />
There have been many failed attempts to create a 6502 backend for LLVM. Ours is the first to successfully compile working programs. The llvm-mos Clang is nearly compatibile with freestanding C99, and the relevant portions of the LLVM end-to-end test suite pass on a simulated 6502 in a variety of configurations. The project also includes a feature-complete assembler and ELF linker support for generic 6502 targets.<br />
<br />
This project will permit modern C programs, written in a modern style, to target common microcomputers of the 1980s, including but not limited to the [[wikipedia:Commodore_64|Commodore 64]], the [[wikipedia:Apple_IIe|Apple IIe]], and the [[wikipedia:Atari_8-bit_family|Atari 8-bit family]].<br />
<br />
Our work is based on LLVM's novel [https://llvm.org/docs/GlobalISel/index.html GlobalISel] architecture, and as such, our compiler will be ''aggressive'' about pursuing optimization opportunities for the 65xx series. While our focus is currently feature-completeness, not optimization, we've already overcome all existing theoretical hurdles necessary to emit high quality 6502 code. <br />
<br />
The development team has established a [https://github.com/llvm-mos project on Github]. Acceptance tests and packaging occur via [https://docs.github.com/en/actions/learn-github-actions/introduction-to-github-actions Github actions] as well. <br />
<br />
We provide current builds of the main branch of the llvm-mos tool chain for [https://github.com/llvm-mos/llvm-mos/releases/tag/llvm-mos-windows-main Windows], [https://github.com/llvm-mos/llvm-mos/releases/tag/llvm-mos-darwin-main MacOS], and [https://github.com/llvm-mos/llvm-mos/releases/tag/llvm-mos-linux-main Ubuntu Linux]. <br />
<br />
Ongoing, public development discussions occur on Slack. If you're an experienced programmer, with a detailed understanding of the LLVM architecture, then [https://join.slack.com/t/llvm-mos/shared_invite/zt-rtaxxsdu-~3tSQaQCQjLmc27OVX5vsA please join our Slack group now] and help out.<br />
<br />
==== Notice ====<br />
The llvm-mos project is not officially affiliated with or endorsed by the LLVM Foundation or LLVM project. Our project is a fork of LLVM that provides a new backend and Clang target; our project is based on LLVM, not a part of LLVM. Our use of LLVM or other related trademarks does not imply affiliation or endorsement.<br />
<br />
=== Category tree ===<br />
<br />
<CategoryTree mode="pages" depth="3" hideroot="on">Main</CategoryTree><br />
<br />
=== Categories ===<br />
<br />
{{Special:AllPages|namespace=14}}<br />
<br />
=== Pages ===<br />
<br />
{{Special:AllPages}}</div>71.198.117.145https://llvm-mos.org/index.php?title=C_calling_convention&diff=246C calling convention2021-07-06T20:28:31Z<p>71.198.117.145: </p>
<hr />
<div>The current calling convention is somewhat simplistic; it will be tuned for performance and size before the initial release of the compiler.<br />
<br />
* The bytes composing numeric arguments are passed individually in registers. The order used is A, then X, then each available imaginary (zero page) register, increasing (i.e., RC0 to RC255). The Y register is not used for arguments, but Y is still caller-saved; it's reserved to help the compiler shuffle values into the appropriate locations around calls.<br />
* Pointers are preferentially assigned to imaginary register pairs, functioning as pointer registers (i.e., RS0=(RC0, RC1) to RS128=(RC254, RC255)). If none are available, the low and high bytes are split and passed as above.<br />
* If no registers remain available, any remaining bytes are passed through the soft stack.<br />
* The callee-saved imaginary registers, RS2 (i.e., RC4 and RC5) and RS4 (i.e., RC8 and RC9) are skipped.<br />
* Aggregate types (structs, arrays, etc.) are passed by pointer. The pointer is managed entirely by the caller, and may or may not be on the soft stack. The callee is free to write to the memory; the caller must consider the memory overwritten by the call. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* Aggregate types are returned by a pointer passed as an implicit first argument. The resulting function returns void. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* RS2 and RS4 (and subregisters) are callee-saved. All other ZP locations, registers, and flags are caller-saved. The gap between the callee-saved registers balances between caller- and callee-saved registers if very little of the zero page is available.<br />
* Variable arguments (those within the ellipses of the argument list) are passed through the stack. Named arguments before the variable arguments are passed as usual: first in registers, then stack. Note that the variable argument and regular calling convention differ; thus, variable argument functions must only be called if prototyped. The C standard requires this, but many platforms do not; their variable argument and regular calling conventions are identical. A notable exception is Apple ARM64.<br />
* Values may be returned on the soft stack if insufficiently many registers are available. Callers must reserve sufficient space for this as they do for arguments. The space reserved for arguments may overlap freely with the space used for return values; thus only enough space for the larger of the two need be allocated.<br />
<br />
For insight into the design of performant calling conventions, see the following work by Davidson and Whalley. By their convention, this plaftorm uses the "smarter hybrid" method, since LLVM performs both shrink wrapping and caller save-restore placement optimizations, while using both callee-saved and caller-saved registers when appropriate.<br />
<br />
<br />
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.4669&rep=rep1&type=pdf Methods for Saving and Restoring Register Values across Function Calls: Software--Practice and Experience Vol 21(2), 149-165 (February 1991)]<br />
[[Category:C]]</div>71.198.117.145https://llvm-mos.org/index.php?title=C_calling_convention&diff=245C calling convention2021-07-06T06:25:52Z<p>71.198.117.145: </p>
<hr />
<div>The current calling convention is somewhat simplistic; it will be tuned for performance and size before the initial release of the compiler.<br />
<br />
* The bytes composing numeric arguments are passed individually in registers. The order used is A, then X, then each available imaginary (zero page) register, increasing (i.e., RC0 to RC255). The Y register is not used for arguments, but Y is still caller-saved; it's reserved to help the compiler shuffle values into the appropriate locations around calls.<br />
* Pointers are preferentially assigned to imaginary register pairs, functioning as pointer registers (i.e., RS0=(RC0, RC1) to RS128=(RC254, RC255)). If none are available, the low and high bytes are split and passed as above.<br />
* If no registers remain available, any remaining bytes are passed through the soft stack.<br />
* The callee-saved imaginary registers, RS2 (i.e., RC4 and RC5) and RS4 (i.e., RC8 and RC9) are skipped.<br />
* The very last imaginary pointer register is skipped, although it is still caller-saved.<br />
* Aggregate types (structs, arrays, etc.) are passed by pointer. The pointer is managed entirely by the caller, and may or may not be on the soft stack. The callee is free to write to the memory; the caller must consider the memory overwritten by the call. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* Aggregate types are returned by a pointer passed as an implicit first argument. The resulting function returns void. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* RS2 and RS4 (and subregisters) are callee-saved. All other ZP locations, registers, and flags are caller-saved. The gap between the callee-saved registers balances between caller- and callee-saved registers if very little of the zero page is available.<br />
* Variable arguments (those within the ellipses of the argument list) are passed through the stack. Named arguments before the variable arguments are passed as usual: first in registers, then stack. Note that the variable argument and regular calling convention differ; thus, variable argument functions must only be called if prototyped. The C standard requires this, but many platforms do not; their variable argument and regular calling conventions are identical. A notable exception is Apple ARM64.<br />
* Values may be returned on the soft stack if insufficiently many registers are available. Callers must reserve sufficient space for this as they do for arguments. The space reserved for arguments may overlap freely with the space used for return values; thus only enough space for the larger of the two need be allocated.<br />
<br />
For insight into the design of performant calling conventions, see the following work by Davidson and Whalley. By their convention, this plaftorm uses the "smarter hybrid" method, since LLVM performs both shrink wrapping and caller save-restore placement optimizations, while using both callee-saved and caller-saved registers when appropriate.<br />
<br />
<br />
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.4669&rep=rep1&type=pdf Methods for Saving and Restoring Register Values across Function Calls: Software--Practice and Experience Vol 21(2), 149-165 (February 1991)]<br />
[[Category:C]]</div>71.198.117.145https://llvm-mos.org/index.php?title=C_calling_convention&diff=244C calling convention2021-07-05T20:21:48Z<p>71.198.117.145: </p>
<hr />
<div>The current calling convention is somewhat simplistic; it will be tuned for performance and size before the initial release of the compiler.<br />
<br />
* The bytes composing numeric arguments are passed individually in registers. The order used is A, then X, then each available imaginary (zero page) register, increasing (i.e., RC0 to RC255). The Y register is not used for arguments, but Y is still caller-saved; it's reserved to help the compiler shuffle values into the appropriate locations around calls.<br />
* Pointers are preferentially assigned to imaginary register pairs, functioning as pointer registers (i.e., RS0=(RC0, RC1) to RS128=(RC254, RC255)). If none are available, the low and high bytes are split and passed as above.<br />
* If no registers remain available, any remaining bytes are passed through the soft stack.<br />
* The callee-saved imaginary registers, RS2 (i.e., RC4 and RC5) and RS4 (i.e., RC8 and RC9) are skipped.<br />
* Aggregate types (structs, arrays, etc.) are passed by pointer. The pointer is managed entirely by the caller, and may or may not be on the soft stack. The callee is free to write to the memory; the caller must consider the memory overwritten by the call. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* Aggregate types are returned by a pointer passed as an implicit first argument. The resulting function returns void. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* RS2 and RS4 (and subregisters) are callee-saved. All other ZP locations, registers, and flags are caller-saved. The gap between the callee-saved registers balances between caller- and callee-saved registers if very little of the zero page is available.<br />
* Variable arguments (those within the ellipses of the argument list) are passed through the stack. Named arguments before the variable arguments are passed as usual: first in registers, then stack. Note that the variable argument and regular calling convention differ; thus, variable argument functions must only be called if prototyped. The C standard requires this, but many platforms do not; their variable argument and regular calling conventions are identical. A notable exception is Apple ARM64.<br />
* Values may be returned on the soft stack if insufficiently many registers are available. Callers must reserve sufficient space for this as they do for arguments. The space reserved for arguments may overlap freely with the space used for return values; thus only enough space for the larger of the two need be allocated.<br />
<br />
For insight into the design of performant calling conventions, see the following work by Davidson and Whalley. By their convention, this plaftorm uses the "smarter hybrid" method, since LLVM performs both shrink wrapping and caller save-restore placement optimizations, while using both callee-saved and caller-saved registers when appropriate.<br />
<br />
<br />
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.4669&rep=rep1&type=pdf Methods for Saving and Restoring Register Values across Function Calls: Software--Practice and Experience Vol 21(2), 149-165 (February 1991)]<br />
[[Category:C]]</div>71.198.117.145https://llvm-mos.org/index.php?title=Rationale&diff=243Rationale2021-07-01T23:45:08Z<p>71.198.117.145: /* Findings */</p>
<hr />
<div>== Why yet another 6502 C compiler? ==<br />
After all, there's [https://github.com/cc65/cc65 cc65], and [https://gitlab.com/camelot/kickc KickC], and [http://www.compilers.de/vbcc.html vbcc], and plenty of other efforts out there to target the 6502. And the processor is ancient anyway. So why bother porting LLVM to it?<br />
<br />
== Performance ==<br />
As an LLVM backend, we benefit from the expansive high-level optimizations available. These include radical code transformations of switch statements, loops, and table lookups. Nothing beyond what a human could do of course; a human wrote all these optimizations, of course. But there's potential far beyond what a human would have patience for. As an example, switch statement cases may be shifted and bitwise operations applied to them to make the different case integers denser. This increases the number that can fit into a jump table, which decreases the amount of branching needed to execute the switch. A human could do that for a switch statement, but it's unlikely they'd go through the effort for any but the most performance critical. LLVM will tirelessly consider it for every single switch in the program.<br />
<br />
At a lower level, good use of the zero page is essential to producing good 6502 code. To that end, we model the zero page as an "imaginary register" bank. The number and placement of these registers are completely customizable by the end user to fit a variety of target system memory models. Using registers for this purpose allows us full access to LLVM's register allocator, which can often allocate program temporary values in such a way that they never need to leave the zero page, A, X, and Y. This vastly reduces need for soft (emulated) stack, which is a sticking point for earlier 6502 compilers.<br />
<br />
Even when a stack of some kind is required, the optimizer performs whole-program analysis to identify functions that cannot simultaneously have more than one invocation active. These functions can have their "stack frames" allocated in absolute memory, again avoiding use of the soft stack. We reserve the actual soft stack only for cases where it cannot be statically proven that a function doesn't intrinsically require it (due to function pointers or other complex control flow).<br />
<br />
As for the code itself, we perform a remarkably effective loop optimization that detects 16-bit index operations that can be converted to a 16-bit index plus an 8-bit offset. The latter is a directly-supported addressing mode on the 6502, and 8-bit index manipulation can be done in a single instruction. This allows us to convert idiomatic 16-bit "int c" loops into something much more suitable for the 6502. Eventually, we hope that optimizations of this kind will transform standard, naive C code into tightly optimized 6502 code. <br />
<br />
== Features ==<br />
Because this is a subproject of LLVM, it inherits all the features of LLVM. Namely, this project provides full ELF support for 6502 objects, libraries, and executables. This opens up previously impossible functionality, such as viewing 6502 program properties in ELF tools that don't know anything about the 6502 specifically.<br />
<br />
llvm-mos is not limited to C. It also provides a fully functional assembler and disassembler that reads and writes assembly source files in a GNU assembler compatible format. This in turns opens up a world of macro programming functionality, for those who prefer to work at the metal level.<br />
<br />
A proof of concept exists, demonstrating that [http://forum.6502.org/viewtopic.php?f=2&t=6450&p=84195#p84048 llvm-mos can support Rust as a source language]. This suggests that LLVM-MOS can support other source languages as well, such as C++.<br />
<br />
Lastly, the llvm-mos project is entirely open source, and developed entirely consistently with LLVM coding standards in mind. Want to experiment with a new codegen pass, or adding a new target? Jump right in, clone the codebase, and start playing.<br />
<br />
== Findings ==<br />
Several common assumptions about the MOS 6502 processor, and C compilers targeting it, are now refuted by our work.<br />
<br />
First, the assumption that a modern compiler framework, such as LLVM, cannot be targeted towards an old 8-bit CPU such as the 6502. LLVM's new GlobalISel architecture can very well be targeted to the 6502, and it can indeed produce superior code, if permitted to do so.<br />
<br />
Second, the assumption that because the 6502 is "stackless," and has few registers, it is not a good host for C.<br />
<br />
Regarding stacks, while it's true that the standard C runtime model is quite hostile to 6502 performance, the C standard provides broad latitude for alternative models that behavior in all points "as if" it were the C model. In the broad space of possible alternatives, we've found a collection of techniques that broadly preserve C standard compatibility while emitting very high quality code. To put it another way, we go to great lengths to emit code that operates "as if there were stacks", without using stacks at all. LLVM's sophistication facilitates this; the analyses required are quite intricate, but most of them are slight modifications to data structures already available in LLVM.<br />
<br />
Regarding registers, the original 6502 designers were well aware of the 6502's register limitations, and so provided a bunch of zero-page addressing modes to compensate. We present these to LLVM as registers, which takes our backend from "alien nightmare" to "ugly duckling", not unlike x86 or AVR. Normal register allocation techniques apply, since 6502 instructions treat different zero page locations identically. While A, X, and Y are a bit unusual, they're not any worse than x86, and LLVM's register is fully capable of handling them, even if the relationship is a bit strained.<br />
<br />
Third, that "simpler is better" for producing a performant compiler for 8-bit targets. llvm-mos's architecture and design choices are not at all simple. I haven't counted, but I think llvm-mos is doing about 100 passes through the code, about 8 of which are specific to the MOS 6502.<br />
<br />
Fourth, that the 6502 is implicitly some sort of "special" architecture, and it therefore requires special compilers, linkers, binary file formats, etc. We treat the 6502, ultimately, as just another target within the LLVM framework, and as such it benefits from all the industry-standard ELF-compatible file formats.<br />
<br />
Fifth, that because the 6502 is a small target, it requires a smaller compiler and smaller tools. This assumption never really made sense anyway. In fact the opposite is true: if you want to do advanced codegen for the 6502, you need a really intelligent (and large) compiler and toolchain framework, not a small one. The state of the art of optimization has advanced leaps and bounds in the past three decades, and the poor old 6502 has received none of those benefits, until the current work.<br />
<br />
Sixth, that peephole optimization produces the best codegen for the 6502. In fact, llvm-mos gets the most benefit out of 6502-specific optimizations relatively early in the LLVM machine function pass pipeline, and the code it produces (in small tests) is quite efficient, even without any 6502-specific peephole optimizer at all. "The more clothes you put on during the day, the more you have to take off at night." One high-level instruction can become a big block of 6502, so a single high-level optimization that removes it can prevent a thousand cases from being needed to handle it later.<br />
<br />
Seventh, that because the 6502 is small, it requires some sort of specialized language (in the [https://dwheeler.com/6502/ David A. Wheeler] sense) in order to generate performant code. While this makes the job of the compiler author considerably easier, we like the challenge. Organizing code for the 6502 is actually rather difficult; making "a version of C" that is easy to compile just shifts the burden of this work back onto the user. LLVM will likely be able to handle compiling other languages to the 6502, at some point in the future. Rust support has already been proven, but there are no problems in principle with lowering many more languages to the 6502.<br />
[[Category:Main]]</div>71.198.117.145https://llvm-mos.org/index.php?title=Rationale&diff=242Rationale2021-07-01T23:44:26Z<p>71.198.117.145: /* Findings */</p>
<hr />
<div>== Why yet another 6502 C compiler? ==<br />
After all, there's [https://github.com/cc65/cc65 cc65], and [https://gitlab.com/camelot/kickc KickC], and [http://www.compilers.de/vbcc.html vbcc], and plenty of other efforts out there to target the 6502. And the processor is ancient anyway. So why bother porting LLVM to it?<br />
<br />
== Performance ==<br />
As an LLVM backend, we benefit from the expansive high-level optimizations available. These include radical code transformations of switch statements, loops, and table lookups. Nothing beyond what a human could do of course; a human wrote all these optimizations, of course. But there's potential far beyond what a human would have patience for. As an example, switch statement cases may be shifted and bitwise operations applied to them to make the different case integers denser. This increases the number that can fit into a jump table, which decreases the amount of branching needed to execute the switch. A human could do that for a switch statement, but it's unlikely they'd go through the effort for any but the most performance critical. LLVM will tirelessly consider it for every single switch in the program.<br />
<br />
At a lower level, good use of the zero page is essential to producing good 6502 code. To that end, we model the zero page as an "imaginary register" bank. The number and placement of these registers are completely customizable by the end user to fit a variety of target system memory models. Using registers for this purpose allows us full access to LLVM's register allocator, which can often allocate program temporary values in such a way that they never need to leave the zero page, A, X, and Y. This vastly reduces need for soft (emulated) stack, which is a sticking point for earlier 6502 compilers.<br />
<br />
Even when a stack of some kind is required, the optimizer performs whole-program analysis to identify functions that cannot simultaneously have more than one invocation active. These functions can have their "stack frames" allocated in absolute memory, again avoiding use of the soft stack. We reserve the actual soft stack only for cases where it cannot be statically proven that a function doesn't intrinsically require it (due to function pointers or other complex control flow).<br />
<br />
As for the code itself, we perform a remarkably effective loop optimization that detects 16-bit index operations that can be converted to a 16-bit index plus an 8-bit offset. The latter is a directly-supported addressing mode on the 6502, and 8-bit index manipulation can be done in a single instruction. This allows us to convert idiomatic 16-bit "int c" loops into something much more suitable for the 6502. Eventually, we hope that optimizations of this kind will transform standard, naive C code into tightly optimized 6502 code. <br />
<br />
== Features ==<br />
Because this is a subproject of LLVM, it inherits all the features of LLVM. Namely, this project provides full ELF support for 6502 objects, libraries, and executables. This opens up previously impossible functionality, such as viewing 6502 program properties in ELF tools that don't know anything about the 6502 specifically.<br />
<br />
llvm-mos is not limited to C. It also provides a fully functional assembler and disassembler that reads and writes assembly source files in a GNU assembler compatible format. This in turns opens up a world of macro programming functionality, for those who prefer to work at the metal level.<br />
<br />
A proof of concept exists, demonstrating that [http://forum.6502.org/viewtopic.php?f=2&t=6450&p=84195#p84048 llvm-mos can support Rust as a source language]. This suggests that LLVM-MOS can support other source languages as well, such as C++.<br />
<br />
Lastly, the llvm-mos project is entirely open source, and developed entirely consistently with LLVM coding standards in mind. Want to experiment with a new codegen pass, or adding a new target? Jump right in, clone the codebase, and start playing.<br />
<br />
== Findings ==<br />
Several common assumptions about the MOS 6502 processor, and C compilers targeting it, are now refuted by our work.<br />
<br />
First, the assumption that a modern compiler framework, such as LLVM, cannot be targeted towards an old 8-bit CPU such as the 6502. LLVM's new GlobalISel architecture can very well be targeted to the 6502, and it can indeed produce superior code, if permitted to do so.<br />
<br />
Second, the assumption that because the 6502 is "stackless," and has few registers, it is not a good host for C.<br />
<br />
Regarding stacks, while it's true that the standard C runtime model is quite hostile to 6502 performance, the C standard provides broad latitude for alternative models that behavior in all points "as if" it were the C model. In the broad space of possible alternatives, we've found a collection of techniques that broadly preserve C standard compatibility while emitting very high quality code. To put it another way, we go to great lengths to emit code that operates "as if there were stacks", without using stacks at all. LLVM's sophistication facilitates this; the analyses required are quite intricate, but most of them are slight modifications to data structures already available in LLVM.<br />
<br />
Regarding registers, the original 6502 designers were well aware of the 6502's register limitations, and so provided a bunch of zero-page addressing modes to compensate. We present these to LLVM as registers, which takes our backend from "alien nightmare" to "ugly duckling", not unlike x86 or AVR. Normal register allocation techniques apply, since 6502 instructions treat different zero page locations identically. While A, X, and Y are a bit unusual, there not any worse than x86, and LLVM's register is fully capable of handling them, even if the relationship is a bit strained.<br />
<br />
Third, that "simpler is better" for producing a performant compiler for 8-bit targets. llvm-mos's architecture and design choices are not at all simple. I haven't counted, but I think llvm-mos is doing about 100 passes through the code, about 8 of which are specific to the MOS 6502.<br />
<br />
Fourth, that the 6502 is implicitly some sort of "special" architecture, and it therefore requires special compilers, linkers, binary file formats, etc. We treat the 6502, ultimately, as just another target within the LLVM framework, and as such it benefits from all the industry-standard ELF-compatible file formats.<br />
<br />
Fifth, that because the 6502 is a small target, it requires a smaller compiler and smaller tools. This assumption never really made sense anyway. In fact the opposite is true: if you want to do advanced codegen for the 6502, you need a really intelligent (and large) compiler and toolchain framework, not a small one. The state of the art of optimization has advanced leaps and bounds in the past three decades, and the poor old 6502 has received none of those benefits, until the current work.<br />
<br />
Sixth, that peephole optimization produces the best codegen for the 6502. In fact, llvm-mos gets the most benefit out of 6502-specific optimizations relatively early in the LLVM machine function pass pipeline, and the code it produces (in small tests) is quite efficient, even without any 6502-specific peephole optimizer at all. "The more clothes you put on during the day, the more you have to take off at night." One high-level instruction can become a big block of 6502, so a single high-level optimization that removes it can prevent a thousand cases from being needed to handle it later.<br />
<br />
Seventh, that because the 6502 is small, it requires some sort of specialized language (in the [https://dwheeler.com/6502/ David A. Wheeler] sense) in order to generate performant code. While this makes the job of the compiler author considerably easier, we like the challenge. Organizing code for the 6502 is actually rather difficult; making "a version of C" that is easy to compile just shifts the burden of this work back onto the user. LLVM will likely be able to handle compiling other languages to the 6502, at some point in the future. Rust support has already been proven, but there are no problems in principle with lowering many more languages to the 6502.<br />
[[Category:Main]]</div>71.198.117.145https://llvm-mos.org/index.php?title=Rationale&diff=241Rationale2021-07-01T23:43:13Z<p>71.198.117.145: /* Performance */</p>
<hr />
<div>== Why yet another 6502 C compiler? ==<br />
After all, there's [https://github.com/cc65/cc65 cc65], and [https://gitlab.com/camelot/kickc KickC], and [http://www.compilers.de/vbcc.html vbcc], and plenty of other efforts out there to target the 6502. And the processor is ancient anyway. So why bother porting LLVM to it?<br />
<br />
== Performance ==<br />
As an LLVM backend, we benefit from the expansive high-level optimizations available. These include radical code transformations of switch statements, loops, and table lookups. Nothing beyond what a human could do of course; a human wrote all these optimizations, of course. But there's potential far beyond what a human would have patience for. As an example, switch statement cases may be shifted and bitwise operations applied to them to make the different case integers denser. This increases the number that can fit into a jump table, which decreases the amount of branching needed to execute the switch. A human could do that for a switch statement, but it's unlikely they'd go through the effort for any but the most performance critical. LLVM will tirelessly consider it for every single switch in the program.<br />
<br />
At a lower level, good use of the zero page is essential to producing good 6502 code. To that end, we model the zero page as an "imaginary register" bank. The number and placement of these registers are completely customizable by the end user to fit a variety of target system memory models. Using registers for this purpose allows us full access to LLVM's register allocator, which can often allocate program temporary values in such a way that they never need to leave the zero page, A, X, and Y. This vastly reduces need for soft (emulated) stack, which is a sticking point for earlier 6502 compilers.<br />
<br />
Even when a stack of some kind is required, the optimizer performs whole-program analysis to identify functions that cannot simultaneously have more than one invocation active. These functions can have their "stack frames" allocated in absolute memory, again avoiding use of the soft stack. We reserve the actual soft stack only for cases where it cannot be statically proven that a function doesn't intrinsically require it (due to function pointers or other complex control flow).<br />
<br />
As for the code itself, we perform a remarkably effective loop optimization that detects 16-bit index operations that can be converted to a 16-bit index plus an 8-bit offset. The latter is a directly-supported addressing mode on the 6502, and 8-bit index manipulation can be done in a single instruction. This allows us to convert idiomatic 16-bit "int c" loops into something much more suitable for the 6502. Eventually, we hope that optimizations of this kind will transform standard, naive C code into tightly optimized 6502 code. <br />
<br />
== Features ==<br />
Because this is a subproject of LLVM, it inherits all the features of LLVM. Namely, this project provides full ELF support for 6502 objects, libraries, and executables. This opens up previously impossible functionality, such as viewing 6502 program properties in ELF tools that don't know anything about the 6502 specifically.<br />
<br />
llvm-mos is not limited to C. It also provides a fully functional assembler and disassembler that reads and writes assembly source files in a GNU assembler compatible format. This in turns opens up a world of macro programming functionality, for those who prefer to work at the metal level.<br />
<br />
A proof of concept exists, demonstrating that [http://forum.6502.org/viewtopic.php?f=2&t=6450&p=84195#p84048 llvm-mos can support Rust as a source language]. This suggests that LLVM-MOS can support other source languages as well, such as C++.<br />
<br />
Lastly, the llvm-mos project is entirely open source, and developed entirely consistently with LLVM coding standards in mind. Want to experiment with a new codegen pass, or adding a new target? Jump right in, clone the codebase, and start playing.<br />
<br />
== Findings ==<br />
Several common assumptions about the MOS 6502 processor, and C compilers targeting it, are now refuted by our work.<br />
<br />
First, the assumption that a modern compiler framework, such as LLVM, cannot be targeted towards an old 8-bit CPU such as the 6502. LLVM's new GlobalISel architecture can very well be targeted to the 6502, and it can indeed produce superior code, if permitted to do so.<br />
<br />
Second, the assumption that because the 6502 is "stackless," and has few registers, it is not a good host for C.<br />
<br />
Regarding stacks, while it's true that the standard C runtime model is quite hostile to 6502 performance, the C standard provides broad latitude for alternative models that behavior in all points "as if" it were the C model. In the broad space of possible alternatives, we've found a collection of techniques that broadly preserve C standard compatibility while emitting very high quality code. To put it another way, we go to great lengths to emit code that operates "as if there were stacks", without using stacks at all. LLVM's sophistication facilitates this; the analyses required are quite intricate, but most of them are slight modifications to data structures already available in LLVM.<br />
<br />
Regarding registers, the original 6502 designers were well aware of the 6502's register limitations, and so provided a bunch of zero-page addressing modes to compensate. We presenting these to LLVM as registers, which makes our backend look roughly like a slightly odd backend, not unlike x86 or AVR. Normal register allocation techniques apply, since 6502 instructions treat different zero page locations identically. While A, X, and Y are a bit unusual, there not any worse than x86, and LLVM's register is fully capable of handling them, even if the relationship is a bit strained.<br />
<br />
Third, that "simpler is better" for producing a performant compiler for 8-bit targets. llvm-mos's architecture and design choices are not at all simple. I haven't counted, but I think llvm-mos is doing about 100 passes through the code, about 8 of which are specific to the MOS 6502.<br />
<br />
Fourth, that the 6502 is implicitly some sort of "special" architecture, and it therefore requires special compilers, linkers, binary file formats, etc. We treat the 6502, ultimately, as just another target within the LLVM framework, and as such it benefits from all the industry-standard ELF-compatible file formats.<br />
<br />
Fifth, that because the 6502 is a small target, it requires a smaller compiler and smaller tools. This assumption never really made sense anyway. In fact the opposite is true: if you want to do advanced codegen for the 6502, you need a really intelligent (and large) compiler and toolchain framework, not a small one. The state of the art of optimization has advanced leaps and bounds in the past three decades, and the poor old 6502 has received none of those benefits, until the current work.<br />
<br />
Sixth, that peephole optimization produces the best codegen for the 6502. In fact, llvm-mos gets the most benefit out of 6502-specific optimizations relatively early in the LLVM machine function pass pipeline, and the code it produces (in small tests) is quite efficient, even without any 6502-specific peephole optimizer at all. "The more clothes you put on during the day, the more you have to take off at night." One high-level instruction can become a big block of 6502, so a single high-level optimization that removes it can prevent a thousand cases from being needed to handle it later.<br />
<br />
Seventh, that because the 6502 is small, it requires some sort of specialized language (in the [https://dwheeler.com/6502/ David A. Wheeler] sense) in order to generate performant code. While this makes the job of the compiler author considerably easier, we like the challenge. Organizing code for the 6502 is actually rather difficult; making "a version of C" that is easy to compile just shifts the burden of this work back onto the user. LLVM will likely be able to handle compiling other languages to the 6502, at some point in the future. Rust support has already been proven, but there are no problems in principle with lowering many more languages to the 6502.<br />
[[Category:Main]]</div>71.198.117.145https://llvm-mos.org/index.php?title=C_compiler&diff=240C compiler2021-07-01T23:19:16Z<p>71.198.117.145: </p>
<hr />
<div>A backend has been added to Clang to support the MOS instruction set. This backend may be targeted by adding the flag <code>--target=mos</code> to clang.<br />
<br />
The Clang is nearly compatible with the freestanding portion of the C99 standard, with a couple caveats:<br />
* Although LLVM's (SingleSource) end-to-end test suite passes, we haven't finished auditing the compiler for C99 compatibility. It's likely already very compatible, but we haven't done the "full point inspection" yet.<br />
* No float and no double. We'll eventually ship a working IEEE 754 soft float library with the compiler for completeness' sake, but we expect low demand for this, and it'll distract from the rest of the project.<br />
* The (default) included printf will not be compiled with floating point support, even when we ship soft float libraries. We'll find some way to link in a different version if users elect to link the soft float routines.<br />
[[Category:C]]</div>71.198.117.145https://llvm-mos.org/index.php?title=Welcome&diff=239Welcome2021-07-01T23:16:20Z<p>71.198.117.145: /* Welcome to the llvm-mos project! */</p>
<hr />
<div>[[Category:Main]]<br />
[[File:Hello-vic20.png|thumb|Hello world of LLVM assembler targeting Commodore VIC-20]]<br />
[[File:Hello-apple2.png|thumb|Hello world of LLVM assembler targeting Apple IIe]]<br />
[[File:Hello-c64.png|thumb|Hello world of LLVM assembler targeting Commodore 64]]<br />
[[File:Rust-hello-atari-800.png|thumb|Hello world in Rust, with factorial calculation, for Atari 800, proof of concept by mrk]]<br />
<br />
== Welcome to the llvm-mos project! ==<br />
<br />
The llvm-mos project is intended to create a first-class [https://llvm.org/docs/WritingAnLLVMBackend.html backend] in [https://llvm.org/ LLVM] for the [[wikipedia:MOS_Technology|MOS Technology]] 65xx series of microprocessors and their clones.<br />
<br />
To get started playing with the tools, check out [[Getting started]].<br />
<br />
There have been many failed attempts to create a 6502 backend for LLVM. Ours is the first to successfully compile working programs. The llvm-mos Clang is nearly compatibile with freestanding C99, and the relevant portions of the LLVM end-to-end test suite pass on a simulated 6502 in a variety of configurations. The project also includes a feature-complete assembler and ELF linker support for generic 6502 targets.<br />
<br />
This project will permit modern C programs, written in a modern style, to target common microcomputers of the 1980s, including but not limited to the [[wikipedia:Commodore_64|Commodore 64]], the [[wikipedia:Apple_IIe|Apple IIe]], and the [[wikipedia:Atari_8-bit_family|Atari 8-bit family]].<br />
<br />
Our work is based on LLVM's novel [https://llvm.org/docs/GlobalISel/index.html GlobalISel] architecture, and as such, our compiler will be ''aggressive'' about pursuing optimization opportunities for the 65xx series. While our focus is currently feature-completeness, not optimization, we've already overcome all existing theoretical hurdles necessary to emit high quality 6502 code. <br />
<br />
The development team has established a [https://github.com/llvm-mos project on Github]. Acceptance tests and packaging occur via [https://docs.github.com/en/actions/learn-github-actions/introduction-to-github-actions Github actions] as well. <br />
<br />
We provide current builds of the main branch of the llvm-mos tool chain for [https://github.com/llvm-mos/llvm-mos/releases/tag/llvm-mos-windows-main Windows], [https://github.com/llvm-mos/llvm-mos/releases/tag/llvm-mos-darwin-main MacOS], and [https://github.com/llvm-mos/llvm-mos/releases/tag/llvm-mos-linux-main Ubuntu Linux]. <br />
<br />
Ongoing, public development discussions occur on Slack. If you're an experienced programmer, with a detailed understanding of the LLVM architecture, then [https://join.slack.com/t/llvm-mos/shared_invite/zt-rtaxxsdu-~3tSQaQCQjLmc27OVX5vsA please join our Slack group now] and help out.<br />
<br />
=== Category tree ===<br />
<br />
<CategoryTree mode="pages" depth="3" hideroot="on">Main</CategoryTree><br />
<br />
=== Categories ===<br />
<br />
{{Special:AllPages|namespace=14}}<br />
<br />
=== Pages ===<br />
<br />
{{Special:AllPages}}</div>71.198.117.145https://llvm-mos.org/index.php?title=Current_status&diff=238Current status2021-07-01T05:45:19Z<p>71.198.117.145: /* C compiler */</p>
<hr />
<div>== Overview ==<br />
'''Warning:''' As of this writing, the LLVM-MOS compiler and toolchain is '''not feature complete.''' You should not expect to drop it into your project and generate running programs immediately.<br />
== Assembler ==<br />
The assembler, llvm-mc, understands and assembles all NMOS 6502 opcodes. The assembler correctly understands symbols, and it's possible to use them as branch targets, do pointer math on them, and the like. Fixups work as expected at link time.<br />
<br />
The assembler correctly deals with 6502 relative branches. BEQ, BCC, etc., all correctly calculate PC relative offsets in the unusual 6502 convention, in the range of [-126,+129]. Since llvm-mc is GNU assembler compatible, you can use all GNU assembler features while writing 65xx code, including macros, ifdefs, and similar.<br />
<br />
The assembler is capable of intelligently figuring out whether symbols should refer to zero page or 16-bit locations, at the time of compilation. If, at compile time, you place a symbol in a section named <code>.zeropage</code>, <code>.directpage</code>, or <code>.zp</code>, then that symbol will be assumed to be located in zero page; otherwise, it will be assumed to refer to a 16-bit address. Additionally, if a symbol is placed in a section with a <code>z</code> section flag enabled, then that symbol is assumed to be located in zero page, with addressing calculated accordingly.<br />
<br />
The assembler and linker both understand that $ is a legal prefix for hexadecimal constants. Much existing 6502 assembly code depends on this older convention. See the DollarIsHexPrefix constant in MCAsmInfo.h. The lexer now queries whatever the current MCAsmInfo structure to see whether the target wants the dollar sign to be a hex prefix. So, everything that depends on the lexer (which is almost everything in LLVM) can now recognize 6502 format hexadecimal constants, if the corresponding MCAsmInfo asks for it. The modern 0x prefix works fine as well.<br />
<br />
The assembler understands the names of the 6502 registers, including a, x, y, p, sp, and pc. It understands references to these names to be references to those registers. If your code depends on these names as variable or section names, you can force the assembler to use the prefix of llvm_mos_ on those registers, e.g. llvm_mos_a, llvm_mos_x, etc. To require this prefix on references to those registers, enable the <code>mos-long-register-names</code> compilation feature. For example, with llvm-mc, use the flag <code>-mattr=+mos-long-register-names</code>. Printed assembly output uses this naming scheme to avoid conflicts with existing code.<br />
<br />
To target MOS family processors, you will need to use a triple of "mos" (try: <code>-triple mos</code>) as a parameter to any tool.<br />
<br />
== ELF ==<br />
Both the assembler and the linker support the ELF format, for both object files and executables. The ELF format has been extended with a machine type of 6502 (naturally) to permit storing 65xx code in ELF files.<br />
<br />
Because the 6502 assembler and linker both work with ELF files, you can use any of your favorite tools to inspect or understand ELF files generated by the LLVM tools. The llvm-readobj, llvm-objdump, llvm-objcopy, llvm-strip, and likely the other command line tools as well, work as expected. This also means that generic tools that work on ELF files, such as this online ELF viewer, can read and dump basic information about MOS executables.<br />
<br />
== Linker ==<br />
The ELF file format for objects and executables, has been extended to support 65xx compatible processors. Hello-world type programs have been proven to compile, and work as expected, on emulated Commodore 64, VIC-20, and Apple II machines.<br />
<br />
The linker, lld, can be called with a "-flavor gnu" parameter in order to permit linking of ELF executables.<br />
<br />
If you've written an appropriate linker file for your 6502 target, you can season the following overly verbose formula to compile assembly code for your particular target, where %s is the name of your assembly source, %S is the directory of your include files, and %t is the base name of your project:<br />
<code>llvm-mc -triple mos --filetype=obj -o=%t.obj %s <br />
llvm-objdump --all-headers --print-imm-hex -D %t.obj <br />
llvm-readelf --all %t.obj<br />
lld -flavor gnu %t.obj -o %t.elf -L %S your-target.ld<br />
llvm-readelf --all %t.elf <br />
llvm-objdump --all-headers --print-imm-hex -D %t.elf<br />
llvm-objcopy --output-target binary --strip-unneeded %t.elf %t.bin</code><br />
The llvm-objdump and llvm-readelf programs are not necessary; they're just there to help you debug your own pipeline.<br />
<br />
As of this writing, some example linker files and BASIC stubs exist at llvm/test/MC/MOS/Inputs.<br />
<br />
== C compiler ==<br />
An C99-complete backend to clang is being developed based on LLVM's GlobalISel architecture. As of this writing it can pass the LLVM end-to-end test suite in a variety of optimization modes, although very little specific attention was paid to the quality of the generated output. It's still pretty OK, since we benefit greatly from LLVM's high-level optimizations, but there's still lots of low-level optimization opportunities available.<br />
<br />
== Deployment ==<br />
If any branch builds and passes the <code>check-all</code> suite of LLVM tests, it's immediately deployed so you can get fresh bits for your favorite platform. Always grab the main build first. We smoke test on Windows, MacOS, and Ubuntu, but llvm-mos should build on anything that LLVM can build on. You can download fresh binary bits from the [https://github.com/llvm-mos/llvm-mos/releases Github releases page].<br />
<br />
== Implementation ==<br />
A backend for MOS architectures has been added to llvm/lib/Target/MC/MOS . Using the triple 'mos' will cause llvm-mc to use the new MOS backend. By default, this backend will target the 6502 as its default, which should work on all CPUs and NMOS implementations that claim 6502 compatibility.<br />
<br />
TableGen has been taught all native 6502 instructions and formats. llvm-mc can assemble all 6502 instructions..<br />
<br />
For some examples of what the backend can do as of this writing, see the llvm/test/MC/MOS directory for some functional assembler tests. Building the check-llvm-mc-mos target, confirms just these tests for MOS.<br />
[[Category:Main]]</div>71.198.117.145https://llvm-mos.org/index.php?title=Current_status&diff=237Current status2021-07-01T05:44:54Z<p>71.198.117.145: /* C compiler */</p>
<hr />
<div>== Overview ==<br />
'''Warning:''' As of this writing, the LLVM-MOS compiler and toolchain is '''not feature complete.''' You should not expect to drop it into your project and generate running programs immediately.<br />
== Assembler ==<br />
The assembler, llvm-mc, understands and assembles all NMOS 6502 opcodes. The assembler correctly understands symbols, and it's possible to use them as branch targets, do pointer math on them, and the like. Fixups work as expected at link time.<br />
<br />
The assembler correctly deals with 6502 relative branches. BEQ, BCC, etc., all correctly calculate PC relative offsets in the unusual 6502 convention, in the range of [-126,+129]. Since llvm-mc is GNU assembler compatible, you can use all GNU assembler features while writing 65xx code, including macros, ifdefs, and similar.<br />
<br />
The assembler is capable of intelligently figuring out whether symbols should refer to zero page or 16-bit locations, at the time of compilation. If, at compile time, you place a symbol in a section named <code>.zeropage</code>, <code>.directpage</code>, or <code>.zp</code>, then that symbol will be assumed to be located in zero page; otherwise, it will be assumed to refer to a 16-bit address. Additionally, if a symbol is placed in a section with a <code>z</code> section flag enabled, then that symbol is assumed to be located in zero page, with addressing calculated accordingly.<br />
<br />
The assembler and linker both understand that $ is a legal prefix for hexadecimal constants. Much existing 6502 assembly code depends on this older convention. See the DollarIsHexPrefix constant in MCAsmInfo.h. The lexer now queries whatever the current MCAsmInfo structure to see whether the target wants the dollar sign to be a hex prefix. So, everything that depends on the lexer (which is almost everything in LLVM) can now recognize 6502 format hexadecimal constants, if the corresponding MCAsmInfo asks for it. The modern 0x prefix works fine as well.<br />
<br />
The assembler understands the names of the 6502 registers, including a, x, y, p, sp, and pc. It understands references to these names to be references to those registers. If your code depends on these names as variable or section names, you can force the assembler to use the prefix of llvm_mos_ on those registers, e.g. llvm_mos_a, llvm_mos_x, etc. To require this prefix on references to those registers, enable the <code>mos-long-register-names</code> compilation feature. For example, with llvm-mc, use the flag <code>-mattr=+mos-long-register-names</code>. Printed assembly output uses this naming scheme to avoid conflicts with existing code.<br />
<br />
To target MOS family processors, you will need to use a triple of "mos" (try: <code>-triple mos</code>) as a parameter to any tool.<br />
<br />
== ELF ==<br />
Both the assembler and the linker support the ELF format, for both object files and executables. The ELF format has been extended with a machine type of 6502 (naturally) to permit storing 65xx code in ELF files.<br />
<br />
Because the 6502 assembler and linker both work with ELF files, you can use any of your favorite tools to inspect or understand ELF files generated by the LLVM tools. The llvm-readobj, llvm-objdump, llvm-objcopy, llvm-strip, and likely the other command line tools as well, work as expected. This also means that generic tools that work on ELF files, such as this online ELF viewer, can read and dump basic information about MOS executables.<br />
<br />
== Linker ==<br />
The ELF file format for objects and executables, has been extended to support 65xx compatible processors. Hello-world type programs have been proven to compile, and work as expected, on emulated Commodore 64, VIC-20, and Apple II machines.<br />
<br />
The linker, lld, can be called with a "-flavor gnu" parameter in order to permit linking of ELF executables.<br />
<br />
If you've written an appropriate linker file for your 6502 target, you can season the following overly verbose formula to compile assembly code for your particular target, where %s is the name of your assembly source, %S is the directory of your include files, and %t is the base name of your project:<br />
<code>llvm-mc -triple mos --filetype=obj -o=%t.obj %s <br />
llvm-objdump --all-headers --print-imm-hex -D %t.obj <br />
llvm-readelf --all %t.obj<br />
lld -flavor gnu %t.obj -o %t.elf -L %S your-target.ld<br />
llvm-readelf --all %t.elf <br />
llvm-objdump --all-headers --print-imm-hex -D %t.elf<br />
llvm-objcopy --output-target binary --strip-unneeded %t.elf %t.bin</code><br />
The llvm-objdump and llvm-readelf programs are not necessary; they're just there to help you debug your own pipeline.<br />
<br />
As of this writing, some example linker files and BASIC stubs exist at llvm/test/MC/MOS/Inputs.<br />
<br />
== C compiler ==<br />
An C99-complete backend to clang is being developed based on LLVM's GlobalISel architecture. As of this writing it can pass the LLVM end-to-end test suite in a variety of optimization modes, although very little specific attention was paid to the quality of the generated output. It's still fairly good, since we benefit greatly from LLVM's high-level optimizations, but there's still lots of low-level optimization opportunities available.<br />
<br />
== Deployment ==<br />
If any branch builds and passes the <code>check-all</code> suite of LLVM tests, it's immediately deployed so you can get fresh bits for your favorite platform. Always grab the main build first. We smoke test on Windows, MacOS, and Ubuntu, but llvm-mos should build on anything that LLVM can build on. You can download fresh binary bits from the [https://github.com/llvm-mos/llvm-mos/releases Github releases page].<br />
<br />
== Implementation ==<br />
A backend for MOS architectures has been added to llvm/lib/Target/MC/MOS . Using the triple 'mos' will cause llvm-mc to use the new MOS backend. By default, this backend will target the 6502 as its default, which should work on all CPUs and NMOS implementations that claim 6502 compatibility.<br />
<br />
TableGen has been taught all native 6502 instructions and formats. llvm-mc can assemble all 6502 instructions..<br />
<br />
For some examples of what the backend can do as of this writing, see the llvm/test/MC/MOS directory for some functional assembler tests. Building the check-llvm-mc-mos target, confirms just these tests for MOS.<br />
[[Category:Main]]</div>71.198.117.145https://llvm-mos.org/index.php?title=C_calling_convention&diff=225C calling convention2021-06-09T16:45:17Z<p>71.198.117.145: </p>
<hr />
<div>The current calling convention is somewhat simplistic; it will be tuned for performance and size before the initial release of the compiler.<br />
<br />
* The bytes composing numeric arguments are passed individually in registers. The order used is A, then X, then each available imaginary (zero page) register, increasing (i.e., RC0 to RC255). Y is not used for arguments (but is still caller-save); it's reserved to help the compiler shuffle values into the appropriate locations around calls.<br />
* Pointers are preferentially assigned to imaginary register pairs, functioning as pointer registers (i.e., RS0=(RC0, RC1) to RS128=(RC254, RC255)). If none are available, the low and high bytes are split and passed as above.<br />
* If no registers remain available, any remaining bytes are passed through the soft stack.<br />
* The callee-saved imaginary registers, RS2 (i.e., RC4 and RC5) and RS4 (i.e., RC8 and RC9) are skipped.<br />
* Aggregate types (structs, arrays, etc.) are passed by pointer. The pointer is managed entirely by the caller, and may or may not be on the soft stack. The callee is free to write to the memory; the caller must consider the memory overwritten by the call. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* Aggregate types are returned by a pointer passed as an implicit first argument. The resulting function returns void. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* RS2 and RS4 (and subregisters) are callee-saved. All other ZP locations, registers, and flags are caller-saved. The gap between the callee-saved registers balances between caller- and callee-saved registers if very little of the zero page is available.<br />
* Variable arguments (those within the ellipses of the argument list) are passed through the stack. Named arguments before the variable arguments are passed as usual: first in registers, then stack. Note that the variable argument and regular calling convention differ; thus, variable argument functions must only be called if prototyped. The C standard requires this, but many platforms do not; their variable argument and regular calling conventions are identical. A notable exception is Apple ARM64.<br />
For insight into the design of performant calling conventions, see the following work by Davidson and Whalley. By their convention, this plaftorm uses the "smarter hybrid" method, since LLVM performs both shrink wrapping and caller save-restore placement optimizations, while using both callee-saved and caller-saved registers when appropriate.<br />
<br />
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.4669&rep=rep1&type=pdf Methods for Saving and Restoring Register Values across Function Calls: Software--Practice and Experience Vol 21(2), 149-165 (February 1991)]<br />
[[Category:C]]</div>71.198.117.145https://llvm-mos.org/index.php?title=C_calling_convention&diff=224C calling convention2021-06-06T23:37:47Z<p>71.198.117.145: </p>
<hr />
<div>The current calling convention is somewhat simplistic; it will be tuned for performance and size before the initial release of the compiler.<br />
<br />
* The bytes composing numeric arguments are passed individually in registers. The order used is A, then X, then Y, then each available imaginary (zero page) register, increasing (i.e., RC0 to RC255).<br />
* Pointers are preferentially assigned to imaginary register pairs, functioning as pointer registers (i.e., RS0=(RC0, RC1) to RS128=(RC254, RC255)). If none are available, the low and high bytes are split and passed as above.<br />
* If no registers remain available, any remaining bytes are passed through the soft stack.<br />
* The callee-saved imaginary registers, RS2 (i.e., RC4 and RC5) and RS4 (i.e., RC8 and RC9) are skipped.<br />
* Aggregate types (structs, arrays, etc.) are passed by pointer. The pointer is managed entirely by the caller, and may or may not be on the soft stack. The callee is free to write to the memory; the caller must consider the memory overwritten by the call. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* Aggregate types are returned by a pointer passed as an implicit first argument. The resulting function returns void. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* RS2 and RS4 (and subregisters) are callee-saved. All other ZP locations, registers, and flags are caller-saved. The gap between the callee-saved registers balances between caller- and callee-saved registers if very little of the zero page is available.<br />
* Variable arguments (those within the ellipses of the argument list) are passed through the stack. Named arguments before the variable arguments are passed as usual: first in registers, then stack. Note that the variable argument and regular calling convention differ; thus, variable argument functions must only be called if prototyped. The C standard requires this, but many platforms do not; their variable argument and regular calling conventions are identical. A notable exception is Apple ARM64.<br />
For insight into the design of performant calling conventions, see the following work by Davidson and Whalley. By their convention, this plaftorm uses the "smarter hybrid" method, since LLVM performs both shrink wrapping and caller save-restore placement optimizations, while using both callee-saved and caller-saved registers when appropriate.<br />
<br />
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.4669&rep=rep1&type=pdf Methods for Saving and Restoring Register Values across Function Calls: Software--Practice and Experience Vol 21(2), 149-165 (February 1991)]<br />
[[Category:C]]</div>71.198.117.145https://llvm-mos.org/index.php?title=C_calling_convention&diff=223C calling convention2021-06-06T23:37:21Z<p>71.198.117.145: </p>
<hr />
<div>The current calling convention is somewhat simplistic; it will be tuned for performance and size before the initial release of the compiler.<br />
<br />
* The bytes composing numeric arguments are passed individually in registers. The order used is A, then X, then Y, then each available imaginary (zero page) register, increasing (i.e., RC0 to RC255).<br />
* Pointers are preferentially assigned to imaginary register pairs, functioning as pointer registers (i.e., RS0=(RC0, RC1) to RS128=(RC254, RC255)). If none are available, the low and high bytes are split and passed as above.<br />
* If no registers are available, any remaining bytes are passed through the soft stack.<br />
* The callee-saved imaginary registers, RS2 (i.e., RC4 and RC5) and RS4 (i.e., RC8 and RC9) are skipped.<br />
* Aggregate types (structs, arrays, etc.) are passed by pointer. The pointer is managed entirely by the caller, and may or may not be on the soft stack. The callee is free to write to the memory; the caller must consider the memory overwritten by the call. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* Aggregate types are returned by a pointer passed as an implicit first argument. The resulting function returns void. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* RS2 and RS4 (and subregisters) are callee-saved. All other ZP locations, registers, and flags are caller-saved. The gap between the callee-saved registers balances between caller- and callee-saved registers if very little of the zero page is available.<br />
* Variable arguments (those within the ellipses of the argument list) are passed through the stack. Named arguments before the variable arguments are passed as usual: first in registers, then stack. Note that the variable argument and regular calling convention differ; thus, variable argument functions must only be called if prototyped. The C standard requires this, but many platforms do not; their variable argument and regular calling conventions are identical. A notable exception is Apple ARM64.<br />
For insight into the design of performant calling conventions, see the following work by Davidson and Whalley. By their convention, this plaftorm uses the "smarter hybrid" method, since LLVM performs both shrink wrapping and caller save-restore placement optimizations, while using both callee-saved and caller-saved registers when appropriate.<br />
<br />
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.4669&rep=rep1&type=pdf Methods for Saving and Restoring Register Values across Function Calls: Software--Practice and Experience Vol 21(2), 149-165 (February 1991)]<br />
[[Category:C]]</div>71.198.117.145https://llvm-mos.org/index.php?title=C_calling_convention&diff=222C calling convention2021-06-06T23:36:59Z<p>71.198.117.145: </p>
<hr />
<div>The current calling convention is somewhat simplistic; it will be tuned for performance and size before the initial release of the compiler.<br />
<br />
* The bytes composing numeric arguments are passed individually in registers. The order used is A, then X, then Y, then each available imaginary (zero page) register, increasing (i.e., RC0 to RC255).<br />
* Pointers are preferentially assigned to imaginary register pairs, functioning as pointer registers (i.e., RS0=(RC0, RC1) to RS128=(RC254, RC255)). If none are available, the low and high bytes are split and passed as above, low byte first.<br />
* If no registers are available, any remaining bytes are passed through the soft stack.<br />
* The callee-saved imaginary registers, RS2 (i.e., RC4 and RC5) and RS4 (i.e., RC8 and RC9) are skipped.<br />
* Aggregate types (structs, arrays, etc.) are passed by pointer. The pointer is managed entirely by the caller, and may or may not be on the soft stack. The callee is free to write to the memory; the caller must consider the memory overwritten by the call. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* Aggregate types are returned by a pointer passed as an implicit first argument. The resulting function returns void. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* RS2 and RS4 (and subregisters) are callee-saved. All other ZP locations, registers, and flags are caller-saved. The gap between the callee-saved registers balances between caller- and callee-saved registers if very little of the zero page is available.<br />
* Variable arguments (those within the ellipses of the argument list) are passed through the stack. Named arguments before the variable arguments are passed as usual: first in registers, then stack. Note that the variable argument and regular calling convention differ; thus, variable argument functions must only be called if prototyped. The C standard requires this, but many platforms do not; their variable argument and regular calling conventions are identical. A notable exception is Apple ARM64.<br />
For insight into the design of performant calling conventions, see the following work by Davidson and Whalley. By their convention, this plaftorm uses the "smarter hybrid" method, since LLVM performs both shrink wrapping and caller save-restore placement optimizations, while using both callee-saved and caller-saved registers when appropriate.<br />
<br />
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.4669&rep=rep1&type=pdf Methods for Saving and Restoring Register Values across Function Calls: Software--Practice and Experience Vol 21(2), 149-165 (February 1991)]<br />
[[Category:C]]</div>71.198.117.145https://llvm-mos.org/index.php?title=C_calling_convention&diff=221C calling convention2021-06-06T23:36:30Z<p>71.198.117.145: </p>
<hr />
<div>The current calling convention is somewhat simplistic; it will be tuned for performance and size before the initial release of the compiler.<br />
<br />
* The bytes composing numeric arguments are passed individually in registers. The order used is A, then X, then Y, then each available imaginary (zero page) register, increasing (i.e., RC0 to RC255).<br />
* Pointers are preferentially assigned to imaginary register pairs, functioning as pointer registers (i.e., RS0=(RC0, RC1) to RS128=(RC254, RC255)). If none are available, the bytes are split and passed as above.<br />
* If no registers are available, any remaining bytes are passed through the soft stack.<br />
* The callee-saved imaginary registers, RS2 (i.e., RC4 and RC5) and RS4 (i.e., RC8 and RC9) are skipped.<br />
* Aggregate types (structs, arrays, etc.) are passed by pointer. The pointer is managed entirely by the caller, and may or may not be on the soft stack. The callee is free to write to the memory; the caller must consider the memory overwritten by the call. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* Aggregate types are returned by a pointer passed as an implicit first argument. The resulting function returns void. This is handled directly by Clang; LLVM itself should never see aggregates.<br />
* RS2 and RS4 (and subregisters) are callee-saved. All other ZP locations, registers, and flags are caller-saved. The gap between the callee-saved registers balances between caller- and callee-saved registers if very little of the zero page is available.<br />
* Variable arguments (those within the ellipses of the argument list) are passed through the stack. Named arguments before the variable arguments are passed as usual: first in registers, then stack. Note that the variable argument and regular calling convention differ; thus, variable argument functions must only be called if prototyped. The C standard requires this, but many platforms do not; their variable argument and regular calling conventions are identical. A notable exception is Apple ARM64.<br />
For insight into the design of performant calling conventions, see the following work by Davidson and Whalley. By their convention, this plaftorm uses the "smarter hybrid" method, since LLVM performs both shrink wrapping and caller save-restore placement optimizations, while using both callee-saved and caller-saved registers when appropriate.<br />
<br />
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.4669&rep=rep1&type=pdf Methods for Saving and Restoring Register Values across Function Calls: Software--Practice and Experience Vol 21(2), 149-165 (February 1991)]<br />
[[Category:C]]</div>71.198.117.145https://llvm-mos.org/index.php?title=Code_generation_overview&diff=178Code generation overview2021-06-02T19:43:00Z<p>71.198.117.145: </p>
<hr />
<div>Like all LLVM backends, the bulk of the implementation exists in its own directory within the LLVM hierarchy, in llvm/lib/Target/MOS . llvm-mos uses LLVM's new GlobalISel architecture for instruction selection.<br />
<br />
The instruction set of the 6502 isn't much more irregular than most modern CPUs, but the way that that regularity manifests itself is highly irregular. Modern CPUs generally have multiple operand slots for each opcode, so that addresses, indices, offsets, source registers, and destination registers can be directly specified. For the 6502, many of these values are indirectly encoded in the opcode itself. For example, the 6502 can load an absolute memory location to any register (A, X, or Y), but there are three opcodes to do so: LDA ADDR, LDX ADDR, and LDY ADDR. A more modern CPU would typically use one opcode: "LD R, ADDR".<br />
<br />
To make the 6502 instruction set look more like what LLVM expects, we recast it "as if" it were a more modern instruction set. Thus, we report that the 6502 does have one "LD R, ADDR" instruction, where R is A, X, or Y. After code is generated in terms of these "logical instructions", we lower them down to the real 6502 instructions for final assembly output, machine code generation, and linking.<br />
<br />
Because we model the 6502 instruction set in such a way as to be amenable to LLVM's algorithms, we benefit greatly from its machine independent optimization flows, from instruction selection, to register allocation, to basic block layout. There are some 6502-specific difficulties, but LLVM does provide relatively good means for targets (like ours) to sort them out ourselves, by providing pseudoinstructions and subroutine implementations that abstract away the complexity so LLVM doesn't need to know about it.<br />
<br />
The original architects of the NMOS 6502 compensated for the 6502's small number of registers by providing 256 bytes of memory called Zero page, which could be accessed relatively quickly and cheaply by the processor. The llvm-mos C compiler utilizes a user-selectable range of zero-page memory, and performs nearly all of its operations there directly. We refer to this treatment of selectable ranges of zero page as imaginary registers, to distinguish them from LLVM's virtual registers. Code generation allocates and chooses imaginary registers for all operations that do not access 16-bit memory. Eventually, references to the imaginary registers are emitted as abstract symbols like "__rc12". The linker script will later map these to available locations in the zero page, depending on the target's specific memory map. Accordingly, the compiler's use of the zero page is highly customizable, and it can make use of highly discontiguous zero page fragments typical on real 6502 hardware.<br />
<br />
Because we reserve a chunk of zero page memory for imaginary registers, and because LLVM has a great deal of specialized knowledge about pointer offsets and the like, llvm-mos can intelligently use a lot of the 6502's specialized addressing modes. For example, when a memory address that is a offset from a specific 16-bit pointer must be calculated, and that offset is 255 bytes or less, then llvm-mos can use LDA/STA (zp),y instructions to access that memory directly in a single instruction. This in turn means that llvm's GetElementPtr (GEP) instruction, can in many cases be reduced to a series of individual 6502 instructions. This is a big win when operating on pointed-to structs. <br />
<br />
We can even try to produce 8-bit offsets where they wouldn't otherwise exist. For example, wherever possible, we rewrite 16-bit pointer loop indices to a sum of a 16-bit base and am 8-bit offset. Later on, the sum will be folded away into a LDA/STA (zp),y instruction. The advantage is that the loop increment is now just INY, not a full 16-bit increment. This optimization is possible because this pattern can be detected early on in the codegen pipelines, as LLVM allows us to do. See [https://github.com/llvm-mos/llvm-mos/blob/main/llvm/lib/Target/MOS/MOSIndexIV.cpp MOSIndexIV.cpp] for more information on this particular optimization.</div>71.198.117.145https://llvm-mos.org/index.php?title=Code_generation_overview&diff=177Code generation overview2021-06-02T19:42:20Z<p>71.198.117.145: </p>
<hr />
<div>Like all LLVM backends, the bulk of the implementation exists in its own directory within the LLVM hierarchy, in llvm/lib/Target/MOS . llvm-mos uses LLVM's new GlobalISel architecture for instruction selection.<br />
<br />
The instruction set of the 6502 isn't much more irregular than most modern CPUs, but the way that that regularity manifests itself is highly irregular. Modern CPUs generally have multiple operand slots for each opcode, so that addresses, indices, offsets, source registers, and destination registers can be directly specified. For the 6502, many of these values are indirectly encoded in the opcode itself. For example, the 6502 can load an absolute memory location to any register (A, X, or Y), but there are three opcodes to do so: LDA ADDR, LDX ADDR, and LDY ADDR. A more modern CPU would typically use one opcode: "LD R, ADDR".<br />
<br />
To make the 6502 instruction set look more like what LLVM expects, we recast it "as if" it were a more modern instruction set. Thus, we report that the 6502 does have one "LD R, ADDR" instruction, where R is A, X, or Y. After code is generated in terms of these "logical instructions", we lower them down to the real 6502 instructions for final assembly output, machine code generation, and linking.<br />
<br />
Because we model the 6502 instruction set in such a way as to be amenable to LLVM's algorithms, we benefit greatly from its machine independent optimization flows, from instruction selection, to register allocation, to basic block layout. There are some 6502-specific difficulties, but LLVM does provide relatively good means for targets (like ours) to sort them out ourselves, by providing pseudoinstructions and subroutine implementations that abstract away the complexity so LLVM doesn't need to know about it.<br />
<br />
The original architects of the NMOS 6502 compensated for the 6502's small number of registers by providing 256 bytes of memory called Zero page, which could be accessed relatively quickly and cheaply by the processor. The llvm-mos C compiler utilizes a user-selectable range of zero-page memory, and performs nearly all of its operations there directly. We refer to this treatment of selectable ranges of zero page as imaginary registers, to distinguish them from LLVM's virtual registers. Code generation allocates and chooses imaginary registers for all operations that do not access 16-bit memory. Eventually, references to the imaginary registers are emitted as abstract symbols like "__rc12". The linker script will later map these to available locations in the zero page, depending on the target's specific memory map. Accordingly, the compiler's use of the zero page is highly customizable, and it can make use of highly discontiguous zero page fragments typical on real 6502 hardware.<br />
<br />
Because we reserve a chunk of zero page memory for imaginary registers, and because LLVM has a great deal of specialized knowledge about pointer offsets and the like, llvm-mos can intelligently use a lot of the 6502's specialized addressing modes. For example, when a memory address that is a offset from a specific 16-bit pointer must be calculated, and that offset is 255 bytes or less, then llvm-mos can use LDA/STA (zp),y instructions to access that memory directly in a single instruction. This in turn means that llvm's GetElementPtr (GEP) instruction, can in many cases be reduced to a series of individual 6502 instructions. This is a big win when operating on pointed-to structs. <br />
<br />
We can even try to produce 8-bit offsets where they wouldn't otherwise exist. For example, wherever possible, we rewrite 16-bit pointer loop indices to a sum of a 16-bit constant and 8-bit offset. Later on, the sum will be folded away into a LDA/STA (zp),y instruction. The advantage is that the loop increment is now just INY, not a full 16-bit increment. This optimization is possible because this pattern can be detected early on in the codegen pipelines, as LLVM allows us to do. See [https://github.com/llvm-mos/llvm-mos/blob/main/llvm/lib/Target/MOS/MOSIndexIV.cpp MOSIndexIV.cpp] for more information on this particular optimization.</div>71.198.117.145https://llvm-mos.org/index.php?title=Code_generation_overview&diff=176Code generation overview2021-06-02T19:41:31Z<p>71.198.117.145: </p>
<hr />
<div>Like all LLVM backends, the bulk of the implementation exists in its own directory within the LLVM hierarchy, in llvm/lib/Target/MOS . llvm-mos uses LLVM's new GlobalISel architecture for instruction selection.<br />
<br />
The instruction set of the 6502 isn't much more irregular than most modern CPUs, but the way that that regularity manifests itself is highly irregular. Modern CPUs generally have multiple operand slots for each opcode, so that addresses, indices, offsets, source registers, and destination registers can be directly specified. For the 6502, many of these values are indirectly encoded in the opcode itself. For example, the 6502 can load an absolute memory location to any register (A, X, or Y), but there are three opcodes to do so: LDA ADDR, LDX ADDR, and LDY ADDR. A more modern CPU would typically use one opcode: "LD R, ADDR".<br />
<br />
To make the 6502 instruction set look more like what LLVM expects, we recast it "as if" it were a more modern instruction set. Thus, we report that the 6502 does have one "LD R, ADDR" instruction, where R is A, X, or Y. After code is generated in terms of these "logical instructions", we lower them down to the real 6502 instructions for final assembly output, machine code generation, and linking.<br />
<br />
Because we model the 6502 instruction set in such a way as to be amenable to LLVM's algorithms, we benefit greatly from its machine independent optimization flows, from instruction selection, to register allocation, to basic block layout. There are some 6502-specific difficulties, but LLVM does provide relatively good means for targets (like ours) to sort them out ourselves, by providing pseudoinstructions and subroutine implementations that abstract away the complexity so LLVM doesn't need to know about it.<br />
<br />
The original architects of the NMOS 6502 compensated for the 6502's small number of registers by providing 256 bytes of memory called Zero page, which could be accessed relatively quickly and cheaply by the processor. The llvm-mos C compiler utilizes a user-selectable range of zero-page memory, and performs nearly all of its operations there directly. We refer to this treatment of selectable ranges of zero page as imaginary registers, to distinguish them from LLVM's virtual registers. Code generation allocates and chooses imaginary registers for all operations that do not access 16-bit memory. Eventually, references to the imaginary registers are emitted as abstract symbols like "__rc12". The linker script will later map these to available locations in the zero page, depending on the target's specific memory map. Accordingly, the compiler's use of the zero page is highly customizable, and it can make use of highly discontiguous zero page fragments typical on real 6502 hardware.<br />
<br />
Because we reserve a chunk of zero page memory for imaginary registers, and because LLVM has a great deal of specialized knowledge about pointer offsets and the like, llvm-mos can intelligently use a lot of the 6502's specialized addressing modes. For example, when a memory address that is a offset from a specific 16-bit pointer must be calculated, and that offset is 255 bytes or less, then llvm-mos can use LDA/STA (zp),y instructions to access that memory directly in a single instruction. This in turn means that llvm's GetElementPtr (GEP) instruction, can in many cases be reduced to a series of individual 6502 instructions. This is a big win when operating on pointed-to structs. <br />
<br />
We can even try to produce 8-bit offsets where they wouldn't otherwise exist. For example, wherever possible, we rewrite 16-bit pointer loop indices to a sum of a 16-bit constant and 8-bit offset. Later on, the sum will be folded away into a LDA/STA (zp),y instruction. This optimization is possible because this pattern can be detected early on in the codegen pipelines, as LLVM allows us to do. See [https://github.com/llvm-mos/llvm-mos/blob/main/llvm/lib/Target/MOS/MOSIndexIV.cpp MOSIndexIV.cpp] for more information on this particular optimization.</div>71.198.117.145https://llvm-mos.org/index.php?title=Code_generation_overview&diff=175Code generation overview2021-06-02T19:40:15Z<p>71.198.117.145: </p>
<hr />
<div>Like all LLVM backends, the bulk of the implementation exists in its own directory within the LLVM hierarchy, in llvm/lib/Target/MOS . llvm-mos uses LLVM's new GlobalISel architecture for instruction selection.<br />
<br />
The instruction set of the 6502 isn't much more irregular than most modern CPUs, but the way that that regularity manifests itself is highly irregular. Modern CPUs generally have multiple operand slots for each opcode, so that addresses, indices, offsets, source registers, and destination registers can be directly specified. For the 6502, many of these values are indirectly encoded in the opcode itself. For example, the 6502 can load an absolute memory location to any register (A, X, or Y), but there are three opcodes to do so: LDA ADDR, LDX ADDR, and LDY ADDR. A more modern CPU would typically use one opcode: "LD R, ADDR".<br />
<br />
To make the 6502 instruction set look more like what LLVM expects, we recast it "as if" it were a more modern instruction set. Thus, we report that the 6502 does have one "LD R, ADDR" instruction, where R is A, X, or Y. After code is generated in terms of these "logical instructions", we lower them down to the real 6502 instructions for final assembly output, machine code generation, and linking.<br />
<br />
Because we model the 6502 instruction set in such a way as to be amenable to LLVM's algorithms, we benefit greatly from its machine independent optimization flows, from instruction selection, to register allocation, to basic block layout. There are some 6502-specific difficulties, but LLVM does provide relatively good means for targets (like ours) to sort them out ourselves, by providing pseudoinstructions and subroutine implementations that abstract away the complexity so LLVM doesn't need to know about it.<br />
<br />
The original architects of the NMOS 6502 compensated for the 6502's small number of registers by providing 256 bytes of memory called Zero page, which could be accessed relatively quickly and cheaply by the processor. The llvm-mos C compiler utilizes a user-selectable range of zero-page memory, and performs nearly all of its operations there directly. We refer to this treatment of selectable ranges of zero page as imaginary registers, to distinguish them from LLVM's virtual registers. Code generation allocates and chooses imaginary registers for all operations that do not access 16-bit memory. Eventually, references to the imaginary registers are emitted as abstract symbols like "__rc12". The linker script will later map these to available locations in the zero page, depending on the target's specific memory map. Accordingly, the compiler's use of the zero page is highly customizable, and it can make use of highly discontiguous zero page fragments typical on real 6502 hardware.<br />
<br />
Because we reserve a chunk of zero page memory for imaginary registers, and because LLVM has a great deal of specialized knowledge about pointer offsets and the like, llvm-mos can intelligently use a lot of the 6502's specialized addressing modes. For example, when a memory address that is a offset from a specific 16-bit pointer must be calculated, and that offset is 255 bytes or less, then llvm-mos can use LDA/STA (zp),y instructions to access that memory directly in a single instruction. This in turn means that llvm's GetElementPtr (GEP) instruction, can in many cases be reduced to a series of individual 6502 instructions. This is a big win when operating on pointed-to structs. <br />
<br />
We can even try to produce 8-bit offsets where they wouldn't otherwise exist. For example, wherever possible, we rewrite 16-bit pointer loop indices to a sum of a 16-bit constant and 8-bit offset. This optimization is possible because this pattern can be detected early on in the codegen pipelines, as LLVM allows us to do. See [https://github.com/llvm-mos/llvm-mos/blob/main/llvm/lib/Target/MOS/MOSIndexIV.cpp MOSIndexIV.cpp] for more information on this particular optimization.</div>71.198.117.145https://llvm-mos.org/index.php?title=C_compiler&diff=174C compiler2021-06-02T19:36:42Z<p>71.198.117.145: </p>
<hr />
<div>A backend has been added to clang to support the MOS instruction set. This backend may be targeted by adding the flag <code>--target=mos</code> to clang.<br />
<br />
The compiler is a freestanding implementation of the C99 standard, with a couple caveats. The biggest is that it isn't finished yet!<br />
Code generation is broken, both in terms of what compiles and in terms of whether or not the results actually work.<br />
Development is ongoing though, and test cases are moving from failed to passing at a considerable clip!<br />
<br />
Even for the initial release, there'll still be some caveats:<br />
* While this is a freestanding implementation, we'll also provide a few support libraries; mostly just those needed to get LLVM's end-to-end test suite passing. This includes printf, sprintf, and alloca.<br />
* No float and no double. We'll eventually ship a working IEEE 754 soft float library with the compiler for completeness' sake, but we expect low demand for this, and it'll distract from the rest of the project.<br />
* The (default) included printf will not be compiled with floating point support, even when we ship soft float libraries. We'll find some way to link in a different version if users elect to link the soft float routines.</div>71.198.117.145