Emulator overview

From llvm-mos

As part of the llvm-mos project, general-purpose emulator functionality is being added to LLVM. This permits quick testing and experimentation with compiler changes, as well as test suite verification, without requiring actual hardware to be present.

The intention of the emulator design is to permit multiple future iterations of emulation, without throwing away any existing functionality. Although a classic read-parse-execute style emulator would be sufficient for small tests, we'd want to be able to transpile generator code to the local machine's native instruction set, for maximum performance. We'd like to leave the door open to debugging with lldb or other tools, including reverse execution. And although all known LLVM targets have register sets and contiguous memory, we don't want to necessarily require any particular emulation to have these features.

About 80% of the functionality required to build an emulator, is already present within LLVM for most targets. LLVM already includes a disassembler class, MCDisassembler, which is implemented for most platforms. It's able to reconstruct a series of MCInst objects given a raw instruction stream. So, what's needed is a way to take the "current" MCInst, and emulate the behavior of it.

It's also natural to describe the behavior of each instruction, in the tablegen file that defines it. It allows similar instructions to be grouped and to share related code for related activities.

Since LLVM's tablegen already has complete knowledge of all the instructions present for the target, it makes sense to explain the emulator behavior of these instructions to tablegen, and then let tablegen collate all these behaviors into an .inc file that can be included within a larger emulator.

In keeping with the LLVM philosophy, MCEmulator is a set of building blocks for building emulators. It includes a command-line tool, llvm-emu, which demonstrates one way to combine those blocks into an emulator.