The heart of any backend is instruction selection. LLVM implements several approaches; in this chapter, we will implement instruction selection via the selection directed acyclic graph (DAG) and with global instruction selection.

In this chapter, you will learn about the following topics:

  • Defining the rules of the calling convention: This section shows you how to describe the rules of a calling convention in the target description
  • Instruction selection via the selection DAG: This section teaches you how to implement instruction selection with a graph data structure
  • Adding register and instruction information: This section explains how to access information in the target description, and what additional information you need to provide
  • Putting an empty frame lowering in place: This section introduces you to the stack layout and the prologue of a function
  • Emitting machine instructions: This section tells you how machine instructions are finally written into an object file or as assembly text
  • Creating the target machine and the sub-target: This section shows you how a backend is configured
  • Global instruction selection: This section demonstrates a different approach to instruction selection
  • How to further evolve the backend: This section gives you some guidance about possible next steps

By the end of this chapter, you will know how to create an LLVM backend that can translate simple instructions. You will also acquire the knowledge to develop instruction selection via the selection DAG and with global instruction selection, and you will become familiar with all the important support classes you have to implement to get instruction selection working.

Defining the rules of the calling convention

Implementing the rules of the calling convention is an important part of lowering the LLVM intermediate representation (IR) to machine code. The basic rules can be defined in the target description. Let’s have a look.

Most calling conventions follow a basic pattern: they define a subset of registers for parameter passing. If this subset is not exhausted, the next parameter is passed in the next free register. If there is no free register, then the value is passed on the stack. This can be realized by looping over the parameters and deciding how to pass each parameter to the called function while keeping track of the used registers. In LLVM, this loop is implemented inside the framework, and the state is held in a class called CCState. Furthermore, the rules are defined in the target description.

The rules are given as a sequence of conditions. If the condition holds, then an action is executed. Depending on the outcome of that action, either a place for the parameter is found, or the next condition is evaluated. For example, 32-bit integers are passed in a register. The condition is the type check, and the action is the assignment of a register to this parameter. In the target description, this is written as follows:
CCIfType<[i32],
         CCAssignToReg<[R2, R3, R4, R5, R6, R7, R8, R9]>>,

Of course, if the called function has more than eight parameters, then the register list will be exhausted, and the action will fail. The remaining parameters are passed on the stack, and we can specify this as the next action:
CCAssignToStack<4, 4>

The first parameter is the size of a stack slot in bytes, while the second is the alignment. Since it is a catch-all rule, no condition is used.

Leave a Reply

Your email address will not be published. Required fields are marked *