Creating the M88k assembler parser class – The Target Description-3
By Peggy Johnston / September 28, 2022 / No Comments / Adding the M88k backend to LLVM, Creating the disassembler, Emitting machine instructions, Implementing the assembler parser, ITCertification Exams
- The parseRegister() method tries to parse a register. First, it checks for a percent sign %. If this is followed by an identifier which matches a register name, then we successfully parsed a register, and return the register number in the RegNo parameter. However, if we cannot identify a register, then we may need to undo the lexing if the RestoreOnFailure parameter is true:
bool M88kAsmParser::parseRegister(
MCRegister &RegNo, SMLoc &StartLoc, SMLoc &EndLoc,
bool RestoreOnFailure) {
StartLoc = Parser.getTok().getLoc();
if (Parser.getTok().isNot(AsmToken::Percent))
return true;
const AsmToken &PercentTok = Parser.getTok();
Parser.Lex();
if (Parser.getTok().isNot(AsmToken::Identifier) ||
(RegNo = MatchRegisterName(
Parser.getTok().getIdentifier())) == 0) {
if (RestoreOnFailure)
Parser.getLexer().UnLex(PercentTok);
return Error(StartLoc, “invalid register”);
}
Parser.Lex();
EndLoc = Parser.getTok().getLoc();
return false;
}
- The parseRegister() and tryparseRegister() overridden methods are just wrappers around the previously defined method. The latter method also translates the boolean return value into an enumeration member of the OperandMatchResultTy enumeration:
bool M88kAsmParser::parseRegister(MCRegister &RegNo,
SMLoc &StartLoc,
SMLoc &EndL
oc) {
return parseRegister(RegNo, StartLoc, EndLoc,
/RestoreOnFailure=/false);
}
OperandMatchResultTy M88kAsmParser::tryParseRegister(
MCRegister &RegNo, SMLoc &StartLoc, SMLoc &EndLoc) {
bool Result =
parseRegister(RegNo, StartLoc, EndLoc,
/RestoreOnFailure=/true);
bool PendingErrors = getParser().hasPendingError();
getParser().clearPendingErrors();
if (PendingErrors)
return MatchOperand_ParseFail;
if (Result)
return MatchOperand_NoMatch;
return MatchOperand_Success;
}
- Finally, the MatchAndEmitInstruction() method drives the parsing. Most of the method is dedicated to emitting error messages. To identify the instruction, the MatchInstructionImpl() generated method is called:
bool M88kAsmParser::MatchAndEmitInstruction(
SMLoc IdLoc, unsigned &Opcode,
OperandVector &Operands, MCStreamer &Out,
uint64_t &ErrorInfo, bool MatchingInlineAsm) {
MCInst Inst;
SMLoc ErrorLoc;
switch (MatchInstructionImpl(
Operands, Inst, ErrorInfo, MatchingInlineAsm)) {
case Match_Success:
Out.emitInstruction(Inst, SubtargetInfo);
Opcode = Inst.getOpcode();
return false;
case Match_MissingFeature:
return Error(IdLoc, “Instruction use requires “
“option to be enabled”);
case Match_MnemonicFail:
return Error(IdLoc,
“Unrecognized instruction mnemonic”);
case Match_InvalidOperand: {
ErrorLoc = IdLoc;
if (ErrorInfo != ~0U) {
if (ErrorInfo >= Operands.size())
return Error(
IdLoc, “Too few operands for instruction”);
ErrorLoc = ((M88kOperand &)*Operands[ErrorInfo])
.getStartLoc();
if (ErrorLoc == SMLoc())
ErrorLoc = IdLoc;
}
return Error(ErrorLoc,
“Invalid operand for instruction”);
}
default:
break;
}
llvm_unreachable(“Unknown match type detected!”);
}
- And like some other classes, the assembler parser has its own factory method:
extern “C” LLVM_EXTERNAL_VISIBILITY void
LLVMInitializeM88kAsmParser() {
RegisterMCAsmParser X(
getTheM88kTarget());
}
This finishes the implementation of the assembler parser. After building LLVM, we can use the llvm-mc machine code playground tool to assemble an instruction:
$ echo ‘and %r1,%r2,%r3’ | \
bin/llvm-mc –triple m88k-openbsd –show-encoding
.text
and %r1, %r2, %r3 | encoding: [0xf4,0x22,0x40,0x03]
Note the use of the vertical bar | as the comments sign. This is the value we configured in the M88kMCAsmInfo class.
DEBUGGING THE ASSEMBLER MATCHER
To debug the assembler matcher, you specify the –debug-only=asm-matcher command-line option. This helps with understanding why a parsed instruction fails to match the instructions defined in the target description.
In the next section, we will add a disassembler feature to the llvm-mc tool.