CMPUT 229 - Computer Organization and Architecture I

RISC-V-to-ARM Binary Translation - ALU

CMPUT 229 - Lab 5, RISC-V to ARM - ALU

Introduction

This lab introduces the ARM assembly language and asks you to write a program that translates a subset of RISC-V assembly into ARM assembly. This lab is the first of two ARM translation labs. This lab focuses on the translation of ALU instructions. The next lab, Lab ARM - Control, will translate control-flow instructions.

Information

The Advanced RISC Machines (ARM) Instruction Set

The ARM instruction set varies in quite a few ways from the RISC-V instruction set. It aims to cause fewer control-flow changes and pipeline flushes during the execution of the program. ARM has only 15 visible registers at any time, numbered R0 through R14. Of these, R13 is the Stack Pointer, equivalent to RISC-V's sp, and R14 is the Link Register, equivalent to RISC-V's return address (ra) register. ARM also operates with 4-byte words.

The binary code for every ARM instruction begins with a 4-bit condition code that determines whether or not that instruction should execute. The binary format for a data processing instruction --- which supports ADD, SUB, OR and other ALU instructions with an immediate value --- is as shown below:

Data-Processing Immediate Format

The data-processing immediate format explained above is used by the ARM instructions AND, OR, SUB and ADD with a single register operand and an immediate value.

An addi instruction with a negative immediate must be translated into the SUB instruction.

Data-Processing Register Format

This instruction format is for ARM data-processing instructions that use two registers as operands. The immediate bit (bit 25) is set to 0, and bits 11-0 hold the register for the second operand and a Shift field.

Bits 31-12 have the same fields as data-processing immediate instructions.

Specifying Rotate Right 0 in the shift field shifts Rm rightwards 1, and places the Carry flag from the status register into bit 31. Flags are explained in the Lab ARM Control.

The data-processing register format is used by ARM instructions AND, OR, SUB and ADD that use two register operands as well as for all LSL, LSR and ASR instructions.

All (both I-type and R-type) RISC-V bit-shift operations should be translated into data-processing register-format instructions. The information in the Shift section describes how ARM immediate-value and register-value shifts should be encoded.

Immediate shifts in the Shift field should not be encoded with the function computeRotation specified below. Immediate shifts should be written as is because both RISC-V and ARM shift-immediate values are 5 bit fields treated as unsigned numbers.

For all bit-shift operations (i.e. LSL, LSR and ASR instructions), bits 19 to 16 must be 0 because RISC-V's R-type instructions do not have an immediate value and therefore the shift field should be 0 for all instructions that do not perform a shift. When necessary, the register is specified in bits 11-8.

The Operations to Translate

RISC-V instructions must be translated into the following ARM instructions in this lab:

InstructionOpcodeDescription
AND0000bitwise AND of two register values or of one register value and an immediate value
OR1100bitwise OR of two register values or of one register value and an immediate value
ADD0100addition of two register values or of one register value and immediate value
SUB0010subtraction of one register value from another (specifically Rn - Rm)
LSL1101shifts register value in Rm left by specified number of bits. has shift type 00
LSR1101shifts register value in Rm right by specified number of bits without sign extension. has shift type 01
ASR1101shifts register value in Rm right by specified number of bits with sign extension. has shift type 10

Assignment

Your assignment is to implement a binary translator from RISC-V to ARM for a subset of RISC-V assembly instructions. The subset implemented in this lab is not Turing complete because it only consists of arithmetic and logic operators. Once you complete the next lab --- translating conditional operators --- then you will have a Turing-complete set of operators and could as such, theoretically, compute anything with that subset.

RISC-V Instructions to Translate

The following are all of the RISC-V instructions that you will need to handle in your binary translator. Additional constraints are put on some instructions to ensure simple translation to ARM instructions. In the encoding, s specifies a source register, t a target register, d a destination register and i an immediate value.

InstructionEncodingType
ANDI    d, s, imm iiii iiii iiii ssss s111 dddd d001 0011 I
AND     d, s, t 0000 000t tttt ssss s111 dddd d011 0011 R
ORI     d, s, imm iiii iiii iiii ssss s110 dddd d001 0011 I
OR      d, s, t 0000 000t tttt ssss s110 dddd d011 0011 R
ADDI    d, s, imm iiii iiii iiii ssss s000 dddd d001 0011 I
ADD     d, s, t 0000 000t tttt ssss s000 dddd d011 0011 R
SUB     d, s, t 0100 000t tttt ssss s000 dddd d011 0011 R
SRAI    d, s, imm 0100 000i iiii ssss s101 dddd d001 0011 I
SRLI    d, s, imm 0000 000i iiii ssss s101 dddd d001 0011 I
SLLI    d, s, imm 0000 000i iiii ssss s001 dddd d001 0011 I
SRA     d, s, t 0100 000t tttt ssss s101 dddd d011 0011 R
SRL     d, s, t 0000 000t tttt ssss s101 dddd d011 0011 R
SLL     d, s, t 0000 000t tttt ssss s001 dddd d011 0011 R

RISC-V destination registers should be encoded as Rds in ARM. For all instructions but shifts, translated source and target registers should be encoded as Rn and Rm respectively.

For shift instructions, the source register should be encoded as Rm and the target register should be encoded within the Shift field as described above.

Guarantees and Cautions

Register Translation

The ARM architecture exposes only 16 registers at a time to its instructions. A fully featured binary translator would need to recompute register allocation, which is beyond the scope of this assignment. Therefore, the translator will assume that only the RISC-V registers below appear in a valid RISC-V program.

RISC-V RegisterARM Register
t0 (x5)R0
t1 (x6)R1
t2 (x7)R2
s0 (x8)R3
s1 (x9)R4
s2 (x18)R5
s3 (x19)R6
s4 (x20)R7
s5 (x21)R8
s6 (x22)R9
a0 (x10)R10
a1 (x11)R11
a2 (x12)R12
sp (x2)R13
ra (x1)R14

Specification

You are required to implement the following functions:

Testing

Write short RISC-V programs using the subset of instructions provided, and convert them into binary files using the following command

rars "YOUR_RISCV_FILE" a dump .text Binary "YOUR_DESIRED_BINARY_FILE"

The provided common.s file loads RISC-V binary from a file and generates out.bin file after calling the functions specified above. This commons.s file should be included in your arm_alu.s file. The program, starting in arm_alu.s, takes the name of the file containing the test to load as an argument. Thus, it can be run using rars arm_alu.s pa RISCV_BINARY_FILE. The submitted solution must not contain the common.s attached. It also must not contain a main function

This assignment provides the program ARMDisassembler.s that prints ARM instructions in a textual representation.

The disassembler indicates when the status bit is set by adding an S after the instruction type (e.g. ADD S R0, R1, R2), and indicates when a non-shift data-processing register instruction has a shift by appending LL/LR/AR/RR alongside the shift amount at the very end of the instruction. Make sure to take all of this into account when analyzing the output.

The disassembler is designed to print instructions that follow the specifications. It prints question marks if no valid interpretation is possible.

You are responsible for creating test cases to ensure compliance with the assignment specification.

The disassembler can be run using:
rars ARMDisassembler.s pa out.bin

To view the bytecode contents of the generated out.bin files in a terminal, use the following command:
hexdump out.bin

Not all immediate values can be represented in a single data-processing immediate instruction. Some valid values can be found here.

Here are some test cases you can use to test your program:

RISC-V ProgramRISC-V BinaryARM BinaryARM Text Representation
ITypes.s ITypes.bin ITypes.out ITypes.txt
RTypes.s RTypes.bin RTypes.out RTypes.txt
randomCalculation.s randomCalculation.bin randomCalculation.out randomCalculation.txt

Check My Lab

Link to CheckMyLab

This lab is supported in CheckMyLab. To get started, navigate to the ARM-ALU lab in CheckMyLab found in the dashboard. From there, students can upload test cases in the My test cases table. Test cases are RISC-V binary files, generated as described in the Testing section. Additionally, students can upload their arm_alu.s file in the My solutions table, which will then be tested against all other valid test cases.

Resources

More information about ARM instruction set encoding can be found here.

A PDF version of the GIF embedded in this page can be found here here.

Slides can be found here as a PDF and here as a PPTX.

Marking Guide

Assignments too short to be adequately judged for code quality will be given a zero. Register translation is vital for all instructions. Therefore it is difficult for a binary translator that does not do correct register translation to pass ANY of the grading test cases. Please, ensure proper register translation according to the table above.

A copy of the marksheet to be used can be found here. For the instruction translations, an incomplete set of translated instructions can still earn part marks.

Submission

There is a single file to be submitted for this lab. The file name should be arm_alu.s and it should contain only the code for the functions specified above. Make sure to not include a main function in your solution. Do not remove .include "common.s" from the top of your solution. To submit, keep the arm_alu.s file in the Code directory of your submission repo, where the latest commit from the master branch will be marked. Your solution also MUST include the CMPUT 229 Student Submission License at the top of the file containing your solution and you must include your name in the appropriate place in the license text.