This lab introduces the ARM assembly language and asks you to write a program that translates a subset of RISC-V assembly into ARM assembly. This lab is the first of two ARM translation labs. This lab focuses on the translation of ALU instructions. The next lab, Lab ARM - Control, will translate control-flow instructions.
The ARM instruction set varies in quite a few ways from the
RISC-V instruction set. It aims to cause fewer control-flow
changes and pipeline flushes during the execution of the
program. ARM has only 15 visible registers at any time,
numbered R0
through R14
. Of these,
R13
is the Stack Pointer, equivalent to
RISC-V's sp
, and R14
is the Link
Register, equivalent to RISC-V's return address
(ra
) register. ARM also operates with 4-byte
words.
The binary code for every ARM instruction begins with a 4-bit
condition code that determines whether or not that instruction
should execute. The binary format for a data processing
instruction --- which supports ADD
,
SUB
, OR
and other ALU instructions
with an immediate value --- is as shown below:
Conditions
(bits 31-28) allow the
instruction to be executed conditionally. The purpose is to
get rid of branching, which can introduce delays when the
outcome of the branch is predicted incorrectly. This
conditional execution of instructions is called predicated
execution because the condition can be seen as a
predicate that controls the execution of the instruction.
For the purposes of this lab, the conditions field for every
instruction should be 1110
which allows for
every instruction to always get executed. The next lab will
elaborate on the conditions field and the various related
flags.Bits 27-26
are 00
for all
data-processing instructions. Bits 27-25 are part of another
opcode, with 00x
representing data-processing
and miscellaneous instructions, where x
specifies how the operands are supplied to the
instruction.Immediate
(bit 25) specifies whether the
operands are two registers or a register and an immediate
value. A 1 specifies an immediate value, while a 0 specifies
an additional register.Opcode
(bits 24-21) specifies the type of
operation to perform on the two operands. Relevant opcodes
are detailed later on.Status
(bit 20) determines whether the
outcome of this instruction alters the Status
register, which is used by control-flow
instructions. Control-flow instructions will be translated in
the next lab. The Status
bit should be 0 for all
instructions in this lab.Rn
(bits 19-16) contains the first
operand for the ARM instruction.Rd
(bits 15-12) is the register that
contains the output of the instruction. Rotate
(bits 11-8) The instruction uses a
32-bit immediate value, however only 8-bit immediate and
4-bit rotate fields are specified. To obtain the immediate,
the value in the 4-bit rotate field is first shifted left a
single bit and then that value is used to shift the 8-bit
immediate right, with wraparound. Though this means that
only some 32-bit immediates are possible, it allows for large
powers of 2 to be more easily represented. The following gif
illustrates the rotation.0x77
, is shifted to the right
with wraparound by the value obtained from the previous
calculation in bits. That is, 0x77
is shifted by
28 bits. The 32-bit value obtained after this 4-bit shift
with wraparound, 0x00000770
, will be used as the
immediate in this instruction.Immediate
(bits 7-0) is an unsigned 8-bit
value that is rotated by the rotate field over 32 bits with
wraparound, then used as the 32-bit second operand in the
operation specified by the instruction. The data-processing immediate format explained above is used by
the ARM instructions AND
, OR
,
SUB
and ADD
with a single register
operand and an immediate value.
An addi
instruction with a negative immediate must
be translated into the SUB
instruction.
This instruction format is for ARM data-processing instructions
that use two registers as operands. The immediate bit (bit 25)
is set to 0
, and bits 11-0 hold the register for
the second operand and a Shift field.
Bits 31-12 have the same fields as data-processing immediate instructions.
Rm
(bits 3-0) is the second operand for
the instruction, after its content is shifted (see
below).Shift
(bits 11-4)Rm
prior to the
operation. When bit 4 is set to 0 the shift amount is in
bits 11-7. When bit 4 is set to 1 the shift amount is in the
lowest byte of the content of Rs
. Bits 6-5
specify the shift type as shown in the Shift Types table
above. Rotate right performs a rightward shift in which bits
that "fall off" the register are placed entering the opposite
side.
Specifying Rotate Right 0
in the shift field shifts
Rm
rightwards 1, and places the Carry flag from the
status register into bit 31. Flags are explained in the Lab ARM
Control.
The data-processing register format is used by ARM instructions
AND
, OR
, SUB
and
ADD
that use two register operands as well as for
all LSL
, LSR
and ASR
instructions.
All (both I-type and R-type) RISC-V bit-shift operations should
be translated into data-processing register-format
instructions. The information in the Shift
section
describes how ARM immediate-value and register-value shifts
should be encoded.
Immediate shifts in the Shift
field should not be
encoded with the function computeRotation
specified
below. Immediate shifts should be written as is because both
RISC-V and ARM shift-immediate values are 5 bit fields treated
as unsigned numbers.
For all bit-shift operations (i.e. LSL
,
LSR
and ASR
instructions), bits 19 to
16 must be 0 because RISC-V's R-type instructions do not have an
immediate value and therefore the shift field should be
0
for all instructions that do not perform a
shift. When necessary, the register is specified in bits
11-8.
RISC-V instructions must be translated into the following ARM instructions in this lab:
Instruction | Opcode | Description |
AND | 0000 | bitwise AND of two register values or of one register value and an immediate value |
OR | 1100 | bitwise OR of two register values or of one register value and an immediate value |
ADD | 0100 | addition of two register values or of one register value and immediate value |
SUB | 0010 | subtraction of one register value from another (specifically Rn - Rm ) |
LSL | 1101 | shifts register value in Rm left by specified number of bits. has shift type 00 |
LSR | 1101 | shifts register value in Rm right by specified number of bits without sign extension. has shift type 01 |
ASR | 1101 | shifts register value in Rm right by specified number of bits with sign extension. has shift type 10 |
Your assignment is to implement a binary translator from RISC-V to ARM for a subset of RISC-V assembly instructions. The subset implemented in this lab is not Turing complete because it only consists of arithmetic and logic operators. Once you complete the next lab --- translating conditional operators --- then you will have a Turing-complete set of operators and could as such, theoretically, compute anything with that subset.
The following are all of the RISC-V instructions that you will
need to handle in your binary translator. Additional constraints
are put on some instructions to ensure simple translation to ARM
instructions. In the encoding, s
specifies a source
register, t
a target register, d
a
destination register and i
an immediate value.
Instruction | Encoding | Type |
ANDI d, s, imm |
iiii iiii iiii ssss s111 dddd d001 0011 |
I |
AND d, s, t |
0000 000t tttt ssss s111 dddd d011 0011 |
R |
ORI d, s, imm |
iiii iiii iiii ssss s110 dddd d001 0011 |
I |
OR d, s, t |
0000 000t tttt ssss s110 dddd d011 0011 |
R |
ADDI d, s, imm |
iiii iiii iiii ssss s000 dddd d001 0011 |
I |
ADD d, s, t |
0000 000t tttt ssss s000 dddd d011 0011 |
R |
SUB d, s, t |
0100 000t tttt ssss s000 dddd d011 0011 |
R |
SRAI d, s, imm |
0100 000i iiii ssss s101 dddd d001 0011 |
I |
SRLI d, s, imm |
0000 000i iiii ssss s101 dddd d001 0011 |
I |
SLLI d, s, imm |
0000 000i iiii ssss s001 dddd d001 0011 |
I |
SRA d, s, t |
0100 000t tttt ssss s101 dddd d011 0011 |
R |
SRL d, s, t |
0000 000t tttt ssss s101 dddd d011 0011 |
R |
SLL d, s, t |
0000 000t tttt ssss s001 dddd d011 0011 |
R |
RISC-V destination registers should be encoded as Rd
s
in ARM. For all instructions but shifts, translated source and
target registers should be encoded as Rn
and
Rm
respectively.
For shift instructions, the source register should be encoded as
Rm
and the target register should be encoded within
the Shift
field as described above.
computeRotation
function have a valid ARM translation and fit into the
12 immediate bits of RISC-V I-type instructions. ADDI
instructions the immediate
values may be a negative value represented in
2's-complement n. If the immediate value of a RISC-V
ADDI
instruction is negative, that
instruction translated to an ARM SUB
instruction.function7
(i.e. bit 30 is 1) which needs
to be accounted for when extracting the immediate.S
after the
instruction type (e.g. ADD S R0, R1,
R2
). It also indicates when a non-shift
data-processing register instruction has a shift by
appending
LL
/LR
/AR
/RR
alongside the shift amount at the very end of the
instruction. common.s
file. ecall
s to help debug,
make sure to remove them before submitting your
solution because it may result in lost marks.The ARM architecture exposes only 16 registers at a time to its instructions. A fully featured binary translator would need to recompute register allocation, which is beyond the scope of this assignment. Therefore, the translator will assume that only the RISC-V registers below appear in a valid RISC-V program.
RISC-V Register | ARM Register |
t0 (x5) | R0 |
t1 (x6) | R1 |
t2 (x7) | R2 |
s0 (x8) | R3 |
s1 (x9) | R4 |
s2 (x18) | R5 |
s3 (x19) | R6 |
s4 (x20) | R7 |
s5 (x21) | R8 |
s6 (x22) | R9 |
a0 (x10) | R10 |
a1 (x11) | R11 |
a2 (x12) | R12 |
sp (x2) | R13 |
ra (x1) | R14 |
You are required to implement the following functions:
RISCVtoARM_ALU
a0
into ARM code and stores that ARM code into
the memory address found in a1
.
a0
: pointer to memory containing a
RISC-V function. The end of the RISC-V instructions
is marked by the sentinel word
0xFFFFFFFF
.a1
: a pointer to pre-allocated memory
where you will have to write ARM instructions.a0
: number of bytes that the
instructions generated by RISCVtoARM_ALU
occupy.translateALU
a0
: untranslated RISC-V instructiona0
: translated ARM instruction.
translateRegister
a0
into the number of a corresponding ARM register.
a0
: RISC-V register to translate.a0
: translated ARM register.computeRotation
a0
: RISC-V immediate in the bottom 20 bits.a0
: rotate
in bits 11 to 8 and
immediate
in bits 7 to 0, with all other
bits 0.Write short RISC-V programs using the subset of instructions provided, and convert them into binary files using the following command
rars "YOUR_RISCV_FILE" a dump .text Binary "YOUR_DESIRED_BINARY_FILE"
The provided common.s file loads
RISC-V binary from a file and generates out.bin
file
after calling the functions specified above. This
commons.s
file should be included in your
arm_alu.s
file. The program, starting in
arm_alu.s
, takes the name of the file containing the
test to load as an argument. Thus, it can be run using rars
arm_alu.s pa RISCV_BINARY_FILE
. The submitted solution
must not contain the common.s
attached. It
also must not contain a main
function
This assignment provides the program ARMDisassembler.s that prints ARM instructions in a textual representation.
The disassembler indicates when the status bit is set by adding an
S
after the instruction type (e.g. ADD S R0,
R1, R2
), and indicates when a non-shift data-processing
register instruction has a shift by appending
LL
/LR
/AR
/RR
alongside the shift amount at the very end of the
instruction. Make sure to take all of this into account when
analyzing the output.
The disassembler is designed to print instructions that follow the specifications. It prints question marks if no valid interpretation is possible.
You are responsible for creating test cases to ensure compliance with the assignment specification.
The disassembler can be run using:
rars ARMDisassembler.s pa out.bin
To view the bytecode contents of the generated
out.bin
files in a terminal, use the following
command:
hexdump out.bin
Not all immediate values can be represented in a single data-processing immediate instruction. Some valid values can be found here.
Here are some test cases you can use to test your program:
RISC-V Program | RISC-V Binary | ARM Binary | ARM Text Representation |
ITypes.s | ITypes.bin | ITypes.out | ITypes.txt |
RTypes.s | RTypes.bin | RTypes.out | RTypes.txt |
randomCalculation.s | randomCalculation.bin | randomCalculation.out | randomCalculation.txt |
This lab is supported
in CheckMyLab.
To get started, navigate to the ARM-ALU lab in CheckMyLab found in the
dashboard.
From there, students can upload test cases in the My test cases table.
Test cases are RISC-V binary files, generated as described in the
Testing section.
Additionally, students can upload their arm_alu.s
file in
the My solutions table, which will then be tested against all other
valid test cases.
More information about ARM instruction set encoding can be found here.
A PDF version of the GIF embedded in this page can be found here here.
Slides can be found here as a PDF and here as a PPTX.
Assignments too short to be adequately judged for code quality will be given a zero. Register translation is vital for all instructions. Therefore it is difficult for a binary translator that does not do correct register translation to pass ANY of the grading test cases. Please, ensure proper register translation according to the table above.
A copy of the marksheet to be used can be found here. For the instruction translations, an incomplete set of translated instructions can still earn part marks.
There is a single file to be submitted for this lab. The file name
should be arm_alu.s
and it should contain only the
code for the functions specified above. Make sure to not include a
main
function in your solution. Do not remove
.include "common.s"
from the top of your solution. To
submit, keep the arm_alu.s
file in the
Code
directory of your submission repo, where the
latest commit from the master branch will be marked. Your solution
also MUST include the CMPUT 229 Student
Submission License at the top of the file containing your
solution and you must include your name in the appropriate place in
the license text.