CMPUT 229 - Computer Organization and Architecture I
Lab 2: Caesar Cipher
Introduction
The first historical example of a cipher was used by Julius Caesar around 58 B.C.E. It was a substitution cipher that shifted each letter in the message by a particular key. To anyone other than those who knew the key the message would appear meaningless.
In computing, ciphers play a crucial role in ensuring secure communication and data protection. When sensitive data is transmitted over a network, it must be encrypted so that the data appears unintelligible to those who do not have the key. In this lab, you will be implementing a Caesar Cipher in RISC-V.
Caesar Ciphers
A Caesar Cipher is a substitution cipher in which letters are translated by a specified key. Each letter maps to a number (A=0, B=1, ... , Z=25). To encrypt a letter, add the value of that letter and the key. If the key increments a letter past 25, that letter will wrap around back to 0. See the below example where the string "xyz" is encrypted with a key = 2.
Original String | x | y | z |
---|---|---|---|
Encrypted String | z | a | b |
Typically, Caesar Ciphers are only used for strings and spaces. One way to add complexity to the cipher is to translate uppercase letters and lowercase letters by different keys.
For each character there are two cases:
- Letters - add the letter with the uppercase key or lowercase key. If this addition exceeds 25 then wrap around back to the start of the alphabet.
- Spaces - leave the character as is.
Assignment
Write a RISC-V assembly function named caesarEncrypt
with the following
specifications:
caesarEncrypt: This function encrypts a string using a Caesar Cipher with both uppercase and lowercase keys. Arguments: a0: pointer to a string to encrypt a1: uppercase key, represented by a positive integer value a2: lowercase key, represented by a positive integer value Returns: a0: pointer to a newly allocated memory that contains the encrypted string
Once caesarEncrypt
encrypts the string, it must return the pointer to the encrypted string in a0
.
Writing a Solution
Notice the directive .include "common.s"
in the solution template file provided. This directive will cause RARS to execute the code in the file common.s
before executing any code in the solution.
The code in common.s
reads in a file provided in the program arguments and initializes the arguments for your function, and then
calls your caesarEncrypt
function. Read the code in the common.s
file to
understand how the whole program works. A solution must begin with the label caesarEncrypt:
.
In RISC-V assembly every function must have a return statement. The following are examples of valid return statements: ret
, jr ra
and jalr zero, 0(ra)
.
You must include this instruction at the end of your function so that it will then return to the main program in common.s
to display the result.
You don't need to follow register conventions in your solution for this lab, but it is a good idea to be familiar with it for future labs.
Return values must be stored in the argument registers.
For this lab, you only need to use the register a0
for the return value.
Strings and Characters
This section provides more details on how to work with strings and characters.

Each character is stored in memory as a number that represents the character. The American Standard Code for Information Interchange (ASCII) is the most common character encoding format. In ASCII, characters are stored as 1-byte numbers. See the ASCII table for how characters map to numbers. Click here for more information about ASCII characters.
A strings is an array of characters. The null terminator character (\0)
is represented by 0 in ASCII. This terminator character must appear as the last character at the end of any string. While traversing a string, you will know a string ends when the value of it's current character is 0.
Below are examples of a string in memory. Notice that t0
is a pointer to the first character in the string. The string ends when the null terminator character is reached. The second image displays the ASCII values of the characters in hexadecimal.


In the explanation of the Caesar Cipher above the letter 'A' was mapped to the number 0. However, in ASCII 'A'=65. Your program will process ASCII codes.
Caesar Cipher in RISC-V
This section provides more details on instructions in RISC-V and features in RARS that may be helpful in completing this lab.
When working with strings, it is important to remember how characters are stored. Your function's only argument is a pointer to a string. To work with the characters in the string, you will need to load each individual character. This is where the load byte instruction comes in handy: lb t0, 0(t1)
, t0 gets the sign extended value at the memory address in t1. Since a byte is 8 bits but a register holds 32 bits, you need to direct RARS how to fill the remaining bits. The lb
instruction sign-extends the 8-bit value to a 32-bit value. The load byte unsigned, lbu
instruction fills the remaining bits with zeroes.
In RISC-V, the instruction rem rd, rs1, rs2
sets rd to the value of the remainder from rs1/rs2. This is equivalent to the modulo (%) operation.
Memory Allocation in RISC-V
A solution to this lab must allocate a new section of memory to contain the encrypted string. It cannot overwrite the original string. There are two types of memory allocation:
Static Allocation
Static memory allocation requires that the size of variables are known before execution. Static memory allocation is done using directives in RARS. Some examples of directives in RARS are: the .include
directive that runs the common.s
file, the .text
directive that tells RARS that the subsequent lines contain RISC-V instructions. The .data
directive indicate that the subsequent lines contains directived that reserve space for data --- the way to allocate memory statically in a RISC-V program. For instance, the following sequence of directives statically allocate memory for a 64-byte space and for a 32-bit word:
.data buffer: .space 64 counter: .word 1The
.data
directive only needs to appear once for each data session in the code. An assembly program can obtain the addess for these allocated spaces in memory using the load address pseudo instruction. For instance:
.text la t0, buffer la t1, counterStatic memory allocation will be necessary for future labs.
Dynamic Allocation
Dynamic memory allocation does not require that the size of a variable is known in advance. This allows for flexibility when handling inputs of unknown sizes, such as a string. For this lab, you must dynamically allocate memory since the size of input strings is not specified.
In RISC-V, dynamic memory allocation is accomplished through a system call: ecall a7, a0
. The register a7
contains the code indicating what the system call should do. To allocate memory, set a7
to 9. The register a0
specifies the amount of contiguous memory allocated, in bytes. After this ecall
the address of the newly allocated memory will be in a0
.
A solution to this lab should first count the number of characters in the input string. Then dynamically allocate num chars + 1
(to account for the null terminator). Then, copy the encrypted string to the dynamically allocated space.
Testing your Lab
You are provided with sample inputs and outputs to test the solution. To perform these tests, enter the complete file path of the appropriate test input in the Program Argument box in RARS. There cannot be any spaces in the entire file path. For instance, a path like CMPUT 229/Lab 2/test3.txt
is invalid. Do not put quotations around the filename. Once you have correctly entered the path to the testcase, you can run the program. The
common.s
file prints either the encrypted string returned by your function or an appropriate error message.
The tests provided to you are not extensive, you must create your own corner-case tests to ensure that your code is correct.
Check My Lab
This lab is supported in CheckMyLab. To get started, navigate to the CaesarCipher lab in CheckMyLab found in the dashboard. From there, students can upload their test cases in the My test cases table (see below). Additionally, students can also upload their `caesarencrypt.s` file in the My solutions table, which will then be tested against all other valid test cases.
Test Case Format
Test cases are plain text files ending in .txt that must be in the following format:
[text to encrypt] [uppercase key] [lowercase key]
Assumptions and Notes
- Every string contains only ASCII characters, which range from values between 0-127.
- All test cases only contain alphabetical characters and spaces. This means that the only possible ASCII values in the string provided are: ('\0' = 0, ' ' = 32, 'A'-'Z' = 65-90, 'a'-'z' = 97-122)
- Each character has a length of 1 byte. Each address has a length of 1 word, or 4 bytes. Each integer also has a length of 1 word. Make sure you are using the correct instructions for the data type/size.
- Key values are represented by positive integers. They will not overflow (we will not give keys that are too big to fit in a register).
- Ensure that none of the labels in the solution start with
main
. - The encrypted string returned by your function must be null terminated.
Resources
Marking Guide
Assignments too short to be adequately judged for code quality will be given a zero for that portion of the evaluation.
- 20% for correctly shifting lowercase letters.
- 20% for correctly shifting uppercase letters.
- 20% for correctly ignoring spaces.
- 20% for additional test cases.
- 20% for code cleanliness, readability, and comments
Submission
There is a single file to be submitted for this lab. caesarencrypt.s
should contain the code for your solution.
- Do not add a
main
label to this file. - Do not modify the line
.include "common.s"
. - Keep the file
caesarencrypt.s
in theCode
folder of the git repository. - Push your repository to GitHub before the deadline. Just committing will not upload your code. Check online to ensure your solution is submitted.