Part 2: Instruction Representation and Logic

The previous article focused more on operands, how they are stored, represented & how operations like addition are structured.

But now, we can look at more instructions. They are simply high/low signals in the computer that are represented as numbers. The instruction has 32 or 64 bits. And within those, we specify where we get the data from registers.

There are 32 registers named x0 to x31, but for readability, assemblers and simulators like RARS use aliases like a0, t0, and s1 based on their purpose (e.g., function arguments, temporaries). These are just nicknames for the same physical registers. Additionally, RARS supports pseudoinstructions like li and mv that aren’t real hardware instructions but are automatically translated into valid ones by the assembler before execution. So what you write (aliases and pseudocode) gets converted into raw x registers and real instructions during assembly. But that shouldn’t be a concern to a very low level view – the one where we will be working on.

To solidify our knowledge, let’s look at an example: add x9, x20, x21. The decimal representation is:

==0==2120==0==9==51==

Each number is a field. 0, 0, 51 here represents the operation. 21, 20 are the source registers and 9 is the destination register.

It can also be represented as binary, in fact they are in the hardware. When you represent instructions in binary, it is easier to see the similarity of the instructions in similar operations. Although different instructions might ask for different fields, each instruction is 32 bits long. This follows our design principle of “simplicity favors regularity.”

The binary version of an instruction is called machine language, and a full set of such instructions that make up a program is called machine code—similar to how a single line like a = b + c; is a C statement, and a whole file of such lines is considered C code.

Instructions are also represented as hex, this is a number system that translates well to binary and is easier to read too.

Each field is given a name to easily distinguish it from others. They are:

  • opcode – the operation of the instruction

  • rd – the destination register (receives the result)

  • funct3 – more fine-grained control of the operation (usually used in arithmetic instructions)

  • rs1 – the first source register

  • rs2 – the second source register

  • funct7 – an additional opcode field for extended operations

This is the basic format. R-type instructions have these fields.

Why Multiple Formats Exist

What are R-type instructions? Well, let’s first talk about why we need different types or formats of instructions. If we use the same format for a load instruction as for an add, then we could only access memory within a 31-byte range from our base register. Why is that bad? Can’t you just change the base register?

The problem is that it becomes harder to branch to or access a location farther than 31 bytes. Our memory size can go up to 2³¹ or even 2⁶³ bytes in a 64-bit system, so limiting ourselves to a 5-bit offset (which is where that “31” comes from) isn’t scalable.

What if we use two registers? We could add their values and store the result in rd. That gives us a large addressing range, up to 2³² or 2⁶⁴, depending on the architecture.

This can be left as an open question or even a mini-research topic.

Okay, skipping over our detour about load. Let’s take an example of adding a value stored in a register and a constant. Doing this typically takes two instructions:

ld x9, 64(x22)
add x10, x9, x21

But what if we had an instruction formatted like this?

addi x10, x21, 9

Much simpler, right? This is one of the reasons we have operations with a constant. In fact, operations with a constant make up almost half of the arithmetic instructions in a CPU.

For load, we have one destination and one source register: the source gives the address of the data, and we write to the destination register. But what if we want to move a value from a register to memory? That’s a store instruction.

For a store instruction, we still need two registers—but both are source registers. One holds the value to store, and the other gives the address. We’re not writing to any register!

Why is there no sdi (store with immediate)? Oh—it’s because we already use the immediate field for addressing :P

You can see the formats here: Instruction Formats. You’ll notice that RISC-V tries its best to keep the fields consistent—this is why the immediate field in the S-type format is split into 7 and 5 bits (this part confused me at first too).

But if the format varies, how does the CPU even know how to interpret it? That’s where the opcode field comes in. The control unit first looks at the opcode and then decides how to interpret the remaining bits.

One additional note from the book: assemblers automatically generate the correct opcodes and instruction formats if one operand is a constant. So just writing add is often sufficient. However, we’ll continue using addi to better distinguish and understand the hardware.

You can also use addi in place of subi. In fact, there is no subi in RISC-V. Just treat the numbers as signed, and it works correctly—thanks to two’s complement representation.

We can also see that not all instruction formats use the same fields. Some instructions use 3, 2, 1, or even 0 registers. But the design is finely balanced—other fields are used to fill out the instruction bits. This is precisely why the number of registers must be chosen carefully. Adding more registers means increasing the bit width of all register fields, which changes the entire instruction format. While there are options like 16 or 32 registers, it’s well-established that 32 is an ideal trade-off for power, efficiency, and simplicity. (Remember the design principle: simpler is better.)

We’ve been discussing formats, and it’s important to remember that instructions are just bits. This idea is central to the stored-program concept. As long as a program follows the ISA, the CPU will execute it. After all, the CPU is just a circuit—it processes high and low voltages in a predictable way. Adhering to the ISA is called binary compatibility.

Logical operations

First computers operated on full words—but the fact that we need to operate on fields and bits became apparent. This task to pack and unpack bits was assigned to logical operations.

Shifts

Shifts are used by I-type instructions. The registers are 64 bits wide (in RV64I). But it wouldn’t make sense to shift a value by more than 63, so we only take the lower 6 bits of the immediate field in I-type for shifting. The remaining bits form a field called funct6, used to differentiate between shift types—slli, srli, sll, srl, sra, etc.

The instructions sll, srl, and sra use a register as the source for the shift amount. One application of shifting is multiplication: shifting left by 1 bit multiplies by 2, by 2 bits multiplies by 4. This is extremely useful for converting between bits, bytes, words, and doublewords. Why do we need conversions? One reason is for branching, which we’ll discuss soon.

AND

AND isolates fields. When you AND with a value like 111…111, you only extract the bits at those locations. This is why AND is traditionally referred to as a mask—very useful for extracting fields from registers, instructions, and more.

OR and NOT

Well, we actually use XOR and OR. This is because NOT can just be implemented by XORing with 0.

These logical instructions also have immediate counterparts: andi, ori, xori. Bit or field insertion/extraction can be done using combinations of andi, ori, slli, and srli.

Making Decisions

“What distinguishes a computer from a simple calculator is its ability to make decisions.”
— Love it! Using it!

Computers make decisions. Programming languages represent this with statements like if or goto. In assembly, this corresponds to branching and jumping.

Branching—specifically conditional branching—means the CPU decides whether to jump to a label based on the result of a test. This test is usually a comparison between two registers to check if they are equal or not. These instructions are called beq and bne.

    li x5, 0          # initialize counter = 0
    li x6, 5          # set loop limit = 5
loop:
    addi x5, x5, 1    # counter++
    bne x5, x6, loop  # if counter != 5, repeat

You can see that we use labels to denote the jump target. The actual machine doesn’t use labels—it uses offsets—but the assembler computes it for us, relieving the programmer from tedious calculations. This is similar to how the assembler helps with loads and stores.

For loops, we also use the same branching instructions—beq, bne—with labels. These are used so frequently that they’re associated with the buzzword basic blocks. A basic block is defined as a sequence of instructions that has no branches except at the end, and no labels except at the top.

Equality is the most common condition for branching, but there are other variants too: blt, bge, etc., for less-than or greater-than comparisons. These are particularly useful for looping, such as checking whether an index is still in bounds (< 0). This is referred to as array bounds checking, which we’ll soon encounter.

There are two alternative styles for performing comparisons.

  1. One used in MIPS performs the test using a separate instruction like slt, which stores the result (1 or 0) in a register. Then, you use beq or bne to compare that register against x0 (zero). If the condition is true, the register holds 1, so bne reg, x0 branches; otherwise it does not. This simplifies the datapath but requires one extra instruction compared to using blt or bge directly.

  2. Another alternative used in ARM sets flags or condition codes set by previous instructions instead. Arithmetic and logical operations can update status flags like Zero (Z), Negative (N), Carry (C), and Overflow (V). Branch instructions then use these flags—e.g., BEQ (branch if equal) checks the Z flag. But this creates dependencies on the flags that causes trouble for pipelining.

I’d also like to mention that in PIC18 (from a microcomputer systems course I took), we had both bge/blt-style comparisons and status flags. This gave us multiple ways to implement branching logic depending on the context.

Bounds Check

A neat trick for checking if 0 < x < y is to treat signed integers as unsigned. If x is negative, its leading bit is 1, so when interpreted as unsigned, it appears very large—automatically flagging it as out of bounds. Similarly, if x ≥ y, it also exceeds the limit. This gives a low-cost bounds check without extra instructions.

Case/Switch Statement

Case or switch statements let us choose between multiple options based on a single value. The straightforward way to implement this is with a chain of if-then-else statements. However, a more efficient method uses a branch table—a list of jump addresses. To jump indirectly to an address from this table, RISC-V uses jalr, which performs an unconditional jump to the address stored in a register.