Part 1: Introduction to Computer Instructions

Introduction

One thing I noticed while reading the book is how every design choice we see today—like “Why binary?”, “Why 2’s complement?”, or “Why only 32 registers?"—exists because earlier architects tried all the alternatives. What we use today is not arbitrary; it’s what proved most efficient. These design paths are rarely discussed in class, so seeing them laid out was refreshing.

The book takes a top-down approach: it starts with a language that looks like low-level code and peels back the layers until you’re working directly with machine language.

So, what are instructions? Think of them as the words a computer understands. When we want to tell a computer to do something, we give it instructions. And the full collection of these words is called an instruction set—the vocabulary of the machine.

The instruction set used in the book is RISC-V, chosen for its simplicity. Personally, I also appreciate that it’s open-source. Once you learn RISC-V, it becomes much easier to understand others like MIPS or x86. (You can check out part 6 in this folder to explore these.)

Despite differences in syntax, these ISAs are based on the same foundational ideas. Why? Because all computers are built on the same core principles—and all architects strive for the same goal:

The universal goal of hardware design: increase performance, reduce cost and energy.

This principle was echoed as early as the 1950s by Von Neumann, Burks, and Goldstine, who believed that while many abstract models exist for computation, practicality drives implementation choices.

“The simplicity of the equipment is as valuable now as it was in the 1950s.”

Operations

Out of all the operations computers perform, arithmetic is one of the most fundamental. It’s no coincidence that computers were once simply called “calculators.”

Every computer must be able to perform arithmetic. In most instruction sets—including RISC-V—arithmetic instructions follow a strict format:

operation var1, var2, var3

For example:

add a, b, c

This means: add the values in b and c, and store the result in a.

This uniform structure—an operation followed by three operands—is not arbitrary. It reflects the RISC design principle: keep the hardware simple, even if it places more burden on the compiler or programmer.

Checkpoint 1

  flowchart TB
    subgraph IC["Instruction Categories"]
        A1[Arithmetic]
    end

    subgraph DP["Design Principles"]
        D1[Simplicity favors regularity]
    end

Operands

In high-level languages, you’re used to writing expressions like x = a + b + c + d; and letting the compiler figure out where to put things. But in assembly, things are more limited. Why?

Because operands must live in registers—small, fast storage locations inside the CPU. Registers are hardware primitives, but also visible to programmers. They’re literally flip-flops wired together, but in our view, they act like variables.

So, when we say add a, b, c, all three—a, b, and c—must be registers. You can’t just add a value from memory or a constant (unless there’s a special instruction for that).

Register Sizes

Most registers today are 64 bits wide. This size is called a doubleword. And yes, a word in most systems is 32 bits. You’ll see instructions like addw (word-size addition) vs add (full-width addition).

But here’s a constraint: you only get 32 registers.

Why just 32?

Well… remember what we said in the introduction? Every hardware choice is tried and tested.

It turns out, this isn’t just a question of “how many boxes can we have to store things in.” It’s a hardware question. Because registers aren’t just abstract variables—they are physical circuits sitting next to other parts of the CPU like the ALU. And the more registers you add, the bigger that circuit gets.

Here’s the problem: signals don’t teleport.

They have to travel through metal wires etched into the silicon. And the longer the distance, the slower it gets. More registers = more distance = longer wires = longer delay. Not just that—the decoder also has to handle more bits (5 bits to index 32 registers, 6 bits for 64, etc). And all of this adds up.

So when you ask the CPU to fetch rs1 and rs2 from the register file, then do something with them in the ALU, the signals have to go through this whole route. If the register file is too large, that route takes too long. And that delay becomes part of the critical path—the slowest step in the entire CPU.

And what does that mean?

It means you have to slow down the clock to give everything time to catch up.

That’s not good. Because every clock cycle is precious.

So 32 registers turns out to be a sweet spot. It’s small enough that the wires stay short and the clock stays fast. And it’s big enough that compilers can juggle a decent number of variables without spilling everything into memory.

Architects tried more. They tested 64, 128, even 256. But 32 came out as the best compromise for performance, area, and energy.

This is why RISC-V, MIPS, ARM, SPARC—all of them settled . on 32.

Once again: not the maximum… but the optimal.

Memory Operands

There are complex data structures in programming languages, but we only have 32 registers. In reality, these data structures are stored in memory. Since arithmetic instructions are only performed on registers, we need to transfer data between memory and registers. To accomplish this, we use data transfer instructions.

To access data in memory, we require the memory address. After all, memory is simply a large array, and the address serves as the index to that array.

A data transfer function that copies data from memory to a register is called a load. The format is:
load destination_register, base_register, offset.

The sum of the base register and the constant (offset) forms the memory address. The base register provides a reference point, while the offset gives the distance from that point.

Memory is typically byte-addressed, and virtually all modern computers use this scheme. For a doubleword (64-bit), instruction addresses differ by 8 bytes. In memory addressing, there’s a concept called endianness, which refers to which byte is treated as the starting address of a multi-byte word:

Big-endian: the leftmost (most significant) byte has the lowest address.
Little-endian: the rightmost (least significant) byte has the lowest address.

RISC-V is little-endian.

The opposite operation—copying data from registers to memory—is called a store. It has a similar format to load, but moves data in the other direction.

In many architectures, words (32-bit) must be stored at addresses divisible by 4, and doublewords (64-bit) at addresses divisible by 8. This requirement is known as an alignment restriction.

Constant or Immediate Operands

Constants are incredibly useful as operands.
Let’s look at a simple example: how do we perform unconditional branching?

This is a sneak peek into later topics, but in RISC-V, all branches are conditional. So how do you branch unconditionally?
You compare two values that are always equal—like 0 and 0. Since the condition is always true, the branch always happens.

Another neat trick with constants: you can negate a number using just zero. For example, if you subtract a number from 0 (0 - x), you get -x.

Why constants matter

Aside from clever tricks, constants are useful because they save instructions. Operations between a register and a constant are so common that RISC-V introduces a whole set of immediate instruction formats just for them.

Let’s break it down with an example:
We want to add 5 to register x2.

Using only R-type instructions (which operate register-to-register), you would need:

Load the constant 5 into a register (say x3)
Perform add x2, x2, x3

With ADDI (add immediate), you can write just:

Checkpoint 2

  flowchart TB
    subgraph IC["Instruction Categories"]
        A1[Arithmetic]
        A2[Data Transfer]
    end

    subgraph DP["Design Principles"]
        D1[Simplicity favors regularity]
        D2[Smaller is faster]
    end

Signed and Unsigned Numbers

Here, we talk about number systems. I’m sure most of you are already familiar with them—almost all of my classes start here too.

One interesting thing we learned is that the first commercial computers supported decimal arithmetic. But internally, computers still used on/off signals, which naturally map to binary. So even if users thought in decimal, everything in memory was still binary.
Eventually, we just reserved decimal for input/output operations, which are relatively rare.

Signed Numbers and the Problem with Sign Bits

Next comes the question of signed numbers. A natural idea is to use the first bit as a sign bit—0 for positive, 1 for negative. This is called sign-and-magnitude representation.

But this approach had serious problems:

You had to know the sign ahead of time to perform addition correctly.
You’d need to invert negative numbers manually before adding. Imagine trying to add 5 + (-3) and ending up with -8 if you got it wrong.
You’d end up with two versions of zero: positive zero and negative zero. Confusing, right?

These issues were painful for both hardware designers and programmers.

That’s why we moved to something much better: two’s complement.
I won’t explain it fully here, but it’s an elegant solution that simplifies the hardware and removes the ambiguity of sign bits.

Yes, it introduces one edge case: the number -2^63 doesn’t have a positive counterpart. But that’s a problem only for the careless programmer—not for hardware.

Overflow, Signed vs Unsigned, and Sign Extension

In both representations (sign-and-magnitude and two’s complement), overflow is something we have to watch out for.

Another distinction we make is between signed and unsigned integers. This matters in both load operations and arithmetic.

For loading, we need to extend the sign bit into the remaining bits of the register.
This process is called sign extension.

For branching and memory addressing, we work with unsigned values.
It makes no sense to talk about a “negative memory address”. That’s why C and many other languages include unsigned integer types.

Bit Widths and Data Conversion

Another issue arises with data width.
You’ll often work with both 32-bit and 64-bit values. Since different data types have different lengths, we sometimes need to convert between them—especially for arithmetic.
Once again, sign extension comes to the rescue here.

As for one’s complement, it was another idea people explored. But it required an extra step for subtraction, so we stuck with two’s complement as the better option.

This concludes Part 1 of Chapter 2.

Part 2: Instruction Representation and Logic