Immediate Formats in RISC V

RISC‑V defines six 32‑bit instruction formats, each optimized for a specific operation. In this article, we’ll review the R, I, S, B, U, and J formats and explore how they integrate into the datapath.


Why Six Formats?

A single 32‑bit instruction must encode different fields—registers, immediates, or offsets—depending on the operation. Rather than forcing every instruction into one shape, RISC‑V provides six formats that precisely match the information required:

  1. R-Type: Register-to-register arithmetic and logic (e.g., ADD rd, rs1, rs2).

  2. I-Type: Arithmetic or memory operations with a 12‑bit signed immediate (e.g., ADDI rd, rs1, imm).

  3. S-Type: Store operations with a split 12‑bit offset (e.g., SW rs2, imm(rs1)).

  4. B-Type: Conditional branches with a 13‑bit signed offset (e.g., BEQ rs1, rs2, offset).

  5. U-Type: Upper immediate instructions (e.g., LUI rd, imm or AUIPC rd, imm) that set the top 20 bits.

  6. J-Type: Unconditional jumps with link using a 21‑bit signed offset (e.g., JAL rd, offset).


1. R-Type: Register‑Register Operations

31   25  24  20  19  15 14 12 11   7 6    0
┌───────┬───────┬───────┬───────┬───────┬──────┐
 funct7  rs2    rs1  funct3   rd   opcode
└───────┴───────┴───────┴───────┴───────┴──────┘
  • Operation: Reads two source registers (rs1, rs2), applies the ALU operation selected by funct7 and funct3, and writes the result to rd.

  • Use cases: ADD, SUB, SLL, SLT, XOR, OR, AND.


2. I-Type: Immediate Operations

31      20  19  15 14 12 11   7 6    0
┌──────────┬───────┬───────┬───────┬───────┐
 imm[11:0]  rs1  funct3   rd   opcode 
└──────────┴───────┴───────┴───────┴───────┘
  • Operation: Extracts a 12‑bit signed immediate from bits [31:20], sign‑extends it via IMMGEN, and uses it as the second ALU operand.

  • Use cases: ADDI, ANDI, ORI, loads (LB, LH, LW), JALR, system instructions.


3. S-Type: Store Operations

31    25 24  20 19  15 14 12 11   7 6    0
┌────────┬───────┬───────┬───────┬───────┬──────┐
imm[11:5]  rs2    rs1  funct3 imm[4:0]opcode
└────────┴───────┴───────┴───────┴───────┴──────┘
  • Operation: Combines imm[11:5] and imm[4:0] into a 12‑bit signed offset, sign‑extends it, then computes address = rs1 + offset to store rs2.

  • Use cases: SB, SH, SW (store byte, halfword, word).


4. B-Type: Conditional Branches

31 30  25 24  20 19  15 14 12 11  8 7 6   0
┌───┬───────┬───────┬───────┬───────┬──────┬┬──────┐
i12 i10:5   rs2    rs1  funct3 i4:1 i11opcode
└───┴───────┴───────┴───────┴───────┴──────┴┴──────┘
  • Operation: Assembles a 13‑bit signed offset from scattered bits, sign‑extends it, shifts left by one, and adds it to the PC if the condition (funct3 on rs1 vs rs2) is true.

  • Use cases: BEQ, BNE, BLT, BGE, etc.


5. U-Type: Upper Immediate Instructions

31      12 11   7 6    0
┌──────────┬───────┬───────┐
imm[31:12]  rd   opcode 
└──────────┴───────┴───────┘
  • Operation: Takes the top 20 bits as an immediate and zeroes the lower 12 bits. AUIPC adds this value to the PC; LUI writes it directly to rd.

  • Use cases: LUI, AUIPC.


6. J-Type: Jump and Link

31 30   21 20 19   12 11   7 6    0
┌───┬───────┬──┬───────┬───────┬───────┐
j20 j10:1 j11j19:12  rd   opcode 
└───┴───────┴──┴───────┴───────┴───────┘
  • Operation: Gathers a 21‑bit signed offset, sign‑extends it, shifts left by one, writes PC+4 to rd, then sets PC = PC + offset.

  • Use cases: JAL.

Why is it all over the place

By weaving the J-type immediate into that same pattern (as other instruction formats)—just adding one more chunk (imm[11] in bit 20, imm[10:1] in [30:21], imm[19:12] in [19:12], imm[20] in [31])—you let the hardware share much of the same wiring and multiplexing logic for both branches and jumps.

If jumps had a brand-new contiguous field at [31:12], you’d need an extra wide “if J-type then use bits[31:12]” path, in addition to the B-type paths for [31], [30:25], [11:8], [7], etc.


Cross-Linking Concepts

  • If you recall from our IMMGEN deep-dive, that module is the “stencil press” that carves out these immediates and sign-extends them before sending them off to the ALU or PC logic.

  • And remember how the control unit picks an imm_sel? That’s its way of choosing the right factory line for each stencil.


These six formats keep the datapath clean and the control logic straightforward. Each format precisely matches the fields an instruction needs—nothing more, nothing less.

Summary:

  • rs2, rs1 and rd are always 5 bits long. This is because they are referring to a register. Since most CPUs have 32 registers, we would need 5 bits. However, if you are using more or less registers, you can change the bit length. (Some CPUs use 64 bits.)
  • funct7 is only used in R-type instructions. This is because it is only used by ALU Control. They help in further decoding instructions like telling add and sub apart.
  • funct3 is used by Control signals
31   25 │ 24  20 │ 19  15 │14 12 │11   7 │6    0
┌───────┬───────┬───────┬───────┬───────┬──────┐
│ funct7│  rs2  │  rs1  │funct3 │  rd   │opcode│
└───────┴───────┴───────┴───────┴───────┴──────┘
31      20 │ 19  15 │14 12 │11   7 │6    0
┌──────────┬───────┬───────┬───────┬───────┐
│ imm[11:0]│  rs1  │funct3 │  rd   │opcode │
└──────────┴───────┴───────┴───────┴───────┘
31     25 │24  20 │19  15 │14  12 │11    7 │6    0
┌──────-──┬───────┬───────┬───────┬─────-──┬──────┐
│imm[11:5]│  rs2  │  rs1  │funct3 │imm[4:0]│opcode│
└───────-─┴───────┴───────┴───────┴──────-─┴──────┘
31  │30   25│24  20 │19  15 │14  12 │11  8 │7  │6    0
┌───┬───────┬───────┬───────┬───────┬──────┬---┬──────┐
│i12│ i10:5 │  rs2  │  rs1  │funct3 │i4:1  │i11│opcode│
└───┴───────┴───────┴───────┴───────┴──────┴---┴──────┘
31      12 │11   7 │6    0
┌──────────┬───────┬───────┐
│imm[31:12]│  rd   │opcode │
└──────────┴───────┴───────┘
31 │30   21 │20 │19   12 │11   7 │6    0
┌───┬───────┬──-┬───────┬───────┬───────┐
│j20│ j10:1 │j11│ j19:12│  rd   │opcode │
└───┴───────┴──=┴───────┴───────┴───────┘

References:

https://www.cs.sfu.ca/~ashriram/Courses/CS295/assets/notebooks/RISCV/RISCV_CARD.pdf