Learning Logs

This is where I will update my daily progress and activities.

Week 10 Learning Log

Lab: CALAS, City University of Hong Kong Week of: July 21–27, 2025 Weekly Goals 1. Project Understanding Goal: Solidify knowledge of RISC-V assembly programming fundamentals. Actions Taken: Studied RISC-V procedures, branching logic (jal, jalr), stack management, frame pointers, and loop implementation. Analyzed decision-making patterns in branching (conditional jumps, loops) and memory allocation via stack frames. Reflection: Clarified how high-level constructs (e.g., functions, loops) map to assembly instructions. Understand stack dynamics for nested procedures and variable storage.

July 27, 2025

Week 5

Week 5 Learning Log (June 16–20, 2025) 1. Objectives Library Documentation Finalize docs 01–07 for RVSvKit Repository & CI Scaffolding Set up the RVSvKit GitHub repo Add scaffolding scripts for build/test automation Dependency Installation Verilator, Icarus, Dockerized Vivado SystemVerilog Fundamentals Ports Nets vs. variables Tasks vs. functions 2. Daily Activities 📅 Monday, June 16 Docs 01 & 02

June 20, 2025

Week 4

Week 4 Learning Log (June 9–13, 2025) 1. Objectives Literature Review & Blogging Complete Chapter 1 article on Patterson & Hennessy Read Chapters 1, 2, and 4 from Harris & Harris (2022) Website Development Implement table styling and internal linking on Hugo site Redesign homepage and publish backlog articles Professional Skill Building Attend TA training session Learn TCL scripting for tool automation

June 13, 2025

Week 3

Week 3 Learning Log (June 2–6, 2025) 1. Objectives ALU Implementation & Testing Design a simple 4-bit ALU using Vivado IP Integrator arithmetic blocks (alu_bd.v) Verify functionality via testbench and hardware simulation FPGA Design Flow Meet with Sanka to review end-to-end FPGA design flow (synthesis, implementation, place-and-route) Reading & Blogging Read Patterson & Hennessy, Chapter 1 (“Computer Abstractions and Technology”) Write blog posts summarizing Chapter 1 divided into four thematic sections: Foundations & The Eight Great Ideas Inside the Machine – Abstraction Layers & Technologies Performance, Power & the “Sea Change” Real-World Examples & Wrap-Up Lab Session Attend the Edge AIoT and Microelectronics TA session to receive training, lab components and complete assigned lab 2. Daily Activities 📅 Monday, June 2 4-bit ALU via IP Integrator Opened a new Vivado project. Launched IP Integrator: instantiated Xilinx “Arithmetic & Logic” IP (configured for 4-bit width). Wired up inputs: A[3:0], B[3:0] ports op[2:0] to select ADD, SUB, MULTIPLY, DIVIDE Added constant generator blocks to drive OpSel and tested all combinations. Generated Block Design wrapper and created alu_bd.v top-level module. Testbench & Simulation Wrote alu_tb.v to sweep A/B values and check results for each op value. Ran behavioral simulation in XSim; verified: ADD: correct sum and carry-out SUB: correct two’s-complement difference and borrow flag MULTIPLY: correct product DIVIDE: correct quotient and remainder 📅 Tuesday, June 3 Reading: P&H Chapter 1, Section 1 (“Foundations & The Eight Great Ideas”) Covered motivation for studying computer architecture and the shift from uniprocessor to multicore. Identified the eight design principles: Abstraction Pipelining Parallelism Prediction Memory Hierarchy Hierarchical Protection Reliability Energy Efficiency Took detailed notes on how each principle recurs in modern CPU/SoC design. Blog Drafting: Section 1 Drafted “Foundations & The Eight Great Ideas”: Motivation for performance (Moore’s Law slowing) Core principles that transcend specific technologies Examples: how pipelining and parallelism appear in multicore CPUs 📅 Wednesday, June 4 Meeting with Sanka – FPGA Design Flow Review Discussed steps from RTL → synthesis → implementation → place-and-route → timing closure → bitstream generation Edge AIoT & Microelectronics TA Session Attended the TA training lab: Received training on Fundamentals of Microelectronics and Digital Systems Design Received lab components (breadboard, transistors, resistors, wiring) Completed initial lab Reading: P&H Chapter 1, Section 2 (“Inside the Machine – Abstraction Layers & Technologies”) Explored layers from high-level code down to transistors: ISA, microarchitecture, logic, devices, circuits Reviewed core technologies: static CMOS, SRAM, DRAM, interconnect fabrics Blog Drafting: Section 2 Drafted “Inside the Machine – Abstraction Layers & Technologies”: 📅 Thursday, June 5 Reading: P&H Chapter 1, Section 3 (“Performance, Power & the ‘Sea Change’”) Learned performance metrics: CPI (cycles per instruction), instruction count, clock rate Reviewed Amdahl’s Law and its implications for parallelism Understood power constraints: the Power Wall, Dark Silicon, and energy efficiency trends Blog Drafting: Section 3 Drafted “Performance, Power & the ‘Sea Change’”: How to calculate CPU performance using CPI×IC×ClkPeriod Why single-threaded frequency scaling plateaued, necessitating multicore Introduction to power-performance trade-offs and dynamic voltage/frequency scaling (DVFS) 📅 Friday, June 6 Reading: P&H Chapter 1, Section 4 (“Real-World Examples & Wrap-Up”) Examined Intel Core i7 benchmark analysis: how the eight ideas appear in a commercial CPU Reviewed common fallacies (e.g., “faster clock always wins”) and pitfalls (e.g., ignoring memory latency) Identified the five classic components of a computer: Processing Unit Memory Unit I/O Unit Network/Interconnect Storage Blog Drafting: Section 4 Drafted “Real-World Examples & Wrap-Up”: Applied principles (pipelining, caching) to Intel Core i7 data Summarized the five components as a roadmap for the rest of the book Documentation & Website Updates Uploaded alu_tb.v, testbench, and waveform captures to Hugo site Wrote Week 3 blog posts for all four sections: Foundations, Abstractions, Performance & Power, Examples & Wrap-Up 3. Key Learnings 4-bit ALU via IP Integrator Leveraged Xilinx arithmetic IP to build ADD, SUB, AND, OR, XOR operations in one block Verified that IP Integrator correctly generated ports and constraints; saw how the block maps to LUTs/FFs FPGA Design Flow Understood the complete Vivado flow from RTL to bitstream, including critical constraint and timing steps P&H Chapter 1 Highlights Foundations & Eight Great Ideas: Core principles (abstraction, pipelining, parallelism, memory hierarchy, etc.) form the basis of all architectures Abstraction Layers: How high-level software ultimately relies on transistor-level implementations; importance of mapping optimizations across layers Performance & Power: Metrics (CPI, instruction count, clock rate), Amdahl’s Law, and power-limited scaling leading to multicore designs Real-World Examples: Intel Core i7 data shows lessons in pipelining, caching, and parallel thread execution; five classic components framework guides subsequent chapters

June 6, 2025

Week 2

Week 2 Learning Log (May 26–30, 2025) 1. Objectives HDL Implementation & Simulation Implement and simulate the following combinational primitives: 4-bit Subtractor (subtractor4.v) 2-to-1 Multiplexer (mux2to1.v) 4-to-1 Multiplexer (mux4to1.v) Toolflow Practice Create testbenches to verify each module in XSim Synthesize all Week 2 designs in Vivado and analyze LUT/CLB usage Documentation & Blogging Draft blog posts on subtractor design (two’s-complement) and multiplexer architectures Reading (acquired Friday, May 30) Start reading Harris & Harris and Patterson & Hennessy chapters on combinational components 2. Daily Activities 📅 Monday, May 26 4-bit Subtractor (subtractor4.v) Designed two’s-complement subtractor by inverting B inputs, adding 1 (carry-in) to a 4-bit adder. Wrote subtractor4.v: module subtractor4 ( input [3:0] A, input [3:0] B, output [3:0] D, output BorrowOut ); wire [3:0] B_inv; wire carry_in = 1'b1; assign B_inv = ~B; // bitwise invert B // reuse prop_adder for A + (¬B) + 1 prop_adder adder_inst ( .A (A), .B (B_inv), .CI (carry_in), .SUM (D), .CO (BorrowOut) ); endmodule Created testbench subtractor4_tb.v applying A,B pairs: (4’b0101 – 4’b0011 = 2) (4’b0010 – 4’b0100 = –2) (4’b1000 – 4’b1000 = 0), etc. Simulated in XSim; confirmed correct 4-bit difference and borrow flag. 📅 Tuesday, May 27 2-to-1 Multiplexer (mux2to1.v) Wrote mux2to1.v to select between two 8-bit inputs for practice: module mux2to1 #( parameter WIDTH = 8 )( input [WIDTH-1:0] D0, input [WIDTH-1:0] D1, input SEL, output [WIDTH-1:0] Y ); assign Y = SEL ? D1 : D0; endmodule Created mux2to1_tb.v to test all SEL / data combinations (e.g., D0=8’hAA, D1=8’h55). Ran XSim behavioral simulation; verified correct output switching. 📅 Wednesday, May 28 Meeting with Sanka & Verilog Syntax Review Met with Sanka to go over Verilog syntax nuances: module definitions, always blocks, non-blocking vs. blocking assignments, and best practices for naming conventions. 4-to-1 Multiplexer (mux4to1.v) Extended multiplexer logic to four inputs: module mux4to1 #( parameter WIDTH = 8 )( input [WIDTH-1:0] D0, input [WIDTH-1:0] D1, input [WIDTH-1:0] D2, input [WIDTH-1:0] D3, input [1:0] SEL, output [WIDTH-1:0] Y ); always @(*) begin case (SEL) 2'b00: Y = D0; 2'b01: Y = D1; 2'b10: Y = D2; 2'b11: Y = D3; endcase end endmodule Wrote mux4to1_tb.v to exercise SEL = 00,01,10,11 with distinct patterns on D0–D3. Simulated to confirm correct selection and no glitches. 📅 Thursday, May 29 Synthesis & Resource Analysis Added subtractor4.v, mux2to1.v, and mux4to1.v to a Vivado project. Ran synthesis for each module: Subtractor4 → used ~5 LUTs (4 for each inverted bit & one for adder instrumentation). Mux2to1 (8-bit) → 8 LUTs (one per bit). Mux4to1 (8-bit) → ~16 LUTs (2:1 trees or equivalent). Reviewed Utilization Reports and CLB Mapping to understand LUT distribution and routing overhead. Blog Writing Drafted a post: “Implementing a 4-bit Two’s-Complement Subtractor” covering: Two’s-complement basics (invert + add 1). Verilog implementation leveraging the existing prop_adder. Simulation results and borrow-out interpretation. Drafted a post: “Multiplexer Architectures in FPGA” covering: 2:1 vs. 4:1 multiplexer logic. LUT-based implementation and resource considerations. Simulation snapshots illustrating glitch-free switching. 📅 Friday, May 30 Book Access & Reading Received Digital Design & Computer Architecture (Harris & Harris) and Computer Organization & Design (Patterson & Hennessy). Read Harris & Harris, Ch 2 (Sect 2.4 “Adders and Subtractors”) to reinforce subtractor theory and comparator design. Read Harris & Harris, Ch 3 (Sect 3.2 “Multiplexers and Demultiplexers”) for mux implementation details. Read Patterson & Hennessy, Ch 3 (Sect 3.3 “Subtracters and Extensions”) and Ch 2 (Sect 2.2 “R-Type ALU Operations”) for context on how subtractors map to ALU control signals. Made notes on best practices for coding subtractors/multiplexers in Verilog and FPGA-friendly optimizations. 3. Key Learnings Two’s-Complement Subtraction Implemented as A + (~B) + 1; borrow-out corresponds to final carry-out. Reusing a ripple-carry adder greatly simplifies subtractor design. Multiplexer Implementation 2-to-1 Mux: single LUT per bit when width = 1; for WIDTH > 1, replicate per bit. 4-to-1 Mux: often built as two cascaded 2:1 stages → higher LUT count; careful case coding avoids glitches. Verilog Syntax Refinement Distinction between blocking (=) and non-blocking (<=) assignments in sequential logic. Best practices: use clear module port lists, consistent indentation, and meaningful signal names. Importance of always @(*) for purely combinational case statements. Resource Utilization Subtractor4 consumed ~5 LUTs + routing. Mux2to1 (8-bit) used 8 LUTs; Mux4to1 (8-bit) used ~16 LUTs. Vivado’s utilization reports help anticipate resource requirements for larger datapaths. Reading Insights Harris & Harris Ch 2–3 emphasize building subtractors via inverter + adder and show LUT-based mux implementations. Patterson & Hennessy clarify how ALU control signals select between operations (ADD vs. SUB, etc.), reinforcing Week 3 FSM control concepts.

May 30, 2025

Week 1

Week 1 Learning Log (May 19–23, 2025) 1. Objectives FPGA Architecture & Tools Understand FPGA internal architecture: CLBs (LUTs, muxes, flip-flops), on-chip SRAM/Block RAM Install and configure Vivado/Vitis 2022.2 on Windows HDL Implementation & Simulation Implement and simulate basic arithmetic primitives: Half Adder (half_adder.v) Full Adder (full_adder.v) 4-bit Ripple Carry Adder (prop_adder.v) Reading & Documentation Read and reviewed Combinational Logic lectures from EE2000 Draft concise write-ups on SRAM cell operation, bistable flip-flops, and LUT fundamentals 2. Daily Activities 📅 Monday, May 19 FPGA Architecture Overview Studied CLB internals: each CLB contains LUTs backed by SRAM bits, local multiplexers, and flip-flops Reviewed Block RAM (SRAM-based) structure: how 6-T SRAM cells store truth tables and provide synchronous read/write Explored programmable interconnect fabric: how CLBs interconnect via switch matrices. Planner & Timeline Created a high-level internship timeline, aligning Weeks 1–12 with incremental HDL targets and chapter readings. 📅 Tuesday, May 20 Vivado/Vitis Installation Downloaded and ran AMD Unified Installer 2022.2 (includes Vivado and Vitis) Configured WebPACK license; confirmed license activation within Vivado. Set up environment variables and verified vivado –version and vitis –version on Windows. 📅 Wednesday, May 21 Half Adder Implementation Wrote half_adder.v: module half_adder (input a, b, output sum, carry); assign sum = a ^ b; assign carry = a & b; endmodule Created testbench Half_Adder_tb.v; applied all four (a,b) combinations. Ran behavioral simulation in XSim; verified truth-table matches expected sum/carry outputs. Reading Read fundamentals of gates, combinational logic, half-adder/full-adder Read implementation details for sum = a ⊕ b, carry = a ∧ b (). 📅 Thursday, May 22 Full Adder & 4-bit Ripple Carry Adder Imported half_adder.v into a new Vivado RTL project. Developed full_adder.v by cascading two half-adders plus an OR gate: module full_adder (input a, b, cin, output sum, cout); wire s1, c1, c2; half_adder ha0 (.a(a), .b(b), .sum(s1), .carry(c1)); half_adder ha1 (.a(s1),.b(cin),.sum(sum),.carry(c2)); assign cout = c1 | c2; endmodule Created Full_Adder_tb.v; applied all eight (a,b,cin) vectors in XSim to verify sum/ cout. Extended to 4-bit ripple carry adder (prop_adder.v) by chaining four full_adder instances in Vivado’s IP Integrator (). Simulated prop_adder.v to confirm correct 4-bit addition and carry-propagation behavior. Reading Read about full-adder and ripple-carry adder architectures . Read about cascading full-adders for multi-bit addition. 📅 Friday, May 23 Synthesis & Resource Analysis Ran synthesis in Vivado for half_adder.v, full_adder.v, and prop_adder.v. Examined Utilization Report: Half Adder → 1 LUT, 0 FFs Full Adder → 2 LUTs + 1 LUT for OR, 3 FFs 4-bit CPA → 4 × (full adder) LUT usage + routing overhead (). Viewed CLB Mapping: traced how LUT outputs feed into adjacent CLBs to propagate carry. Weekly Documentation Wrote three short blog posts: SRAM Basics: detailed 6-T cell operation and how LUT/SRAM bits store truth tables. Bistable Flip-Flops: explained edge-triggered D-FF operation, asynchronous reset, and how FPGA fabric implements them. LUT Internals: described how LUTs map their address bits into SRAM contents to realize arbitrary Boolean functions. 3. Key Learnings Configurable Logic Block (CLB) A CLB contains several small LUTs, flip-flops, and local multiplexers. The LUT is implemented using SRAM cells to encode up to a 4- or 6-input Boolean function. Look-Up Tables (LUTs) Each LUT uses SRAM bits to store a truth-table; any combination of inputs addresses that table. Post-synthesis, I observed per-LUT utilization metrics. Block RAM (SRAM) Larger on-chip memory blocks consist of arrays of SRAM cells. They are inferred in Verilog via ram_style = “block” or instantiated via IP; useful for data storage in larger designs. Vivado Flow Typical flow: Project setup HDL source & test-bench creation Simulation (XSim behavioral) Synthesis → Implementation (place & route) Bitstream generation → Programming Setting up the correct .xdc constraint file is crucial before implementation. Arithmetic Modules Half Adder: implemented as sum = a ^ b, carry = a & b. Full Adder: two half-adders cascaded + cout = c1 | c2. 4-bit CPA: cascades four full adders; carry-out of stage i feeds carry-in of stage i + 1. Resource scaling: 4-bit CPA uses ~4× LUT resources of a single full adder plus interconnect. 4. Additional Activities Obsidian → Hugo Integration Initialize an Obsidian vault for note-taking and blog drafts. Configure a Hugo site locally (install Hugo, choose a theme). Link Obsidian’s Markdown folder to Hugo’s content/ directory so notes automatically become Hugo blog posts. Deployment on Vercel Set up a GitHub repository containing the Hugo site. Connect the repo to Vercel for automatic deployments on commits to main. Verify that pushing a new Markdown file (via Obsidian sync) triggers Hugo rebuild and Vercel deployment.

May 23, 2025

Week 6

Week 6 Learning LogJune 23 – June 27, 2025 1. Objectives CI Migration & Simulation Infrastructure Replace Icarus with Verilator for RTL simulation and CI integration. Validate simulation with basic functional modules and update the Makefile accordingly. Automation & Scripting Develop scripts to automate GitHub issue creation, milestone syncing, and project board updates based on weekly plans. TA Responsibilities Participate in training sessions in preparation for upcoming academic support duties. 2. Daily Activities 🗓 Monday, June 23 — Verilator Integration Rewrote simulation entry points (sim_main.cpp, sim_main_minimal.cpp) to support Verilator for both sequential and combinational modules.