User Tools

Site Tools


buzzword

Buzzwords

Buzzwords are terms that are mentioned during lecture which are particularly important to understand thoroughly. This page tracks the buzzwords for each of the lectures and can be used as a reference for finding gaps in your understanding of course material.

Lecture 1 (25.02 Thu.)

  • Computer Architecture
  • The transformation hierarchy
    • Levels of transformation
  • Design goals
  • Google tensor processing unit (TPU)
  • Tesla self-driving computer
  • Redundant cores for better safety
  • Energy efficiency and performance
  • Intel Optane persistent memory (3D-XPoint)
  • Cerebra's wafer scale engine
  • Graphics processing unit (GPU)
  • UPMEM processing-in-DRAM
  • Samsung Function-in-Memory DRAM
  • Processing-in-memory
  • AI/ML chips
  • Reliability and security
  • RowHammer
  • DRAM row
  • Meltdown and Spectre
  • Genome sequencing
    • Covid-19 nanopore sequencing
  • Computing paradigms
  • Accelerators (algorithm-hardware codesign)
  • Memory and storage systems
  • Bahnhof Stadelhofen
  • Santiago Calatrava
  • Architecture
  • Tradeoffs
  • Evaluation criteria
  • Principled design
  • Design constrains

Lecture 2a (26.02 Fri.)

  • Metrics
  • Tradeoffs
  • Principled design
  • Debug
  • Critical Thinking
  • Machine Learning
  • Combinational Logic
  • Sequential Logic

Lecture 2b (26.02 Fri.)

  • Transformation hierarchy
  • Hamming distance
  • Error correcting codes (ECC)
  • Levels of transformation
  • Abstraction levels
  • Accelerator
  • Meltdown & Spectre
  • Rambleed
  • DeepHammer
  • Microarchitecture
  • Security attacks
  • Hardware security vulnerabilities
  • Speculative execution
  • Microarchitecture
  • Instruction Set Architecture (ISA)
  • Cache
  • Timing side channel
  • Rowhammer
  • RDMA
  • Deep Neural Networks
  • Refresh rate
  • Disturbance errors
  • DRAM module
  • DRAM cell
  • Bit flips
  • Page table entry (PTE)
  • PARA: Probabilistic Adjacent Row Activation
  • Byzantine failures
  • Maslow’s Hierarchy
  • Reliability
  • DDR4
  • Technology node

Lecture 3a (4.03 Thu.)

  • Mysteries in Computer Architecture
  • DRAM Refresh
  • Retention Time Profile of DRAM
  • Manufacturing Process Variation
  • RAIDR: Eliminating Unnecessary DRAM Refreshes
  • Bloom Filters
  • VRT: Variable Retention Time
  • Memory Performance Attacks
  • Many Cores on Chip
  • Unexpected Slowdowns in Multi-Core
  • Disparity in Slowdowns
  • Memory Controller
  • DRAM Bank Operation
  • DRAM Row / Column
  • Row Buffer
  • FR-FCFS: First Ready First Come First Served
  • Row-Hit-first
  • Oldest-first
  • Denial-of-Service Attacks
  • Row Buffer Locality
  • Memory-Intensive Applications

Lecture 3b (4.03 Thu.)

  • Lab Sessions
  • Grading Policy
  • Deadlines for Lab Exercises and Lab Reports
  • The Transformation Hierarchy
  • Hardware Prototyping
  • Debugging a Hardware Implementation
  • Hardware Description Languages (HDL)
  • Hardware Design Flow
  • Computer-Aided Design (CAD)
  • Project Brainwave
  • Amazon EC2 F1
  • FPGA-based DNA Sequencing
  • FPGA-based DRAM Characterization
  • SoftMC
  • FPGA-based Flash Memory Characterization
  • Basys 3 FPGA Board
  • High Level Summary of Labs
  • Seven Segment Display
  • Finite State Machines
  • ALU: Arithmetic and Logic Unit
  • Testing and Simulation
  • Assembly Language
  • FPGA: Field Programmable Gate Array
  • FPGA Building Blocks
  • Look-Up Tables (LUT)
  • Switches
  • Multiplexers
  • Xilinx Zynq Ultrascale+
  • FPGA Design Flow
  • Xilinx Vivado
  • Verilog code
  • Logic Synthesis
  • Placement and Routing

Lecture 4 (05.03 Fri.)

  • Combinational logic circuits
  • Transistor
  • Moore’s Law
  • HW/SW interface
  • Boolean algebra
  • Boolean equations
  • Logic gates
  • Microprocessors
  • FPGA
  • ASIC
  • MOS transistor
  • Transistor gate/source/drain
  • Power supply, ground
  • n-type/p-type MOS transistor
  • nMOS
  • pMOS
  • Complementary MOS (CMOS)
  • Boolean inverter
  • CMOS NOT gate
  • Pull-up, pull-down
  • CMOS NAND gate
  • Truth table
  • CMOS AND gate
  • CMOS NOR gate
  • Common logic gates
  • Functional Specification
  • Timing Specification
  • Boolean Algebra Axioms
  • DeMorgan’s Law

Lecture 5 (11.03 Thu.)

  • Sum of Products Form (SOP)
  • Product of Sums (POS)
  • Decorder
  • Multiplexer
  • Selector
  • Full Adder
  • Programmable Logic Array (PLA)
  • Comparator
  • Arithmetic Logic Unit (ALU)
  • Tri-State Buffer
  • Karnaugh Maps
  • Binary Coded Decimal (BCD)

Lecture 6 (12.03 Fri.)

  • Combinational circuit
  • Sequential circuit
  • Boolean algebra
  • Latch
  • Falling edge
  • Rising edge
  • Flip-flop
  • Register
  • Finite State Machine (FSM)
  • Lookup Table (LUT)
  • State Transition Table
  • Moore Machine
  • Mealy Machine

Lecture 7 (18.03 Thu.)

  • Hardware Description Language (HDL)
  • Verilog
  • VHDL
  • Hierarchical design
  • Modules
  • Top-down / Bottom-up design methodologies
  • Top-level module, sub-module, leaf cell
  • Bus
  • Manipulating Bits
    • Bit slicing
    • Concatenation
    • Duplication
  • Behavioral HDL
  • Structural (gate-level) description
  • Behavioral / functional description
  • Bitwise operators
  • Reduction operators
  • Conditional assignments
  • Precedence of operators
  • Tri-state buffer
  • Synthesis
  • Simulation
  • Gate-level implementation
  • Parametrized modules
  • Sequential Logic in Verilog
  • Always block
  • Sensitivity list
  • Posedge
  • D Flip-Flop
  • Blocking assignment
  • Non-blocking assignment
  • Asynchronous/Synchronous reset
  • Blocking/Non-blocking assignment
  • Case statement
  • Implementing FSM

Lecture 8 (19.03 Fri.)

  • Finite state machine (FSM)
  • Clock
  • Next state logic
  • Output logic
  • Verilog implementation of an FSM
  • Timing
  • Area
  • Speed / Throughput
  • Power / Energy
  • Design time
  • Circuit timing
  • Combinational circuit timing
  • Combinational circuit delay
  • Contamination delay
  • Propagation delay
  • Longest / Shortest path
  • Critical path
  • Glitch
  • Fixing glitches with K-map
  • Sequential circuit timing
  • D flip-flop
  • Setup / Hold / Aperture time
  • Metastability
  • Non-deterministic convergence
  • Contamination delay clock-to-q
  • Propagation delay clock-to-q
  • Correct sequential operation
  • Hold time constraint
  • Timing analysis
  • Clock skew
  • Safe timing
  • Circuit verification
  • High level design
  • Circuit level
  • Functional equivalence
  • Functional tests
  • Timing constraints
  • Functional verification
  • Testbench
  • Device under test (DUT)
  • Simple / Self-checking / Automatic testbench
  • Wavefront diagrams
  • Clock generation
  • Golden model
  • Timing verification
  • Timing report / summary

Lecture 9 (25.03 Fri.)

  • von Neumann model
  • LC-3
  • MIPS
  • Assembly
  • Programming
  • Single-cycle
  • Microarchitecture
  • Multi-cycle
  • Processing
  • Memory
  • HW/SW Interface
  • Instruction Set Architecture
  • Inputs/Outputs
  • Control Unit
  • Address space
  • Byte-Addressable Memory
  • Big-Endian
  • Little-Endian
  • Register File
  • Fetch
  • Decode
  • Execute
  • opcode
  • ALU

Lecture 10a (26.03 Fri.)

  • von Neumann model
  • LC-3
  • MIPS
  • Assembly
  • The Instruction Set Architecture (ISA)
  • Opcodes
  • Data Types
  • Registers
  • Immediate
  • Literal
  • Program Counter (PC)
  • Offset
  • Operate Instructions
  • Data Path
  • Data Movement Instructions
  • Addressing Modes
  • Opcode
  • PC-Relative Addressing Mode
  • Indirect Addressing Mode
  • Base+Offset Addressing Mode
  • Immediate Addressing Mode
  • Control Flow Instructions
  • Condition Codes
  • Conditional Branches
  • Complex Instructions
  • Simple Instructions
  • x86, VAX, SIMD ISAs, VLIW ISAs, PowerPC, RISC ISAs

Lecture 10b (26.03 Fri.)

  • von Neumann model
  • LC-3
  • MIPS
  • Assembly Programming
  • The Instruction Set Architecture (ISA)
  • Sequential Construct
  • Conditional Construct
  • Iterative Construct
  • TRAP Instruction
  • Debugging
  • Conditional Statements
  • Loops
  • If Statement
  • While, For Loops
  • Arrays in MIPS
  • Function Calls
  • Stack

Lecture 11 (1.04 Thu.)

  • Microarchitecture
  • Von Neumann Machine
  • Instruction Set Architecture (ISA)
  • Stored program computer
  • Sequential instruction processing
  • Unified memory
  • Instruction pointer
  • Data flow model
  • Data flow dependence
  • Instruction pointer
  • Data flow node
  • Control-flow execution order
  • Data-flow execution order
  • Pipeline
  • Instruction and data caches
  • General purpose registers
  • Virtual ISA
  • Single-cycle microarchitecture
  • Multi-cycle microarchitecture
  • Critical path
  • Control unit
  • Instruction Fetch
  • Instruction Decode
  • Functional units
  • Datapath
  • Control logic
  • Cycles Per Instruction (CPI)
  • Register file
  • Arithmetic Logic Unit (ALU)
  • Store writeback
  • Arithmetic and Logical instructions
  • Instruction types (R-type, I-type, J-type)
  • Multiplexer (MUX)
  • Source/destination register
  • Immediate value
  • Jump instruction
  • Conditional Branch

Lecture 12 (15.04 Thu.)

  • ALU: Arithmetic-Logic Unit
  • Single-cycle MIPS Datapath
  • Control signals
  • Datapath configuration
  • R-type, I-type, LW, SW, Branch, and Jump datapath configurations
  • Control logic
  • Hardwired control (combinational)
  • Sequential/Microprogrammed control
  • Performance analysis
  • CPI: Cycles per Instruction
  • Critical path
  • Slowest instruction
  • Execution time of an instruction / of a program
  • Single cycle microarchitecture complexity
  • Fetch, decode, evaluate address, fetch operands, execute, store result
  • Magic memory
  • Instruction memory and data memory
  • REP MOVS and INDEX instructions
  • Microarchitecture design principles
  • Bread and butter (common case) design and Amdahl's law
  • Balanced Design
  • Key system design principles: keep it simple, keep it low cost
  • Multi-cycle critical path
  • Multi-cycle microarchitecture
  • Multi-cycle performance
  • Overhead of register setup/hold times
  • Main controller FSM

Lecture 13 (16.04 Fri.)

  • Pipelining
  • Control & data dependence handling
  • State maintenance and recovery
  • Multi-cycle design
  • Concurrency
  • Instruction throughput
  • Assembly line processing
  • Pipeline stages
  • Uniformly partitionable suboperations
  • The instruction processing cycle
  • Instruction fetch (IF)
  • Instruction decode and Register operand fetch (ID/RF)
  • Execute/Evaluate memory address (EX/AG)
  • Memory operand fetch (MEM)
  • Store/writeback result (WB)
  • Pipeline registers
  • Control points
  • Control signals
  • Pipeline stalls
  • Resource contention
  • Dependences (data/control)
  • Long-latency (multi-cycle) operations
  • Data dependences
  • Flow dependence
  • Output dependence
  • Anti dependence
  • Data dependence handling
  • Interlocking
  • Scoreboarding
  • Combinational dependence check logic
  • Data forwarding/bypassing
  • RAW dependence handling
  • Stalling hardware

Lecture 14 (22.04 Thu.)

  • Data dependences
  • Stalling
  • Stalling hardware
  • Hazard unit
  • Control dependences
  • Branch misprediction penalty
  • Instructions flushing
  • Early branch resolution
  • Data forwarding
  • Branch prediction
  • Pipelined performance
  • SPECINT2017 benchmark
  • Average CPI
  • Software-based interlocking
  • Hardware-based interlocking
  • Pipeline bubbles
  • Software-based instruction scheduling
  • Hardware-based instruction scheduling
  • Static / dynamic scheduling
  • Fine-grained multithreading
  • HEP
  • FGMT

Lecture 15a (23.04 Fri.)

  • Data dependences
  • Stalling
  • Stalling hardware
  • Hazard unit
  • Control dependences
  • Branch misprediction penalty
  • Instructions flushing
  • Early branch resolution
  • Data forwarding
  • Branch prediction
  • Pipelined performance
  • SPECINT2006 benchmark
  • Average CPI
  • Software-based interlocking
  • Hardware-based interlocking
  • Pipeline bubbles
  • Software-based instruction scheduling
  • Hardware-based instruction scheduling
  • Static / dynamic scheduling
  • Variable-length operation latency
  • Profiling
  • Multi-cycle execution
  • Exceptions
  • Interrupts
  • Precise exceptions / interrupts
  • Instruction retiring
  • Exception handling
  • Precise exceptions in pipelining
  • Reorder buffer (ROB)
  • ROB entry
  • Content Addressable Memory (CAM)
  • Register renaming
  • Architectural register ID
  • Physical register ID
  • Output dependences
  • Anti dependences
  • In-order pipeline with ROB

Lecture 15b (23.04 Fri.)

  • In-order pipeline
  • Stalling
  • Latency
  • Dispatch Stalls
  • In-order dispatch
  • Out-of-order dispatch
  • Reservation stations
  • Out-of-order (OoO)
  • Out-of-order execution
  • Functional unit (FU)
  • Tomasulo's Algorithm

Lecture 16 (29.04 Thu.)

  • OoO: Out of Order Execution
  • Tomasulo's algorithm
  • Register alias table (RAT)
  • Physical register file (PRF)
  • Tag/value broadcast
  • Reservation station
  • Instruction scheduling/dispatching
  • Instruction window
  • Dataflow graph
  • Precise exceptions
  • Frontend register file
  • Architecture register file
  • Reorder buffer (ROB)
  • Register renaming
  • Latency tolerance
  • Instruction window size
  • Memory disambiguation / unknown address problem
  • Store - Load dependency
  • LQ/SQ: Load Queue / Store Queue
  • Data forwarding between stores and loads

Lecture 17 (30.04 Thu.)

  • Single-cycle Micorarchitecutres
  • Pipeling
  • Dataflow
  • Out-of-order
  • Superscalar Execution
  • In-Order Superscalar
  • Control Dependence
  • Branch Prediction
  • Program Counter
  • Global Branch History
  • Compile Time (Static)
  • Run Time (Dynamic)

Lecture 18 (06.05 Thu.)

  • Branch Predcitor
  • Direction Prediction
  • Branch Target Buffer (BTB)
  • Compile time branch prediction
  • Run time branch prediction
  • Always taken / not taken
  • Backward taken, forward not taken
  • Profile based
  • Program analysis based
  • Last time prediction
  • Two bit counter based prediction
  • Global Branch Correlation
  • Two-level global branch prediction
  • Two level prediction
  • Hybrid branch prediction
  • Perceptron based branch prediction
  • TAGE
  • Tag and BTB index
  • Branch history table
  • Hysteresis
  • Bimodal prediction
  • Two-level adaptive training branch prediction
  • Pattern history table
  • Global history register
  • Global predictor accuracy
  • Alpha 21264 Tournament Predictor
  • Local and global prediction
  • Loop branch detector and predictor
  • Perceptron based branch predictor
  • Hybrid history length based predictor
  • Prediction function
  • Training function
  • Branch confidence estimation
  • Handling control dependencies
  • Delayed branching
  • Fancy delayed branching

Lecture 19a (07.05 Fri.)

  • Very Long Instruction Word (VLIW)
  • Superscalar
  • Lock-Step Execution
  • RISC
  • Commercial VLIW Machines
  • VLIW Tradeoffs
  • Superblock
  • ISA translation

Lecture 19b (07.05 Fri.)

  • Systolic Arrays
  • Processing Element (PE)
  • Regular array of PEs
  • Convolutional Neural Network
  • Matrix Multiplication
  • LeNet-5, AlexNet, GoogLeNet, ResNet
  • Two-Dimensional Systolic Arrays
  • Programmability in Systolic Arrays
  • Staged execution
  • Pipeline-Parallel (Pipelined) Programs
  • Google's TPU

Lecture 19c (07.05 Fri.)

  • Decoupled Access-Execute (DAE)
  • Instruction stream
  • ISA-visible queue
  • Loop unrolling

Lecture 20 (14.05 Fri.)

  • Array Processor
  • Vector Processor
  • Lane
  • Vector Registers
  • Vector Mask
  • Flynn
  • Amdahl's law
  • SIMD
  • SISD
  • MIMD
  • MISD

Lecture 21 (20.05 Thu.)

  • SIMD Processing
  • Regular (Data) Parallelism
  • Automatic Code Vectorization
  • Fine-Grained Multi-threading (FGMT)
  • Graphics Processing Units (GPU)
  • Programming Model
  • Hardware Execution Model
  • Single Program Multiple Data (SPMD)
  • Single Instruction Multiple Thread (SIMT)
  • Warp
  • Branch divergence
  • Warp-based SIMD
  • Dynamic Warp Formation/Merging
  • Stream processor
  • Tensor cores

Lecture 22 (21.05 Fri.)

  • Memory bottleneck
  • Data movement
  • Virtual memory
  • Physical memory
  • Memory array
  • Address decoder
  • Access transistor
  • Wordline
  • Bitline
  • Channel
  • DIMM
  • Rank
  • Bank
  • Subarray
  • Mat
  • Interleaving (banking)
  • Memory controller
  • DRAM Row
  • DRAM Column
  • Cache block
  • Phase Change Memory (PCM)
  • Intel Optane Persistent Memory
  • Non-volatile main memory
  • 3D-XPoint Technology

Lecture 23 (27.05 Thu.)

  • DRAM vs SRAM
  • Phase Change Memory (PCM)
  • DRAM vs PCM
  • Memory hierarchy
  • Memory locality
  • Temporal locality
  • Spatial locality
  • Caching basics
  • Caching in a pipelined design
  • Hierarchical latency analysis
  • Access latency and miss penalty
  • Hit-rate, miss-rate
  • Direct-mapped cache
  • Set associativity
  • Full associativity
  • Eviction/replacement policy

Lecture 24 (28.05 Fri.)

  • Memory Hierarchy and Caches
  • Register File
  • Swap
  • Register Spilling
  • Demand Paging
  • Direct-Mapped
  • Set Associativity
  • Tag Store
  • Data Store
  • Byte in Block
  • Cache Conflicts
  • Full Associativity
  • Promotion
  • Insertion
  • Eviction/Replacement
  • Eviction/Replacement Policy
  • Random Replacement
  • FIFO Replacement
  • Least Recently Used (LRU) Replacement
  • Not LRU
  • Least Frequently Used
  • Hybrid Replacement Policies
  • Optimal Replacement Policy
  • Approximations of LRU
  • True LRU / Perfect LRU
  • Hierarchical LRU
  • Victim-NextVictim Replacement
  • Set Thrashing
  • Belady's OPT
  • Tag Store Entry
  • Valid bit
  • Tag
  • Replacement Policy Bits
  • Dirty bit
  • Write Back vs. Write Through Caches
  • Allocate on Write Miss
  • No-Allocate on Write Miss
  • Subblocked (Sectored) Caches
  • Instruction vs. Data Caches
  • Separate or Unified Caches
  • Multi-Level Caching
  • Serial vs. Parallel Access to Cache Levels
  • Cache Performance
  • Cache Parameters vs. Miss/Hit Rate
  • Working Set
  • Spatiotemporal Locality
  • Compulsory miss
  • Capacity miss
  • Conflict miss
  • Prefetching
  • Victim cache
  • Software Hints
  • Critical Word First
  • Non-Blocking Caches
  • Loop Interchange
  • Tiling
  • Data Reuse
  • Memory Level Parallelism (MLP)
  • Parallel Miss
  • Isolated Miss
  • Shared vs. Private Caches
  • Resource Sharing
  • Cache Coherence
  • Interconnection Network
  • Cache Snooping
  • Prefetching
  • Software Prefetching
  • Hardware Prefetching
  • Prefetch Instructions
  • Execution-Based Prefetchers
  • Runahead Execution

Lecture 26a (04.06 Fri.)

  • Virtual memory
  • Physical memory
  • Infinite capacity
  • Relocation
  • Protection and isolation
  • Sharing
  • Linear address
  • Real address
  • Page table
  • OS-managed lookup table
  • Address translation
  • Physical frame
  • Demand paging
  • Placing, replacement, granularity of management, write policy
  • Page
  • Page size
  • Page offset
  • Page fault
  • Virtual page number (VPN)
  • Physical page number (PPN)
  • Virtual address
  • Physical address
  • Physical page number (physical frame number)
  • Page replacement policy
  • Page dirty bit
  • Page table base register (PTBR)
  • Page fault
  • Multi-level (hierarchical) page table
  • Translation Lookaside Buffer (TLB)
  • Memory Management Unit (MMU)
  • Page Table Entry
  • Tag store
  • Page hit, page fault
  • OS trap handler
  • Direct Memory Access (DMA)
  • Interrupt processor
  • Access protection bits
  • Access protection exception
  • Privilege levels
  • DRAM disturbance errors
  • RowHammer

Lecture 26b (04.06 Fri.)

  • Architectures for Intelligent Machines
  • Runahead Execution
  • Processing in Memory
  • Genome Sequencing & Analysis
buzzword.txt · Last modified: 2021/06/04 21:34 by haimao