Buzzwords

Buzzwords are terms that are mentioned during lecture which are particularly important to understand thoroughly. This page tracks the buzzwords for each of the lectures and can be used as a reference for finding gaps in your understanding of course material.

Lecture 1 (21.02 Thu.)

Computer Architecture
Gordon Moore
Moore's Law
Abstraction layers
Meltdown and Spectre
RowHammer
Vulnerability
Multi-core systems
Architecture
Santiago Calatrava
Bahnhof Stadelhofen
Tradeoffs
Evaluation criteria
Principled design
Design constrains
Fallingwater
Frank Lloyd Wright
Basic building blocks
Hamming distance
Levels of transformation
Instruction Set Architecture (ISA)
Microarchitecture
Logic

Lecture 2 (22.02 Fri.)

Transformation Hierarchy
Microarchitecture
ISA
Power of abstraction
Transmeta
Crossing abstraction layers
Meltdown
Spectre
Vulnerabilities
Speculative Execution
Cache
Side channel attack
Security
DRAM
Rowhammer
DRAM Refresh
Probabilistic Adjacent Row Activation
Byzantine Failures

Lecture 3 (28.02 Thu.)

Programming interface
Hardware-software interface
Rapid Prototyping
Debugging the hardware
Addition
Comparison Operation
Addition Operation
Seven Segment Display
Arithmetic and Logic Unit (ALU)
Full System Integration
FPGA: Field Programmable Gate Array
Reconfigurable
FPGA Building Blocks
Look-Up Tables (LUT)
Switches
Multiplexers
Hardware Description Language (HDL)
Xilinx Zynq Ultrascale+
Computer-Aided Design (CAD) Tools
Xilinx Vivado
Verilog code
Logic Synthesis
Placement and Routing

Lecture 4 (01.03 Fri.)

Combinational logic circuits
Big Data
Machine learning
Genome analysis
Transistor
Moore’s Law, Dennard Scaling
Data movement bottleneck
Main memory
HW/SW interface
Boolean algebra, Boolean Equations
Logic gates
FPGA
MOS transistor
Transistor gate/source/drain
Power supply, ground
n-type/p-type MOS transistor
Complementary MOS (CMOS)
Boolean inverter
CMOS NOT gate
Pull-up, pull-down
CMOS NAND gate
Truth table
CMOS AND gate
Logical completeness
Floating value (Z)
dynamic/static power consumption
Leakage current

Lecture 5 (07.03 Thu.)

Boolean algebra
Truth table
Pull-down network
Pull-up network
Dynamic power consumption
Static power consumption
Dynamic Voltage scaling
Energy consumption
Buffer
XOR
Moore’s Law
Transistor
Combinational Logic
Inputs, outputs
Functional specification
Timing specification
Sequential logic
Booleans equations
Axioms
Duality principle
Idempotent law
Involution Law
Associative Law
DeMorgan’s Law
Logic Circuit
Minterm
Complement
Literal
Implicant
Maxterm
Sum of Products Form (SOP)
Canonical Form
Minimal Form
Decoders
Multiplexers
Full adder
Programmable Logic Array (PLA)
DRAM decoder
Instruction Opcode
Binary addition
Logical Completeness
Tri-State Buffer
Floating Signal (Z)
Karnaugh Maps (k-map)
Uniting Theorem
Two-bit comparator

Lecture 6 (08.03 Fri.)

Logic minimization / simplification
Karnaugh Map (K-Map)
Bit value X: Don't care
BCD: Binary coded decimal
Sequential circuit
Circuits that can store information
Capturing data
Cross-coupled inverter
Metastable states
Storage element
Memory
SRAM (static random access memory)
DRAM (dynamic random access memory)
Flash memory, hard disk, tape, and non-volatility
Latches and flip flops
R-S (Reset-Set) Latch
Forbidden state
Gated D Latch
Register
Address
Reading from memory
Writing to memory
Wordline
Address Decoder
Multiplexer
Addressability
State
State machine diagram
Clock
Finite State Machine (FSM)
Next state logic
State register
Output logic
D Flip Flop
Master latch
Slave latch
Edge-triggered device
Rising edge
Falling edge
4-bit register
Moore machine
Mealy machine
Transition diagram
State encoding

Lecture 7 (14.03 Thu.)

Sequential circuit
Finite State Machine
Flip flop
State transition table
FSM state encoding
- Binary encoding
- One-hot encoding
- Output encoding
Moore and Mealy FSMs
State transition diagram
LC-3 processor
Hardware Description Language (HDL)
Synthesis
Verilog
VHDL
Hierarchical design
Primitive gates
Modules
Top-down / Bottom-up design methodologies
Top-level module, sub-module, leaf cell
Bus
Bit slicing
Concatenation
Duplication
Structural (gate-level) description
Behavioral / functional description
Instantiation
Bitwise operators
Reduction operators
Conditional assignments
Precedence of operators
Tri-state buffer
Gate-level implementation
Parametrized modules
Always block
Sensitivity list
Posedge
Blocking assignment
Non-blocking assignment
Asynchronous/Synchronous reset
Blocking/Non-blocking assignment
Glitches
Case statement
Rising edge
Falling edge

Lecture 8 (15.03 Fri.)

Area
Speed / Throughput
Power / Energy
Design time
Circuit timing
Combinational circuit timing
Combinational circuit delay
Contamination delay
Propagation delay
Longest / Shortest path
Critical path
Glitch
Fixing glitches with K-map
Sequential circuit timing
D flip-flop
Setup / Hold / Aperture time
Metastability
Non-deterministic convergence
Contamination delay clock-to-q
Propagation delay clock-to-q
Correct sequential operation
Hold time constraint
Timing analysis
Clock skew
Safe timing
Circuit verification
High level design
Circuit level
Functional equivalence
Functional tests
Timing constraints
Functional verification
Testbench
Device under test (DUT)
Simple / Self-checking / Automatic testbench
Wavefront diagrams
Clock generation
Golden model
Timing verification
Timing report / summary

Lecture 9 (21.03 Thu.)

Basic elements of a computer
The von Neumann Model
Addressability
Address Space
Word-Addressable Memory
MIPS
LC-3
MIPS memory
Unique address
Byte-addressable
Big endian vs Little endian
MAR and MDR
Load and store instructions
Processing unit
Arithmetic and Logic Unit (ALU)
Registers
Input and Output (IO)
Control Unit
Programmer Visible State
Instructions
Program Counter
Sequential Execution
Memory
Instruction set Architecture (ISA)
Instruction Format
R-type
Operate instructions
Load/Store Word
I-Type instruction
The Instruction Cycle
Fetch phase
Instruction register
Decode phase
Fetch operands
Jump
Unconditional branch or jump
Base register
The Instruction Set
Operand

Lecture 10 (22.03 Fri.)

Instruction Set Architecture (ISA)
Data types
2’s complement integer
Unsigned integer
Floating-point number
Semantic gap
Addressing modes
PC-relative addressing
Indirect addressing
Base+offset addressing
Immediate addressing
Control Flow Instructions
(Un)conditional branches, jumps
Condition codes
Assembly programming
Sequential/conditional/iterative constructs
OS service call (syscall)
TRAP instruction
Debugging
Function call conventions

Lecture 11 (28.03 Thu.)

Microarchitecture
Von Neumann Machine
ISA
Stored program computer
Sequential instruction processing
Unified memory
Instruction pointer
Data flow model
Data flow dependence
Instruction pointer
Data flow node
Control-flow execution order
Data-flow execution order
Pipeline
Instruction and data caches
General purpose registers
Virtual ISA
Single-cycle microarchitecture
Multi-cycle microarchitecture
Critical path
Control unit
Instruction Fetch
Instruction Decode
Functional units
Datapath
Control logic
CPI
Register file
ALU (Arithmetic Logic Unit)
Store writeback
Arithmetic and Logical instructions
Instruction types (R-type, I-type, J-type)
MUX (Multiplexer)
Source/destination register
Immediate value
Jump instruction
Conditional Branch

Lecture 12 (29.03 Fri.)

ALU: Arithmetic-Logic Unit
Single-cycle MIPS Datapath
Control signals
Datapath configuration
R-type, I-type, LW, SW, Branch, and Jump datapath configurations
Control logic
Hardwired control (combinational)
Sequential/Microprogrammed control
Performance analysis
CPI: Cycles per Instruction
Critical path
Slowest instruction
Execution time of an instruction / of a program
Single cycle microarchitecture complexity
Fetch, decode, evaluate address, fetch operands, execute, store result
Magic memory
Instruction memory and data memory
REP MOVS and INDEX instructions
Microarchitecture design principles
Bread and butter (common case) design and Amdahl's law
Balanced Design
Key system design principles: keep it simple, keep it low cost
Multi-cycle critical path
Multi-cycle microarchitecture
Multi-cycle performance
Overhead of register setup/hold times
Main controller FSM

Lecture 13 (04.04 Thu.)

Pipelining
Pipeline stage
Latch
Control signals
Pipeline hazard
Control dependence
Data dependence
Flow dependence
Output dependence
Anti dependence
Resource contention
Critical path

Lecture 14 (05.04 Fri.)

Data dependences
Stalling
Stalling hardware
Hazard unit
Control dependences
Branch misprediction penalty
Instructions flushing
Early branch resolution
Data forwarding
Branch prediction
Pipelined performance
SPECINT2006 benchmark
Average CPI
Software-based interlocking
Hardware-based interlocking
Pipeline bubbles
Software-based instruction scheduling
Hardware-based instruction scheduling
Static / dynamic scheduling
Variable-length operation latency
Profiling
Multi-cycle execution
Exceptions
Interrupts
Precise exceptions / interrupts
Instruction retiring
Exception handling
Precise exceptions in pipelining
Reorder buffer (ROB)
ROB entry
Content Addressable Memory (CAM)
Register Alias Table (RAT)
Register renaming
Output dependences
Anti dependences
In-order pipeline

Lecture 15a (11.04 Thu.)

Register File
Reorder buffer (ROB)
ROB entry
RAM
Content Addressable Memory (CAM)
Use indirection
Register Alias Table (RAT)
Register renaming
True (flow) dependence (RAW)
Anti dependence (WAR)
Output-dependence (WAW)
In-Order Pipeline with ROB
Decode (D) stage
Execute (E) stage
Completion (R) stage
Retirement/Commit (W) stage

Lecture 15b (11.04 Thu.)

In-order pipeline
Stalling
Latency
Dispatch Stalls
In-order dispatch
Out-of-order dispatch
Reservation stations
Out-of-order (OoO)
Out-of-order execution
Functional unit (FU)
Tomasulo's Algorithm

Lecture 16 (12.04 Fri.)

OoO: Out of Order Execution
Tomasulo's algorithm
Register alias table (RAT)
Physical register file (PRF)
Tag/value broadcast
Reservation station
Instruction scheduling/dispatching
Instruction window
Dataflow graph
Precise exceptions
Frontend register file
Architecture register file
Reorder buffer (ROB)
Register renaming
Latency tolerance
Instruction window size
Memory disambiguation / unknown address problem
Store - Load dependency
LQ/SQ: Load Queue / Store Queue
Data forwarding between stores and loads

Lecture 17a (18.04 Thu.)

Superscalar execution
Dataflow
Out-of-order
Irregular parallelism
Von Neumann model
Precise state
Parallelism control
Bookkeeping overhead
Multiple instructions per cycle
N-wide superscalar
In-order superscalar
Dependency checking

Lecture 17b (18.04 Thu.)

Branch prediction
Control dependence
Control-flow
Conditional branches
Unconditional branches
Call
Return
Indirect branch
Branch delay slot
Fine-grain multithreading
Predicated execution
Multipath execution
Branch resolution latency
Trace cache
Predicate combining
Wrong path
Branch target
Branch direction
Branch Target Buffer (BTB)
Target address
.Global branch history
Branch direction prediction
Profile-based branch prediction
Programmer-based branch predictor
Pragmas
Dynamic branch prediction

Lecture 18 (02.05 Thu.)

Direction predictor
Branch target buffer
Always taken / not taken
Backward taken, forward not taken
Profile based
Program analysis based
Last time prediction
Two bit counter based prediction
Two level prediction
Hybrid branch prediction
Perceptron based branch prediction
Tag and BTB index
Branch history table
Hysteresis
Bimodal prediction
Two-level adaptive training branch prediction
Global branch correlation
Two-level global branch prediction
Pattern history table
Global history register
Global predictor accuracy
Alpha 21264 Tournament Predictor
Local and global prediction
Loop branch detector and predictor
Perceptron based branch predictor
Hybrid history length based predictor
Prediction function
Training function
Branch confidence estimation
Handling control dependencies
Delayed branching

Lecture 19a (03.05 Fri.)

Superscalar
VLIW (Very Long Instruction Word)
Compiler
Packed (independent) instructions
Packs/Bundles
Lock-step execution
Machine description
Dependency checking
RISC
Instruction level parallelism
Intel IA-64
EPIC (Explicitly Parallel Instruction Computing)
Recompilation
Superblock
Static instruction scheduling

Lecture 19b (03.05 Fri.)

Systolic arrays
Simple, regular design
Processing elements
Image processing
Convolution
Machine learning
Convolutional layers
Convolutional Neural Network (CNN)
AlexNet
ImageNet
GoogLeNet
Stream processing
Pipeline parallelism
Staged execution
WARP Computer
Tensor Processing Unit (TPU)
Decoupled Access/Execute (DAE)
Astronautics ZS-1
Loop unrolling

Lecture 20 (9.05 Thu.)

Throwhammer: RowHammer over the network
SIMD processing
GPU
Regular parallelism
Single Instruction Single Data (SISD)
Single Instruction Multiple Data (SIMD)
Multiple Instruction Single Data (MISD)
Systolic array
Streaming processor
Multiple Instruction Multiple Data (MIMD)
Multiprocessor
Multithreaded processor
Data parallelism
Array processor
Vector processor
Very Long Instruction Word (VLIW)
Vector register
Vector control register
Vector length register (VLEN)
Vector stride register (VSTR)
Prefetching
Vector mask register (VMASK)
Vector functional unit
CRAY-1
Seymour Cray
Memory interleaving
Memory banking
Vector memory system
Scalar code
Vectorizable loops
Vector chaining
Multi-ported memory
Vector stripmining
Gather/Scatter operations
Masked vector instructions

Lecture 21 (10.05 Thu.)

Memory Banking
Vector instruction execution
Vector length
Vector instruction level parallelism
SIMD processing
CRAY
SIMD
Vector processing
Automatic Code Vectorization
Vectorized Code
Scalar Sequential Code
Amdahl's Law
Regular data level parallelism
Vectorizability of code
ISAs including SIMD operations
Modern ISAs
MMX operations
Packed arithmetic
Fine-grained multithreading
Multithreaded pipeline
Warps
GPUs and SIMD engines
Programming using threads
Hardware Execution Model
Exploiting parallelism
SISD
MIMD
SPMD on SIMT Machine
Sequential instruction stream
Multiple instruction streams
Scalar instructions
Fine grained multithreading of warps
Warp-Level FGMT
Warp Execution
SIMT Memory Access
Warp Instruction Level Parallelism
CPU threads and GPU kernels
GPU SIMT Code
CUDA code
Blocks to Warps
Streaming Multiprocessors (SM)
Streaming Processors (SP)
NVIDIA Fermi architecture
Warp-based vs Traditional SIMD
Dynamic Warp Formation
Two-Level Warp Scheduling
Branch divergence
Long latency operations
Sub-warps
Two-Level Round Robin
NVIDIA GeForce GTX 285
NVIDIA V100

Lecture 22a (16.05 Thu.)

Memory
Virtual memory
Physical memory
Abstraction layers
Load/store data
Flip-flops (latches)
Random Access Memory (RAM)
Static RAM (SRAM)
Dynamic RAM (DRAM)
Storage technology (flash memory, hard disk, tape)
Memory array
Decoder
Wordline
Memory bank
Sense amplifier
Charge loss
Refresh

Lecture 22b (16.05 Thu.)

DRAM vs SRAM
Memory hierarchy
Mature and immature memory technologies
Flash
Phase Change Memory
Magnetic RAM
Resistive RAM
Temporal locality
Spatial locality
Caching basics
Caching in a pipelined design
Hierarchical latency analysis
Access latency and miss penalty
Hit-rate, miss-rate
Prefetching

Lecture 23a (17.05 Fri.)

DRAM
Memory hierarchy
Caching
Temporal locality
Spatial locality
Cache Line/Block
Cache hit/miss
Placement
Replacement
Granularity of management
Write policy
Tag Store
Data store
Average Memory Access Time (AMAT)
Direct map cache
Conflict miss
Set associativity
Full associativity
Degree of associativity
Capacity miss
Eviction/Replacement policy
LRU, MRU, Random replacement policies
Set thrashing
Write-back
Write-through
Subblocked (Sectored) Caches
Instruction cache
Data cache
Multi-level caching
Compulsory misses

Lecture 24a (23.05 Thu.)

Cache structure
Tag store
Data Store
Bookkeeping
Cache performance
Cache size
Block size
Associativity
Replacement policy
Insertion/Placement policy
Hit/Miss rate
Hit/Miss latency/cost
Data access patterns
Data layout
Column major and Row major data layouts
Tiling and blocking
Multi-core issues in caching
Shared vs private caches
Performance isolation
QoS (Quality of Service), Fairness, and Starvation
Cache fragmentation
Dynamic partitioning
Shared resource view
Cache coherence
Consistency problem
Software-level coherence: Flush-Local/Global/Cache instructions
Scratchpad memory - software managed caches
Simple coherence scheme: snooping and broadcasting
Maintaining coherence
Write propagation
Write serialization
Hardware cache coherence
Snoopy bus
Directory based

Lecture 24b (23.05 Thu.)

Virtual memory
Physical memory
Infinite capacity
Relocation
Protection and isolation
Sharing
Linear address
Real address
Page table
OS-managed lookup table
Address translation
Physical frame
Demand paging
Placing, replacement, granularity of management, write policy
Page
Page size
Page offset
Page fault
Virtual page number (VPN)
Physical page number (PPN)
Virtual address
Physical address

Lecture 25 (24.05 Fri.)

Physical page number (physical frame number)
Page replacement policy
Page dirty bit
Page table base register (PTBR)
Page fault
Multi-level (hierarchical) page table
Translation Lookaside Buffer (TLB)
Memory Management Unit (MMU)
Page Table Entry
Tag store
Page hit, page fault
OS trap handler
Direct Memory Access (DMA)
Interrupt processor
Access protection bits
Access protection exception
Privilege levels
DRAM disturbance errors
RowHammer

Discussion Session I (31.05 Fri.)

Boolean algebra, Boolean Equations
Sum of product, Truth table
Logic gates
Pipelining
Pipeline stage
Pipeline hazard
Data dependence
Data forwarding
Out-of-order execution
Tomasulo's Algorithm
Reservation stations
Local and global prediction
Global history register
Pattern history table

Design of Digital Circuits - Spring 2019

Table of Contents