User Tools

Site Tools


buzzword

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
buzzword [2018/05/04 11:38] – [Lecture 18 (03.05 Thu.)] jeremiebuzzword [2019/02/12 16:34] (current) – external edit 127.0.0.1
Line 682: Line 682:
   * Hybrid branch predictors   * Hybrid branch predictors
  
-===== Lecture 19 (04.05 Thu.) =====+===== Lecture 19 (04.05 Fri.) ===== 
 +  * GPU-based RowHammer
   * Out-of-Order Execution   * Out-of-Order Execution
   * Single-cycle Microarchitectures   * Single-cycle Microarchitectures
Line 709: Line 710:
   * Intel Pentium M Predictors   * Intel Pentium M Predictors
   * Binary classifier   * Binary classifier
-  * GHR+  * Global History Register (GHR)
   * Perceptron weights   * Perceptron weights
   * Bias weight   * Bias weight
Line 717: Line 718:
   * Control Dependences   * Control Dependences
   * Branch delay slot   * Branch delay slot
-  * Fine-grained multi-threading 
   * Predicated execution   * Predicated execution
   * Multipath execution   * Multipath execution
Line 733: Line 733:
   * Static Instruction Scheduling   * Static Instruction Scheduling
  
 +===== Lecture 20 (11.05 Fri.) =====
 +  * Throwhammer: RowHammer over the network
 +  * SIMD processing
 +  * GPU
 +  * Regular parallelism
 +  * Single Instruction Single Data (SISD)
 +  * Single Instruction Multiple Data (SIMD)
 +  * Multiple Instruction Single Data (MISD)
 +  * Systolic array
 +  * Streaming processor
 +  * Multiple Instruction Multiple Data (MIMD)
 +  * Multiprocessor
 +  * Multithreaded processor
 +  * Data parallelism
 +  * Array processor
 +  * Vector processor
 +  * Very Long Instruction Word (VLIW)
 +  * Vector register
 +  * Vector control register
 +  * Vector length register (VLEN)
 +  * Vector stride register (VSTR)
 +  * Prefetching
 +  * Vector mask register (VMASK)
 +  * Vector functional unit
 +  * CRAY-1
 +  * Seymour Cray
 +  * Memory interleaving
 +  * Memory banking
 +  * Vector memory system
 +  * Scalar code
 +  * Vectorizable loops
 +  * Vector chaining
 +  * Multi-ported memory
 +  * Vector stripmining
 +  * Gather/Scatter operations
 +  * Masked vector instructions
 +
 +===== Lecture 21 (17.05 Thu.) =====
 +  * SIMD processing
 +  * GPU
 +  * Flynn’s taxonomy
 +  * Systolic arrays
 +  * Micron's Automata Processor
 +  * VLIW
 +  * Array processor
 +  * Vector processor
 +  * Row/Column major
 +  * Sparse vector
 +  * Gather/Scatter operations
 +  * Address indirection
 +  * Data parallelism
 +  * Vector register
 +  * Vector instruction
 +  * Vector functional units
 +  * Memory banks
 +  * Vectorizable loop
 +  * Vector Instruction Level Parallelism
 +  * Automatic code vectorization
 +  * SIMD ISA extensions
 +  * Intel Pentium MMX
 +  * Multimedia registers
 +  * Programming model
 +  * Sequential
 +  * Single-Instruction Multiple Data (SIMD)
 +  * Multi-threaded
 +  * Single-Program Multiple Data (SPMD)
 +  * Execution model
 +  * Single-Instruction Multiple Thread (SIMT)
 +  * Warp (wavefront)
 +  * Warp-level FGMT
 +  * Shader core
 +  * Scalar pipeline
 +  * Latency hiding
 +  * Interleave warp execution
 +  * Warp instruction level parallelism
 +  * Warp-based SIMD vs. Traditional SIMD
 +  * Control flow path
 +  * Branch divergence
 +  * SIMD utilization
 +  * Dynamic warp formation
 +
 +===== Lecture 22 (18.05 Fri.) =====
 +  * GPGPU programming
 +  * NVIDIA Volta
 +  * Inherent parallelism
 +  * Data parallelism
 +  * GPU main bottlenecks
 +  * CPU-GPU data transfers
 +  * DRAM memory
 +  * Task offloading
 +  * Serial code (host)
 +  * Parallel code (device)
 +  * Bulk synchronization
 +  * Transparent scalability
 +  * Memory hierarchy
 +  * CUDA programming language
 +  * OpenCL
 +  * Indexing and memory access
 +  * Streaming multiprocessor (SM)
 +  * Streaming processor (SP)
 +  * Memory coalescing
 +  * Shared memory tiling
 +  * Bank conflict
 +  * Padding
 +  * GPU computing
 +  * GPU kernel
 +  * Massively parallel sections
 +  * Shared memory
 +  * Data transfers
 +  * Kernel launch
 +  * Latency hiding
 +  * Occupancy
 +  * Data reuse
 +  * SIMD utilization
 +  * Atomic operations
 +  * Histogram calculation
 +  * CUDA streams
 +  * Asynchronous transfers
 +  * Overlap of communication and computation
 +
 +===== Lecture 23a (24.05 Thu.) =====
 +  * Systolic Arrays
 +  * High concurrency
 +  * Balanced computation and I/O memory bandwidth
 +  * Simple, regular design 
 +  * Processing Elements
 +  * Decoupled Access Execute (DAE)
 +  * Image processing
 +  * Convolution
 +  * Convolutional layers
 +  * Convolutional Neural Network
 +  * AlexNet
 +  * ImageNet
 +  * GoogLeNet
 +  * Stream processing
 +  * Pipeline parallelism
 +  * Staged execution
 +  * WARP Computer
 +  * Tensor Processing Unit
 +  * Astronautics ZS-1
 +  * Loop unrolling
 +
 +===== Lecture 23b (24.05 Thu.) =====
 +  * Memory
 +  * Virtual memory
 +  * Physical memory
 +  * Load/store data
 +  * Random Access Memory (RAM)
 +  * Static RAM (SRAM)
 +  * Dynamic RAM (DRAM)
 +  * Memory array
 +  * Decoder
 +  * Wordline
 +  * Memory bank
 +  * Sense amplifier
 +
 +===== Lecture 24 (25.05 Fri.) =====
 +  * Destructive reads
 +  * Refresh
 +  * Capacitor and logic manufacturing technologies
 +  * DRAM vs SRAM
 +  * Mature and immature memory technologies 
 +      * Flash
 +      * Phase Change Memory
 +      * Magnetic RAM
 +      * Resistive RAM
 +  * Memory hierarchy
 +  * Temporal locality
 +  * Spatial locality
 +  * Caching basics
 +  * Caching in a pipelined design
 +  * Hierarchical latency analysis
 +  * Access latency and miss penalty
 +  * Hit-rate, miss-rate
 +  * Prefetching
 +  * Cache line, cache block
 +  * Placement
 +  * Replacement
 +  * Granularity of management
 +  * Write policy
 +  * Separation of instruction and data
 +  * Tag store and data store
 +  * Cache bookkeeping
 +  * Tag - index - byte in block
 +  * Direct mapped cache
 +  * Conflict misses
 +  * Set associativity
 +  * Ways in cache
 +  * Fully associative cache
 +  * Degree of associativity
 +  * Insertion, promotion, and eviction (replacement)
 +  * Replacement policies
 +      * Random
 +      * FIFO
 +      * Least recently used
 +      * Not most recently used
 +      * Least frequently used
 +  * Implementing LRU
 +  * Set thrashing
 +
 +===== Lecture 25a (31.05 Thu.) =====
 +  * Cache Tag
 +  * Tag Store Entry
 +  * Valid Bit
 +  * Dirty Bit
 +  * Replacement Policy Bit
 +  * Write-Back Cache
 +  * Write-Through Cache
 +  * Cache Coherence
 +  * Cache Consistency
 +  * Write Combining
 +  * (No-)Allocate on Write Miss
 +  * First-Level Cache
 +  * Second-Level Cache
 +  * Last-Level Cache
 +  * Sub-blocked (Sectored) Caches
 +  * Instruction Cache
 +  * Data Cache
 +  * Unified Instruction and Data Cache
 +  * Cache Management Policy
 +  * Cache Hit/Miss Rate
 +  * Cache Block Size
 +  * Critical-Word First
 +  * Working Set
 +  * Set Associativity
 +  * Compulsory Cache Miss
 +  * Capacity Cache Miss
 +  * Conflict Cache Miss
 +  * Loop Interchange
 +  * Loop Fusion
 +  * Array Merging
 +  * Shared vs. Private Caches
 +  * Cache Contention
 +  * Performance Isolation
 +  * Quality of Service
 +  * Starvation
 +  * Dynamic Cache Partitioning
 +
 +===== Lecture 25b (31.05 Thu.) =====
 +  * Virtual Memory
 +  * Physical Memory
 +  * Virtual Memory Address
 +  * Physical Memory Address
 +  * Code/Data Relocation
 +  * Memory Isolation
 +  * Memory Protection
 +  * Code/Data Sharing
 +  * Address Indirection
 +  * Virtual Address Translation
 +  * x86 Linear Address
 +  * Virtual Memory Page
 +  * Physical Memory Frame
 +  * Page Size
 +  * Page Table
 +  * Demand Paging
 +  * Page Replacement
 +  * Page Granularity
 +  * Virtual Page Number
 +  * Physical Frame Number
 +  * Page Fault
 +  * Translation Lookaside Buffer (TLB)
buzzword.1525433921.txt.gz · Last modified: 2019/02/12 16:34 (external edit)