User Tools

Site Tools


buzzword

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
buzzword [2018/11/21 14:11] – [Lecture 16 (15.11 Thu.)] kimjebuzzword [2019/09/20 11:09] (current) juang
Line 63: Line 63:
   * Error Correcting Codes (ECC)   * Error Correcting Codes (ECC)
  
-===== Lecture 2 (20.09 Wed.) =====+===== Lecture 2 (20.09 Thu.) =====
   * Rowhammer   * Rowhammer
   * Memory reliability   * Memory reliability
Line 717: Line 717:
   * request buffer   * request buffer
  
 +===== Lecture 18a (22.11 Thu.) =====
  
 +   * Fundamental interference control techniques
 +   * Core/Source throttling
 +   * Smart resources
 +   * Dynamic unfairness estimation
 +   * Throttling cores' memory access rates
 +   * FST: Fairness via Source Throttling
 +   * Runtime unfairness evaluation
 +   * Dynamic request throttling
 +   * Request injection rate
 +   * Application/Thread scheduling
 +   * Many-core on-chip communication
 +   * Shared cache bank
 +   * Spatial task scheduling
 +   * Clustering, balancing, isolation, and radial mapping
 +   * Network power
 +   * Microarchitecture unawareness
 +   * Operating-system-level metrics and microarchitecture-level metrics
 +   * Architecture-aware distributed resource management (DRM)
 +   * Interference-aware thread scheduling
 +   * Memory quality of service (QoS) approaches and techniques
 +   * Smart vs dump components
 +   * Cache interference management
 +   * Interconnect interference management
 +   * DRAM designs to reduce interference
 +   * SoftMC
 +   * PIM accelerators
 +   * Decoupled direct memory access (DDMA)
 +
 +===== Lecture 18b (22.11 Thu.) =====
 +
 +   * Multi-core issues in caching
 +   * Cache coherence
 +   * Flush-local and flush-global
 +   * Snoopy cache coherence
 +   * Free for all sharing
 +   * Controlled cache sharing
 +   * Hardware -based cache partitioning
 +   * Marginal utility of a cache way
 +   * Dynamic set sampling
 +   * UCP
 +   * Optimal partitioning: Greedy and look-ahead algorithms
 +   * Dynamic fair caching
 +   * Software-based shared cache partitioning
 +   * Page coloring
 +   * Static cache partitioning
 +   * Dynamic cache partitioning via page re-coloring
 +
 +===== Lecture 19a (28.11 Thu.) =====
 +   * Controlled Shared Caching
 +   * Cache spilling
 +   * Cooperative caching
 +   * DSR: Dynamic spill-receive
 +   * Set dueling
 +   * Cooperative caching
 +   * Handling shared data in provate caches
 +   * Non-uniform cache access
 +   * Multi-core cache efficiency
 +   * Cache compression
 +   * Decompression latency
 +   * Compression ratio
 +   * Zero compression
 +   * Frequent value compression
 +   * Frequent pattern compression
 +   * Base-delta immediate compression
 +   * Toggle-aware compression for GPU systems
 +   * Core-assisted bottleneck acceleration in GPUs
 +   * Cache placement
 +   * Cache insertion policies: MRU, LRU
 +   * LIP: LRU insertion position (Low-prioirity insertion policy)
 +   * BIP: Bimodal insertion policy
 +   * DIP: Dynamic insertion policy
 +   * Circular reference model
 +   * Cache pollution
 +   * Cache thrashing
 +   * Reuse prediciton
 +   * EAF: Evicted-address filter
 +   * TA-DIP: Thread-aware dynamic insertion policy
 +   * Run-time bypassing
 +   * Single-usage block prediction
 +   * SHIP: Signature-based block prediction
 +   * Miss classification table
 +   * s-curve
 +   * ASM: Application slowdown model
 +   * Cache access rate
 +   * Memory access rate
 +   * Auxilary tag store
 +
 +===== Lecture 19b (28.11 Thu.) =====
 +   * Heterogeneity and asymmetry
 +   * CRAY-1 design
 +   * Scalar machine and vector pipeline machine
 +   * RAIDR
 +   * DRAM + Phase change memory
 +   * Reliable, costly DRAM + Unreliable, cheap DRAM
 +   * Heterogeneous retention time
 +   * Tilera
 +   * Packet switching and circuit switching
 +   * TDN, MDN, IDN, UDN, and STN
 +   * General purpose vs special purpose
 +   * Heterogeneity of CPU and GPUs
 +   * Predictability and robustness
 +
 +===== Lecture 20 (29.11 Thu.) =====
 +  * DRAM
 +  * NVM
 +  * Flash
 +  * Processing in Memory
 +  * Hardware Security
 +  * Heterogeneous Multi-Core Systems
 +  * Bottleneck Acceleration
 +  * Heterogeneity (Asymmetry)
 +  * Symmetric design 
 +  * One-size-fits-all
 +  * Quality of Service (QoS)
 +  * Hybrid Memory Controllers
 +  * Heterogeneous agents (e.g., CPUs, GPUs, and HWAs)
 +  * Heterogeneous memories: Fast vs. Slow DRAM
 +  * Heterogeneous interconnects: Control, Data, Synchronization
 +  * Amdahl’s Law
 +  * Synchronization overhead
 +  * Load imbalance overhead
 +  * Resource sharing overhead
 +  * Sequential portions (Amdahl’s “serial part”)
 +  * Critical sections
 +  * Barriers
 +  * Asymmetric Chip Multiprocessor (ACMP)
 +  * Bottleneck Acceleration
 +  * Staged Execution
 +  * Data Marshaling
 +  * Phase Change Memory
 +
 +===== Lecture 21 (05.12 Wed.) =====
 +  * GPU
 +  * Programming model
 +  * Sequential
 +  * SIMD
 +  * SPMD
 +  * SIMT
 +  * Warp (wavefront)
 +  * Multithreading of warps
 +  * Warp-level FGMT
 +  * Latency-hiding
 +  * Interleave warp execution
 +  * Registers of thread ID
 +  * Warp-based SIMD vs. Traditional SIMD
 +  * GPGPU programming
 +  * Inherent parallelism
 +  * Data parallelism
 +  * GPU main bottlenecks
 +  * CPU-GPU data transfers
 +  * DRAM memory
 +  * Task offloading
 +  * Serial code (host)
 +  * Parallel code (device)
 +  * Bulk synchronization
 +  * Transparent scalability
 +  * Memory hierarchy
 +  * Indexing and memory access
 +  * Streaming multiprocessor (SM)
 +  * Streaming processor (Vector lane)
 +  * Occupancy
 +  * Memory coalescing
 +  * Shared memory tiling
 +  * Bank conflict
 +  * Padding
 +  * SIMD utilization
 +  * Atomic operations
 +  * Histogram calculation
 +  * CUDA streams
 +  * Asynchronous transfers
 +  * Heterogeneous systems
 +  * Unified memory
 +  * System-wide atomic operations
 +  * Collaborative computing
 +  * CPU+GPU collaboration
 +  * Collaborative patterns
 +      * Data partitioning
 +      * Task partitioning
 +          * Coarse-grained
 +          * Fine-grained
 +  * Bézier surfaces
 +  * NVIDIA Pascal
 +  * NVIDIA Volta
 +  * Padding
 +  * Chai benchmark suite
 +
 +===== Lecture 22 (6.12 Thu.) =====
 +  * Persistent memory
 +  * Crash consistency
 +  * Checkpointing
 +  * Flynn´s taxonomy of computers
 +  * Parallelism
 +  * Performance
 +  * Power consumption
 +  * Cost efficiency
 +  * Dependability
 +  * Instruction level parallelism
 +  * Data parallelism
 +  * Task level parallelism
 +  * Multiprocessor
 +  * Loosely coupled
 +  * Tightly coupled
 +  * Shared global memory address space
 +  * Shared memory synchronization
 +  * Cache coherence
 +  * Memory consistency
 +  * Shared resource management
 +  * Interconnects
 +  * Programming issues in tightly coupled multiprocessor
 +  * Sublinear speedup
 +  * Linear speedup
 +  * Superlinear speedup
 +  * Unfair comparison
 +  * Cache/memory effect
 +  * Utilization
 +  * Redundancy
 +  * Efficiency
 +  * Amdahl's law
 +  * Bottlenecks in parallel portion
 +  * Ordering of operations
 +  * Sequential consistency
 +  * Weaker memory consistency
 +  * Memory fence instructions
 +  * Higher performance
 +  * Burden on the programmer
 +  * Coherence scheme
 +  * Valid/invalid
 +  * Write propagation
 +  * Write serialization
 +  * Update vs. Invalid
 +  * Cache coherence
 +  * Snoopy bus
 +  * Directory
 +  * Directory optimizations
 +  * Directory bypassing
 +  * Snoopy cache
 +  * Shared bus
 +  * VI protocol
 +  * MSI (Modified, Shared, Invalid)
 +  * Exclusive state
 +  * MESI (Modified, Exclusive, Shared, Invalid)
 +  * Illinois Protocol (MESI)
 +  * Broadcast
 +  * Bus request
 +  * Downgrade
 +  * Upgrade
 +  * Snoopy invalidation
 +  * Cache-to-cache transfer
 +  * Writeback
 +  * MOESI (Modified, Owned, Exclusive, Shared, Invalid)
 +  * Directory coherence
 +  * Race conditions
 +  * Totally-ordered interconnect
 +  * Directory-based protocols
 +  * Set inclusion test
 +  * Linked list
 +  * Bloom filters
 +  * Contention resolution
 +  * Ping-ponging
 +  * Synchronization
 +  * Shared-data-structure
 +  * Token Coherence
 +  * Virtual bus
 +
 +===== Lecture 23 (12.12 Wed.) =====
 +  * Interconnection Network, Interconnect
 +  * Topology
 +  * Routing
 +  * Buffering and Flow Control
 +  * Switch/Router
 +  * Channel
 +  * Wire
 +  * Packet
 +  * Path
 +  * Bus
 +  * Mesh, 2D Mesh
 +  * Throttling
 +  * Oversubscription
 +  * Network Interface
 +  * Link
 +  * Node
 +  * Message
 +  * Flit
 +  * Direct/Indirect Network
 +  * Radix
 +  * Regular/Irregular Topology
 +  * Routing Distance
 +  * Diameter
 +  * Bisection Bandwidth
 +  * Congestion
 +  * Blocking/non-blocking Interconnect
 +  * Crossbar
 +  * Ring
 +  * Tree
 +  * Omega
 +  * Hypercube
 +  * Torus
 +  * Butterfly
 +  * Arbitration
 +  * Point-to-Point
 +  * Multistage Network
 +  * Hop
 +  * Circuit Switching
 +  * Packet Switching
 +  * Tree saturation
 +  * Deadlock
 +  * Circular dependency
 +  * Oblivious Routing
 +  * Adaptive Routing
 +  * Packet Format
 +  * Header
 +  * Payload
 +  * Error Code
 +  * Virtual Channel Flow Control
 +
 +===== Lecture 24 (13.12 Thu.) =====
 +  * Load latency curve
 +  * Performance of interconnection networks
 +  * On-chip networks
 +  * Difference between off-chip and on-chip networks
 +  * Network buffers
 +  * Efficient routing
 +  * Advantages of on-chip interconnects
 +  * Pin constraints
 +  * Wiring resources
 +  * Disadvantages of on-chip interconnects
 +  * Energy/power constraint
 +  * Tradeoffs of interconnect design
 +  * Buffers in NoC routers
 +  * Bufferless routing
 +  * Flit-level routing
 +  * Deflection routing
 +  * Buffer and link energy consumption
 +  * Self-throttling
 +  * Livelock freedom problem
 +  * Golden packet for livelock freedom
 +  * Reassembly buffers
 +  * Packet retransmission
 +  * Packet scheduling
buzzword.1542809472.txt.gz · Last modified: 2019/02/12 16:33 (external edit)