buzzword
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
buzzword [2020/11/05 13:31] – [Lecture 13 (05.11 Thu.)] yaglikca | buzzword [2020/12/04 12:02] (current) – [Lecture 20 (03.12 Thu.)] Add Buzzwords for Lecture 21, 4/12 sjoao | ||
---|---|---|---|
Line 581: | Line 581: | ||
===== Lecture 13 (05.11 Thu.) ===== | ===== Lecture 13 (05.11 Thu.) ===== | ||
* Memory Interference | * Memory Interference | ||
+ | * Prioritization | ||
+ | * Data Mapping | ||
+ | * Core/Source Throttling | ||
+ | * Applitcation Thread Scheduling | ||
+ | * Memory Service Guarantees | ||
* Quality of Service | * Quality of Service | ||
* QoS-Aware Memory Systems | * QoS-Aware Memory Systems | ||
Line 587: | Line 592: | ||
* PAR-BS | * PAR-BS | ||
* ATLAS Memory Scheduler | * ATLAS Memory Scheduler | ||
+ | * BLISS (Blacklisting Memory Scheduler) | ||
* Thread Cluster Memory Scheduling | * Thread Cluster Memory Scheduling | ||
* TCM | * TCM | ||
Line 593: | Line 599: | ||
* STFM | * STFM | ||
* FR-FCFS | * FR-FCFS | ||
- | * The Blacklisting Memory Scheduler | ||
- | * BLISS | ||
* Staged Memory Scheduling | * Staged Memory Scheduling | ||
* SMS | * SMS | ||
Line 606: | Line 610: | ||
* Data mapping | * Data mapping | ||
* Memory Channel Partitioning | * Memory Channel Partitioning | ||
- | * Core/source throttling | + | * Parallel Application Memory Scheduling |
* Fairness via Source Throttling | * Fairness via Source Throttling | ||
+ | ===== Lecture 14 (12.11 Thu.) ===== | ||
+ | * Target metric | ||
+ | * Theoretical proof | ||
+ | * Analytical modeling/ | ||
+ | * Abstraction | ||
+ | * Accuracy | ||
+ | * Workload | ||
+ | * RTL simulations | ||
+ | * Design choices | ||
+ | * Cycle-level accuracy | ||
+ | * Design space exploration | ||
+ | * Flexibility | ||
+ | * High-level simulations | ||
+ | * Low-level models | ||
+ | * Ramulator | ||
+ | * Modular | ||
+ | * Extensible | ||
+ | * IPC (instructions per cycle) | ||
+ | * 3D-stacked DRAM | ||
+ | * DDR3 | ||
+ | * GDDR5 | ||
+ | * HBM | ||
+ | * HMC | ||
+ | * Wide I/O | ||
+ | * LPDDR | ||
+ | * Spatial locality | ||
+ | * Bank-level parallelism | ||
+ | |||
+ | ===== Lecture 15 (13.11 Fri.) ===== | ||
+ | * Emerging memory technologies | ||
+ | * Charge memory | ||
+ | * Resistive memory technologies | ||
+ | * Phase Change Memory (PCM) | ||
+ | * STT-MRAM | ||
+ | * Memristor | ||
+ | * RRAM/ReRAM | ||
+ | * Non-volatile | ||
+ | * Multi-Level Cell PCM (MLC-PCM) | ||
+ | * Endurance | ||
+ | * Reliability | ||
+ | * Intel Optane Memory | ||
+ | * 3D-XPoint Technology | ||
+ | * Read Asymmetry | ||
+ | * Magnetic Tunnel Junction (MTJ) device | ||
+ | * Hybrid main memory | ||
+ | * DRAM buffer/DRAM cache | ||
+ | * Data placement | ||
+ | * Row buffer | ||
+ | * Memory-Level Parallelism (MLP) | ||
+ | * Translation Lookaside Buffer (TLB) | ||
+ | * Page Table | ||
+ | * In-memory bulk bitwise operations | ||
+ | * In-memory crossbar array operations | ||
+ | * Analog computation | ||
+ | * Digital to Analog Converter (DAC) | ||
+ | * Analog to Digital Converter (ADC) | ||
+ | * NVM-based PIM system | ||
+ | |||
+ | |||
+ | ===== Lecture 16a (19.11 Thu.) ===== | ||
+ | * Emerging memory technology | ||
+ | * Flash memory | ||
+ | * Memory-centric system design | ||
+ | * Phase change memoery | ||
+ | * Charge memory | ||
+ | * Resistive memory | ||
+ | * Multi-level cell | ||
+ | * Spin-Transfer Torque Magnetic RAM (STT-MRAM) | ||
+ | * Memristors | ||
+ | * Resistive RAM (RRAM or ReRAM) | ||
+ | * Intel 3D Xpoint | ||
+ | * Capacity-latency trade-off | ||
+ | * Capacity-reliability trade-off | ||
+ | * Endurance | ||
+ | * Magnetic Tunnel Junction (MTJ) | ||
+ | * Hybrid memory | ||
+ | * Writing filtering | ||
+ | * Data placement | ||
+ | * Data access pattern | ||
+ | * Row-buffer locality | ||
+ | * Overall system performance impact | ||
+ | * Memory-Level Parallelism (MLP) | ||
+ | * Utility-based hybrid memory management | ||
+ | * Hybrid Memory Systems | ||
+ | * Large (DRAM) Cache | ||
+ | * TIMBER | ||
+ | * Two-Level Memory/ | ||
+ | * Volatile data | ||
+ | * Persistent data | ||
+ | * Single-level store | ||
+ | * Unified Memory and storage | ||
+ | * The Persistent Memory Manager (PMM) | ||
+ | * ThyNVM | ||
+ | |||
+ | ===== Lecture 16b (19.11 Thu.) ===== | ||
+ | * Heterogeneity | ||
+ | * Asymmetry in design | ||
+ | * Amdahl' | ||
+ | * Synchronization overhead | ||
+ | * Load imbalance overhead | ||
+ | * Resource sharing overhead | ||
+ | * IBM Power4 | ||
+ | * IBM Power5 | ||
+ | * Niagara Processor | ||
+ | * Performance vs. parallelism | ||
+ | * Asymmetric Chip Multiprocessor (ACMP) | ||
+ | * MorphCore | ||
+ | ===== Lecture 17 (20.11 Fri.) ===== | ||
+ | *Amdahl' | ||
+ | *Parallelizable fraction of a program | ||
+ | *Serial bottleneck | ||
+ | *Synchronization overhead | ||
+ | *Load imbalance overhead | ||
+ | *Resource sharing overhead | ||
+ | *Critical section | ||
+ | *Asymmetric multi-core (ACMP) | ||
+ | *Symmetric CMP (SCMP) | ||
+ | *Accelerated Critical Sections (ACS) | ||
+ | *Selective Acceleration of Critical Sections (SEL) | ||
+ | *Critical Section Request Buffer(CSRB) | ||
+ | *Cache misses for private data | ||
+ | *Cache misses for shared data | ||
+ | *Equal-area comparison | ||
+ | *Bottleneck Identification and Scheduling (BIS) | ||
+ | *Thread waiting cycles (TWC) | ||
+ | *Bottleneck Table (BT) | ||
+ | *Scheduling Buffers (SB) | ||
+ | *Acceleration Index Tables (AIT) | ||
+ | *The critical path | ||
+ | *Feedback-Directed Pipelining (FDP) | ||
+ | *Comprehensive fine-grained bottleneck acceleration | ||
+ | *Lagging threads | ||
+ | *Multiple applications | ||
+ | *Criticality of code segments | ||
+ | *Utility-Based Acceleration (UBA) | ||
+ | *Global criticality of the segment | ||
+ | *Fraction of execution time spent on segment | ||
+ | *Local speedup of the segment | ||
+ | *Data marshaling | ||
+ | *Staged execution model | ||
+ | *Segment spawning | ||
+ | *Producer-Consumer Pipeline Parallelism | ||
+ | *Locality of inter-segment data | ||
+ | *Generator instruction | ||
+ | *Marshal buffer | ||
+ | *Pipeline parallelism | ||
+ | *Aggressive stream prefetcher | ||
+ | *Energy expended per instruction (EPI) | ||
+ | *Dynamic voltage frequency scaling (DVFS) | ||
+ | |||
+ | ===== Lecture 18 (26.11 Thu.) ===== | ||
+ | |||
+ | * Memory latency | ||
+ | * DRAM Latency | ||
+ | * Latency Reduction | ||
+ | * Latency Tolerance | ||
+ | * Latency Hiding | ||
+ | * Caching | ||
+ | * Prefetching | ||
+ | * Multithreading | ||
+ | * Out-of-order Execution | ||
+ | * Software prefetching | ||
+ | * Hardware prefetching | ||
+ | * Execution-based prefetchers | ||
+ | * Next-Line Prefetchers | ||
+ | * Stride Prefetchers | ||
+ | * Stream Buffers | ||
+ | * Feedback-Directed Prefetching | ||
+ | * Content Directed Prefetching | ||
+ | |||
+ | |||
+ | ===== Lecture 19a (27.11 Fri.) ===== | ||
+ | |||
+ | * Execution-based Prefetcher | ||
+ | * Speculative thread | ||
+ | * Thread-Based Pre-Execution | ||
+ | * Runahead Execution | ||
+ | * Address-Value Delta (AVD) Prediction | ||
+ | * Multi-Core Issues in Prefetching | ||
+ | * Feedback Directed Prefetching | ||
+ | * Bandwidth-Efficient Prefetching | ||
+ | * Coordinated Prefetcher Control | ||
+ | * Prefetching in GPUs | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ===== Lecture 19b (27.11 Fri.) ===== | ||
+ | |||
+ | |||
+ | * Multiprocessing | ||
+ | * Memory Consistency | ||
+ | * Cache Coherence | ||
+ | * SISD | ||
+ | * SIMD | ||
+ | * MISD | ||
+ | * MIMD | ||
+ | * Parallelism | ||
+ | * Instruction Level Parallelism | ||
+ | * Data Parallelism | ||
+ | * Task Level Parallelism | ||
+ | * Loosely Coupled Multiprocessors | ||
+ | * Tightly Coupled Multiprocessors | ||
+ | * Hardware-based Multithreading | ||
+ | * Parallel Speedup | ||
+ | * Superlinear Speedup | ||
+ | * Utilization | ||
+ | * Redundancy | ||
+ | * Efficiency | ||
+ | * Amdahl’s Law | ||
+ | * Sequential Bottleneck | ||
+ | * Synchronization | ||
+ | * Load Imbalance | ||
+ | * Resource Contention | ||
+ | * Critical Sections | ||
+ | * Barriers | ||
+ | * Stages of Pipelined Programs | ||
+ | |||
+ | ===== Lecture 20 (03.12 Thu.) ===== | ||
+ | |||
+ | * Memory ordering | ||
+ | * Memory consistency | ||
+ | * Parallel computer architecture | ||
+ | * Multiprocessor operation | ||
+ | * MIMD (multiple instruction, | ||
+ | * Performance-correctness trade-off | ||
+ | * Cache coherence | ||
+ | * Ordering of operations | ||
+ | * Local ordering | ||
+ | * Global ordering | ||
+ | * Memory fence instruction | ||
+ | * Out-of-order execution | ||
+ | * Mutual exclusion | ||
+ | * Protecting shared data | ||
+ | * Critical section | ||
+ | * Sequential consistency | ||
+ | * Weaker memory consistency | ||
+ | * Dataflow processor | ||
+ | |||
+ | |||
+ | ===== Lecture 21 (04.12 Fri.) ===== | ||
+ | |||
+ | * Cache coherence | ||
+ | * Memory consistency | ||
+ | * Shared memory model | ||
+ | * Software coherence | ||
+ | * Coarse-grained (page-level) | ||
+ | * Non-cacheable | ||
+ | * Fine-grained (cache flush) | ||
+ | * Hardware coherence | ||
+ | * Valid/ | ||
+ | * Write propagation | ||
+ | * Write serialization | ||
+ | * Update vs. Invalid | ||
+ | * Snoopy bus | ||
+ | * Directory | ||
+ | * Exclusive bit | ||
+ | * Directory optimizations (bypassing) | ||
+ | * Snoopy cache | ||
+ | * Shared bus | ||
+ | * VI protocol | ||
+ | * MSI (Modified, Shared, Invalid) | ||
+ | * Exclusive state | ||
+ | * MESI (Modified, Exclusive, Shared, Invalid) | ||
+ | * Illinois Protocol (MESI) | ||
+ | * Broadcast | ||
+ | * Bus request | ||
+ | * Downgrade/ | ||
+ | * Snoopy invalidation | ||
+ | * Cache-to-cache transfer | ||
+ | * Writeback | ||
+ | * MOESI (Modified, Owned, Exclusive, Shared, Invalid) | ||
+ | * Directory coherence | ||
+ | * Race conditions | ||
+ | * Totally-ordered interconnect | ||
+ | * Directory-based protocols | ||
+ | * Set inclusion test | ||
+ | * Linked list | ||
+ | * Bloom filters | ||
+ | * Contention resolution | ||
+ | * Ping-ponging | ||
+ | * Synchronization | ||
+ | * Shared-data-structure | ||
+ | * Token Coherence | ||
+ | * Coherence for NDAs | ||
+ | * Optimistic execution | ||
+ | * Signature | ||
+ | * Commit/ |
buzzword.1604583108.txt.gz · Last modified: 2020/11/05 13:31 by yaglikca