User Tools

Site Tools


buzzword

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

buzzword [2019/11/14 19:56] – [Lecture 15 (14.11 Thu.)] firtinacbuzzword [2019/12/14 20:25] (current) – external edit 127.0.0.1
Line 253: Line 253:
     * BitWeaving     * BitWeaving
   * Computing Architectures with Minimal Data Movement   * Computing Architectures with Minimal Data Movement
-  * Mindset on reviewing manuscripts and scientific process  
-  * Suggestions on critical paper review 
-  * Mindset issues everywhere 
-    * Bandwidth bottleneck in Zurich Airport 
-    * Wrong methodology in design space exploration: Building bridges across Manhattan and Brooklyn 
   * 3D-Stacked Logic+Memory   * 3D-Stacked Logic+Memory
   * Logic Layer    * Logic Layer 
Line 598: Line 593:
   * Core/source throttling   * Core/source throttling
   * Fairness via Source Throttling   * Fairness via Source Throttling
 +===== Lecture 16a (15.11 Fri.) =====
 +  * Shared resource contention
 +  * Slowdown estimation
 +  * Application/thread scheduling
 +  * Multi-core/many-core systems
 +  * Application/data mapping
 +  * Application prioritization
 +  * On-chip communication
 +  * Communication distance
 +  * Congestion in Network-on-Chip (NoC)
 +  * Spatial task scheduling
 +  * Clustering
 +  * Load balancing
 +  * Isolation
 +  * Radial mapping
 +  * Distributed Resource Management (DRM)
 +  * Operating-system-level metric
 +  * Microarchitecture-level metric
 +  * Architecture-aware DRM
 +  * Machine learning-based mapping/scheduling
 +
 +===== Lecture 16b (15.11 Fri.) =====
 +  * Emerging memory technology
 +  * Flash memory
 +  * Memory-centric system design
 +  * Phase change memoery
 +  * Charge memory
 +  * Resistive memory
 +  * Multi-level cell
 +  * Spin-Transfer Torque Magnetic RAM (STT-MRAM)
 +  * Memristors
 +  * Resistive RAM (RRAM or ReRAM)
 +  * Intel 3D Xpoint
 +  * Capacity-latency trade-off
 +  * Capacity-reliability trade-off
 +  * Endurance
 +  * Magnetic Tunnel Junction (MTJ)
 +  * Hybrid memory
 +  * Writing filtering
 +  * Data placement
 +  * Data access pattern
 +  * Row-buffer locality
 +  * Overall system performance impact
 +  * Memory-Level Parallelism (MLP)
 +  * Utility-based hybrid memory management
 +
 +===== Lecture 17 (21.11 Thu.) =====
 +  * SIMD
 +  * Multiply-accumulate
 +  * Thread block
 +  * Stream processor
 +  * Tensor core
 +  * Neural network training
 +  * Systolic arrays
 +  * Fine-grain multithreading
 +  * Warp
 +  * GPU programming
 +  * General purpose processing on GPU
 +  * GPU kernels
 +  * CUDA
 +  * OpenCL
 +  * SPMD
 +  * Grid, Block, Threads
 +  * Row major layout
 +  * Warp scheduler
 +  * Coalesced memory accesses
 +  * AoS (Array of Structures)
 +  * SoA (Structure of Arrays)
 +  * Tiling
 +  * Bank conflicts
 +  * Padding
 +  * Randomized mapping
 +  * Hash functions
 +  * Divergence
 +  * Vector reduction
 +  * Divergence-free mapping
 +  * Atomic operations
 +  * PTX
 +  * SASS
 +  * Synchronous and asynchronous transfers
 +  * Stream
 +  * Collaborative Computing
 +  * Unified memory space
 +  * Collaborative patterns
 +
 +===== Lecture 18 (22.11 Fri.) =====
 +  * Instruction prefetching
 +  * Data prefetching
 +  * Memory Hierarchy
 +  * Memory Read/Write Latency
 +  * Memory Bandwidth
 +  * Memory Footprint
 +  * Caches as Bandwidth Filters
 +  * Little's Law
 +  * Occupancy
 +  * Latency
 +  * Throughput
 +  * Queueing Resources
 +  * Compulsory Miss
 +  * Demand Miss
 +  * Spatial and Temporal Locality
 +  * Fetch Granule
 +  * Hardware Prefetching
 +  * Software Prefetch Instruction
 +  * Code Reordering
 +  * Speculative Execution
 +  * Loop Unrolling
 +  * Load Hoisting
 +  * Prefetch Degree
 +  * Three Prefetch Metrics
 +    * Accuracy
 +    * Coverage
 +    * Timeliness
 +  * Heuristic-Based Next-N-Line Prefetching
 +  * History-Based Target Line Prefetching
 +  * Heuristic-Based Wrong-Path Prefetching
 +  * Hybrid Prefetching
 +  * Branch Predictor
 +  * Branch Target Buffer (BTB)
 +  * Next-Line Prefetchers
 +  * Stride Prefetchers
 +  * Cache-Block Address Based Stride Prefetching
 +  * Correlation-Based Prefetchers
 +  * Content-Birected Prefetchers
 +  * Precomputation or Execution-Based prefetchers
 +  * Address Correlation Based Prefetching
 +  * Markov Model and Markov Prefetchers
 +  * Prefetch Confidence
 +  * Hybrid Hardware Prefetchers
 +  * Execution-based Prefetchers
 +  * Speculative Thread
 +  * Feedback-Directed Prefetcher
 +  * Prefetcher Throttling
 +===== Lecture 19 (28.11 Thu.) =====
 +
 +  * Hybrid Memory Systems 
 +  * Large (DRAM) Cache
 +  * TIMBER 
 +  * Two-Level Memory/Storage model
 +  * Volatile data 
 +  * Persistent data 
 +  * Single-level store
 +  * Unified Memory and storage
 +  * The Persistent Memory Manager (PMM)
 +  * ThyNVM 
 +  * Heterogeneity 
 +  * Asymmetry in design
 +  * Amdahl's Law 
 +  * Synchronization overhead 
 +  * Load imbalance overhead
 +  * Resource sharing overhead
 +  * IBM Power4
 +  * IBM Power5
 +  * Niagara Processor
 +  * Performance vs. parallelism
 +  * Asymmetric Chip Multiprocessor (ACMP)
 +  * MorphCore
 +
 +===== Lecture 20 (29.11 Fri.) =====
 +  * Heterogeneity 
 +  * Pointer-chasing
 +  * Critical section
 +  * Asymmetry
 +  * Accelerated
 +  * Data Marshalling
 +  * False Serialization
 +  * Shared Data 
 +  * Private Data
 +  * Amdahl’s Law
 +  * Barriers
 +  * Identification
 +  * Migration
 +  * Private Data
 +  * Feedback Directed Pipelining
 +  * Staged Execution
 +  * Inter-segment Data
 +  * Pipeline Parallelism
 +  * Dynamic heterogeneity
 +
 +===== Lecture 21a (5.12 Thu.) =====
 +  * Persistent memory
 +  * Crash consistency
 +  * Checkpointing
 +  * Flynn's taxonomy of computers
 +  * Parallelism
 +  * Performance
 +  * Power consumption
 +  * Cost efficiency
 +  * Dependability
 +  * Instruction level parallelism
 +  * Data parallelism
 +  * Task level parallelism
 +  * Utilization
 +  * Redundancy
 +  * Efficiency
 +  * Amdahl's law
 +  * Bottlenecks in parallel portion
 +  * Multiprocessor
 +  * Loosely coupled multiprocessors
 +  * Tightly coupled multiprocessors
 +  * Shared global memory address space
 +  * Shared memory synchronization
 +  * Interconnects
 +  * Programming issues in tightly coupled multiprocessor
 +  * Sublinear speedup
 +  * Linear speedup
 +  * Superlinear speedup
 +  * Shared resource management
 +  * Unfair comparison
 +  * Cache/memory effect
 +
 +  
 +
 +===== Lecture 21b (5.12 Thu.) =====
 +
 +  * Memory consistency / memory ordering
 +  * Ordering of operations
 +  * Local ordering
 +  * Global ordering 
 +  * Sequential consistency
 +  * Weaker memory consistency
 +  * Memory fence instructions
 +  * Consequences of Sequential Consistency
 +  * Issues with Sequential Consistency
 +  * Global order requirement 
 +  * Aggressiveness
 +  * Out-of-order execution 
 +  * Higher performance
 +  * Burden on the programmer
 +  * Mutual exclusion
 +  * Protecting shared data
 +  * Ease of debugging
 +  * Correctness
 +  * MIMD processor
 +  * Dataflow processor
 +  
 +
 +===== Lecture 22 (6.12 Fri.) =====
 +
 +  * Cache coherence
 +  * Memory consistency
 +  * Shared memory model
 +  * Software coherence
 +    * Coarse-grained (page-level)
 +    * Non-cacheable
 +    * Fine-grained (cache flush)
 +  * Hardware coherence
 +  * Valid/invalid
 +  * Write propagation
 +  * Write serialization
 +  * Update vs. Invalid
 +  * Snoopy bus
 +  * Directory
 +    * Exclusive bit
 +  * Directory optimizations (bypassing)
 +  * Snoopy cache
 +  * Shared bus
 +  * VI protocol
 +  * MSI (Modified, Shared, Invalid)
 +  * Exclusive state
 +  * MESI (Modified, Exclusive, Shared, Invalid)
 +  * Illinois Protocol (MESI)
 +  * Broadcast
 +  * Bus request
 +  * Downgrade/upgrade
 +  * Snoopy invalidation
 +  * Cache-to-cache transfer
 +  * Writeback
 +  * MOESI (Modified, Owned, Exclusive, Shared, Invalid)
 +  * Directory coherence
 +  * Race conditions
 +  * Totally-ordered interconnect
 +  * Directory-based protocols
 +  * Set inclusion test
 +  * Linked list
 +  * Bloom filters
 +  * Contention resolution
 +  * Ping-ponging
 +  * Synchronization
 +  * Shared-data-structure
 +  * Token Coherence
 +  * Coherence for NDAs
 +  * Optimistic execution
 +  * Signature
 +  * Commit/re-execute
 +
 +===== Lecture 23 (12.12 Thu.) =====
 +  * Interconnects 
 +  * Cache coherence
 +  * Interconnect networks: 
 +      * Topology
 +      * Routing
 +      * Buffering and flow control
 +          * Oversubscription of routers
 +  *  Terminology:
 +      * Network Interface
 +      * Link
 +      * Switch/router
 +      * Channel
 +      * Node
 +      * Message
 +      * Packet
 +      * Flit
 +      * Direct/Indirect network
 +  * Properties of a Network Topology:
 +      * Regular/Irregular
 +      * Routing distance
 +      * Diameter
 +      * Average distance
 +      * Bisection Bandwidth
 +      * Blocking/non-blocking. Rearrangeable non-blocking
 +  * Topologies:
 +      * Bus
 +      * P2P
 +      * Crossbar
 +      * Ring
 +      * Tree
 +      * Omega
 +      * Hypercube
 +      * Mesh
 +      * Torus
 +      * Butterfly
 +  * cost, latency, contention, energy, bandwidth, overall performance
 +  * Circuit switching network
 +  * Multistage network 
 +  * Fetch-and-add
 +  * Unidirectional Ring
 +  * Bidirectional rings
 +  * Hierarchical rings
 +  * Mesh: asymmetricity on the edge
 +  * Torus
 +  * H-tree
 +  * Fat-tree
 +  * Hyper-cube. Cosmic Cube
 +  * Routing mechanism: Arithmetic, Source-based, Table-based lookup
 +  * Types of routing algorithm: deterministic, oblivious, adaptive
 +  * Deadlock
 +  * Oblivious routing
 +  * Adaptive Routing: minimal adaptive, non-minimal adaptive
 +  * Flow Control
 +  * Handling contention: buffer, drop or misroute
 +  * Flow control methods:
 +      * Circuit switching
 +      * Bufferless
 +      * Store and forward
 +      * Virtual cut through
 +      * Wormhole
 +  * Performance and congestion at high loads
 +  * Store-and-forward
 +  * Cut-though flow control
 +  * Wormhole flow control
 +  * Head of Line Blocking
 +
 +===== Lecture 24 (13.12 Fri.) =====
 +  * Load latency curve
 +  * Performance of interconnection networks
 +  * On-chip networks
 +  * Difference between off-chip and on-chip networks
 +  * Network buffers
 +  * Efficient routing
 +  * Advantages of on-chip interconnects
 +  * Pin constraints
 +  * Wiring resources
 +  * Disadvantages of on-chip interconnects
 +  * Energy/power constraint
 +  * Tradeoffs of interconnect design
 +  * Buffers in NoC routers
 +  * Bufferless routing
 +  * Flit-level routing
 +  * Deflection routing
 +  * Buffer and link energy consumption
 +  * Self-throttling
 +  * Livelock freedom problem
 +  * Golden packet for livelock freedom
 +  * Reassembly buffers
 +  * Packet retransmission
 +  * Packet scheduling
buzzword.1573761372.txt.gz · Last modified: 2019/11/14 19:56 by firtinac