User Tools

Site Tools


buzzword

This is an old revision of the document!


Buzzwords

Buzzwords are terms that are mentioned during lecture which are particularly important to understand thoroughly. This page tracks the buzzwords for each of the lectures and can be used as a reference for finding gaps in your understanding of course material.

Lecture 1 (19.09 Thu.)

  • Computer Architecture
  • Redundancy
  • Bahnhof Stadelhofen
  • Santiago Calatrava
  • Oculus
  • Design constraints
  • Falling Water
  • Frank Lloyd Wright
  • Sustainability
  • Evaluation criteria for designs
    • Functionality
    • Reliability
    • Space requirement
    • Expandability
  • Principled design
  • Role of the (Computer) Architect
  • Systems programming
  • Digital design
  • Levels of transformation
    • Algorithm
    • System software
    • Instruction Set Architecture (ISA)
    • Microarchitecture
    • Logic
  • Abstraction layers
  • Hamming code
  • Hamming distance
  • User-centric view
  • Productivity
  • Multi-core systems
  • Caches
  • DRAM memory controller
  • DRAM banks
  • Energy efficiency
  • Memory performance hog
  • Slowdown
  • Consolidation
  • QoS guarantees
  • Unfairness
  • Row decoder
  • Column address
  • Row buffer hit/miss
  • Row buffer locality
  • FR-FCFS
  • Stream/Random access patterns
  • Memory scheduling policies
  • Scheduling priority
  • DRAM cell
  • Access transistor
  • DRAM refresh
  • DRAM retention time
  • Variable retention time
  • Retention time profile
  • Manufacturing process variation
  • Bloom filter
  • Data pattern dependence
  • Variable retention time
  • Error Correcting Codes (ECC)

Lecture 2a (20.09 Fri.)

  • DRAM refresh
  • DRAM cell
  • Wordline
  • Bitline
  • Refresh overhead
  • Retention time
  • Manufacturing process variation
  • Data Pattern Dependence (DPD)
  • Variable Retention Time (VRT)
  • DRAM retention failures
  • Bloom filter

Lecture 3 (26.09 Thu.)

  • Fundamentally Secure/Reliable/Safe Architectures
  • Fundamentally Energy-Efficient Architectures
  • Memory-centric (Data-centric) Architectures
  • Fundamentally Low-Latency Architectures
  • Architectures for Genomics, Medicine, Health
  • Genome Sequence Analysis
  • Reference Genome
  • Read Mapping
  • Read Alignment/Verification
  • Edit Distance
  • In-Memory DNA Sequence Analysis
  • Memory Bottleneck
  • Main Memory
  • Storage (SSD/HDD)
  • The Memory Capacity Gap
  • DRAM Capacity, Bandwidth & Latency
  • Flash Memory
  • RowHammer
  • Non-Volatile Memory (NVM) (e.g., PCM, STTRAM, ReRAM, 3D Xpoint)
  • Emerging Memory Technologies
  • 3D-Stacked DRAM
  • Hybrid Main Memory
  • System-Memory Co-Design
  • Microarchitecture
  • Memory-Centric System Design
  • Memory Interference
  • Memory Controllers

Lecture 4a (27.09 Fri.)

  • Memory problem
  • System-memory co-design
  • Heterogeneous memories
  • Memory scaling
  • Memory-centric system design
  • Waste management
  • Reliability
  • Intelligent memory controllers
  • Computations close to data
  • Emerging memory technologies
  • Resistive memory technologies
  • Non-volatile
  • Phase Change Memory (PCM)
  • 3DXPoint
  • Hybrid Memories
  • Error Tolerance
  • Tolerant data
  • Vulnerable data
  • ECC
  • Heterogeneous-Reliability Memory
  • Memory Interference
  • QoS-aware memory
  • Fairness
  • SLA (Service Level Agreement)
  • Performance loss
  • Resource partitioning/prioritization
  • DRAM controllers
  • Machine learning
  • DRAM scaling

Lecture 4b (27.09 Fri.)

  • Rowhammer
  • Security
  • Safety
  • Bit flip
  • Maslow Hierarchy
  • Charge-based memory
  • Data retention
  • Flash memory
  • Disturbance errors
  • Hammered row
  • Victim row
  • Electrical interference
  • Cell-to-cell coupling
  • Security attack
  • kernel privileges
  • Page Table Entry (PTE)
  • Electromagnetic coupling
  • Conductive bridges
  • Hot-Carrier injection
  • Aggressor row
  • Refresh rate
  • Data pattern
  • Victim cells
  • weak cells
  • ECC
  • SECDED
  • Variable retention time
  • Rowhammer solutions
  • PARA (Probabilistic Adjacent Row Activation)

Lecture 5 (03.10 Thu.)

  • Genome analysis
  • DNA
  • Cell information
  • Genetic content
  • Human genome
  • DNA genotypes
  • RNA
  • Protein / Phenotypes
  • Adenine (A), Thymine (T), Guanine (G), Cytosine (C)
  • Supercoiled
  • Chromosomes
  • HeLa's cells (Henrietta Lacks)
  • Reference genome
  • Sequence alignment
  • High-throughput sequencing (HTS)
  • Read mapping
  • Hash based seed-and-extend
  • K-mers
  • Burrows-Wheeler Transform
  • Ferragina-Manzini Index
  • Edit distance
  • Match / Mismatch
  • Deletion / Insertion / Substitution
  • Dynamic programming
  • MrFAST
  • Verification
  • Seed filtering
  • Adjacency filtering
  • Cheap k-mer selection
  • FastHASH
  • Pre-alignment filtering
  • Hamming distance
  • Shifted Hamming distance
  • Needleman-Wunsch
  • Neighborhood map
  • GateKeeper
  • Magnet
  • Slider
  • GRIM-filter
  • Apollo
  • Hercules
  • 3D-stacked memory (HMC)
  • Nanopore genome assembly

Lecture 6 (04.10 Thu.)

  • RowHammer
  • Security implications
  • Probabilistic Adjacent Row Activation (PARA)
  • Intelligent Controller
  • NAND Flash
  • Retention Time
  • Data Pattern Dependence (DPD)
  • Variable Retention Time (VRT)
  • Architecting for Security
  • Byzantine Failures
  • Computation in Memory
  • In-Memory Computation
  • Data Movement Bottlenecks
  • Hybrid Memory Cube (HMC)
  • Bulk Data Copy
  • Bulk Copy Initialization
  • RowClone
  • Inter Subarray Copy
  • Inter-Bank Copy
  • Memory as an Accelerator

Lecture 7 (10.10 Thu.)

  • Computation in Memory
  • Processing in Memory
  • Minimally Changing Memory Chips
  • 3D-Stacked Memory
  • RowClone
  • Memory as an Accelerator
  • In-memory bulk bitwise operations
    • Ambit
    • Destructive reads
    • Triple row activation
    • Majority Function
    • Dual contact cell
    • Concurrent addition in space and in time
    • Bit-serial operations
      • Connection machine
    • Bitmap Index
    • BitWeaving
  • Computing Architectures with Minimal Data Movement
  • Mindset on reviewing manuscripts and scientific process
  • Suggestions on critical paper review
  • Mindset issues everywhere
    • Bandwidth bottleneck in Zurich Airport
    • Wrong methodology in design space exploration: Building bridges across Manhattan and Brooklyn
  • 3D-Stacked Logic+Memory
  • Logic Layer
  • Hybrid Memory Cube
  • High-Bandwidth Memory, Wide-IO
  • In-Memory Graph Processing
  • Key Bottlenecks in Graph Processing
  • Tesseract System for Graph Processing
  • Crossbar network

Lecture 8 (11.10 Fri.)

  • Processing-in-memory
  • 3D-stacked memory
  • Processing-in-Memory (PIM)
  • 2.5D Integration
  • Graph Processing
  • Tesseract
  • Accelerating GPU Execution
  • Remote Function Call
  • Data movement bottleneck
  • Google Workloads
  • Chrome Tab Switching
  • Function offloading
  • Transparent Offloading Mechanism (TOM)
  • Pointer Chasing
  • CoNDA
  • In-Memory Pointer-Chasing Accelerator (IMPICA)
  • PIM-Enabled Instructions (PEI)
  • LazyPIM
  • GRIM Filter

Lecture 9a (17.10 Thu.)

  • Target metric
  • Theoretical proof
  • Analytical modeling/estimation
  • Abstraction
  • Accuracy
  • Workload
  • RTL simulations
  • Design choices
  • Cycle-level accuracy
  • Design space exploration
  • Flexibility
  • High-level simulations
  • Low-level models
  • Ramulator
  • Modular
  • Extensible
  • IPC (instructions per cycle)
  • 3D-stacked DRAM
  • DDR3
  • GDDR5
  • HBM
  • HMC
  • Wide I/O
  • LPDDR
  • Spatial locality
  • Bank-level parallelism

Lecture 9b (17.10 Thu.)

  • Data-centric architecture
  • Low latency memory
  • Low energy memory
  • Memory contention
  • QoS problem
  • Caching
  • Prefetching
  • Multithreading
  • Out-of-order execution
  • Runahead execution
  • Instruction Window
  • Speculative execution
  • DRAM Module
  • DRAM Chip
  • DIMM
  • Bank
  • Subarray
  • Sense Amplifier
  • Row buffer
  • I/O logic
  • Cross-coupled inverters
  • SRAM
  • Access transistor
  • Enable signal
  • DRAM cell
  • Bitline
  • Memory channel
  • Activate
  • Precharge
  • Isolation transistor
  • near/far segments
  • Profile-based page mapping
  • Hardware-managed cache
  • LRU
  • Inter-segment migration
  • Page Fault

Lecture 10 (18.10 Fri.)

  • Long memory latency
  • Tiered-latency DRAM
  • Bulk data movement
  • Inter-subarray copy
  • Isolation transistors
  • Row buffer movement (RBM)
  • Variable latency DRAM (VILLA)
  • Linked precharge (LIP)
  • Copy row substrate (CROW)
  • Multiple row activation
  • Subarray-level parallelism (SALP)
  • Bank conflicts
  • Row decoder
  • Global structures (global decoder, global row buffer, global bitlines)
  • Per-subarray latches
  • Designated latches
  • DRAM timing parameters
  • “Fixed latency mindset”
  • Process variation
  • Sensing, restore, precharge
  • Activation errors
  • Spatial latency variation
  • Spatial distribution of failures
  • DRAM aging
  • Systematic variation in DRAM cells
  • Dynamic profiling
  • Voltage reduction

Lecture 11 (24.10 Thu.)

  • PUF: Physical Unclonable Function
  • Challenge-response protocol
  • Trusted and untrusted devices
  • Device authentication
  • Runtime-accessible PUFs
  • Repeatability
  • Diffuseness
  • Uniform randomness
  • DRAM Latency PUF
  • DRAM Retention PUF
  • TRNG: True Random Number Generator
  • Sense Amplification
  • tRCD: Activation latency
  • D-RaNGe
  • RNG Cell
  • NIST statistical test suite
  • DRAM Command Scheduling RNG
  • Retention-based TRNGs
  • Start-up Values as Random Numbers
  • VOLTRON
  • Voltage reduction
  • DDR3L, LPDDR4
  • Dynamic Power
  • Activation latency
  • Spatial locality of Voltage reduction induced errors
  • Memory intensity
  • Memory stall time
  • Memory DVFS (Dynamic Voltage and Frequency Scaling)
  • EDEN
  • Approximate computing
  • Approximate DRAM
  • Deep neural networks (DNN)
  • DNN training
  • DNN inference
  • DNN Weights
  • Input Feature Maps (IFM)
  • Output Feature Maps (OFM)
  • Layer
  • Convolutional Layer
  • DNN Error Tolerance
  • Bit Error Rate (BER)
  • Retraining
  • DNN Accuracy
  • Accuracy collapse

Lecture 12 (25.10 Fri.)

  • EIN: Error INference
  • ECC: error correction code
  • Unstandardised, invisible ECC
  • Post-correction, pre-correction
  • Recover pre-correction information
  • Deliberately induce bit-flips
  • Error distribution
  • Error characteristics comparison
  • Obfuscation of error distribution
  • Predictable and intrinsic DRAM characteristics
  • Uniform-random spatial distribution
  • Maximum-a-posteriori (MAP)
  • Monte-carlo simulation
  • SoftMC
  • Characterise, analyse and understand DRAM cell
  • Flexible and easy-to-use API
  • Violating latency
  • Reliability
  • Custom timing
  • Simple, minimal, accessible
  • Retention time study
  • Highly-charged cell, low latency
  • Non-volatile memory
  • Flexible, easy-to-use
  • CROW: Copy DRAM Row
  • High latency
  • Refresh overhead
  • Vulnerabilities
  • CROW-cache, CROW-ref
  • Duplication, remapping
  • Row-copy, and two-row activation
  • Weak regular row, strong copy row
  • Eliminate refresh
  • Remap row hammer victim
  • SMASH: Sparse Matrix Acceleration using Software and hardware cooperation
  • Pagerank, Sparse DNN
  • Expensive discovery
  • Compressed Sparse Row
  • High compression ratio
  • Special compression formats
  • Hierarchy of Bitmaps
  • Bitmap Management Unit
  • Bitmap Buffers
  • Cross-Layer Interface

Lecture 13a (31.10 Thu.)

  • Memory controller
  • DRAM latency
  • DRAM throughput
  • Phase Change Memory, Spin-Transfer Torque Magnetic Memory
  • Flash memory
  • SSD controller
  • DRAM types: DDR, LPDDR, GDDR, WideIO, HBM, HMC
  • DRAM request
  • Request buffer
  • DRAM scheduling policy
  • FCFS (first come first served)
  • FR-FCFS (first ready, first come first served)
  • Row buffer management policy
  • DRAM timing constraints
  • Memory contention
  • Self-optimizing DRAM controller

Lecture 13b (31.10 Thu.)

  • Resource sharing
  • Partitioning
  • Performance isolation
  • Quality of service (QoS)
  • Fairness
  • Inter-thread/application interference
  • Unfair slowdown
  • Memory performance attack
  • Request scheduling
  • Bank parallelism interference
  • Request batching

Lecture 14 (8.11 Fri.)

  • SIMD
  • SISD
  • MISD
  • Systolic arrays
  • MIMD
  • Instruction level parallelism (ILP)
  • Array processor
  • Vector processor
  • VLIW: Very long instruction word
  • Vector length register (VLEN)
  • Vector stride register (VSTR)
  • Vector load instruction (VLD)
  • Intra-vector dependencies
  • Regular parallelism
  • Memory bandwidth
  • Vector data register
  • Vector control registers
  • Vector mask register
  • Vector functional units
  • Vector registers
  • VADD
  • Scalar operations
  • Memory data register
  • Memory address register
  • Interleaved memory
  • Memory banking
  • Address generator
  • Monolithic memory
  • Memory access latency
  • Vectorizable loops
  • Vector code performance
  • Vector data forwarding (chaining)
  • Vector chaining
  • Vector stripmining
  • Irregular memory access
  • Gather/Scather operations
  • Sparse vector
  • Masked operations
  • Predicated execution
  • Row/Column major layouts
  • Bank conflicts
  • Randomized mapping
  • Vector instruction level parallelism
  • Automatic code vectorization
  • Packed arithmetic
  • GPUs
  • Programming model vs execution model
  • SPMD
  • Warp (wavefront)
  • SIMD vs. SIMT
  • Warp-level FGMT
  • Vector lanes
  • Warp scheduler
  • Fine-grained multithreading
  • Warp instruction level parallelism
  • Warp-based SIMD vs. traditional SIMD
  • Multiple instruction streams
  • Conditional control flow instructions
  • Branch divergence
  • Dynamic warp formation
  • Functional unit

Lecture 15 (14.11 Thu.)

  • Memory Interference
  • Quality of Service
  • QoS-Aware Memory Systems
  • Stall-Time Fair Memory Scheduling
  • Parallelism-Aware Batch Scheduling
  • PAR-BS
  • ATLAS Memory Scheduler
  • Thread Cluster Memory Scheduling
  • TCM
  • Throughput vs. Fairness
  • Clustering Threads
  • STFM
  • FR-FCFS
  • The Blacklisting Memory Scheduler
  • BLISS
  • Staged Memory Scheduling
  • SMS
  • DASH
  • Current SoC Architectures
  • Strong Memory Service Guarantees
  • Predictable Performance
  • Handling Memory Interference In Multithreaded Applications
  • Barriers
  • Critical Sections
  • Data mapping
  • Memory Channel Partitioning
  • Core/source throttling
  • Fairness via Source Throttling

Lecture 16a (15.11 Fri.)

  • Shared resource contention
  • Slowdown estimation
  • Application/thread scheduling
  • Multi-core/many-core systems
  • Application/data mapping
  • Application prioritization
  • On-chip communication
  • Communication distance
  • Congestion in Network-on-Chip (NoC)
  • Spatial task scheduling
  • Clustering
  • Load balancing
  • Isolation
  • Radial mapping
  • Distributed Resource Management (DRM)
  • Operating-system-level metric
  • Microarchitecture-level metric
  • Architecture-aware DRM
  • Machine learning-based mapping/scheduling

Lecture 16b (15.11 Fri.)

  • Emerging memory technology
  • Flash memory
  • Memory-centric system design
  • Phase change memoery
  • Charge memory
  • Resistive memory
  • Multi-level cell
  • Spin-Transfer Torque Magnetic RAM (STT-MRAM)
  • Memristors
  • Resistive RAM (RRAM or ReRAM)
  • Intel 3D Xpoint
  • Capacity-latency trade-off
  • Capacity-reliability trade-off
  • Endurance
  • Magnetic Tunnel Junction (MTJ)
  • Hybrid memory
  • Writing filtering
  • Data placement
  • Data access pattern
  • Row-buffer locality
  • Overall system performance impact
  • Memory-Level Parallelism (MLP)
  • Utility-based hybrid memory management

Lecture 17 (21.11 Thu.)

  • SIMD
  • Multiply-accumulate
  • Thread block
  • Stream processor
  • Tensor core
  • Neural network training
  • Systolic arrays
  • Fine-grain multithreading
  • Warp
  • GPU programming
  • General purpose processing on GPU
  • GPU kernels
  • CUDA
  • OpenCL
  • SPMD
  • Grid, Block, Threads
  • Row major layout
  • Warp scheduler
  • Coalesced memory accesses
  • AoS (Array of Structures)
  • SoA (Structure of Arrays)
  • Tiling
  • Bank conflicts
  • Padding
  • Randomized mapping
  • Hash functions
  • Divergence
  • Vector reduction
  • Divergence-free mapping
  • Atomic operations
  • PTX
  • SASS
  • Synchronous and asynchronous transfers
  • Stream
  • Collaborative Computing
  • Unified memory space
  • Collaborative patterns

Lecture 19 (28.11 Thu.)

  • Hybrid Memory Systems
  • Large (DRAM) Cache
  • TIMBER
  • Two-Level Memory/Storage model
  • Volatile data
  • Persistent data
  • Single-level store
  • Unified Memory and storage
  • The Persistent Memory Manager (PMM)
  • ThyNVM
  • Heterogeneity
  • Asymmetry in design
  • Amdahl's Law
  • Synchronization overhead
  • Load imbalance overhead
  • Resource sharing overhead
  • IBM Power4
  • IBM Power5
  • Niagara Processor
  • Performance vs. parallelism
  • Asymmetric Chip Multiprocessor (ACMP)
  • MorphCore

Lecture 20 (29.11 Fri.)

  • Heterogeneity
  • Pointer-chasing
  • Critical section
  • Asymmetry
  • Accelerated
  • Data Marshalling
  • False Serialization
  • Shared Data
  • Private Data
  • Amdahl’s Law
  • Barriers
  • Identification
  • Migration
  • Private Data
  • Feedback Directed Pipelining
  • Staged Execution
  • Inter-segment Data
  • Pipeline Parallelism
  • Dynamic heterogeneity

Lecture 21a (5.12 Thu.)

  • Persistent memory
  • Crash consistency
  • Checkpointing
  • Flynn's taxonomy of computers
  • Parallelism
  • Performance
  • Power consumption
  • Cost efficiency
  • Dependability
  • Instruction level parallelism
  • Data parallelism
  • Task level parallelism
  • Utilization
  • Redundancy
  • Efficiency
  • Amdahl's law
  • Bottlenecks in parallel portion
  • Multiprocessor
  • Loosely coupled multiprocessors
  • Tightly coupled multiprocessors
  • Shared global memory address space
  • Shared memory synchronization
  • Interconnects
  • Programming issues in tightly coupled multiprocessor
  • Sublinear speedup
  • Linear speedup
  • Superlinear speedup
  • Shared resource management
  • Unfair comparison
  • Cache/memory effect

Lecture 21b (5.12 Thu.)

  • Memory consistency / memory ordering
  • Ordering of operations
  • Local ordering
  • Global ordering
  • Sequential consistency
  • Weaker memory consistency
  • Memory fence instructions
  • Consequences of Sequential Consistency
  • Issues with Sequential Consistency
  • Global order requirement
  • Aggressiveness
  • Out-of-order execution
  • Higher performance
  • Burden on the programmer
  • Mutual exclusion
  • Protecting shared data
  • Ease of debugging
  • Correctness
  • MIMD processor
  • Dataflow processor

Lecture 22 (6.12 Fri.)

  • Cache coherence
  • Memory consistency
  • Shared memory model
  • Software coherence
    • Coarse-grained (page-level)
    • Non-cacheable
    • Fine-grained (cache flush)
  • Hardware coherence
  • Valid/invalid
  • Write propagation
  • Write serialization
  • Update vs. Invalid
  • Snoopy bus
  • Directory
    • Exclusive bit
  • Directory optimizations (bypassing)
  • Snoopy cache
  • Shared bus
  • VI protocol
  • MSI (Modified, Shared, Invalid)
  • Exclusive state
  • MESI (Modified, Exclusive, Shared, Invalid)
  • Illinois Protocol (MESI)
  • Broadcast
  • Bus request
  • Downgrade/upgrade
  • Snoopy invalidation
  • Cache-to-cache transfer
  • Writeback
  • MOESI (Modified, Owned, Exclusive, Shared, Invalid)
  • Directory coherence
  • Race conditions
  • Totally-ordered interconnect
  • Directory-based protocols
  • Set inclusion test
  • Linked list
  • Bloom filters
  • Contention resolution
  • Ping-ponging
  • Synchronization
  • Shared-data-structure
  • Token Coherence
  • Coherence for NDAs
  • Optimistic execution
  • Signature
  • Commit/re-execute

Lecture 23 (12.12 Thu.)

  • Interconnects
  • Cache coherence
  • Interconnect networks:
    • Topology
    • Routing
    • Buffering and flow control
      • Oversubscription of routers
  • Terminology:
    • Network Interface
    • Link
    • Switch/router
    • Channel
    • Node
    • Message
    • Packet
    • Flit
    • Direct/Indirect network
  • Properties of a Network Topology:
    • Regular/Irregular
    • Routing distance
    • Diameter
    • Average distance
    • Bisection Bandwidth
    • Blocking/non-blocking. Rearrangeable non-blocking
  • Topologies:
    • Bus
    • P2P
    • Crossbar
    • Ring
    • Tree
    • Omega
    • Hypercube
    • Mesh
    • Torus
    • Butterfly
  • cost, latency, contention, energy, bandwidth, overall performance
  • Circuit switching network
  • Multistage network
  • Fetch-and-add
  • Unidirectional Ring
  • Bidirectional rings
  • Hierarchical rings
  • Mesh: asymmetricity on the edge
  • Torus
  • H-tree
  • Fat-tree
  • Hyper-cube. Caltech's “The Cosmic Cube”
  • Routing mechanism: Arithmetic, Source-based, Table-based lookup
  • Types of routing algorithm: deterministic, oblivious, adaptive
  • Deadlock
  • Oblivious routing: Valiant’s algorithm
  • Adaptive Routing: minimal adaptive, non-minimal adaptive
  • Flow Control
  • Handling contention: buffer, drop or misroute
  • Flow control methods:
    • Circuit switching
    • Bufferless
    • Store and forward
    • Virtual cut through
    • Wormhole
  • Performance and congestion at high loads
  • Store-and-forward
  • Cut-though flow control
  • Wormhole flow control
  • Head of Line Blocking
buzzword.1576190060.txt.gz · Last modified: 2019/12/12 22:34 by rahbera