User Tools

Site Tools


heterogeneous_systems

Programming Heterogeneous Computing Systems with GPUs and other Accelerators (227-0085-51L)

Course Description

The increasing difficulty of scaling the performance and efficiency of CPUs every year has created the need for turning computers into heterogeneous systems, i.e., systems composed of multiple types of processors that can suit better different types of workloads or parts of them. More than a decade ago, Graphics Processing Units (GPUs) became general-purpose parallel processors, in order to make their outstanding processing capabilities available to many workloads beyond graphics. GPUs have been a critical key to the recent rise of Machine Learning and Artificial Intelligence, which took unrealistic training times before the use of GPUs. Field-Programmable Gate Arrays (FPGAs) are another example computing device that can deliver impressive benefits in terms of performance and energy efficiency. More specific examples are (1) a plethora of specialized accelerators (e.g., Tensor Processing Units for neural networks), and (2) near-data processing architectures (i.e., placing compute capabilities near or inside memory/storage).

Despite the great advances in the adoption of heterogeneous systems in recent years, there are still many challenges to tackle, for example:

  • Heterogeneous implementations (using GPUs, FPGAs, TPUs) of modern applications from important fields such as bioinformatics, machine learning, graph processing, medical imaging, personalized medicine, robotics, virtual reality, etc.
  • Scheduling techniques for heterogeneous systems with different general-purpose processors and accelerators, e.g., kernel offloading, memory scheduling, etc.
  • Workload characterization and programming tools that enable easier and more efficient use of heterogeneous systems.

If you are enthusiastic about working hands-on with different software, hardware, and architecture projects for heterogeneous systems, this is your P&S. You will have the opportunity to program heterogeneous systems with different types of devices (CPUs, GPUs, FPGAs, TPUs), propose algorithmic changes to important applications to better leverage the compute power of heterogeneous systems, understand different workloads and identify the most suitable device for their execution, design optimized scheduling techniques, etc. In general, the goal will be to reach the highest performance reported for a given important application.

Prerequisites of the course:

  • Digital Design and Computer Architecture (or equivalent course).
  • Familiarity with C/C++ programming and strong coding skills.
  • Interest in future computer architectures and computing paradigms.
  • Interest in discovering why things do or do not work and solving problems
  • Interest in making systems efficient and usable

The course is conducted in English.

The course has two main parts:
1. Weekly lectures on GPU and heterogeneous programming.
2. Hands-on project: Each student develops his/her own project.

Course description page Moodle

Mentors

Name E-mail Office
Lead Supervisor Juan Gómez Luna juan.gomez@safari.ethz.ch ETZ H 61.1
Supervisor Mohammed Alser alserm@ethz.ch ETZ H 61.1
Supervisor Behzad Salami bsalami@ethz.ch ETZ H 64
Supervisor Mohammad Sadr ETZ H 61.1
Supervisor Joel Lindegger ETZ H 64

Lecture Video Playlist on YouTube

Fall 2022 Meetings/Schedule

Week Date Livestream Meeting Learning Materials Assignments
W1 03.10
Mon.
Livestream
M1: P&S Course Presentation
(PDF) (PPT)
Required Materials
Recommended Materials
HW 0 Out
W2 10.10
Mon.
Premiere
M2: SIMD Processing and GPUs
(PDF) (PPT)
Hands-on Project Proposals
W3 17.10
Mon.
Premiere
M3: GPU Software Hierarchy
(PDF) (PPT)
W4 24.10
Mon.
Premiere
M4: GPU Memory Hierarchy
(PDF) (PPT)
W5 31.10
Mon.
Premiere
M5: GPU Performance Considerations
(PDF) (PPT)
W6 07.11
Mon.
Premiere
M6: Parallel Patterns: Reduction
(PDF) (PPT)
W7 14.11
Mon.
Premiere
M7: Parallel Patterns: Histogram
(PDF) (PPT)
W8 21.11
Mon.
Premiere
M8: Parallel Patterns: Convolution
(PDF) (PPT)
W9 28.11
Mon.
Premiere
M9: Advanced Tiling for Matrix Multiplication
(PDF) (PPT)
W10 05.12
Mon.
Premiere
M10: Parallel Patterns: Prefix Sum (Scan)
(PDF) (PPT)
W11 12.12
Mon.
Premiere
M11: Parallel Patterns: Sparse Matrices
(PDF) (PPT)
W12 19.12
Mon.
Premiere
M12: Parallel Patterns: Graph Search
(PDF) (PPT)
W13 02.01
Mon.
Premiere
M13: Parallel Patterns: Merge
(PDF) (PPT)
W14 09.01
Mon.
Premiere
M14: Dynamic Parallelism
(PDF) (PPT)
W15 16.01
Mon.
Premiere
M15: Collaborative Computing
(PDF) (PPT)
W16 23.01
Mon.
Premiere
M16: GPU Acceleration of Genome Sequence Alignment
(PDF) (PPT)
W17 30.01
Mon.
Premiere
M17: Accelerating Agent-based Simulations
(PDF) (PPT)

Past Lecture Video Playlists on YouTube

Learning Materials

Meeting 1: Required Materials

  • An introduction to SIMD processors and GPUs:
  • An introduction to GPUs and heterogeneous programming:

Meeting 1: Recommended Materials

More Learning Materials

Assignments

HW0: Student Information (Due: 15.10)

heterogeneous_systems.txt · Last modified: 2023/01/28 10:39 by juang