User Tools

Site Tools


Intelligent Architectures using Hardware/Software Cooperative Techniques

Course Description

Modern general-purpose processors are agnostic to an application’s high-level semantic information. Hence, they employ prediction-based techniques to enable computational and memory optimizations, such as prefetching, cache management policies, memory data placement, instruction scheduling, and many others. As such, the potential of such optimizations is limited due to the limited information the underlying hardware can discover on its own and such optimizations come with large area, power and complexity overheads required by the hardware for prediction purposes. Purely-hardware optimizations cannot achieve their performance potential and waste power, complexity and hardware area, since they are not aware of the application characteristics. On the other hand, purely-software optimizations are fundamentally tied up and limited by the underlying hardware.

A promising way to increase the performance of modern applications is to co-design software and hardware. Hence, lately both industry and academia are making serious attempts to improve performance, energy and security using hardware/software cooperative schemes such as application-specific hardware accelerators (e.g., Google’s Tensor Processing Unit) and application-specific extensions in general-purpose processors (e.g., Media Engine in Apple M1).

In this course, we will explore several different topics around hardware/software co-design such as: (i) new hardware/software interfaces (e.g., virtual memory, instruction set architecture) to enhance performance, energy and security, (ii) hardware/software co-design schemes to improve the performance of the memory subsystem in killer memory-intensive applications (e.g., sparse and irregular workloads), (iii) hardware/software cooperative machine-learning-based techniques for different microarchitectural components such as prefetchers, caches and branch predictors, which would continuously learn from the vast amount of memory accesses seen by a processor and adapt to the varying workload and system conditions.

If you are enthusiastic about working hands-on to design both software and hardware, this is your P&S. You will have the opportunity to study modern applications, propose software changes to better match the underlying hardware components, design new hardware components that better match the overlying software and come up with new machine-learning techniques to design efficient microarchitectural components. You will also learn how to program industry-supported microarchitectural simulators and study the performance of modern workloads after your hardware/software modifications.

Prerequisites of the course:

  • Digital Design and Computer Architecture (or equivalent course).
  • Familiarity with C/C++ programming and strong coding skills.
  • Interest in future computer architectures and computing paradigms.
  • Interest in discovering why things do or do not work and solving problems
  • Interest in making systems efficient and usable


  • Hands-on experience with Machine Learning frameworks (depends on the topic you choose)

The course is conducted in English.

Course webpage



Name E-mail Office
Lead Supervisor Konstantinos Kanellopoulos ETZ H 61.1
Supervisor Rahul Bera ETZ H 63
Supervisor Mohammad Sadrosadati ETZ H 61.1
Supervisor Juan Gomez Luna ETZ H 63
Supervisor Haiyu Mao ETZ H 61.1

Video Playlist on YouTube

2022 Meetings/Schedule (Tentative)

Week Date Livestream Meeting Materials Assignments
W0 16.03
Intro to HW/SW Co-Design
Required HW 0 Out
W1 23.03 Project selection Required
W2 30.03
Virtual Memory (I)
W3 13.04
Virtual Memory (II)
W4 25.05 Tutorial: Designing an Expressive Cross-layer Interface
W5 25.05 Tutorial: Using an Expressive Cross-layer Interface

Learning Materials

Meeting 0: Required Materials

Meeting 2 & 3: Required Materials


HW0: Student Information (Due: 17/03/2022)

hw_sw_codesign.txt · Last modified: 2022/05/23 02:58 by kanellok