SAFARI Live Seminar: Jawad Haj-Yahya 16 August 2021

Join us for our next SAFARI Live Seminar with Jawad Haj-Yahya.

Monday, August 16 at 5:30 pm Zurich time (CEST)

Power Management Mechanisms in Modern Microprocessors and Their Security Implications
Jawad Haj-Yahya, Principal researcher at Huawei Research Center in Zurich

Livestream at 5:30 pm Zurich time (CEST) on YouTube:

Billions of new devices (e.g., sensors, wearables, smartphones, tablets, laptops, servers) are being deployed each year with new services and features that are driving a higher demand for high performance microprocessors, which often have high power consumption. Despite the failure of Dennard scaling, the slow-down in Moore’s Law, and the high power-density of modern processors, power management mechanisms have enabled significant advances in modern microprocessor performance and energy efficiency. Yet, current power management architectures also pose serious security implications. This is mainly because functionality rather than security has been the main consideration in the design of power management mechanisms in commodity microprocessors.

In this seminar, we provide a detailed overview of the state-of-the-art in power management mechanisms, power delivery networks (PDNs), and security vulnerabilities of current management mechanisms in modern microprocessors. We first present, analyze and enhance the advanced power management mechanisms of modern microprocessors to improve energy and performance in active and idle power states. Second, we present the design and tradeoffs of modern power delivery networks, evaluate their implications on performance and energy-efficiency, and describe new techniques to mitigate PDN inefficiencies. We will especially introduce the idea and benefits of hybrid power delivery networks. Third, we present some of the security vulnerabilities that exist in current management mechanisms of modern processors and propose mitigation techniques. We conclude that power management, power delivery and resulting security implications are critical and exciting areas to research to make modern systems both more energy-efficient and higher performance.

Jawad Haj-Yahya received his Ph.D. degree in Computer Science from Haifa University, Israel. Jawad was a processor architect for many years at Intel. His awards and honors include the Intel Achievement Award (the highest award at Intel), for his significant contribution to Intel processors. Jawad worked at Nanyang Technological University (NTU), Singapore as a cybersecurity Research Scientist where he led the architecture and design of a secure-processor project based on RISC-V architecture. He then moved to the Institute of Microelectronics (IME) at A*STAR Singapore where he was a Scientist III and worked on hardware security and an AI accelerator. Jawad next worked as a Senior Researcher in the SAFARI Research Group at ETH Zurich, where he led multiple projects on Energy-Efficient Computing and Hardware Security, before moving to his current position as principal researcher at Huawei Research Center in Zurich.

This talk is based on four papers we published respectively at HPCA 2020, ISCA 2020, MICRO 2020 and ISCA 2021. The links to individual papers and slides are below.

Congratulations to Damla Senol Cali on successfully defending her PhD!

Thesis:  Accelerating Genome Sequence Analysis via Efficient Hardware/Algorithm Co-Design


Genome sequence analysis plays a pivotal role in enabling many medical and scientific advancements in personalized medicine, outbreak tracing, the understanding of evolution, and forensics. Modern genome sequencing machines can rapidly generate massive amounts of genomics data at low cost. However, the analysis of genome sequencing data is currently bottlenecked by the computational power and memory bandwidth limitations of existing systems, as many of the steps in genome sequence analysis must process a large amount of data. Our goals in this dissertation are to (1) characterize the real-system behavior of the genome sequence analysis pipeline and its associated tools, (2) expose the bottlenecks and tradeoffs of the pipeline and tools, and (3) co-design fast and efficient algorithms along with scalable and energy-efficient customized hardware accelerators for the key pipeline bottlenecks to enable faster genome sequence analysis.

First, we comprehensively analyze the tools in the genome assembly pipeline for long reads in multiple dimensions (i.e., accuracy, performance, memory usage, and scalability), uncovering bottlenecks and tradeoffs that different combinations of tools and different underlying systems lead to. We show that we need high-performance, memory-efficient, low-power, and scalable designs for genome sequence analysis in order to exploit the advantages that genome sequencing provides. Second, we propose GenASM, an acceleration framework that builds upon bitvector-based approximate string matching (ASM) to accelerate multiple steps of the genome sequence analysis pipeline. We co-design our highly-parallel, scalable and memory-efficient algorithms with low-power and area-efficient hardware accelerators. We evaluate GenASM for three different use cases of ASM in genome sequence analysis and show that GenASM is significantly faster and more power- and area-efficient than state-of-the-art software and hardware tools for each of these use cases. Third, we implement an FPGA-based prototype for GenASM, where state-of-the-art 3D-stacked memory (HBM2) offers high memory bandwidth and FPGA resources offer high parallelism by instantiating multiple copies of the GenASM accelerators. Fourth, we propose GenGraph, the first hardware acceleration framework for sequence-to-graph mapping. Instead of representing the reference genome as a single linear DNA sequence, genome graphs provide a better representation of the diversity among populations by encoding variations across individuals in a graph data structure, avoiding a bias towards any one reference. GenGraph enables the efficient mapping of a sequenced genome to a graph-based reference, providing more comprehensive and accurate genome sequence analysis.

Overall, we demonstrate that genome sequence analysis can be accelerated by co- designing scalable and energy-efficient customized accelerators along with efficient algorithms for the key steps of genome sequence analysis.

Examining Committee

Onur Mutlu, Co-advisor, CMU-ECE, ETH Zurich
Saugata Ghose, Co-advisor, CMU-ECE, University of Illinois Urbana-Champaign
James C. Hoe, CMU-ECE
Can Alkan, Bilkent University








SAFARI Live Seminar: Geraldo F. Oliveira 22 July 2021

We are pleased to have Geraldo F. Oliveira give a 3rd talk in our SAFARI Live Seminars!

Thursday, July 22 at 5:00 pm Zurich time (CEST)

DAMOV: A New Methodology and Benchmark Suite for Evaluating Data Movement Bottlenecks
Geraldo F. Oliveira, SAFARI Research Group, D-ITET, ETH Zurich

Livestream at 5:00 pm Zurich time (CEST) on YouTube:


Data movement between the CPU and main memory is a first-order obstacle against improving performance, scalability, and energy efficiency in modern systems. Computer systems employ different techniques to reduce overheads caused by data movement, from traditional processor-centric mechanisms (e.g., deep multi-level cache hierarchies, aggressive hardware prefetchers) to emerging paradigms, such as near-data processing (NDP), where computation is moved closer to or inside memory. However, there is a lack of understanding about (1) the key metrics that identify different sources of data movement bottlenecks and (2) how different data movement bottlenecks can be alleviated by traditional and emerging data movement mitigation mechanisms.

In this work, we make two key contributions. First, we propose the first methodology to characterize data-intensive workloads based on the source of their data movement bottlenecks. This methodology is driven by insights obtained from a large-scale experimental characterization of 345 applications from 37 different benchmark suites and an evaluation of the performance of memory-bound functions from these applications with three data-movement mitigation mechanisms. Second, we release DAMOV, the first open-source benchmark suite for main memory data movement-related studies, based on our systematic characterization methodology. This suite consists of 144 functions representing different sources of data movement bottlenecks and can be used as a baseline benchmark set for future data-movement mitigation research. We show how DAMOV can aid the study of open research problems for NDP architectures via four case studies.

Our work provides new insights about the suitability of different classes of data movement bottlenecks to the different data movement mitigation mechanisms, including analyses on how the different data movement mitigation mechanisms impact performance and energy for memory bottlenecked applications. All our bottleneck analysis toolchains and DAMOV benchmarks are publicly and freely available ( We believe and hope that our work can enable further studies and research on hardware and software solutions for data movement bottlenecks, including near-data processing.

Speaker Bio:
Geraldo F. Oliveira is a Ph.D. student in the SAFARI Research Group @ETH Zurich. He received a B.S. degree in computer science from the Federal University of Viçosa, Viçosa, Brazil, in 2015, and an M.S. degree in computer science from the Federal University of Rio Grande do Sul, Porto Alegre, Brazil, in 2017. Since 2018, he has been working toward a Ph.D. degree with Onur Mutlu at ETH Zürich, Zürich, Switzerland. His current research interests include system support for processing-in-memory and processing-using-memory architectures, data-centric accelerators for emerging applications, approximate computing, and emerging memory systems for consumer devices. He has several publications on these topics.

Join us as ISCA 2021 for our talks

ISCA 2021 Program:

Tuesday, June 15 Session 6B: Memory II 12 pm EDT:

Lois Orosa, Yaohua Wang, Mohammad Sadrosadati, Jeremie S. Kim, Minesh Patel, Ivan Puddu, Haocong Luo, Kaveh Razavi, Juan Gomez-Luna, Hasan Hassan, Nika Mansouri-Ghiasi, Saugata Ghose, and Onur Mutlu, “CODIC: A Low-Cost Substrate for Enabling Custom In-DRAM Functionalities and Optimizations”Proceedings of the 48th International Symposium on Computer Architecture (ISCA), Virtual, June 2021.
[Slides (pptx) (pdf)]
[Short Talk Slides (pptx) (pdf)]
[Talk Video (22 minutes)]

Wednesday, June 16 Session 11B 1:15 pm EDT:

Jawad Haj-Yahya, Jeremie S. Kim, A. Giray Yaglikci, Ivan Puddu, Lois Orosa, Juan Gomez Luna, Mohammed Alser, and Onur Mutlu, “IChannels: Exploiting Current Management Mechanisms to Create Covert Channels in Modern Processors”, Proceedings of the 48th International Symposium on Computer Architecture (ISCA), Virtual, June 2021.
[Slides (pptx) (pdf)]
[Short Talk Slides (pptx) (pdf)]
[Talk Video (21 minutes)]

Wednesday, June 16 Session 11A 1:15 pm EDT:

Ataberk Olgun, Minesh Patel, A. Giray Yaglikci, Haocong Luo, Jeremie S. Kim, F. Nisa Bostanci, Nandita Vijaykumar, Oguz Ergin, and Onur Mutlu, “QUAC-TRNG: High-Throughput True Random Number Generation Using Quadruple Row Activation in Commodity DRAM Chips”, Proceedings of the 48th International Symposium on Computer Architecture (ISCA), Virtual, June 2021.
[Slides (pptx) (pdf)]
[Short Talk Slides (pptx) (pdf)]
[Talk Video (25 minutes)]


SIMDRAM: A Framework for Bit-Serial SIMD Processing using DRAM

Watch our recent talks at ASPLOS 2021!

Nastaran Hajinazar, Geraldo F. Oliveira, Sven Gregorio, Joao Dinis Ferreira, Nika Mansouri Ghiasi, Minesh Patel, Mohammed Alser, Saugata Ghose, Juan Gomez-Luna, and Onur Mutlu,
SIMDRAM: A Framework for Bit-Serial SIMD Processing using DRAM”
Proceedings of the 26th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Virtual, March-April 2021.
[2-page Extended Abstract]
[Short Talk Slides (pptx) (pdf)]
[Talk Slides (pptx) (pdf)]
[Short Talk Video (5 mins)]
[Full Talk Video (27 mins)]


Join us at ASPLOS 2021 online

We are at ASPLOS 2021 this week and next.  Join us for our talks and learn more about our recent works:

Session 2: Memory Systems, Monday, April 19 4:00 PM Pacific Tiime:

Irina Calciu, M. Talha Imran, Ivan Puddu, Sanidhya Kashyap, Hasan Al Maruf, Onur Mutlu, and Aasheesh Kolli,
“Rethinking Software Runtimes for Disaggregated Memory”
Proceedings of the 26th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Virtual, March-April 2021.
[2-page Extended Abstract]
[Source Code (Officially Artifact Evaluated)]

Session 8: Tools & Frameworks, Tuesday, April 20 4:00 PM Pacific Time: 

Nastaran Hajinazar, Geraldo F. Oliveira, Sven Gregorio, Joao Dinis Ferreira, Nika Mansouri Ghiasi, Minesh Patel, Mohammed Alser, Saugata Ghose, Juan Gomez-Luna, and Onur Mutlu,
“SIMDRAM: An End-to-End Framework for Bit-Serial SIMD Computing in DRAM”
Proceedings of the 26th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Virtual, March-April 2021.
[2-page Extended Abstract]
[Short Talk Slides (pptx) (pdf)]
[Talk Slides (pptx) (pdf)]
[Short Talk Video (5 mins)]
[Full Talk Video (27 mins)]

Session 17: Solid State Drives, Thursday, April 22 7:00 AM Pacific Time:

Jisung Park, Myungsuk Kim, Myoungjun Chun, Lois Orosa, Jihong Kim, and Onur Mutlu,
“Reducing Solid-State Drive Read Latency by Optimizing Read-Retry”
Proceedings of the 26th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Virtual, March-April 2021.
[2-page Extended Abstract]
[Short Talk Slides (pptx) (pdf)]
[Full Talk Slides (pptx) (pdf)]
[Short Talk Video (5 mins)]
[Full Talk Video (19 mins)]


ASPLOS Program:

Onur Mutlu and Co-authors Receive the 2021 HPCA Test of Time Award

Congratulations to Onur Mutlu and co-authors on receiving the HPCA Test of Time Award for their 2003 HPCA paper:

Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-Order Processors
Onur Mutlu, Jared Stark, Chris Wilkerson, Yale N. Patt

The IEEE International Symposium on High-Performance Computer Architecture (HPCA) Test of Time Award recognizes the most influential papers published in prior sessions of HPCA (held 18-22 years ago), and that have had a significant impact in the field.

The paper was Professor Onur Mutlu’s first publication during his PhD at the University of Texas with his PhD advisor Professor Yale Patt and colleagues from Intel, and Dr. Jared Stark and Chris Wilkerson.  The significance of the paper was described by the award committee as: “Runahead Execution is a pioneering paper that opened up new avenues in dynamic prefetching. The basic idea of run ahead execution effectively increases the instruction window very significantly, without having to increase physical resource size (e.g. the issue queue). This seminal paper spawned off a new area of ILP-enhancing microarchitecture research. This work has had strong industry impact as evidenced by IBM’s POWER6 – Load Lookahead, NVIDIA Denver, and Sun ROCK’s hardware scouting.” The award was presented last week at HPCA 2021 on March 2, 2021.

Watch Onur’s Retrospective HPCA Test of Time Award Talk Video (14 minutes)

Onur Mutlu
, Jared Stark, Chris Wilkerson, and Yale N. Patt,
“Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-order Processors”
Proceedings of the 9th International Symposium on High-Performance Computer Architecture (HPCA), pages 129-140, Anaheim, CA, February 2003.
[Talk Slides (pdf)]
[Lecture Slides (pptx) (pdf)]
[Lecture Video (1 hr 54 mins)]
[Retrospective HPCA Test of Time Award Talk Slides (pptx) (pdf)]
[Retrospective HPCA Test of Time Award Talk Video (14 minutes)]
One of the 15 computer architecture papers of 2003 selected as Top Picks by IEEE Micro.
HPCA Test of Time Award (awarded in 2021).

Interview with Mohammed Alser: on his recent papers and his future work

Mohammed Alser
 is a Senior Researcher and Lecturer with SAFARI. 
He was previously a PhD student in SAFARI, co-advised with Can Alkan. Mohammed co-teaches two Projects and Seminars courses on Genome Sequencing Analysis and Mobile Genomics along with the Seminar on Computer Architecture.  We recently interviewed Mohammed for the January 2021 issue of the SAFARI Newsletter.  

You have been busy this past year, and have published quite a few papers. Your recent work, SneakySnake, was recently published in Bioinformatics. This is an important work in improving computations for genome analysis. Can you tell us more about the significance of this work, and what broader impacts you hope for it?

SneakySnake is one of the projects that I enjoyed the most working on. We try in this work to significantly reduce the time spent on finding the similarities and differences between two genomic sequences without sacrificing solution optimality. Finding the similarities and differences between two sequences is a well-known computer science problem, called approximate string matching (ASM), which is solved using computationally expensive algorithms.

SneakySnake quickly finds the sequence pairs that have a large (greater than a user-defined threshold) number of differences and prevents applying computationally expensive algorithms for these sequence pairs, as such sequence pairs are usually not useful for genomic studies. SneakySnake is inspired by the single net routing (SNR) problem in VLSI design that was introduced in 1976. SneakySnake is the first work that proposes to convert the ASM problem into an instance of the SNR problem, which provides several key benefits as we discussed in the paper, and proposes a new efficient algorithm for comparing genomic sequences at scale.

SneakySnake is very beneficial for analyzing both short (e.g., Illumina) and long (e.g., nanopore) sequences as it accelerates the analysis of genomic sequences by up to two orders of magnitude compared to the state-of-the-art algorithms. SneakySnake works efficiently and fast on modern CPU, FPGA, and GPU architectures, which can potentially enable new applications of genome sequencing such as rapid surveillance of disease outbreaks including Ebola and COVID-19, near-patient testing, and bringing precision medicine to remote locations, without the need for large infrastructure.

One of the Bioinformatics journal’s reviewers states that: “SneakySnake is a valuable contribution to bioinformatics and it was innovative to reduce the ASM problem to the SNR problem in VLSI CAD”.

You also recently published Accelerating Genome Analysis, which reviews the improvements made in hardware accelerators for genome analysis. What are your take away messages from this paper, and what do you see as future priorities in hardware improvements for genome analysis?

Most speedup comes from parallelism enabled by novel architectures and algorithms. We need to develop acceleration solutions that exploit new efficient hardware-aware algorithms, hardware/software co-design, and hardware accelerators to achieve a high degree of parallelism.

Accelerating the entire genome analysis pipeline is important. Accelerating only a single step of genome analysis is not an effective acceleration approach as it limits the overall achieved speedup according to Amdahl’s Law.

Genome analysis is currently heavily bottlenecked by data movement. We need to reduce the high amount of data movement that takes place during genome analysis. Moving data (1) between compute units and main memory, (2) between multiple hardware accelerators, and (3) between the sequencing machine and the computer performing the analysis incurs high costs in terms of execution time and energy. These costs are a significant barrier to enabling efficient analysis that can keep up with sequencing technologies.

The need for flexible hardware architectures. We need to develop flexible hardware architectures that do not conservatively limit the range of supported parameter values at design time. Rapid changes in sequencing technologies (e.g., those that result in high sequencing error rates and longer read lengths) can quickly make specialized hardware with restricted parameter values obsolete.

The need for new genomic data formats. We need to adapt existing genomic data formats for hardware accelerators or develop more efficient file formats to maximize the benefits of hardware accelerators and reduce resource utilization.

Looking into the future, building a genome sequencing machine that provides the entire genome as a single string, rather than its short subsequences, might be possible. However, we believe that the need for hardware acceleration of whole-genome analysis will continue to remain necessary. We also believe performing genome analysis inside the sequencing machine itself can significantly improve efficiency by eliminating sequencer-to-computer data movement.

Your work has many topical applications that are highly relevant to society, including COVID modeling. Can you talk a bit about this, and your future research directions?

As the entire world is largely negatively impacted by the recent COVID-19 outbreak, we believe that everyone can help to end this pandemic based on their skills, expertise, and available resources. At SAFARI research group, we are helping with two main directions.

We are working on developing an accurate and configurable prediction model that evaluates the existing mitigation measures that the government applies in a region and provides suggestions on what strength the future mitigation measures should be. We are quantifying the spread of COVID-​19 in Switzerland (as a use-case) by calculating the daily reproduction number of COVID-19, which quantifies how many people are infected on average by an infected person. The reproduction number is directly affected by the mitigation measures that the government applies to a region. We are also considering other important factors such as daylight temperature that significantly affect the spread of COVID-​19 as we observed during the year 2020.

We are also working on developing new algorithms and hardware accelerators that perform fast and accurate metagenomic profiling for assessing microbial diversity, identifying potential new species, and investigating microbiomes associated with COVID-19 and other diseases. Performing genomic tests at scale during a pandemic highlights the dire need for building efficient specialized hardware that is both scalable and portable to enable genome analysis anywhere and anytime. We hope that the progress we make in this direction will also enable new applications that benefit human life and society.

Mohammed Alser, Taha Shahroodi, Juan-Gomez Luna, Can Alkan, and Onur Mutlu, SneakySnake: A Fast and Accurate Universal Genome Pre-Alignment Filter for CPUs, GPUs, and FPGAsBioinformatics, December 2020.
Paper PDF | Paper link Bioinformatics | Source Code

Mohammed Alser, Zulal Bingol, Damla Senol Cali, Jeremie Kim, Saugata Ghose, Can Alkan, and Onur Mutlu, Accelerating Genome Analysis: A Primer on an Ongoing Journey, IEEE MICRO, September/October 2020.
Paper | Slides (pptx) (pdf)

Interview with Damla Senol Cali: about her work, experience in SAFARI, and her future directions

Damla Senol Cali is a PhD student with SAFARI at CMU co-advised by Onur Mutlu and Saugata Ghose.
We recently interviewed Damla about her work, her experience as a PhD student in SAFARI, and her future directions.  Damla’s interview appears as the first video contribution to the SAFARI Meet our Members section in our January 2021 newsletter.

Watch Damla’s video interview here    |    Read the transcript here 

Damla Senol Cali, Gurpreet S. Kalsi, Zulal Bingol, Can Firtina, Lavanya Subramanian, Jeremie S. Kim, Rachata Ausavarungnirun, Mohammed Alser, Juan Gomez-Luna, Amirali Boroumand, Anant Nori, Allison Scibisz, Sreenivas Subramoney, Can Alkan, Saugata Ghose, and Onur Mutlu, GenASM: A High-Performance, Low-Power Approximate String Matching Acceleration Framework for Genome Sequence AnalysisProceedings of the 53rd International Symposium on Microarchitecture (MICRO), Virtual, October 2020.

Full paper link | Talk Video (18 mins) | Talk Slides (pptx) (pdf) |
Lecture Video (37 mins) | Lecture Slides (pptx) (pdf) | GenASM Source Code |
More information here

Damla Senol Cali, Jeremie Kim, Saugata Ghose, Can Alkan, and Onur Mutlu, Nanopore Sequencing Technology and Tools for Genome Assembly: Computational Analysis of the Current State, Bottlenecks and Future DirectionsBriefings in Bioinformatics (BIB), 2018.

Paper link | Paper PDF | AACBB’19 Talk Video | Slides (ppt) (pdf) |
More information here

Read the latest edition of our SAFARI Newsletter

Dear SAFARI friends,

Happy New Year!  We are excited to share our group highlights with you in this second edition of the SAFARI newsletter:

In this second edition of the SAFARI newsletter, we share our research, teaching and outreach highlights from 2020, and look ahead to a new and inspiring future in 2021.

We wish you a wonderful 2021, in all aspects of your lives!

Onur Mutlu