Join us at ASPLOS 2021 online

We are at ASPLOS 2021 this week and next.  Join us for our talks and learn more about our recent works:

Session 2: Memory Systems, Monday, April 19 4:00 PM Pacific Tiime:
Irina Calciu, M. Talha Imran, Ivan Puddu, Sanidhya Kashyap, Hasan Al Maruf, Onur Mutlu, and Aasheesh Kolli,
“Rethinking Software Runtimes for Disaggregated Memory”
Proceedings of the 26th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Virtual, March-April 2021.
[2-page Extended Abstract]
[Source Code (Officially Artifact Evaluated)]

Session 8: Tools & Frameworks, Tuesday, April 20 4:00 PM Pacific Time: 
Nastaran Hajinazar, Geraldo F. Oliveira, Sven Gregorio, Joao Dinis Ferreira, Nika Mansouri Ghiasi, Minesh Patel, Mohammed Alser, Saugata Ghose, Juan Gomez-Luna, and Onur Mutlu,
“SIMDRAM: An End-to-End Framework for Bit-Serial SIMD Computing in DRAM”
Proceedings of the 26th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Virtual, March-April 2021.
[2-page Extended Abstract]

Session 17: Solid State Drives, Thursday, April 22 7:00 AM Pacific Time:
Jisung Park
, Myungsuk Kim, Myoungjun Chun, Lois Orosa, Jihong Kim, and Onur Mutlu,
“Reducing Solid-State Drive Read Latency by Optimizing Read-Retry”
Proceedings of the 26th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Virtual, March-April 2021.
[2-page Extended Abstract]

ASPLOS Program:  https://asplos-conference.org/program/

Onur Mutlu and Co-authors Receive the 2021 HPCA Test of Time Award

Congratulations to Onur Mutlu and co-authors on receiving the HPCA Test of Time Award for their 2003 HPCA paper:

Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-Order Processors
Onur Mutlu, Jared Stark, Chris Wilkerson, Yale N. Patt

The IEEE International Symposium on High-Performance Computer Architecture (HPCA) Test of Time Award recognizes the most influential papers published in prior sessions of HPCA (held 18-22 years ago), and that have had a significant impact in the field.

The paper was Professor Onur Mutlu’s first publication during his PhD at the University of Texas with his PhD advisor Professor Yale Patt and colleagues from Intel, and Dr. Jared Stark and Chris Wilkerson.  The significance of the paper was described by the award committee as: “Runahead Execution is a pioneering paper that opened up new avenues in dynamic prefetching. The basic idea of run ahead execution effectively increases the instruction window very significantly, without having to increase physical resource size (e.g. the issue queue). This seminal paper spawned off a new area of ILP-enhancing microarchitecture research. This work has had strong industry impact as evidenced by IBM’s POWER6 – Load Lookahead, NVIDIA Denver, and Sun ROCK’s hardware scouting.” The award was presented last week at HPCA 2021 on March 2, 2021.

Watch Onur’s Retrospective HPCA Test of Time Award Talk Video (14 minutes)


Onur Mutlu
, Jared Stark, Chris Wilkerson, and Yale N. Patt,
“Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-order Processors”
Proceedings of the 9th International Symposium on High-Performance Computer Architecture (HPCA), pages 129-140, Anaheim, CA, February 2003.
[Talk Slides (pdf)]
[Lecture Slides (pptx) (pdf)]
[Lecture Video (1 hr 54 mins)]
[Retrospective HPCA Test of Time Award Talk Slides (pptx) (pdf)]
[Retrospective HPCA Test of Time Award Talk Video (14 minutes)]
One of the 15 computer architecture papers of 2003 selected as Top Picks by IEEE Micro.
HPCA Test of Time Award (awarded in 2021).

Interview with Mohammed Alser: on his recent papers and his future work


Mohammed Alser
 is a Senior Researcher and Lecturer with SAFARI. 
He was previously a PhD student in SAFARI, co-advised with Can Alkan. Mohammed co-teaches two Projects and Seminars courses on Genome Sequencing Analysis and Mobile Genomics along with the Seminar on Computer Architecture.  We recently interviewed Mohammed for the January 2021 issue of the SAFARI Newsletter.  


You have been busy this past year, and have published quite a few papers. Your recent work, SneakySnake, was recently published in Bioinformatics. This is an important work in improving computations for genome analysis. Can you tell us more about the significance of this work, and what broader impacts you hope for it?

SneakySnake is one of the projects that I enjoyed the most working on. We try in this work to significantly reduce the time spent on finding the similarities and differences between two genomic sequences without sacrificing solution optimality. Finding the similarities and differences between two sequences is a well-known computer science problem, called approximate string matching (ASM), which is solved using computationally expensive algorithms.

SneakySnake quickly finds the sequence pairs that have a large (greater than a user-defined threshold) number of differences and prevents applying computationally expensive algorithms for these sequence pairs, as such sequence pairs are usually not useful for genomic studies. SneakySnake is inspired by the single net routing (SNR) problem in VLSI design that was introduced in 1976. SneakySnake is the first work that proposes to convert the ASM problem into an instance of the SNR problem, which provides several key benefits as we discussed in the paper, and proposes a new efficient algorithm for comparing genomic sequences at scale.

SneakySnake is very beneficial for analyzing both short (e.g., Illumina) and long (e.g., nanopore) sequences as it accelerates the analysis of genomic sequences by up to two orders of magnitude compared to the state-of-the-art algorithms. SneakySnake works efficiently and fast on modern CPU, FPGA, and GPU architectures, which can potentially enable new applications of genome sequencing such as rapid surveillance of disease outbreaks including Ebola and COVID-19, near-patient testing, and bringing precision medicine to remote locations, without the need for large infrastructure.

One of the Bioinformatics journal’s reviewers states that: “SneakySnake is a valuable contribution to bioinformatics and it was innovative to reduce the ASM problem to the SNR problem in VLSI CAD”.


You also recently published Accelerating Genome Analysis, which reviews the improvements made in hardware accelerators for genome analysis. What are your take away messages from this paper, and what do you see as future priorities in hardware improvements for genome analysis?

Most speedup comes from parallelism enabled by novel architectures and algorithms. We need to develop acceleration solutions that exploit new efficient hardware-aware algorithms, hardware/software co-design, and hardware accelerators to achieve a high degree of parallelism.

Accelerating the entire genome analysis pipeline is important. Accelerating only a single step of genome analysis is not an effective acceleration approach as it limits the overall achieved speedup according to Amdahl’s Law.

Genome analysis is currently heavily bottlenecked by data movement. We need to reduce the high amount of data movement that takes place during genome analysis. Moving data (1) between compute units and main memory, (2) between multiple hardware accelerators, and (3) between the sequencing machine and the computer performing the analysis incurs high costs in terms of execution time and energy. These costs are a significant barrier to enabling efficient analysis that can keep up with sequencing technologies.

The need for flexible hardware architectures. We need to develop flexible hardware architectures that do not conservatively limit the range of supported parameter values at design time. Rapid changes in sequencing technologies (e.g., those that result in high sequencing error rates and longer read lengths) can quickly make specialized hardware with restricted parameter values obsolete.

The need for new genomic data formats. We need to adapt existing genomic data formats for hardware accelerators or develop more efficient file formats to maximize the benefits of hardware accelerators and reduce resource utilization.

Looking into the future, building a genome sequencing machine that provides the entire genome as a single string, rather than its short subsequences, might be possible. However, we believe that the need for hardware acceleration of whole-genome analysis will continue to remain necessary. We also believe performing genome analysis inside the sequencing machine itself can significantly improve efficiency by eliminating sequencer-to-computer data movement.


Your work has many topical applications that are highly relevant to society, including COVID modeling. Can you talk a bit about this, and your future research directions?

As the entire world is largely negatively impacted by the recent COVID-19 outbreak, we believe that everyone can help to end this pandemic based on their skills, expertise, and available resources. At SAFARI research group, we are helping with two main directions.

We are working on developing an accurate and configurable prediction model that evaluates the existing mitigation measures that the government applies in a region and provides suggestions on what strength the future mitigation measures should be. We are quantifying the spread of COVID-​19 in Switzerland (as a use-case) by calculating the daily reproduction number of COVID-19, which quantifies how many people are infected on average by an infected person. The reproduction number is directly affected by the mitigation measures that the government applies to a region. We are also considering other important factors such as daylight temperature that significantly affect the spread of COVID-​19 as we observed during the year 2020.

We are also working on developing new algorithms and hardware accelerators that perform fast and accurate metagenomic profiling for assessing microbial diversity, identifying potential new species, and investigating microbiomes associated with COVID-19 and other diseases. Performing genomic tests at scale during a pandemic highlights the dire need for building efficient specialized hardware that is both scalable and portable to enable genome analysis anywhere and anytime. We hope that the progress we make in this direction will also enable new applications that benefit human life and society.

Mohammed Alser, Taha Shahroodi, Juan-Gomez Luna, Can Alkan, and Onur Mutlu, SneakySnake: A Fast and Accurate Universal Genome Pre-Alignment Filter for CPUs, GPUs, and FPGAsBioinformatics, December 2020.
Paper PDF | Paper link Bioinformatics | Source Code

Mohammed Alser, Zulal Bingol, Damla Senol Cali, Jeremie Kim, Saugata Ghose, Can Alkan, and Onur Mutlu, Accelerating Genome Analysis: A Primer on an Ongoing Journey, IEEE MICRO, September/October 2020.
Paper | Slides (pptx) (pdf)

Interview with Damla Senol Cali: about her work, experience in SAFARI, and her future directions

Damla Senol Cali is a PhD student with SAFARI at CMU co-advised by Onur Mutlu and Saugata Ghose.
We recently interviewed Damla about her work, her experience as a PhD student in SAFARI, and her future directions.  Damla’s interview appears as the first video contribution to the SAFARI Meet our Members section in our January 2021 newsletter.

Watch Damla’s video interview here    |    Read the transcript here 

Damla Senol Cali, Gurpreet S. Kalsi, Zulal Bingol, Can Firtina, Lavanya Subramanian, Jeremie S. Kim, Rachata Ausavarungnirun, Mohammed Alser, Juan Gomez-Luna, Amirali Boroumand, Anant Nori, Allison Scibisz, Sreenivas Subramoney, Can Alkan, Saugata Ghose, and Onur Mutlu, GenASM: A High-Performance, Low-Power Approximate String Matching Acceleration Framework for Genome Sequence AnalysisProceedings of the 53rd International Symposium on Microarchitecture (MICRO), Virtual, October 2020.

Full paper link | Talk Video (18 mins) | Talk Slides (pptx) (pdf) |
Lecture Video (37 mins) | Lecture Slides (pptx) (pdf) | GenASM Source Code |
More information here

Damla Senol Cali, Jeremie Kim, Saugata Ghose, Can Alkan, and Onur Mutlu, Nanopore Sequencing Technology and Tools for Genome Assembly: Computational Analysis of the Current State, Bottlenecks and Future DirectionsBriefings in Bioinformatics (BIB), 2018.

Paper link | Paper PDF | AACBB’19 Talk Video | Slides (ppt) (pdf) |
More information here

Read the latest edition of our SAFARI Newsletter

Dear SAFARI friends,

Happy New Year!  We are excited to share our group highlights with you in this second edition of the SAFARI newsletter: https://safari.ethz.ch/safari-newsletter-january-2021/

In this second edition of the SAFARI newsletter, we share our research, teaching and outreach highlights from 2020, and look ahead to a new and inspiring future in 2021.

We wish you a wonderful 2021, in all aspects of your lives!

Onur Mutlu

TRRespass wins the Pwnie Award for Most Innovative Research

TRRespass won the Pwnie Award for “Most Innovative Research” at the annual BlackHat Europe conference this week.  Pwnies are the most prestigious industrial awards in the security community.   Congratulations to the authors: Pietro Frigo, Emanuele Vannacci, Hasan Hassan, Victor van der Veen, Onur Mutlu, Cristiano Giuffrida, Herbert Bos, and Kaveh Razavi on this prestigious prize!

We recently interviewed Hasan Hassan about his contribution to TRRespass.  Here’s what he had to say:

You were a co-author on TRRespass, which recently won a Best Paper Award at IEEE S&P. What is the significance of this paper?

Shortly after the discovery of the RowHammer vulnerability of DRAM, DRAM vendors announced RowHammer-free DRAM devices that implement in-DRAM solutions to protect against RowHammer. However, in TRRespass, we find that such solutions, commonly referred to as Target Row Refresh (TRR), do not effectively protect against RowHammer attacks when many rows are hammered at the same time. We show that the RowHammer vulnerability is not only still intact on the current DDR4 devices, but it has also become worse due to technology node scaling.

How was your experience in collaborating with the Systems and Network Security Group at VU Amsterdam on this work?

I am glad that our combined effort with the Systems and Network Security Group at VU Amsterdam won us the Best Paper Award at IEEE S&P. It has been a great experience for me to collaborate with experts in hardware security. I hope there will be more such collaborations that result in impactful research.

Which tools did you use in this work?

I think SoftMC, our FPGA-based DRAM testing infrastructure, was one of the key enablers of this research. We used SoftMC to interface with DDR4 DRAM chips in a much more flexible way than anyone can do using commodity desktop and mobile systems. Specifically, we used SoftMC to communicate with DRAM chips using low-level DDR4 commands as opposed to using load/store instructions provided by typical instruction set architectures. In a way, SoftMC lets us be the memory controller and provides the flexibility of issuing any DDR4 command at any time, which is not possible with commodity systems.

An earlier version of SoftMC that supports DDR3 devices is open-source and can be accessed here. In 2017, we published a paper that describes the design of SoftMC in detail.

I am also involved in maintaining Ramulator, a cycle-accurate DRAM simulator that we describe in this paper, and Scarab, which is a cycle-accurate simulator for state-of-the-art multicore CPUs.


Pietro Frigo, Emanuele Vannacci, Hasan Hassan, Victor van der Veen, Onur Mutlu, Cristiano Giuffrida, Herbert Bos, and Kaveh Razavi, “TRRespass: Exploiting the Many Sides of Target Row Refresh”Proceedings of the 41st IEEE Symposium on Security and Privacy (S&P), San Francisco, CA, USA, May 2020.
Slides (pptx) (pdf)
Lecture Slides (pptx) (pdf)
Talk Video (17 minutes)
Lecture Video (59 minutes)
Source Code
Web Article
Project Overview
Best paper award.
Pwnie Award 2020 for Most Innovative Research. Pwnie Awards 2020

Paper: SneakySnake🐍: A Fast and Accurate Universal Genome Pre-Alignment Filter for CPUs, GPUs, and FPGAs

Our recent paper is accepted in Bioinformatics!

Mohammed Alser, Taha Shahroodi, Juan-Gomez Luna, Can Alkan, and Onur Mutlu,
“SneakySnake: A Fast and Accurate Universal Genome Pre-Alignment Filter for CPUs, GPUs, and FPGAs”
Bioinformatics, to appear in 2020.

Source Code:  SneakySnake🐍: A Fast and Accurate Universal Genome Pre-Alignment Filter for CPUs, GPUs, and FPGAs

Radio Interview: Onur Mutlu discusses the review process

Listen to Onur Mutlu’s interview as he discusses the review process in the “ORF Dimensionen” broadcast on “Peer Review and Open Science”

How well does peer review work?
Interviewer: Mariann Unterluggauer
11 November 2020

The idea of using an assessment process to control the quality and efficiency of scientific work emerged in the middle of the 20th century. Since then, publishers have been using this “peer review” as a basis for making decisions about what should appear in their specialist journals. And anyone who wants to be awarded funding for their research projects must first go through and pass such an assessment process. However, this does not always work properly, as misjudgments and sloppiness have shown in the past. In addition, a positive “peer review” does not automatically mean that there is good or relevant science in a publication. – So what’s the point of peer review? Does the procedure have to be evaluated itself? Above all, the discussion about open science has stimulated the discussion about the review process of science.

Some useful related videos:

  • https://youtu.be/HvswnsfG3oQ?t=1800 (Onur Mutlu, Computer Architecture – Lecture 5c: Secure and Reliable Memory (ETH Zürich, Fall 2020): discussing the review process of RowHammer)
  • https://youtu.be/FYwOyapck3M?t=5421 (Onur Mutlu, Seminar in Computer Architecture – Lecture 2: RowClone – In-Memory Data Copy (ETH Zürich, Fall 2020): discussing the review process of RowClone)
  • https://youtu.be/yEYEzFwAY9g?t=4445 (Onur, Mutlu, Seminar in Computer Architecture – Lecture 3: Memory Channel Partitioning (ETH Zürich, Fall 2020): discussing the review process of Memory Channel Partitioning)

We are at MICRO 2020 this week! Join Lois Orosa for his talk on FIGARO, Monday, October 19 6:30PM CEST

Our new paper: FIGARO: Improving System Performance via Fine-Grained In-DRAM Data Relocation and Caching will be presented by Lois Orosa at MICRO 2020 on Monday, October 19 at 6:30 PM CEST.  Join us at MICRO 2020 online!

Authors: Yaohua Wang, Lois Orosa, Xiangjun Peng, Yang Guo, Saugata Ghose, Minesh Patel, Jeremie S. Kim, Juan Gómez Luna, Mohammad Sadrosadati, Nika Mansouri Ghiasi, and Onur Mutlu

Proceedings of the 53rd International Symposium on Microarchitecture (MICRO), Virtual, October 2020.
[Slides (pptx) (pdf)]
[Lightning Talk Slides (pptx) (pdf)]
[Talk Video (16 minutes)]
[Lightning Talk Video (1.5 minutes)]