SAFARI Live Seminar: Damla Senol Cali 07 Nov 2021

Join us for our SAFARI Live Seminar with Damla Senol Cali.

SAFARI Live Seminar: Nastaran Hajinazar 27 Oct 2021

Join us for our SAFARI Live Seminar with Nastaran Hajinazar.

SAFARI Live Seminar: Jawad Haj-Yahya 4 October 2021

Monday, October 4 at 5:30 pm Zurich time (CEST)

Security Implications of Power Management Mechanisms In Modern Processors, Current Studies and Future Trends
Jawad Haj-Yahya, Huawei Research Center Zurich

Livestream at 5:30 pm Zurich time (CEST) on YouTube link

Abstract:
Despite the failure of Dennard scaling, the slow-down in Moore’s Law, and the high power density of modern processors, power management mechanisms have enabled significant advances in modern microprocessor performance and energy efficiency. Yet, current power management architectures also pose serious security implications. This is mainly because functionality rather than security has been the main consideration in the design of power management mechanisms in commodity microprocessors.

In this seminar, we provide a detailed overview of state-of-the-art power management mechanisms used in modern microprocessors. Based on this background, we present our recently-revealed set of new vulnerabilities, called IChannels. IChannels is a set of covert channels that exploits multi-level throttling mechanisms used by the current management mechanisms in modern processors. These covert channels can be established between two execution contexts 1) on the same hardware thread, 2) across simultaneous multithreading (SMT) threads, and 3) across different physical cores.  Finally, we discuss a set of practical mitigation mechanisms to protect a system against known covert channels resulting from current management mechanisms.

We conclude by discussing future follow-up works on vulnerabilities due to power management mechanisms and possible mitigations to explore in these critical and exciting areas.

This talk is based on the following paper: 

Jawad Haj-Yahya, Jeremie S. Kim, A. Giray Yaglikci, Ivan Puddu, Lois Orosa, Juan Gomez Luna, Mohammed Alser, and Onur Mutlu, IChannels: Exploiting Current Management Mechanisms to Create Covert Channels in Modern Processors, Proceedings of the 48th International Symposium on Computer Architecture (ISCA), Virtual, June 2021.
[Slides (pptx) (pdf)]
[Short Talk Slides (pptx) (pdf)]
[Talk Video (21 minutes)]

Speaker Bio:
Jawad Haj-Yahya received his Ph.D. degree in Computer Science from Haifa University, Israel.  Jawad was a processor architect for many years at Intel. His awards and honors include the Intel Achievement Award (the highest award at Intel), for his significant contribution to Intel processors. 
Jawad worked at Nanyang Technological University (NTU), Singapore as a cybersecurity Research Scientist where he led the architecture and design of a secure-processor project based on RISC-V architecture.  He then moved to the Institute of Microelectronics (IME) at A*STAR Singapore where he was a Scientist III and worked on hardware security and an AI accelerator. Jawad next worked as a Senior Researcher in the SAFARI Research Group at ETH Zurich, where he led multiple projects on Energy-Efficient Computing and Hardware Security, before moving to his current position as a principal researcher at Huawei Research Center in Zurich. 

SAFARI Live Seminar: Christina Giannoula 27 September 2021

Monday, September 27 at 5:30 pm Zurich time (CEST)

Efficient Synchronization Support for Near-Data-Processing Architectures
Christina Giannoula, National Technical University of Athens

Livestream at 5:30 pm Zurich time (CEST) on YouTube: Link

Abstract:

Recent advances in 3D-stacked memories have renewed interest in Near-Data Processing (NDP). NDP architectures perform computation close to where the application data resides, and constitute a promising way to alleviate data movement costs. These architectures can provide significant performance and energy benefits to parallel applications. Typical NDP architectures support several NDP units, each including multiple simple cores placed close to memory. To fully leverage the benefits of NDP and achieve high performance for parallel workloads, efficient synchronization among the NDP cores of a system is necessary. However, supporting synchronization in many NDP systems is challenging due to three architectural characteristics: (i) most NDP architectures lack shared caches that can enable low-cost communication and synchronization among NDP cores of the system, (ii) hardware cache coherence protocols are typically not supported in NDP systems due to high area and traffic overheads, (iii) NDP systems are non-uniform, distributed architectures, in which inter-unit communication is more expensive (both in performance and energy) than intra-unit communication.

In this  seminar, we comprehensively examine the synchronization problem in NDP systems, and propose SynCron, an end-to-end synchronization solution for NDP systems. SynCron is designed to achieve the goals of performance, cost, programming ease, and generality to cover a wide range of synchronization primitives through four key techniques. First, SynCron adds low-cost hardware support near memory for synchronization acceleration. Second, SynCron includes a specialized cache memory structure to avoid memory accesses for synchronization and minimize latency overheads. Third, it implements a hierarchical message-passing communication protocol to minimize expensive communication across NDP units of the system. Fourth, SynCron integrates a hardware-only overflow management scheme to avoid performance degradation when hardware resources for synchronization tracking are exceeded.

Our work is the first one to analyze synchronization primitives in NDP systems using a variety of parallel workloads, covering various contention scenarios, and evaluating various NDP configurations. We demonstrate that SynCron achieves significant performance and energy improvements both under high-contention and low-contention scenarios, while it also has low hardware area and power overheads. We conclude that SynCron is an efficient synchronization mechanism for NDP systems, and hope that this work encourages further research on the synchronization problem in heterogeneous systems, including NDP systems.

Bio:

Christina Giannoula is a Ph.D. student in the School of Electrical and Computer Engineering at the National Technical University of Athens (NTUA). She is working in the Computing Systems Laboratory, and is an affiliated Ph.D. researcher in the SAFARI research group at ETH Zürich, which is led by Prof. Onur Mutlu. She received a 5-year Diploma degree (Masters equivalent) in Electrical and Computer Engineering from NTUA in 2016, being awarded with several distinctions including the ‘Paris Kanellakis’ NTUA award, and graduating in the top 2% of her class. Since 2017, she has been working toward a Ph.D. degree at NTUA, and in 2019 she was a visiting PhD researcher in the SAFARI research group at ETH Zürich advised by Prof. Onur Mutlu and mentored by Prof. Nandita Vijaykumar. Her research interests lie in the intersection of computer architecture and high-performance computing. Specifically, her research focuses on the hardware/software co-design of emerging applications, including graph processing, pointer-chasing data structures, machine learning workloads, and sparse linear algebra, with modern computing paradigms, such as large-scale multicore systems and near-data processing architectures. She has several publications and awards for her research on these topics.


Christina Giannoula, Nandita Vijaykumar, Nikela Papadopoulou, Vasileios Karakostas, Ivan Fernandez, Juan Gómez-Luna, Lois Orosa, Nectarios Koziris, Georgios Goumas, and Onur Mutlu, “SynCron: Efficient Synchronization Support for Near-Data-Processing Architectures”, Proceedings of the 27th International Symposium on High-Performance Computer Architecture (HPCA), Virtual, February-March 2021.
[Slides (pptx) (pdf)]
[Short Talk Slides (pptx) (pdf)]
[Talk Video (21 minutes)]
[Short Talk Video (7 minutes)]

SAFARI Live Seminar: Minesh Patel 21 September 2021

Join us for our SAFARI Live Seminar with Minesh Patel.
Tuesday, September 21 at 5:00 pm Zurich time (CEST)

Enabling Effective Error Mitigation in Memory Chips That Use On-Die Error-Correcting Codes
Minesh Patel, SAFARI Research Group, ETH Zurich

Livestream at 5:00 pm Zurich time (CEST) on YouTube: Link

Abstract: 

Improvements in main memory storage density are primarily driven by process technology shrinkage (i.e., technology scaling), which negatively impacts reliability by exacerbating various circuit-level error mechanisms. To offset growing error rates, both memory manufacturers and consumers develop and incorporate error-mitigation mechanisms that improve manufacturing yield and allow system designers to meet desired reliability targets. Developing effective error mitigation techniques requires understanding the errors’ characteristics (e.g., worst-case behavior, statistical properties). Unfortunately, we observe that proprietary on-die Error-Correcting Codes (ECC) used in modern memory chips introduces new challenges to efficient error mitigation by obfuscating CPU-visible error characteristics in an unpredictable, ECC-dependent manner.

In this seminar, we experimentally study memory errors, examine how on-die ECC obfuscates their statistical characteristics, and develop new testing techniques to overcome the obfuscation through four key steps. First, we experimentally study DRAM data-retention error characteristics to understand the challenges inherent in understanding and mitigating technology-scaling-related errors. Second, we study how on-die ECC affects these characteristics to develop Error Inference (EIN), a statistical inference methodology for inferring details of the on-die ECC mechanism and the pre-correction errors. Third, we examine the on-die ECC mechanism in detail to understand exactly how on-die ECC obfuscates raw bit error patterns. Using this knowledge, we introduce Bit Exact ECC Recovery (BEER), a new testing methodology that exploits uncorrectable error patterns to (1) reverse-engineer the exact on-die ECC implementation used in a given chip and (2) identify the bit-exact locations of pre-correction errors that correspond to a given set of observed post-correction errors. Fourth, we study how on-die ECC impacts error profiling and show that on-die ECC introduces three key challenges that impact profiling practicality and effectiveness. To overcome these challenges, we introduce Hybrid Active-Reactive Profiling (HARP), a new profiling strategy that uses simple modifications to the on-die ECC mechanism to quickly and effectively identify bits at risk of error. Finally, we conclude by discussing the need for transparency in DRAM reliability characteristics in order to enable DRAM consumers to better understand and adapt commodity DRAM chips to their system-specific needs. In general, we hope and believe that these new testing techniques will enable scientists and engineers to make informed decisions towards building smarter systems.

Bio:

Minesh Patel is a Ph.D. candidate at ETH Zurich working with Prof. Onur Mutlu. He received B.S. degrees in ECE and Physics from the University of Texas, in 2015. Since then, he has been working toward his Ph.D. degree with a focus on memory systems reliability. His current research interests broadly span computer systems and architecture topics, including support for speculative and/or unreliable systems, performance modeling and analysis, and application characterization and optimization.


This talk is based on four papers we published respectively at ISCA 2017, DSN 2019, MICRO 2020 and MICRO 2021 (to appear). The links to available individual papers and slides are below.

Minesh Patel, Jeremie S. Kim, and Onur Mutlu, “The Reach Profiler (REAPER): Enabling the Mitigation of DRAM Retention Failures via Profiling at Aggressive Conditions”, Proceedings of the 44th International Symposium on Computer Architecture (ISCA), Toronto, Canada, June 2017.
[Slides (pptx) (pdf)]
[Lightning Session Slides (pptx) (pdf)]

Minesh Patel, Jeremie S. Kim, Hasan Hassan, and Onur Mutlu, “Understanding and Modeling On-Die Error Correction in Modern DRAM: An Experimental Study Using Real Devices”, Proceedings of the 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Portland, OR, USA, June 2019.
[Slides (pptx) (pdf)]
[Talk Video (26 minutes)]
[Full Talk Lecture (29 minutes)]
[Source Code for EINSim, the Error Inference Simulator]
Best paper award.

Minesh Patel, Jeremie S. Kim, Taha Shahroodi, Hasan Hassan, and Onur Mutlu, “Bit-Exact ECC Recovery (BEER): Determining DRAM On-Die ECC Functions by Exploiting DRAM Data Retention Characteristics”, Proceedings of the 53rd International Symposium on Microarchitecture (MICRO), Virtual, October 2020.
[Slides (pptx) (pdf)]
[Short Talk Slides (pptx) (pdf)]
[Lightning Talk Slides (pptx) (pdf)]
[Lecture Slides (pptx) (pdf)]
[Talk Video (15 minutes)]
[Short Talk Video (5.5 minutes)]
[Lightning Talk Video (1.5 minutes)]
[Lecture Video (52.5 minutes)]
[BEER Source Code]
Best paper award.

Minesh Patel, Geraldo Francisco de Oliveira Jr., Onur Mutlu, “HARP: Practically and Effectively Identifying Uncorrectable Errors in Main Memory Chips That Use On-Die ECC”, Proceedings of the 54rd International Symposium on Microarchitecture (MICRO), Virtual, October 2021.

SAFARI Live Seminar: Ataberk Olgun 15 September 2021

Join us for our next SAFARI Live Seminar with Ataberk Olgun.

Wednesday, September 15 at 5:00 pm Zurich time (CEST)

QUAC-TRNG: High-Throughput True Random Number Generation Using Quadruple Row Activation in Commodity DRAM Chips
Ataberk Olgun, TOBB University of Economics and Technology & SAFARI Research Group, ETH Zurich

Livestream at 5:00 pm Zurich time (CEST) on YouTube: Link

Abstract:

True random number generators (TRNG) sample random physical processes to create large amounts of random numbers for various use cases, including security-critical cryptographic primitives, scientific simulations, machine learning applications, and even recreational entertainment. Unfortunately, not every computing system is equipped with dedicated TRNG hardware, limiting the application space and security guarantees for such systems. To open the application space and enable security guarantees for the overwhelming majority of computing systems that do not necessarily have dedicated TRNG hardware, we develop QUAC-TRNG.

QUAC-TRNG exploits the new observation that a carefully-engineered sequence of DRAM commands activates four consecutive DRAM rows in rapid succession. This QUadruple ACtivation (QUAC) causes the bitline sense amplifiers to non-deterministically converge to random values when we activate four rows that store conflicting data because the net deviation in bitline voltage fails to meet reliable sensing margins.

We experimentally demonstrate that QUAC reliably generates random values across 136 commodity DDR4 DRAM chips from one major DRAM manufacturer. We describe how to develop an effective TRNG (QUAC-TRNG) based on QUAC. We evaluate the quality of our TRNG using NIST STS and find that QUAC-TRNG successfully passes each test. Our experimental evaluations show that QUAC-TRNG generates true random numbers with a throughput of 3.44 Gb/s (per DRAM channel), outperforming the state-of-the-art DRAM-based TRNG by 15.08x and 1.41x for basic and throughput-optimized versions, respectively. We show that QUAC-TRNG utilizes DRAM bandwidth better than the state-of-the-art, achieving up to 2.03x the throughput of a throughput-optimized baseline when scaling bus frequencies to 12 GT/s.

Bio:

Ataberk Olgun received his BSc degree in Computer Engineering from TOBB University of Economics and Technology, where he is currently studying for a Masters Degree. He joined SAFARI Research Group as an undergraduate intern in 2019. Since then he has worked on many projects on DRAM, Security, and Processing-in-Memory. 

===================

Ataberk Olgun, Minesh Patel, A. Giray Yaglikci, Haocong Luo, Jeremie S. Kim, F. Nisa Bostanci, Nandita Vijaykumar, Oguz Ergin, and Onur Mutlu, “QUAC-TRNG: High-Throughput True Random Number Generation Using Quadruple Row Activation in Commodity DRAM Chips” Proceedings of the 48th International Symposium on Computer Architecture (ISCA), Virtual, June 2021.
[Long Talk Video (25 minutes)]
[Long Talk Slides (pptx) (pdf)]
[Short Talk Video (7 minutes)]
[Short Talk Slides (pptx) (pdf)]
[Conference Talk and Q&A (15 minutes)]

===================

Related talks & lectures: 

===================

D-RaNGe: True Random Number Generation with Commodity DRAM https://www.youtube.com/watch?v=Y3hPv1I5f8Y&list=PL5Q2soXY2Zi-DyoI3HbqcdtUm9YWRR_z-&index=16 

DRAM Latency PUFs (Physical Unclonable Functions) https://www.youtube.com/watch?v=7gqnrTZpjxE&list=PL5Q2soXY2Zi-DyoI3HbqcdtUm9YWRR_z-&index=15 

CODIC: A Low-Cost Substrate for Enabling Custom In-DRAM Functionalities and Optimizations https://www.youtube.com/watch?v=ofBJnFQA6ic&list=PL5Q2soXY2Zi8_VVChACnON4sfh2bJ5IrD&index=133

Computer Architecture – Lecture 10: Low-Latency Memory (ETH Zürich, Fall 2020) https://www.youtube.com/watch?v=vQd1YgOH1Mw&list=PL5Q2soXY2Zi9xidyIgBxUz7xRPS-wisBN&index=19

Computer Architecture – Lecture 11a: Memory Controllers (ETH Zürich, Fall 2020) https://www.youtube.com/watch?v=TeG773OgiMQ&list=PL5Q2soXY2Zi9xidyIgBxUz7xRPS-wisBN&index=20

Public Lecture, Croucher ASI Workshop August 25

Join us next week at the Croucher Advanced Study Institute workshop on “Frontiers of AI Accelerators: Technologies, Circuits and Applications”, August 24 – 27, 2021, for Onur’s public lecture on “Intelligent Architectures for Intelligent Systems”

Talk time:  August 25, 11:15 Zurich time (CEST)

Registration & Program: accessasi.hkust.edu.hk

Abstract: https://accessasi.hkust.edu.hk/lecture-4

Computing is bottlenecked by data. Large amounts of data overwhelm storage capability, communication capability, and computation capability of the modern machines we design today. As a result, many key applications’ performance, efficiency and scalability are bottlenecked by data movement. We describe three major shortcomings of modern architectures in terms of 1) dealing with data, 2) taking advantage of the vast amounts of data, and 3) exploiting different semantic properties of application data. We argue that an intelligent architecture should be designed to handle data well. We show that handling data well requires designing system architectures based on three key principles: 1) data-centric, 2) data-driven, 3) data-aware. We give several examples for how to exploit each of these principles to design a much more efficient and high-performance computing system. We will especially discuss recent research that aims to fundamentally reduce memory latency and energy, and practically enable computation close to data, with at least two promising novel directions: 1) performing computation in memory by exploiting the analog operational properties of memory, with low-cost changes, 2) exploiting the logic layer in 3D-stacked memory technology in various ways to accelerate important data-intensive applications. We discuss how to enable adoption of such fundamentally more intelligent architectures, which we believe are key to efficiency, performance, and sustainability. We conclude with some guiding principles for future computing architecture and system designs.

Reference papers:

  1. “Intelligent Architectures for Intelligent Computing Systems”
  2. “A Modern Primer on Processing in Memory”

Reference video:

  1. IEDM 2020 Tutorial: Memory-Centric Computing Systems, Onur Mutlu, 12 December 2020 

SAFARI Live Seminar: Jawad Haj-Yahya 16 August 2021

Join us for our next SAFARI Live Seminar with Jawad Haj-Yahya.

Monday, August 16 at 5:30 pm Zurich time (CEST)

Power Management Mechanisms in Modern Microprocessors and Their Security Implications
Jawad Haj-Yahya, Principal researcher at Huawei Research Center in Zurich

Livestream at 5:30 pm Zurich time (CEST) on YouTube:
https://www.youtube.com/watch?v=uSuRWYa3k2g

Abstract:
Billions of new devices (e.g., sensors, wearables, smartphones, tablets, laptops, servers) are being deployed each year with new services and features that are driving a higher demand for high performance microprocessors, which often have high power consumption. Despite the failure of Dennard scaling, the slow-down in Moore’s Law, and the high power-density of modern processors, power management mechanisms have enabled significant advances in modern microprocessor performance and energy efficiency. Yet, current power management architectures also pose serious security implications. This is mainly because functionality rather than security has been the main consideration in the design of power management mechanisms in commodity microprocessors.

In this seminar, we provide a detailed overview of the state-of-the-art in power management mechanisms, power delivery networks (PDNs), and security vulnerabilities of current management mechanisms in modern microprocessors. We first present, analyze and enhance the advanced power management mechanisms of modern microprocessors to improve energy and performance in active and idle power states. Second, we present the design and tradeoffs of modern power delivery networks, evaluate their implications on performance and energy-efficiency, and describe new techniques to mitigate PDN inefficiencies. We will especially introduce the idea and benefits of hybrid power delivery networks. Third, we present some of the security vulnerabilities that exist in current management mechanisms of modern processors and propose mitigation techniques. We conclude that power management, power delivery and resulting security implications are critical and exciting areas to research to make modern systems both more energy-efficient and higher performance.

Bio:
Jawad Haj-Yahya received his Ph.D. degree in Computer Science from Haifa University, Israel. Jawad was a processor architect for many years at Intel. His awards and honors include the Intel Achievement Award (the highest award at Intel), for his significant contribution to Intel processors. Jawad worked at Nanyang Technological University (NTU), Singapore as a cybersecurity Research Scientist where he led the architecture and design of a secure-processor project based on RISC-V architecture. He then moved to the Institute of Microelectronics (IME) at A*STAR Singapore where he was a Scientist III and worked on hardware security and an AI accelerator. Jawad next worked as a Senior Researcher in the SAFARI Research Group at ETH Zurich, where he led multiple projects on Energy-Efficient Computing and Hardware Security, before moving to his current position as principal researcher at Huawei Research Center in Zurich.


This talk is based on four papers we published respectively at HPCA 2020, ISCA 2020, MICRO 2020 and ISCA 2021. The links to individual papers and slides are below.

SAFARI Live Seminar: Gennady Pekhimenko 5 August 2021

Join us for our next SAFARI Live Seminar with Gennady Pekhimenko.

Thursday, August 5 at 5:00 pm Zurich time (CEST)

Efficient DNN Training at Scale: from Algorithms to Hardware
Gennady Pekhimenko, University of Toronto

Livestream at 5:00 pm Zurich time (CEST) on YouTube:
https://www.youtube.com/watch?v=QDLgeHfJ91w

Talk slides (pdf) (pptx)

Abstract:
The recent popularity of deep neural networks (DNNs) has generated a lot of research interest in performing DNN-related computation efficiently. However, the primary focus of systems research is usually quite narrow and limited to (i) inference — i.e. how to efficiently execute already trained models and (ii) image classification networks as the primary benchmark for evaluation. In this talk, we will demonstrate a holistic approach to DNN training acceleration and scalability starting from the algorithm, to software and hardware optimizations, to special development and optimization tools.

In the first part of the talk, I will show our radically new approach on how to efficiently scale backpropagation algorithms used in DNN training (BPPSA, MLSys’20). Then I will demonstrate a new approach on how to train multiple DNN models jointly on the same hardware (HFTA, MLSys’21). I will then demonstrate several approaches to deal with one of the major limiting factors in DNN training: limited GPU/accelerator memory capacity (Echo, ISCA’20 and Gist, ISCA’18). At the end, I will show the performance and visualization tools we built in my group to understand, visualize, and optimize DNN models, and even predict their performance on different hardware.

Bio:
Gennady Pekhimenko is an Assistant Professor at the University of Toronto, CS department and (by courtesy) ECE department, where he is leading the EcoSystem (Efficient Computing Systems) group. Gennady is also a Faculty Member at Vector Institute and a CIFAR AI chair. Before joining Univ. of Toronto, he spent a year in 2017 at Microsoft Research in Redmond in the Systems Research group. He got his PhD from the Computer Science Department at Carnegie Mellon University in 2016. Gennady is a recipient of Amazon Machine Learning Research Award, Facebook Faculty Research Award, Connaught New Researcher Award, NVIDIA Graduate, Microsoft Research, Qualcomm Innovation, and NSERC CGS-D Fellowships. His research interests are in the areas of systems, computer architecture, compilers, and applied machine learning.

 

Congratulations to Damla Senol Cali on successfully defending her PhD!

Thesis:  Accelerating Genome Sequence Analysis via Efficient Hardware/Algorithm Co-Design

Abstract:

Genome sequence analysis plays a pivotal role in enabling many medical and scientific advancements in personalized medicine, outbreak tracing, the understanding of evolution, and forensics. Modern genome sequencing machines can rapidly generate massive amounts of genomics data at low cost. However, the analysis of genome sequencing data is currently bottlenecked by the computational power and memory bandwidth limitations of existing systems, as many of the steps in genome sequence analysis must process a large amount of data. Our goals in this dissertation are to (1) characterize the real-system behavior of the genome sequence analysis pipeline and its associated tools, (2) expose the bottlenecks and tradeoffs of the pipeline and tools, and (3) co-design fast and efficient algorithms along with scalable and energy-efficient customized hardware accelerators for the key pipeline bottlenecks to enable faster genome sequence analysis.

First, we comprehensively analyze the tools in the genome assembly pipeline for long reads in multiple dimensions (i.e., accuracy, performance, memory usage, and scalability), uncovering bottlenecks and tradeoffs that different combinations of tools and different underlying systems lead to. We show that we need high-performance, memory-efficient, low-power, and scalable designs for genome sequence analysis in order to exploit the advantages that genome sequencing provides. Second, we propose GenASM, an acceleration framework that builds upon bitvector-based approximate string matching (ASM) to accelerate multiple steps of the genome sequence analysis pipeline. We co-design our highly-parallel, scalable and memory-efficient algorithms with low-power and area-efficient hardware accelerators. We evaluate GenASM for three different use cases of ASM in genome sequence analysis and show that GenASM is significantly faster and more power- and area-efficient than state-of-the-art software and hardware tools for each of these use cases. Third, we implement an FPGA-based prototype for GenASM, where state-of-the-art 3D-stacked memory (HBM2) offers high memory bandwidth and FPGA resources offer high parallelism by instantiating multiple copies of the GenASM accelerators. Fourth, we propose GenGraph, the first hardware acceleration framework for sequence-to-graph mapping. Instead of representing the reference genome as a single linear DNA sequence, genome graphs provide a better representation of the diversity among populations by encoding variations across individuals in a graph data structure, avoiding a bias towards any one reference. GenGraph enables the efficient mapping of a sequenced genome to a graph-based reference, providing more comprehensive and accurate genome sequence analysis.

Overall, we demonstrate that genome sequence analysis can be accelerated by co- designing scalable and energy-efficient customized accelerators along with efficient algorithms for the key steps of genome sequence analysis.

Examining Committee

Onur Mutlu, Co-advisor, CMU-ECE, ETH Zurich
Saugata Ghose, Co-advisor, CMU-ECE, University of Illinois Urbana-Champaign
James C. Hoe, CMU-ECE
Can Alkan, Bilkent University

More on Damla’s publications, talks and research interests can be found on her website.