Meet our Members: Giray Yağlıkçı discusses his PhD work, RowHammer and future plans

Interview with Abdullah Giray Yağlıkçı, to appear in the SAFARI Newsletter July 2024 edition.

Giray recently defended his PhD and sat down to answer a few questions about his PhD work, RowHammer, and his future plans in a recent interview.


In your PhD work, you made some significant contributions to better understanding the RowHammer vulnerability.  Can you tell us a bit about your contributions and their overall significance to understanding (and maybe even coming closer to solving) RowHammer?

Giray: Certainly! We conducted several experiments on many real DRAM chips to understand their vulnerability to read disturbance under different conditions. We found this line of research interesting and extremely important because read disturbance becomes more and more severe going forward and we need to understand why and how it happens really well so that we can solve it efficiently and scalably. These experiments and analyses were published in two papers: one in MICRO’21[1] and one in DSN’22[2]. Let me quickly mention what I really like about our findings with two examples. 

First, we discovered that RowHammer has a really peculiar relationship with temperature: as temperature increases, a DRAM cell first becomes more vulnerable to read disturbance until some temperature level and then becomes less vulnerable as we keep increasing temperature. This observation seems quite odd at first look because what most people assume is a monotonically worsening vulnerability to read disturbance with increasing temperature. So, we dug into the literature on circuit-level error mechanisms that explain why and how read disturbance might happen. We saw a particular circuit-level error mechanism called trap-assisted charge leakage, exhibiting a temperature sensitivity similar to that of read disturbance. This finding shows us that this particular error mechanism actually dominates other error mechanisms in many DRAM cells across manufacturers. Now, with this finding, we understand DRAM read disturbance more thoroughly than before. 

The second example is about the effect of memory access patterns. Until our study, it was widely accepted that you must open and close DRAM rows as fast as possible to achieve the worst possible access pattern to induce read disturbance bitflips. We found out that you can actually exacerbate read disturbance if you spend some extra time keeping a row open once it is opened. Later on, Haocong Luo, also from our group, more rigorously investigated this finding and discovered yet another read disturbance phenomenon called RowPress, which is different than RowHammer. Again, with these findings, today we understand DRAM read disturbance better.

These are just two examples of the impact of our experiments on building a detailed understanding of DRAM read disturbance. We also rigorously analyzed the variation of read disturbance vulnerability across different cells within a module and different voltage levels. I am really proud of our contributions in this direction because they help the whole community understand the read disturbance vulnerability in a more detailed way and thus develop much stronger systems.  

Your PhD work on BlockHammer introduced an improved RowHammer mitigation mechanism for DDR4 chips (the latest technology at the time you published this work).  Can you tell us about the significance of this work?  Was this work in any way taken on board by industry or used in the development of real mitigation techniques in new chip designs at the time?  Your work on BlockHammer was actually a finalist in the Intel Hardware Security Academic Awards in 2022, (congratulations!), so it seems Intel at least is interested in working with researchers to develop more secure hardware solutions, and Intel has (I believe) implemented some of our work into real RowHammer solutions.

Giray: BlockHammer tackles two important limitations that prior works overlooked. Probably, I need to give a quick background here about the prior works. DRAM cells are organized in a two-dimensional array of rows and columns and internally accessed in a row granularity. As you keep accessing a DRAM row, you disturb the data stored in other physically nearby rows, which we call the victim rows. Many prior solutions to read disturbance proposed to refresh potential victim rows before they experience any bitflips. Those papers propose clever ideas on identifying the rows that are repeatedly accessed from within the memory controller placed in the processor chip. Once they identify the address of such a row, they assume that the potential victim rows are at addresses +-n of the row address that is repeatedly accessed. Then, as a very intuitive and straightforward mitigation, these papers propose to perform an operation called refresh targeting these potential victim rows. It is correct that when you refresh a row, you mitigate the effect of read disturbance on the row. However, there are two limitations to this approach.

First, you need to correctly identify the potential victim rows. It is often not that straightforward. To improve yield and density, DRAM chips are designed so that the rows with consecutive addresses are often placed in non-consecutive physical locations. The real physical layout of DRAM rows is only visible within the DRAM chip and not shared with the processor as it is considered to be proprietary information. In the end, you actually do not know which rows are potential victim rows even after you identify the repeatedly accessed rows, so you can detect RowHammer but cannot avoid all bitflips. BlockHammer addresses this limitation by blacklisting the repeatedly accessed rows and selectively throttling the memory accesses targeting such rows until the potential victim rows are refreshed by the DRAM’s periodic refresh mechanism. We did not observe such a mechanism being implemented in real systems based on public information, but at least I can say that it had an impact on the JEDEC standards. A later update to the DDR4 protocol proposed a new command called Direct Refresh Management, abbreviated as DRFM. The memory controller issues a DRFM command with the address of the repeatedly accessed row, and the DRAM chip internally identifies and refreshes the potential victim rows. 

The second limitation is the performance and energy impact of malicious RowHammer attacks. Ideally, you do not want a malicious attacker to consume your resources, leading to slowing down your system and increasing its energy consumption. Prior solutions consider this out of scope and try to avoid bitflips at minimal additional performance and energy overheads. In contrast, BlockHammer identifies RowHammer-like access patterns and traces them back to the responsible hardware thread that causes them. To measure the similarity of a thread’s behavior to a RowHammer attack, we propose a new metric called RowHammer likelihood index (RHLI). The larger the RHLI of a thread, the more similarly the thread acts to a RowHammer attack, and BlockHammer allows fewer memory requests from the thread. By doing so, BlockHammer reduces the memory bandwidth that the attacker would have wasted, and consequently provides concurrently running benign applications with a larger memory bandwidth. There are some follow-up works that build on BlockHammer to avoid potential denial-of-service and performance attacks. Oguzhan Canpolat published a preprint of a new project called BreakHammer [4], where we look into the memory performance and denial-of-service attack models that leverage deployed read disturbance solutions as assets. On a related note, we will give a talk in DRAMSec, a workshop co-located with ISCA, this year, which focuses on the same problem for the PRAC mechanism, introduced with the latest update to DDR5 in last April. 

We had an interview with former group member, Jeremie Kim, in a past newsletter (January 2021) and he discussed his paper “Revisiting RowHammer”.  He mentioned that DRAM manufacturer claims of RowHammer-free chips have been repeatedly refuted in the literature.  Shortly after this paper was published, in 2022, Onur Mutlu received the 2022 Google Security and Privacy Research Award, and commented “We now know that the RowHammer problem is fundamentally much worse than it was ten years ago — newer DRAM chips become more vulnerable as technology node size shrinks. 

Is this still true for the latest DRAM technology, DDR5, has RowHammer continued to become even worse? Or have there been any improvements in RowHammer mitigation mechanisms adopted by chip makers (over the course of your PhD work and various contributions), bringing the problem closer to being solved?

Giray: Well, the short answer is yes. Three critically reviewed papers from ISCA 2014 [5], 2020 [6], and 2023 [7] experimentally demonstrate a clear pattern of worsening read disturbance vulnerability. Essentially, you needed to perform an operation called hammering for almost 100K times within a 64ms time window to get some read disturbance bitflips in early 2010s, while chips from early 2020s experience bitflips at around 1K hammers. I’d like to emphasize that this is two orders of magnitude reduction in the necessary hammer count to induce read disturbance bitflips, and clearly shows that DRAM cells are much more vulnerable to read disturbance than before. Therefore, what Onur Mutlu says is correct. We can pinpoint those three papers: Flipping Bits (ISCA’14), Revisiting RowHammer (ISCA’20), and RowPress (ISCA’23). As you mentioned DDR5, I also wanted to refer to another upcoming paper from Patrick Jattke to appear in USENIX Security later this year, which reports observed bitflips on DDR5-based real systems, showing that our systems are still not RowHammer-free today. The paper is called ZenHammer.

This being said, there is a significant effort from both academia and industry to avoid such bitflips. Although the industry chose to provide security by obfuscation — claiming that they solved the problem without explaining how exactly — in the beginning, recent advancements in JEDEC standards show good steps towards solving the read disturbance issue. There are two features in recent updates. The first is refresh management, abbreviated as RFM in the latest DDR4 protocol. RFM is a practical knob that provides stronger protection against read disturbance at the cost of higher performance and energy loss. The second one is per-row activation counting, abbreviated as PRAC, which was introduced by the latest DDR5 update in April 2024. PRAC moves both the identification of potential victim rows and performing refresh operations to the DRAM chip. Hence, the processor does not track row accesses. So, the processor does not know when the DRAM chip needs to perform refresh operations, while the processor controls the synchronous communication between the processor and the DRAM chip. To make this approach practical, PRAC introduces a new back-off signal from the DRAM chip to the processor, which informs the processor that the DRAM chip needs some time to refresh potential victim rows. This is a promising direction, as shown before by different academic groups. For example, Hasan Hassan and Ataberk Olgun propose a NACK signal similar to the back-off signal in a 2-year-old preprint [8], which keeps getting rejected with the comment of being too costly to be practical, while the latest DDR5 update actually implements something very similar. Another good example is a short paper from Alec Wolman and Stefan Saroiu called Panopticon, published in the DRAMSec workshop 2021. 

These solutions are quite promising, and I can say that we are finally taking baby steps towards solving the DRAM read disturbance issue. Of course, there is more to do in this direction. For example, the hazard of memory performance and denial-of-service attacks I mentioned earlier is quite important going forward. Another important direction is understanding how read disturbance vulnerability changes with aging. There is no rigorous study of aging so far, which is scary. No one knows if your mobile phone will be more susceptible to RowHammer attacks after you use it for a few years or months. 

I guess there is though still a large gap between the research (your findings and others’ related to RowHammer) and what is actually implemented into the new technologies.

Giray: I can say that academic works have been ahead of the industry on this issue and our problem definitions are heavily affected by the business-driven decisions of the industry. Of course, you cannot isolate the issues in such prevalent technology from its economical aspects. Every change we propose potentially has monetary ramifications for the manufacturers and end users, which is already hard to assess, and near impossible when you do not get any useful feedback from the industry. Based on my experience on RowHammer and processing-in-memory research, the feedback from the industry appears in the form of either a question about more data or a comment, suggesting that what we do is not realistic, without any useful insights into the limitations, challenges, or other concerns. I believe that we could have made RowHammer attacks impractical years ago if we had a close collaboration with DRAM manufacturers. 

Do you now have any better insight into how researchers might improve research solutions so that they could be more readily used by industry, or do you see a way to improve this uptake or transfer (possibly by working more closely together and/or through more industry-research focus groups)?

Giray: In academic research, we can come up with all sorts of fantastic ideas. However, at the end of the day, it is not always the best design that is adopted in real products. For example, Yoongu Kim published a paper called SALP in ISCA’12 [9] with the idea of leveraging a sort of parallelism, called subarray-level parallelism, which is already available almost completely in real DRAM chips. The idea is very smart. It provides a tremendous amount of parallelism in DRAM chips at the cost of only few minor modifications in the peripheral circuitry around the DRAM array. Later on, Kevin Chang’s paper in HPCA’15 [10] demonstrated that this subarray-level parallelism can significantly reduce the overheads of some time-consuming maintenance operations, and finally one of my papers in MICRO’22 [11] and another one from Ismail Yuksel in DSN’24 [12] show that we can leverage subarray-level parallelism even in some of off-the-shelf DRAM chips by operating them beyond the specifications. This line of research clearly shows that subarray-level parallelism is very beneficial, low cost, and some chips already support it intentionally or unintentionally. However, the industry still does not officially support subarray-level parallelism. I do not know why they do not. This example shows that an idea is not necessarily useless only because the industry does not adopt it. So, as researchers, we have to keep thinking freely even if industry does not adopt our ideas. That being said, I want to reiterate the importance of communication, feedback, and collaboration between researchers and the decision-makers in the industry. 

What might help or encourage this (more funding, e.g. CHIPS Act or other)?.  It seems that (from our side) having more information on how DRAM operates internally is important for our work, and having more insight from industry would help to bridge the gap so our research solutions could be more readily implemented into DRAM, but I guess that is quite difficult and is proprietary knowledge. Has this changed over time and improved, or improved as a result of your work?  You recently had several talks in the UK at different companies interested in your work, and maybe this gave you more insight as well.

Giray: Based on my interactions, I can say that companies that build systems, for example: ARM, Google, Huawei, and Microsoft, are quite interested in our research. These are major consumers of DRAM chips. Unfortunately, I did not see any similar interest from the DRAM manufacturers in my limited interactions. We still need more information from the industry about what limits the adoption of our ideas in real chips. 

What would you like to do going forward, do you have an interest in working in industry or will you look for a position in academia?  Will you continue to focus on DRAM and hardware security, or where do you see your future interests?

Giray: I greatly enjoy conducting research, so I keep working on new research projects at the moment, but this is a question that also keeps my mind busy nowadays. Today there are outstanding security issues in modern computing systems including but not limited to RowHammer. Today, we can solve these issues at the cost of significant performance loss and increased energy consumption. Going forward, we need to tackle these security issues together with improving system performance and energy efficiency. To achieve this, we need systems with better isolation across processes in both microarchitecture and memory and tackle major performance bottlenecks like excessive data movement and time-consuming maintenance operations. My plan is to broaden my research scope accordingly. I see myself in academia eventually, but I want to improve my understanding and research perspective with some experience in industry research. So, I actively look for research positions in academia and industry.


[1] Lois Orosa, Abdullah Giray Yaglikci, Haocong Luo, Ataberk Olgun, Jisung Park, Hasan Hassan, Minesh Patel, Jeremie S. Kim, and Onur Mutlu, “A Deeper Look into RowHammer’s Sensitivities: Experimental Analysis of Real DRAM Chips and Implications on Future Attacks and Defenses”, Proceedings of the 54th International Symposium on Microarchitecture (MICRO), Virtual, October 2021.
[Slides (pptx) (pdf)]
[Short Talk Slides (pptx) (pdf)]
[Lightning Talk Slides (pptx) (pdf)]
[Talk Video (21 minutes)]
[Lightning Talk Video (1.5 minutes)]
[arXiv version]

[2]  Giray Yağlıkçı, Haocong Luo, Geraldo F. de Oliviera, Ataberk Olgun, Minesh Patel, Jisung Park, Hasan Hassan, Jeremie S. Kim, Lois Orosa, and Onur Mutlu, “Understanding RowHammer Under Reduced Wordline Voltage: An Experimental Study Using Real DRAM Devices”, Proceedings of the 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Baltimore, MD, USA, June 2022.
[Slides (pptx) (pdf)]
[Lightning Talk Slides (pptx) (pdf)]
[arXiv version]
[Talk Video (34 minutes, including Q&A)]
[Lightning Talk Video (2 minutes)]

[3] Haocong Luo, Ataberk Olgun, Giray Yaglikci, Yahya Can Tugrul, Steve Rhyner, M. Banu Cavlak, Joel Lindegger, Mohammad Sadrosadati, and Onur Mutlu, “RowPress: Amplifying Read Disturbance in Modern DRAM Chips”
Proceedings of the 50th International Symposium on Computer Architecture (ISCA), Orlando, FL, USA, June 2023.
[Extended arxiv version]
[Slides (pptx) (pdf)]
[Lightning Talk Slides (pptx) (pdf)]
[Lightning Talk Video (3 minutes)]
[Talk Video (14 minutes, including Q&A)]
[RowPress Source Code and Datasets (Officially Artifact Evaluated with All Badges)]
Officially artifact evaluated as available, reusable and reproducible.
Distinguished artifact award at ISCA 2023.

[4] BreakHammer: https://arxiv.org/abs/2404.13477
[5]
Yoongu Kim, Ross Daly, Jeremie Kim, Chris Fallin, Ji Hye Lee, Donghyuk Lee, Chris Wilkerson, Konrad Lai, and Onur Mutlu, “Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors”, Proceedings of the 41st International Symposium on Computer Architecture (ISCA), Minneapolis, MN, June 2014.
[Slides (pptx) (pdf)] [Lightning Session Slides (pptx) (pdf)] [Source Code and Data] [RowHammer Summary Slides (pptx)] [RowHammer Summary]
[Coverage on ZDNet 1] [Coverage on ZDNet 2] [MemTest86 Hammer Test] [RowHammer Discussion Group] [Discussion on Twitter]
[Lecture Video (1 hr 49 mins), 25 September 2020]
[Invited Retrospective at IEEE TCAD Top Picks in Hardware and Embedded Security, 2019 (pdf)]
[Invited Retrospective at 50 Years of ISCA, 2023 (pdf)]
One of the 7 papers of 2012-2017 selected as Top Picks in Hardware and Embedded Security for IEEE TCAD (link).
Selected to the ISCA-50 25-Year Retrospective Issue covering 1996-2020 in 2023 (Retrospective (pdf) Full Issue).

[6] Jeremie S. Kim, Minesh Patel, A. Giray Yaglikci, Hasan Hassan, Roknoddin Azizi, Lois Orosa, and Onur Mutlu,
“Revisiting RowHammer: An Experimental Analysis of Modern Devices and Mitigation Techniques”
Proceedings of the 47th International Symposium on Computer Architecture (ISCA), Valencia, Spain, June 2020.
[Slides (pptx) (pdf)]
[Lightning Talk Slides (pptx) (pdf)]
[Lecture Slides (pptx) (pdf)]
[ARM Research Summit Poster (pptx) (pdf)]
[Talk Video (20 minutes)]
[Lightning Talk Video (3 minutes)]
[Lecture Video (55 minutes)]

[7] Haocong Luo, Ataberk Olgun, Giray Yaglikci, Yahya Can Tugrul, Steve Rhyner, M. Banu Cavlak, Joel Lindegger, Mohammad Sadrosadati, and Onur Mutlu, “RowPress: Amplifying Read Disturbance in Modern DRAM Chips”
Proceedings of the 50th International Symposium on Computer Architecture (ISCA), Orlando, FL, USA, June 2023.
[Extended arxiv version]
[Slides (pptx) (pdf)]
[Lightning Talk Slides (pptx) (pdf)]
[Lightning Talk Video (3 minutes)]
[Talk Video (14 minutes, including Q&A)]
[RowPress Source Code and Datasets (Officially Artifact Evaluated with All Badges)]
Officially artifact evaluated as available, reusable and reproducible.
Distinguished artifact award at ISCA 2023.

[8] https://arxiv.org/abs/2207.13358

[9]
Yoongu Kim, Vivek Seshadri, Donghyuk Lee, Jamie Liu, and Onur Mutlu, “A Case for Exploiting Subarray-Level Parallelism (SALP) in DRAM” Proceedings of the 39th International Symposium on Computer Architecture (ISCA), Portland, OR, June 2012. Slides (pptx)

[10] Kevin Chang, Donghyuk Lee, Zeshan Chishti, Alaa Alameldeen, Chris Wilkerson, Yoongu Kim, and Onur Mutlu, “Improving DRAM Performance by Parallelizing Refreshes with Accesses”
Proceedings of the 20th International Symposium on High-Performance Computer Architecture (HPCA), Orlando, FL, February 2014. [Summary] [Slides (pptx) (pdf)]

[11] Giray Yaglıkcı, Ataberk Olgun, Minesh Patel, Haocong Luo, Hasan Hassan, Lois Orosa, Oguz Ergin, and Onur Mutlu, “HiRA: Hidden Row Activation for Reducing Refresh Latency of Off-the-Shelf DRAM Chips” Proceedings of the 55th International Symposium on Microarchitecture (MICRO), Chicago, IL, USA, October 2022. [Slides (pptx) (pdf)] [Longer Lecture Slides (pptx) (pdf)] [Lecture Video (36 minutes)] [arXiv version]

[12] Ismail Emir Yuksel, Yahya Can Tugrul, F. Nisa Bostanci, Geraldo F. Oliveira, A. Giray Yaglikci, Ataberk Olgun, Melina Soysal, Haocong Luo, Juan Gómez-Luna, Mohammad Sadrosadati, Onur Mutlu, Simultaneous Many-Row Activation in Off-the-Shelf DRAM Chips: Experimental Characterization and Analysis, Proceedings of the 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), July 2024. 
[arXiv version] [SiMRA-DRAM Source Code]

Posted in Interview, Lectures, Newsletter, Papers.