Efficient Synchronization Support for Near-Data-Processing Architectures
Christina Giannoula, National Technical University of Athens
Livestream at 5:30 pm Zurich time (CEST) on YouTube: Link
Recent advances in 3D-stacked memories have renewed interest in Near-Data Processing (NDP). NDP architectures perform computation close to where the application data resides, and constitute a promising way to alleviate data movement costs. These architectures can provide significant performance and energy benefits to parallel applications. Typical NDP architectures support several NDP units, each including multiple simple cores placed close to memory. To fully leverage the benefits of NDP and achieve high performance for parallel workloads, efficient synchronization among the NDP cores of a system is necessary. However, supporting synchronization in many NDP systems is challenging due to three architectural characteristics: (i) most NDP architectures lack shared caches that can enable low-cost communication and synchronization among NDP cores of the system, (ii) hardware cache coherence protocols are typically not supported in NDP systems due to high area and traffic overheads, (iii) NDP systems are non-uniform, distributed architectures, in which inter-unit communication is more expensive (both in performance and energy) than intra-unit communication.
In this seminar, we comprehensively examine the synchronization problem in NDP systems, and propose SynCron, an end-to-end synchronization solution for NDP systems. SynCron is designed to achieve the goals of performance, cost, programming ease, and generality to cover a wide range of synchronization primitives through four key techniques. First, SynCron adds low-cost hardware support near memory for synchronization acceleration. Second, SynCron includes a specialized cache memory structure to avoid memory accesses for synchronization and minimize latency overheads. Third, it implements a hierarchical message-passing communication protocol to minimize expensive communication across NDP units of the system. Fourth, SynCron integrates a hardware-only overflow management scheme to avoid performance degradation when hardware resources for synchronization tracking are exceeded.
Our work is the first one to analyze synchronization primitives in NDP systems using a variety of parallel workloads, covering various contention scenarios, and evaluating various NDP configurations. We demonstrate that SynCron achieves significant performance and energy improvements both under high-contention and low-contention scenarios, while it also has low hardware area and power overheads. We conclude that SynCron is an efficient synchronization mechanism for NDP systems, and hope that this work encourages further research on the synchronization problem in heterogeneous systems, including NDP systems.
Christina Giannoula is a Ph.D. student in the School of Electrical and Computer Engineering at the National Technical University of Athens (NTUA). She is working in the Computing Systems Laboratory, and is an affiliated Ph.D. researcher in the SAFARI research group at ETH Zürich, which is led by Prof. Onur Mutlu. She received a 5-year Diploma degree (Masters equivalent) in Electrical and Computer Engineering from NTUA in 2016, being awarded with several distinctions including the ‘Paris Kanellakis’ NTUA award, and graduating in the top 2% of her class. Since 2017, she has been working toward a Ph.D. degree at NTUA, and in 2019 she was a visiting PhD researcher in the SAFARI research group at ETH Zürich advised by Prof. Onur Mutlu and mentored by Prof. Nandita Vijaykumar. Her research interests lie in the intersection of computer architecture and high-performance computing. Specifically, her research focuses on the hardware/software co-design of emerging applications, including graph processing, pointer-chasing data structures, machine learning workloads, and sparse linear algebra, with modern computing paradigms, such as large-scale multicore systems and near-data processing architectures. She has several publications and awards for her research on these topics.
Christina Giannoula, Nandita Vijaykumar, Nikela Papadopoulou, Vasileios Karakostas, Ivan Fernandez, Juan Gómez-Luna, Lois Orosa, Nectarios Koziris, Georgios Goumas, and Onur Mutlu, “SynCron: Efficient Synchronization Support for Near-Data-Processing Architectures”, Proceedings of the 27th International Symposium on High-Performance Computer Architecture (HPCA), Virtual, February-March 2021.
[Slides (pptx) (pdf)]
[Short Talk Slides (pptx) (pdf)]
[Talk Video (21 minutes)]
[Short Talk Video (7 minutes)]