Gagan successfully defended his PhD thesis in March 2021. We are excited that Gagan will stay on with SAFARI as a postdoc and we look forward many successful collaborations with him. Congratulations Gagan!
Gagandeep Singh, March 2021 (defended 29 March 2021)
[Slides (pptx) (pdf)]
Abstract:
The cost of moving data between the memory units and the compute units is a major contributor to the execution time and energy consumption of modern workloads in computing systems. At the same time, we are witnessing an enormous amount of data being generated across multiple application domains. Moreover, the end of Dennard scaling, the slowing of Moore’s law, and the emergence of dark silicon limit the attainable performance on current computing systems. These trends suggest a need for a paradigm shift towards a data-centric approach where computation is performed close to where the data resides. This approach allows us to overcome our current systems’ performance and energy limitations by minimizing the data movement overhead by ensuring that data does not overwhelm system components. Further, a data-centric approach can enable a data-driven view where we take advantage of vast amounts of available data to improve architectural decisions. Our current systems are designed to follow rigid and simple policies that lack adaptability. Therefore, current system policies fail to provide robust improvement across varying workloads and system conditions.
As a step towards modern architectures, this dissertation contributes to various aspects of the data-centric approach and further proposes several data-driven mechanisms.
First, we design NERO, a data-centric accelerator for a real-world weather prediction application. NERO overcomes the memory bottleneck of weather prediction stencil kernels by exploiting near-memory computation capability on specialized field-programmable gate array (FPGA) accelerators with high-bandwidth memory (HBM) that are attached to the host CPU.
Second, we explore the applicability of different number formats, including fixed-point, floating-point, and posit, for different stencil kernels. We search for the appropriate bit-width that reduces the memory footprint and improves the performance and energy efficiency with minimal loss in the accuracy.
Third, we propose NAPEL, an ML-based application performance and energy prediction framework for data-centric architectures. NAPEL uses ensemble learning to build a model that, once trained for a fraction of programs, can predict the performance and energy consumption of different applications.
Fourth, we present the first use of few-shot learning to transfer FPGA-based computing models across different hardware platforms and applications. LEAPER provides the ability to reuse a prediction model built on an inexpensive low-end local system to a new, unknown, high-end FPGA-based system.
Fifth, we propose QRator, a reinforcement learning (RL)-based data-placement technique for hybrid storage systems. QRator is a data-driven technique, which uses RL to develop a data-placement policy agent. The data-placement agent decides which data should be stored in what storage device to achieve the best performance while minimizing the migration overhead taking into account the device and the workload characteristics. Our evaluation results show that QRator significantly improves a hybrid storage subsystem’s performance compared to state-of-the-art data placement techniques.
Overall, this thesis provides two key conclusions: (1) hardware acceleration on an FPGA+HBM fabric is a promising solution to overcome the data movement bottleneck of our current computing systems; (2) data should drive system and design decisions by leveraging inherent data characteristics to make our computing systems more efficient. Thus, we conclude that the mechanisms proposed by this dissertation provide promising solutions to handle data well by following a data-centric approach and further demonstrates the importance of leveraging data to devise data-driven policies. We hope that the proposed architectural techniques and detailed experimental data presented in this dissertation will enable the development of energy-efficient data-intensive computing systems and drive the exploration of new mechanisms to improve the performance and energy efficiency of future computing systems.
