Our group works on a broad range of research domains, including all aspects of computer architecture, hardware security, bioinformatics, computer systems. More specific topic areas within these four domains include memory systems, discovery of new security vulnerabilities and defenses, emerging technologies, genome analysis, new computing and communication paradigms, acceleration of important workloads (e.g., AI, genomics, personalized medicine), computing platforms for health and medicine, fault-tolerant systems, storage systems, distributed analytics, hardware/software co-design, mobile systems, energy-efficient systems, etc. Thesis topics are available in all topics.
Please contact us if you are interested!
Below are even more specific potential thesis and semester project topics with the SAFARI Research Group. These are incomplete, so if your interest is not covered by these specific projects but falls in any of the above areas, please contact us.
NEW: BioPIM joint Master’s thesis with IBM:
New In-Memory Computing Paradigms to Enable Genome Sequencing in Off-grid Environments
We are looking for enthusiastic students interested in applying new in-memory computing paradigms to bioinformatics to enable genome sequencing in off-grid environments.
- Exploring DNN implementations that take advantage of the in-memory computing device’s unique and special capabilities to improve accuracy while maintaining a small area and power footprint
- Developing new decoding hardware in HDL for converting the output of the DNN, which represents probabilities that the DNA strand contains a base at a given timestep, into the actual inferred base string
- Developing downstream analysis algorithms that run on the in-memory computing device after basecalling to extend the utility of the device beyond basecalling
Leveraging and Optimizing Heterogeneous Computing Systems
The end of Moore’s law created the need for turning computers into heterogeneous systems, i.e., composed by multiple types of processors that can suit better different types of workloads or parts of them. More than a decade ago, Graphics Processing Units (GPUs) became general-purpose parallel processors, in order to make their outstanding processing capabilities available to many workloads beyond graphics. GPUs are key in the recent development of Machine Learning and Artificial Intelligence, which took unbearable training times before GPUs. Field-Programmable Processing Arrays (FPGAs) are another example of computing device that can deliver impressive benefits in terms of performance and energy efficiency. We are looking for enthusiastic students who want to work hands-on on different software, hardware, and architecture projects for heterogeneous systems, for example:
- Heterogeneous implementations (GPU, FPGA) of modern applications from important fields such as bioinformatics, machine learning, graph processing, medical imaging, etc.
- Scheduling techniques for heterogeneous systems with different general-purpose processors and accelerators, e.g., kernel offloading, memory scheduling, etc.
- Workload characterization and programming tools that enable easier and more efficient use of heterogeneous systems.
Programming and Improving a Real-world Processing-in-Memory Architecture
Data movement between the memory units and the compute units of current computing systems is a major performance and energy bottleneck. Many modern and important workloads such as machine learning, computational biology, and graph processing suffer greatly from the data movement bottleneck. In order to alleviate this data movement bottleneck, Processing-in-Memory (PIM) represents a paradigm shift from the traditional processor-centric design, where all computation takes place in the compute units, to a more data-centric design where processing elements are placed closer to or inside where the data resides. After many years of research proposals from Industry and Academia, a real-world processing-in-memory architecture is publicly available. The UPMEM PIM architecture integrates DRAM Processing Units (DPUs) inside DRAM chips. As a result, workloads can take advantage of an unprecedented memory bandwidth. Projects in this line of research span software and hardware as well as the software/hardware interface. We are looking for enthusiastic students who want to work hands-on (1) programming and optimizing workloads on the UPMEM PIM architecture, and/or (2) proposing and implementing hardware and architecture improvements for future PIM architectures.
Machine-Learning Assisted Intelligent Architectures
Modern processors employ numerous human-driven policies such as prefetching, cache-replacement, data management, and memory scheduling. These techniques rely on statically chosen design features that favor specific workload and/or device characteristics over the other. However, the complexity of designing a highly-effective, high-performance, efficient policy, which can effectively adapt to the changes in workload behavior for a broad range of workloads, usually is well beyond human capability. In this project, you will help develop, implement, and evaluate machine learning-based techniques for different aspects of computer architecture.
Rethinking Virtual Memory for Modern Computing Systems
Modern computing systems heavily depend on virtual memory to provide many features, all of which integral to the overall performance and functionality of the system. However, virtual memory is facing important challenges today that puts efficiently maintaining this critical component, as it is, at a serious risk. These challenges are fundamentally due to the fact that virtual memory was originally designed decades ago, not having the diversity and the complexity of today’s computing systems in mind. Our goal is to fundamentally rethink and redesign the virtual memory, in order to achieve the flexibility required in virtual memory abstraction for adopting today’s massively diverse system configurations while preserving widely-used programmer abstractions. In this project, you will be involved in (1) designing and performing evaluations to study the behavior of modern workloads and system configurations, and (2) using the insights gained from these evaluations to lead our research towards solutions for the challenges that the conventional virtual memory framework faces today.
Designing and Evaluating Energy-Efficient Main Memory
DRAM-based main memory is used in nearly all computers today, but its energy consumption is becoming a growing concern. DRAM energy utilization now accounts for as much as 40% of the total energy used by a computer. Our goal is to design new DRAM-based memory architectures that reduce the energy consumption significantly. This requires a principled approach, where we must measure how existing DRAM devices consume energy. Our group has developed a sophisticated energy measurement infrastructure to collect detailed information on DRAM energy usage. You will be involved with designing and conducting experiments to measure energy consumption using our infrastructure. Based on the data, you will work with other researchers to identify memory operations that consume large amounts of energy, and will design new DRAM architectures that improve the efficiency of these operations.
Evaluating and Enabling Processing inside Memory
Almost all data intensive workloads are bottlenecked in terms of performance and energy by the extensive data movement between processor and memory. We are looking for an enthusiastic student who is hungry for learning and enabling a paradigm shift that can eliminate this data movement bottleneck: computation inside memory (i.e., inside where the data resides). You will be involved in a project that aims to evaluate the benefits of executing data-intensive applications inside specialized logic in memoryand developing both mechanisms and simulatorsfor this purpose.
Exploring new algorithms and hardware architectures for Genomic Sequence Alignment
Our understanding of human genomes today is affected by the ability of modern computing technology to quickly and accurately determine an individual’s entire genome. However, timely analysis of genomic data remains a challenge. One of the most fundamental computational steps in most bioinformatics analyses is genomic sequence alignment. The execution time of this step constitutes the main performance bottleneck in genomic data analysis. In our research group, we developed several efficient hardware architectures and algorithmic solutions to tackle this problem. You will work with other researchers to design and analyze new algorithms and ideas. You will also implement and evaluate these new algorithms using real genomic data.
Navigating the Main Memory Landscape with Fast and Novel Infrastructures
Memory is the major performance, energy and reliability bottleneck of all data-intensive workloads, e.g., graph processing, machine learning using large data sets, data analytics, databases, genome analysis. The landscape of main memory is quickly changing with any technologies appearing and being proposed. This includes 3D-stacked memory designs that are capable of processing in memory, new non-volatile memory technologies that are poised to replace DRAM, and many new types of DRAM architectures. The impact of such new technologies on systems and applications need to be quickly evaluated and understood, with rigorous evaluation infrastructures. Our group develops and openly makes available such infrastructures. A prominent example is Ramulator, which is a very flexible and fast open-source infrastructure for simulating DRAM architectures: https://github.com/CMU-SAFARI/ramulator. This infrastructure is widely used in both academia and industry (e.g., by Google, Apple, AMD, Samsung). Your task in this project is to first understand Ramulator and then improve and extend it. Some extensions include support for the new technologies mentioned above (processing in memory, non-volatile memory, hybrid memories, new DRAM architectures). You will also evaluate the impact of such technologies on real workloads.
DRAM-based main memory is used in most computers today. Manufacturers have been optimizing DRAM capacity and bandwidth for years, but little effort has been done for designing secure memories. Our goal is to discover new security vulnerabilities in DRAM and propose new mechanisms that provide security support in DRAM. This requires characterizing DRAM under different working conditions and testing different data and address patterns. Our group has developed a DRAM testing infrastructure for memory characterization. To design new in-Memory security mechanisms, our group has developed a DRAM simulator that allows evaluating new hardware features in DRAM quickly. You will be involved with designing and conducting experiments with other researchers. The goals are: 1) discover new security vulnerabilities and identify new attack vectors that might compromise the security of the system, and 2) designing new security mechanisms that protect from these and other vulnerabilities using our infrastructure.