This shows you the differences between two versions of the page.
readings [2019/02/12 17:34] 127.0.0.1 external edit |
readings [2019/12/12 10:02] |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ~~NOCACHE~~ | ||
- | ====== Readings ====== | ||
- | ===== Guides on how to review papers critically ===== | ||
- | * Lecture slides: {{onur-CompArch-f17-how-to-do-the-paper-reviews.pdf | pdf}} {{onur-CompArch-f17-how-to-do-the-paper-reviews.ppt | Slides ppt}} | ||
- | * Example reviews on "Main Memory Scaling: Challenges and Solution Directions" [[https://people.inf.ethz.ch/omutlu/pub/main-memory-scaling_springer15.pdf|(link to the paper)]] | ||
- | * {{review-chapter.pdf | Review 1}} | ||
- | * {{review-chapter-2.pdf | Review 2}} | ||
- | * Example review on "Staged memory scheduling: Achieving high performance and scalability in heterogeneous systems" [[https://people.inf.ethz.ch/omutlu/pub/staged-memory-scheduling_isca12.pdf|(link to the paper)]] | ||
- | * {{review-sms.pdf | Review 1}} | ||
- | |||
- | ===== Lecture 1 (19.09 Wed.) ===== | ||
- | === Described in detail during lecture 1: === | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/mph_usenix_security07.pdf|T. Moscibroda and O. Mutlu. "Memory performance attacks: denial of memory service in multi-core systems," USENIX Security Symposium 2007}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/raidr-dram-refresh_isca12.pdf|J. Liu, B. Jaiyen, R. Veras, O. Mutlu, "RAIDR: Retention-Aware Intelligent DRAM Refresh," ISCA 2012}} | ||
- | * {{p422-bloom.pdf|B.H. Bloom, "Space/Time Trade-offs in Hash Coding with Allowable Errors," CACM, 1970}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/dram-retention-time-characterization_isca13.pdf|J. Liu, B. Jaiyen, Y. Kim, C. Wilkerson, O. Mutlu, "An Experimental Study of Data Retention Behavior in Modern DRAM Devices: Implications for Retention Time Profiling Mechanisms," ISCA 2013}} | ||
- | |||
- | === Suggested (lecture 1): === | ||
- | * {{bstj29-2-147.pdf|R.W. Hamming. "Error Detecting and Error Correcting Codes". Bell System Technical Journal, 1950}} | ||
- | * {{youandyourresearch.pdf|R.W. Hamming, "You and Your Research," Transcription of the Bell Communications Research Colloquium Seminar, 1986}} | ||
- | * [[http://www.youtube.com/watch?v=a1zDuOPkMSw|youtube]] | ||
- | * {{p128-rixner.pdf|S. Rixner, W.J. Dally, U.J. Kapasi, P. Mattson, J.D. Owens, "Memory access scheduling," ISCA 2000}} | ||
- | * {{US5630096.pdf|Zuravleff and Robinson. "Controller for a synchronous DRAM the maximizes throughput by allowing memory requests and commands to be issued out of order," US Patent 5,630,096, 1997}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/error-mitigation-for-intermittent-dram-failures_sigmetrics14.pdf|S. Khan, D. Lee, Y. Kim, A. Alameldeen, C. Wilkerson, O. Mutlu, "The Efficacy of Error Mitigation Techniques for DRAM Retention Failures: A Comparative Experimental Study," SIGMETRICS 2014}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/avatar-dram-refresh_dsn15.pdf|M. Qureshi, D.H. Kim, S. Khan, P. Nair, O. Mutlu, "AVATAR: A Variable-Retention-Time (VRT) Aware Refresh for DRAM Systems," DSN 2015}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/MEMCON-system-level-data-dependent-DRAM-failure-detection-mitigation_micro17.pdf|S. Khan, C. Wilkerson, Z. Wang, A.R. Alameldeen, D. Lee, O. Mutlu, "Detecting and Mitigating Data-Dependent DRAM Failures by Exploiting Current Memory Content," MICRO 2017}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/reaper-dram-retention-profiling-lpddr4_isca17.pdf|M. Patel, J.S. Kim, O. Mutlu, "The Reach Profiler (REAPER): Enabling the Mitigation of DRAM Retention Failures via Profiling at Aggressive Conditions," ISCA 2017}} | ||
- | |||
- | |||
- | ===== Lecture 2 (20.09 Wed.) ===== | ||
- | === Described in detail during lecture 2: === | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/dram-row-hammer_isca14.pdf|Y. Kim, R. Daly, J. Kim, C. Fallin, J.H. Lee, D. Lee, C. Wilkerson, K. Lai, O. Mutlu, "Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors," ISCA 2014}} | ||
- | |||
- | === Suggested (lecture 2): === | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/memory-errors-at-facebook_dsn15.pdf|J. Meza, Q. Wu, S. Kumar, O. Mutlu, "Revisiting Memory Errors in Large-Scale Production Data Centers: Analysis and Modeling of New Trends from the Field," DSN 2015}} | ||
- | * {{https://googleprojectzero.blogspot.ch/2015/03/exploiting-dram-rowhammer-bug-to-gain.html | M. Seaborn and T. Dullien, "Exploiting the DRAM rowhammer bug to gain kernel privileges," Google Project Zero, 2015}} | ||
- | * {{10.1007-978-3-319-40667-1_15.pdf| D. Gruss, C. Maurice, S. Mangard, "Rowhammer.js: A Remote Software-Induced Fault Attack in JavaScript," DIMVA 2016}} | ||
- | * {{p743-aweke.pdf| Z. B. Aweke, S. F. Yitbarek, R. Qiao, R. Das, M. Hicks, Y. Oren, T. Austin, "Rowhammer.js: ANVIL: Software-Based Protection Against Next-Generation Rowhammer Attacks," ASPLOS 2016}} | ||
- | * {{0824a987.pdf| E. Bosman, K. Razavi, H. Bos, C. Giuffrida, "Dedup Est Machina: Memory Deduplication as an Advanced Exploitation Vector," IEEE S&P 2016}} | ||
- | * {{sec16_paper_razavi.pdf| K. Razavi, B. Gras, E. Bosman, B. Preneel, "Flip Feng Shui: Hammering a Needle in the Software Stack," USENIX Security 2016}} | ||
- | * {{p1675-van-der-veen.pdf| V. van der Veen, Y. Fratantonio, M. Lindorfer, D. Gruss, C. Maurice, G. Vigna, H. Bos, K. Razavi, C. Giuffrida, "Drammer: Deterministic Rowhammer Attacks on Mobile Platforms," CCS 2016}} | ||
- | * {{Grand Pwning Unit Accelerating Microarchitectural with the GPU.pdf|P. Frigo, C. Giuffrida, H. Bos, K. Razavi, "Grand Pwning Unit: Accelerating Microarchitectural Attacks with the GPU," S&P, 2018.}} | ||
- | * {{throwhammer_atc18.pdf|A. Tatar, R. Krishnan, E. Athanasopoulos, C. Giuffrida, H. Bos, K. Razavi, "Throwhammer: Rowhammer Attacks over the Network and Defenses," ATC 2018}} | ||
- | * {{1805.04956.pdf|M. Lipp, M. T. Aga, M. Schwarz, D. Gruss, C. Maurice, L. Raab, L. Lamster, "Nethammer: Inducing Rowhammer Faults through Network Requests," arxiv 2018}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/rowhammer-and-other-memory-issues_date17.pdf|O. Mutlu, "The RowHammer Problem and Other Issues We May Face as Memory Becomes Denser," DATE 2017}} | ||
- | * {{Patt_2001.pdf | Y.N. Patt, "Requirements, Bottlenecks, and Good Fortune: Agents for Microprocessor Evolution," IEEE Micro 2001}} | ||
- | |||
- | ===== Lecture 3a (26.09 Wed.) ===== | ||
- | === Required (lecture 3a): === | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/mph_usenix_security07.pdf|T. Moscibroda and O. Mutlu. "Memory performance attacks: denial of memory service in multi-core systems," USENIX Security Symposium 2007}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/dram-row-hammer_isca14.pdf|Y. Kim, R. Daly, J. Kim, C. Fallin, J.H. Lee, D. Lee, C. Wilkerson, K. Lai, O. Mutlu, "Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors," ISCA 2014}} | ||
- | |||
- | === Recommended (lecture 3a): === | ||
- | * {{patt_ieee2001.pdf|Y.N. Patt. "Requirements, bottlenecks, and good fortune: agents for microprocessor evolution". Proceedings of the IEEE, 2001}} | ||
- | * {{gordon_moore_1965_article.pdf| G.E. Moore. "Cramming more components onto integrated circuits," Electronics magazine, 1965}} | ||
- | * {{https://en.wikipedia.org/wiki/The_Structure_of_Scientific_Revolutions| T.S. Kuhn, "The Structure of Scientific Revolutions," 1962}} | ||
- | * {{Burks_vonNeumann.pdf| A.W. Burks, H.H. Goldstein, J. von Neumann, "Preliminary discussion of the logical design of an electronic computing instrument," 1946}} | ||
- | * {{04_chapter_4.pdf| Y.N. Patt and S.J. Patel, "Introduction to Computing Systems: Chapter 4, The von Neumann Model”, 2004}} | ||
- | * {{p126-dennis.pdf | J.B. Dennis, D. Misunas, "A preliminary architecture for a basic data-flow processor," ISCA 1974}} | ||
- | * {{p34-gurd-2.pdf| J.R. Gurd, C.C. Kirkham, I. Watson, "Manchester data flow computer," CACM, 1985}} | ||
- | |||
- | ===== Lecture 3b (26.09 Wed.) ===== | ||
- | === Required (lecture 3b): === | ||
- | === Recommended (lecture 3b): === | ||
- | * {{Wilkes_1965.pdf| M.V. Wilkes, "Slave Memories and Dynamic Storage Allocation," IEEE Trans. On Electronic Computers, 1965}} | ||
- | * {{culler_parcomparch_5.1.pdf|Culler and Singh, Parallel Computer Architecture, Chapter 5.1 (pp 269–283)}} | ||
- | * {{culler_parcomparch_5.3.pdf|Culler and Singh, Parallel Computer Architecture, Chapter 5.3 (pp 291-305)}} | ||
- | * {{ph_computerorganizationanddesignthehardwaresoftwareinterface5th_5.10.pdf|P&H, Computer Organization and Design, Chapter 5.10 (pp 466-470)}} | ||
- | * {{Hamacher_Ch8_2012.pdf| C. Hamacher, Z. Vranesic, S. Zaky, N. Manjikian, "Computer Organization and Embedded Systems: Chapter 8, The memory system”, 2012}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/raidr-dram-refresh_isca12.pdf|J. Liu, B. Jaiyen, R. Veras, O. Mutlu, "RAIDR: Retention-Aware Intelligent DRAM Refresh," ISCA 2012}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/pcm_isca09.pdf|B.C. Lee, E. Ipek, O. Mutlu, D. Burger, "Architecting Phase Change Memory as a Scalable DRAM Alternative," ISCA 2009}} | ||
- | * {{phase-changetechnologyandthefutureofmainmemory.pdf|B. C. Lee, P. Zhou, J. Yang, Y. Zhang, B. Zhao, E. Ipek, O. Mutlu, and D. Burger "Phase-Change Technology and the Future of Main Memory" IEEE Micro Top Picks 2010}} | ||
- | * {{liptay68.pdf| J.S. Liptay, "Structural aspects of the System/360 Model 85 II: the cache," IBM Systems Journal, 1968}} | ||
- | * {{p435-fotheringham.pdf| J. Fotheringham, "Dynamic Storage Allocation in the Atlas Computer, Including an Automatic Use of a Backing Store," CACM, 1961}} | ||
- | * {{Bloom62.pdf| L. Bloom, M. Cohen, S. Porter, "Considerations in the Design of a Computer with High Logic-to-Memory Speed Ratio," AIEE Gigacycle Computing Systems Winter Meeting 1962}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/qureshi_isca06.pdf|Moinuddin K. Qureshi, Daniel N. Lynch, Onur Mutlu, and Yale N. Patt. A Case for MLP-Aware Cache Replacement. ISCA '06}} | ||
- | * {{Belady_IBM1966.pdf| L.A. Belady, “A study of replacement algorithms for a virtual- storage computer,” IBM Systems Journal, 1966}} | ||
- | |||
- | ===== Lecture 4a (27.09 Wed.) ===== | ||
- | === Required (lecture 4a): === | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/qureshi_isca06.pdf| Moinuddin K. Qureshi, Daniel N. Lynch, Onur Mutlu, Yale N. Patt, "A Case for MLP-Aware Cache Replacement", ISCA 2006}} | ||
- | === Recommended (lecture 4a): === | ||
- | * {{1-jouppi.pdf| Norman P. Jouppi, "ImprovingDirect-MappedCachePerformancebytheAdditionofaSmall Fully-Associative Cache and Prefetch Buffers", ISCA 1990}} | ||
- | * {{p169-seznec.pdf| André Seznec, “A Case for Two-Way Skewed-Associative Caches,” ISCA 1993}} | ||
- | * {{p381-qureshi.pdf| Moinuddin K. Qureshi, Aamer Jaleel, Yale N. Patt, Simon C. Steely Jr., Joel Emer| “Adaptive Insertion Policies for High Performance Caching,” ISCA 2007}} | ||
- | * {{evictedaddressfilter.pdf| Vivek Seshadri, Onur Mutlu, Michael A Kozuch, Todd C Mowry, "The Evicted-Address Filter: A Unified Mechanism to Address Both Cache Pollution and Thrashing", PACT 2012}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/bdi-compression_pact12.pdf| Gennady Pekhimenko, Vivek Seshadri, Onur Mutlu, Philip B. Gibbons, Michael A. Kozuch, and Todd C. Mowry, "Base-Delta-Immediate Compression: Practical Data Compression for On-Chip Caches", PACT 2012}} | ||
- | * {{p81-kroft.pdf| David Kroft, "Lockup-Free Instruction Fetch/Prefetch Cache Organization," ISCA 1981}} | ||
- | * {{andrew_glew.pdf| “MLP Yes! ILP No!,” ASPLOS Wild and Crazy Ideas Session, 1998}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/mutlu_ieee_micro03.pdf| Onur Mutlu, Jared Stark, Chris Wilkerson, and Yale N. Patt, "Runahead Execution: An Effective Alternative to Large Instruction Windows"}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/utility-based-hybrid-memory-management_cluster17.pdf| Yang Li, Saugata Ghose, Jongmoo Choi, Jin Sun, Hui Wang, and Onur Mutlu, "Utility-Based Hybrid Memory Management"}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/parbs_isca08.pdf| Onur Mutlu and Thomas Moscibroda, "Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems"}} | ||
- | * {{p46-dusser.pdf| Julien Dusser, Thomas Piquet, André Seznec, "Zero-Content Augmented Caches", ICS 2009}} | ||
- | * {{zero.pdf| Mafijul Md. Islam, Per Stenstrom, "Zero-Value Caches: Cancelling Loads that Return Zero", PACT 2009}} | ||
- | * {{p258-yang.pdf| Jun Yang, Youtao Zhang, Rajiv Gupta, "Frequent Value Compression in Data Caches," MICRO 2000}} | ||
- | * {{alameldeen_frequentpatterncompression_isca04.pdf| Alaa R. Alameldeen, David A. Wood, "Adaptive Cache Compression for High-Performance Processors", ISCA 2004 }} | ||
- | * {{cpack_chen_tvlsisystems10.pdf| Xi Chen, Lei Yang, Robert P. Dick, Li Shang, Haris Lekatsas, "C-Pack: A High-Performance Microprocessor Cache Compression Algorithm," T-VLSI Systems 2010}} | ||
- | |||
- | ===== Lecture 4b (27.09 Wed.) ===== | ||
- | === Required (lecture 4b): === | ||
- | |||
- | |||
- | === Recommended (lecture 4b): === | ||
- | |||
- | * {{https://people.inf.ethz.ch/omutlu/pub/memory-systems-research_superfri14.pdf| Onur Mutlu, Lavanya Subramanian,"Research Problems and Opportunities in Memory Systems," SUPERFRI 2014}} | ||
- | * {{https://arxiv.org/pdf/1802.00320.pdf| Saugata Ghose, Kevin Hsieh, Amirali Boroumand, Rachata Ausavarungnirun, and Onur Mutlu, "Enabling the Adoption of Processing-in-Memory: Challenges, Mechanisms, Future Research Directions," Invited Book Chapter, to appear in 2018 }} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/rowhammer-and-other-memory-issues_date17.pdf| Onur Mutlu, "The RowHammer Problem and Other Issues We May Face as Memory Becomes Denser" DATE 2017}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/memory-scaling_memcon13.pdf| Onur Mutlu, "Memory Scaling: A Systems Architecture Perspective," MemCon 2013}} | ||
- | * {{https://arxiv.org/pdf/1706.08642.pdf| Yu Cai, Saugata Ghose, Erich F. Haratsch, Yixin Luo, and Onur Mutlu, "Error Characterization, Mitigation, and Recovery in Flash Memory Based Solid State Drives," Proceedings of the IEEE, Sept. 2017}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/tldram_hpca13.pdf| Donghyuk Lee, Yoongu Kim, Vivek Seshadri, Jamie Liu, Lavanya Subramanian, and Onur Mutlu, "Tiered-Latency DRAM: A Low Latency and Low Cost DRAM Architecture," HPCA 2013}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/salp-dram_isca12.pdf| Yoongu Kim, Vivek Seshadri, Donghyuk Lee, Jamie Liu, and Onur Mutlu, "A Case for Exploiting Subarray-Level Parallelism (SALP) in DRAM," ISCA 2012}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/raidr-dram-refresh_isca12.pdf| Jamie Liu, Ben Jaiyen, Richard Veras, and Onur Mutlu, "RAIDR: Retention-Aware Intelligent DRAM Refresh"}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/ramulator_dram_simulator-ieee-cal15.pdf| Yoongu Kim, Weikun Yang, and Onur Mutlu, "Ramulator: A Fast and Extensible DRAM Simulator"}} | ||
- | * {{stupid_architects_look_to_future.pdf| R. Sites, "It’s the Memory, Stupid!," Microprocessor report 1996}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/mutlu_hpca03.pdf| Onur Mutlu, Jared Stark, Chris Wilkerson, and Yale N. Patt, "Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-order Processors," }} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/memory-errors-at-facebook_dsn15.pdf| Justin Meza, Qiang Wu, Sanjeev Kumar, and Onur Mutlu, "Revisiting Memory Errors in Large-Scale Production Data Centers: Analysis and Modeling of New Trends from the Field," DSN 2015}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/dram-row-hammer_isca14.pdf| Yoongu Kim, Ross Daly, Jeremie Kim, Chris Fallin, Ji Hye Lee, Donghyuk Lee, Chris Wilkerson, Konrad Lai, and Onur Mutlu, "Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors"}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/softMC_hpca17.pdf| Hasan Hassan, Nandita Vijaykumar, Samira Khan, Saugata Ghose, Kevin Chang, Gennady Pekhimenko, Donghyuk Lee, Oguz Ergin, and Onur Mutlu, "SoftMC: A Flexible and Practical Open-Source Infrastructure for Enabling Experimental DRAM Studies," HPCA 2017 }} | ||
- | * {{kang-memoryforum14.pdf| Uksong Kang, Hak-soo Yu, Churoo Park, Hongzhong Zheng, John Halbert, Kuljit Bains, SeongJin Jang, and Joo Sun Choi, "Co-Architecting Controllers and DRAM to Enhance DRAM Process Scaling," The Memory Forum 2014}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/heterogeneous-reliability-memory-for-data-centers_dsn14.pdf| Yixin Luo, Sriram Govindan, Bikash Sharma, Mark Santaniello, Justin Meza, Aman Kansal, Jie Liu, Badriddine Khessib, Kushagra Vaid, and Onur Mutlu, "Characterizing Application Memory Error Vulnerability to Optimize Data Center Cost via Heterogeneous-Reliability Memory," DSN 2014}} | ||
- | |||
- | |||
- | ===== Lecture 5 (03.10 Wed.) ===== | ||
- | === Described in detail during lecture 5: === | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/dram-row-hammer_isca14.pdf|Y. Kim, R. Daly, J. Kim, C. Fallin, J.H. Lee, D. Lee, C. Wilkerson, K. Lai, O. Mutlu, "Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors," ISCA 2014}} | ||
- | * {{isca09-disaggregate.pdf|K. Lim, J. Chang, T. Mudge, P. Ranganathan, S.K. Reinhardt, T.F. Wenisch, "Disaggregated Memory for Expansion and Sharing in Blade Servers," ISCA 2009}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/softMC_hpca17.pdf|H. Hassan, N. Vijaykumar, S. Khan, S. Ghose, K. Chang, G. Pekhimenko, D. Lee, O. Ergin, O. Mutlu, "SoftMC: A Flexible and Practical Open-Source Infrastructure for Enabling Experimental DRAM Studies," HPCA 2017}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/pcm_isca09.pdf|B.C. Lee, E. Ipek, O. Mutlu, D. Burger, "Architecting Phase Change Memory as a Scalable DRAM Alternative," ISCA 2009}} | ||
- | |||
- | === Suggested (lecture 5): === | ||
- | * {{andrew_glew.pdf| A. Glew, “MLP Yes! ILP No!,” ASPLOS Wild and Crazy Ideas Session 1998}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/mutlu_ieee_micro03.pdf| O. Mutlu, J. Stark, C. Wilkerson, Y.N. Patt, “Runahead Execution: An Effective Alternative to Large Instruction Windows,” IEEE Micro 2003}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/utility-based-hybrid-memory-management_cluster17.pdf| Y. Li, S. Ghose, J. Choi, J. Sun, H. Wang, O. Mutlu, “Utility-Based Hybrid Memory Management,” CLUSTER 2017}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/parbs_isca08.pdf| O. Mutlu and T. Moscibroda, “Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems,” ISCA 2008}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/memory-systems-research_superfri14.pdf| O. Mutlu and L. Subramanian, “Research Problems and Opportunities in Memory Systems,” SUPERFRI 2014}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/tldram_hpca13.pdf|D. Lee, Y. Kim, V. Seshadri, J. Liu, L. Subramanian, O. Mutlu, "Tiered-Latency DRAM: A Low Latency and Low Cost DRAM Architecture," HPCA 2013}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/salp-dram_isca12.pdf|Y. Kim, V. Seshadri, D. Lee, J. Liu, O. Mutlu, "A Case for Exploiting Subarray-Level Parallelism (SALP) in DRAM," ISCA 2012}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/raidr-dram-refresh_isca12.pdf|J. Liu, B. Jaiyen, R. Veras, O. Mutlu, "RAIDR: Retention-Aware Intelligent DRAM Refresh," ISCA 2012}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/ramulator_dram_simulator-ieee-cal15.pdf|Y. Kim, W. Yang, O. Mutlu, "Ramulator: A Fast and Extensible DRAM Simulator," IEEE CAL 2015}} | ||
- | * {{stupid_architects_look_to_future.pdf|R. Sites, "It’s the Memory, Stupid!," Microprocessor report 1996}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/mutlu_hpca03.pdf|O. Mutlu, J. Stark, C. Wilkerson, Y.N. Patt, "Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-order Processors," HPCA 2003}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/memory-errors-at-facebook_dsn15.pdf|J. Meza, Q. Wu, S. Kumar, O. Mutlu, "Revisiting Memory Errors in Large-Scale Production Data Centers: Analysis and Modeling of New Trends from the Field," DSN 2015}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/adaptive-latency-dram_hpca15.pdf|D. Lee, Y. Kim, G. Pekhimenko, S. Khan, V. Seshadri, K. Chang, O. Mutlu, "Adaptive-Latency DRAM: Optimizing DRAM Timing for the Common-Case," HPCA 2015}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/understanding-latency-variation-in-DRAM-chips_sigmetrics16.pdf|K. Chang, A. Kashyap, H. Hassan, S. Khan, K. Hsieh, D. Lee, S. Ghose, G. Pekhimenko, T. Li, O. Mutlu, "Understanding Latency Variation in Modern DRAM Chips: Experimental Characterization, Analysis, and Optimization," SIGMETRICS 2016}} | ||
- | * {{kang-memoryforum14.pdf|U. Kang, H.-S. Yu, C. Park, H. Zheng, J. Halbert, K. Bains, S. Jang, J.S. Choi, "Co-Architecting Controllers and DRAM to Enhance DRAM Process Scaling," The Memory Forum 2014}} | ||
- | * {{https://arxiv.org/pdf/1706.08642.pdf|Y. Cai, S. Ghose, E.F. Haratsch, Y. Luo, O. Mutlu, "Error Characterization, Mitigation, and Recovery in Flash Memory Based Solid State Drives," Proceedings of the IEEE 2017}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/flash-memory-programming-vulnerabilities_hpca17.pdf|Y. Cai, S. Ghose, Y. Luo, K. Mai, O. Mutlu, E.F. Haratsch, "Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques," HPCA 2017}} | ||
- | * {{ekman-ISCA05.pdf|M. Ekman and P. Stenstrom, "A Robust Main-Memory Compression Scheme," ISCA 2005}} | ||
- | * {{PCM_IBMJRD.pdf|S. Raoux, G. W. Burr, M. J. Breitwisch, C. T. Rentner, Y.-C. Chen, R. M. Shelby, M. Salinga, D. Krebs, S.-H. Chen, H.-L. Lung, C. H. Lam, "Phase-change random access memory: A scalable technology," IBM JRD 2008}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/heterogeneous-reliability-memory-for-data-centers_dsn14.pdf| Y. Luo, S. Govindan, B. Sharma, M. Santaniello, J. Meza, A. Kansal, J. Liu, B. Khessib, K. Vaid, O. Mutlu, "Characterizing Application Memory Error Vulnerability to Optimize Datacenter Cost via Heterogeneous-Reliability Memory," DSN 2014}} | ||
- | * {{chandra.pdf|T. Chandra, "Sibyl: A system for large scale machine learning at Google," Keynote at DSN 2014}} | ||
- | * [[https://www.youtube.com/watch?v=3SaZ5UAQrQM|youtube]] | ||
- | |||
- | ===== Lecture 7 (10.10 Wed.) ===== | ||
- | === Suggested (lecture 7): === | ||
- | * {{Flynn_1966.pdf|M.J. Flynn, “Very high-speed computing systems,” Proc. of IEEE 1966}} | ||
- | * {{p140-fisher.pdf|J.A.Fisher, "Very Long Instruction Word architectures and the ELI-512,” ISCA 1983}} | ||
- | * {{p63-russell.pdf|R.M. Russell, "The CRAY-1 computer system,” CACM 1978}} | ||
- | * {{p74-rau.pdf|B.R. Rau, "Pseudo-randomly interleaved memory,” ISCA 1991}} | ||
- | * {{mmx_technology_1996.pdf|A. Peleg and U. Weiser, "MMX technology extension to the Intel architecture,” IEEE Micro 1996}} | ||
- | * {{04523358.pdf|E. Lindholm, J. Nickolls, S. Oberman, and J. Montrym, "NVIDIA Tesla: A Unified Graphics and Computing Architecture,” IEEE Micro 2008}} | ||
- | * {{30470407.pdf| W.W.L. Fung, I. Sham, G. Yuan, and T.M. Aamodt, "Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow," MICRO 2007}} | ||
- | |||
- | ===== Lecture 8 (11.10 Thu.) ===== | ||
- | === Suggested (lecture 8): === | ||
- | * {{https://mitpress.mit.edu/books/introduction-bioinformatics-algorithms|N.C. Jones, P.A. Pevzner, and P. Pevzner, “An introduction to bioinformatics algorithms,” MIT press, 2004}} | ||
- | * {{http://genome-scale.info|V. Mäkinen, D. Belazzougui, F. Cunial, and A.I. Tomescu, “Genome-scale algorithm design,” Cambridge University Press, 2015}} | ||
- | * {{https://academic.oup.com/bioinformatics/article/32/11/1632/1742696|X. Hongyi, S. Nahar, R. Zhu, J. Emmons, G. Pekhimenko, C. Kingsford, C. Alkan, and O. Mutlu, “Optimal Seed Solver: optimizing seed selection in read mapping”, Bioinformatics, 2016}} | ||
- | * {{https://www.nature.com/articles/ng.437|C. Alkan, J.M. Kidd, T. Marques-Bonet, G. Aksay, F. Antonacci, F. Hormozdiari, J.O. Kitzman, C. Baker, M. Malig, O. Mutlu, S.C. Sahinalp, R.A. Gibbs, E.E. Eichler, "Personalized copy number and segmental duplication maps using next-generation sequencing”, Nature Genetics, 2009}} | ||
- | * {{https://bmcgenomics.biomedcentral.com/track/pdf/10.1186/1471-2164-14-S1-S13|H. Xin, D. Lee, F. Hormozdiari, S. Yedkar, O. Mutlu, and C, Alkan, "Accelerating read mapping with FastHASH," BMC Genomics, 2013}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/shifted-hamming-distance_bioinformatics15.pdf|H. Xin, J. Greth, J. Emmons, G. Pekhimenko, C. Kingsford, C. Alkan, and O. Mutlu, "Shifted Hamming Distance: A Fast and Accurate SIMD-friendly Filter to Accelerate Alignment Verification in Read Mapping,” Bioinformatics, 2015}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/gatekeeper_FPGA-genome-prealignment-accelerator_bionformatics17.pdf|M. Alser, H. Hassan, H. Xin, O. Ergin, O. Mutlu, C. Alkan, "GateKeeper: A New Hardware Architecture for Accelerating Pre-Alignment in DNA Short Read Mapping,” Bioinformatics, 2017}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/magnet-understanding-improving-genome-prealignment_ipsi17.pdf|M. Alser, O. Mutlu, and C. Alkan, "MAGNET: Understanding and Improving the Accuracy of Genome Pre-Alignment Filtering," IPSI Transactions on Internet Research, 2017}} | ||
- | * {{https://arxiv.org/pdf/1809.07858.pdf|M. Alser, H. Hassan, A. Kumar, O. Mutlu, and C. Alkan, "SLIDER: Fast and Efficient Computation of Banded Sequence Alignment," arXiv 2018}} | ||
- | * {{https://arxiv.org/pdf/1711.01177.pdf|J.S. Kim, D. Senol Cali, H. Xin, D. Lee, S. Ghose, M. Alser, H. Hassan, O. Ergin, C. Alkan, and O. Mutlu, "GRIM-Filter: Fast Seed Location Filtering in DNA Read Mapping Using Processing-in-Memory Technologies," to appear in BMC Genomics, 2018}} | ||
- | * {{https://arxiv.org/pdf/1711.08774.pdf|D. Senol Cali, J.S. Kim, S. Ghose, C. Alkan, and O. Mutlu, “Nanopore Sequencing Technology and Tools for Genome Assembly: Computational Analysis of the Current State, Bottlenecks and Future Directions,” to appear in Briefings in Bioinformatics, 2018}} | ||
- | |||
- | ===== Lecture 9 (17.10 Wed.) ===== | ||
- | === Required (lecture 9): === | ||
- | * {{ :mcfarling_combining.pdf | S. McFarling, "Combining Branch Predictors" DEC WRL Technical Report 1993}} | ||
- | * {{ :two-level-branch-pred.pdf | T. Yeh and Y. Patt, "Two-Level Adaptive Training Branch Prediction" MICRO 1991}} | ||
- | |||
- | === Suggested (lecture 9): === | ||
- | * {{ :the_microarchitecture_of_superscalar_processors.pdf | J. Smith, "The Microarchitecture of Superscalar Processors" IEEE 1995}} | ||
- | * {{ :alpha_21264.pdf | R. E. Kessler, "The Alpha 21264 Microprocessor" IEEE Micro 1999}} | ||
- | * {{ :p16-pettis.pdf | Pettis and Hansen, "Profile Guided Code Positioning" PLDI 1990}} | ||
- | * {{ :p300-ball.pdf | Ball and Larus, ”Branch prediction for free,” PLDI 1993}} | ||
- | * {{ :jsmith.pdf | J. Smith, "A study of Branch Prediction Strategies" ISCA 1981}} | ||
- | * {{ :r2_evers_2lbranch_isca98.pdf | M. Evers et al., "An Analysis of Correlation and Predictability: What Makes Two-level Branch Predictors Work" ISCA 1998}} | ||
- | * {{ :bf03356745.pdf |P. Chang et al., "Branch Classification: A New Mechanism for Improving Branch Predictor Performance" MICRO 1994}} | ||
- | * {{ :agree_isca24.pdf | E. Sprangle et al., "The Agree Predictor: A Mechanism for Reducing Negative Branch History Interference" ISCA 1997}} | ||
- | * {{ :optim2bcgskew.pdf | A. Seznec, "An Optimized 2bcgskew Branch Predictor" IRISA Tech. Report 1993}} | ||
- | * {{ :michaud97trading.pdf | P. Michaud et al., "Trading Conflict and Capacity Aliasing in Conditional Branch Predictors" ISCA 1997}} | ||
- | * {{ :p4-lee.pdf | C. Lee et al., "The Bi-Mode Branch Predictor" MICRO 1997}} | ||
- | * {{ :p69-eden.pdf | A. N. Eden and T. Mudge, "The YAGS Branch Prediction Scheme" MICRO 1998}} | ||
- | * {{ :seznec02.pdf | A. Seznec et al., "Design Tradeoffs for the Alpha EV8 Conditional Branch Predictor" ISCA 2002}} | ||
- | * {{ :p124-yeh.pdf | T. Yeh and Y. Patt, "Alternative implementations of two-level adaptive branch prediction" ISCA 1992}} | ||
- | * {{ :ssmt.pdf | R. Chappell et al., "Simultaneous Subordinate Microthreading (SSMT)" ISCA 1999}} | ||
- | * {{ :aaaa09d97139a5076ad0a24bd5bb69bea1e1.pdf | D. Jimenez and C. Lin, "Dynamic Branch Prediction with Perceptrons" HPCA 2001}} | ||
- | * {{ :ad48737158334a46763c8e0b29fd53975e10.pdf | A. Seznec, "Analysis of the O-GEometric History Length Branch Predictor" ISCA 2005}} | ||
- | * {{ :centrino_microarchitecture_and_performance.pdf | S. Gochman et al., "The Intel Pentium M Processor: Microarchitecture and Performance" Intel Technology Journal 2003}} | ||
- | * {{ :principles_of_neurodynamics.pdf | F. Rosenblatt, “Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms,” 1962}} | ||
- | * {{ :v8paper1.pdf | A. Seznec and P. Michaud, "A Case for (Partially) TAgged GEometric History Length Branch Prediction" JILP 2006}} | ||
- | * {{ :AndreSeznec.pdf | A. Seznec, "TAGE-SC-L Branch Predictors" CBP 2014}} | ||
- | * {{ :andresezneclimited.pdf | A. Seznec, "TAGE-SC-L Branch Predictors Again" CBP 2016}} | ||
- | * {{ :chappell_ISCA2002.pdf | R. Chappell et al., "Difficult-Path Branch Prediction Using Subordinate Microthreads" ISCA 2002}} | ||
- | * {{ :micro.confidence.pdf | Jacobsen et al., "Assigning Confidence to Conditional Branch Predictions" MICRO 1996}} | ||
- | * {{ :10.1.1.33.9918.pdf | Manne et al., "Pipeline Gating: Speculation Control for Energy Reduction" ISCA 1998}} | ||
- | |||
- | ===== Lecture 10a (18.10 Thu.) ===== | ||
- | === Required (lecture 10a): === | ||
- | |||
- | === Suggested (lecture 10a): === | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/rlmc_isca08.pdf|E. Ipek, O., J. F. Martínez, and R. Caruana, "Self-Optimizing Memory Controllers: A Reinforcement Learning Approach," ISCA 2008}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/ramulator_dram_simulator-ieee-cal15.pdf|Y. Kim, W. Yang, and O. Mutlu, "Ramulator: A Fast and Extensible DRAM Simulator," IEEE CAL 2015}} | ||
- | |||
- | ===== Lecture 10b (18.10 Thu.) ===== | ||
- | === Required (lecture 10b): === | ||
- | |||
- | === Suggested (lecture 10b): === | ||
- | * {{Wilkes_1965.pdf| M.V. Wilkes, "Slave Memories and Dynamic Storage Allocation," IEEE Trans. On Electronic Computers, 1965}} | ||
- | * {{Amdahl_1964.pdf| G.M. Amdahl, G.A. Blaauw, F.P. Brooks. "Architecture of the IBM System/360," IBM Journal of Research and Development, 1964}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/tldram_hpca13.pdf|D. Lee, Y. Kim, V. Seshadri, J. Liu, L. Subramanian, O. Mutlu, "Tiered-Latency DRAM: A Low Latency and Low Cost DRAM Architecture," HPCA 2013}} | ||
- | * {{https://users.ece.cmu.edu/~omutlu/pub/lisa-dram_hpca16.pdf|K. K. Chang, P. J. Nair, S. Ghose, D. Lee, M. K. Qureshi, O. Mutlu, "Low-Cost Inter-Linked Subarrays (LISA): Enabling Fast Inter-Subarray Data Movement in DRAM", HPCA 2016}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/understanding-latency-variation-in-DRAM-chips_sigmetrics16.pdf|K. Chang, A. Kashyap, H. Hassan, S. Khan, K. Hsieh, D. Lee, S. Ghose, G. Pekhimenko, T. Li, O. Mutlu, "Understanding Latency Variation in Modern DRAM Chips: Experimental Characterization, Analysis, and Optimization," SIGMETRICS 2016}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/adaptive-latency-dram_hpca15.pdf|D. Lee, Y. Kim, G. Pekhimenko, S. Khan, V. Seshadri, K. Chang, O. Mutlu, "Adaptive-Latency DRAM: Optimizing DRAM Timing for the Common-Case," HPCA 2015}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/dram-row-hammer_isca14.pdf|Y. Kim, R. Daly, J. Kim, C. Fallin, J.H. Lee, D. Lee, C. Wilkerson, K. Lai, O. Mutlu, "Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors," ISCA 2014}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/softMC_hpca17.pdf|H. Hassan, N. Vijaykumar, S. Khan, S. Ghose, K. Chang, G. Pekhimenko, D. Lee, O. Ergin, O. Mutlu, "SoftMC: A Flexible and Practical Open-Source Infrastructure for Enabling Experimental DRAM Studies," HPCA 2017}} | ||
- | *{{https://people.inf.ethz.ch/omutlu/pub/DIVA-low-latency-DRAM_sigmetrics17-paper.pdf|D. Lee, S. Khan, L. Subramanian, S. Ghose, R. Ausavarungnirun, G. Pekhimenko, V. Seshadri, O. Mutlu, "Design-Induced Latency Variation in Modern DRAM Chips: Characterization, Analysis, and Latency Reduction Mechanisms," SIGMETRICS 2017}} | ||
- | |||
- | ===== Lecture 11a (24.10 Wed.) ===== | ||
- | === Recommended (lecture 11a): === | ||
- | |||
- | * {{solar-dram-for-reduced-latency-memory_iccd18.pdf| J.S. Kim, M. Patel, H. Hassan, O. Mutlu, ”Solar-DRAM: Reducing DRAM Access Latency by Exploiting the Variation in Local Bitlines,” | ||
- | ICCD, 2018.}} | ||
- | * {{DIVA-low-latency-DRAM_sigmetrics17-paper.pdf| D. Lee, S. Khan, L. Subramanian, S. Ghose, R. Ausavarungnirun, G. Pekhimenko, V. Seshadri, O. Mutlu, | ||
- | "Design-Induced Latency Variation in Modern DRAM Chips: Characterization, Analysis, and Latency Reduction Mechanisms,” SIGMETRICS, 2017.}} | ||
- | * {{Voltron-reduced-voltage-DRAM-sigmetrics17-paper.pdf| K. Chang, A.G. Yaglikci, S. Ghose, A. Agrawal, N. Chatterjee, A. Kashyap, D. Lee, M. O'Connor, H. Hassan, O. Mutlu, "Understanding Reduced-Voltage Operation in Modern DRAM Devices: Experimental Characterization, Analysis, and Mechanisms," SIGMETRICS, 2017.}} | ||
- | * {{memory-dvfs_icac11.pdf| H. David, C. Fallin, E. Gorbatov, U.R. Hanebutte, O. Mutlu, "Memory Power Management via Dynamic Voltage/Frequency Scaling," ICAC, 2011.}} | ||
- | * {{VRL-DRAM_reduced-refresh-latency_dac18.pdf| A. Das, H. Hassan, O. Mutlu, "VRL-DRAM: Improving DRAM Performance via Variable Refresh Latency,” DAC, 2018.}} | ||
- | * {{dram-latency-puf_hpca18.pdf| J. S. Kim, M. Patel, H. Hassan, O. Mutlu, "The DRAM Latency PUF: Quickly Evaluating Physical Unclonable Functions by Exploiting the Latency-Reliability Tradeoff in Modern DRAM Devices,” HPCA, 2018.}} | ||
- | * {{chargecache_low-latency-dram_hpca16.pdf| H. Hassan, G. Pekhimenko, N. Vijaykumar, V. Seshadri, D. Lee, O. Ergin, O. Mutlu, "ChargeCache: Reducing DRAM Latency by Exploiting Row Access Locality,” HPCA, 2016.}} | ||
- | * {{CAL-DRAM_for-reduced-latency-memory_micro18.pdf| Y. Wang, A. Tavakkol, L. Orosa, S. Ghose, N.M. Ghiasi, M. Patel, J.S. Kim, H. Hassan, M. Sadrosadati, O. Mutlu, "Reducing DRAM Latency via Charge-Level-Aware Look-Ahead Partial Restoration,” MICRO, 2018.}} | ||
- | * {{VAMPIRE-DRAM-power-characterization-and-modeling_sigmetrics18_pomacs18-twocolumn.pdf| S. Ghose, A.G. Yaglikci, R. Gupta, D. Lee, K. Kudrolli, W.X. Liu, H. Hassan, K.K. Chang, N. Chatterjee, A. Agrawal, M. O'Connor, O. Mutlu, "What Your DRAM Power Models Are Not Telling You: Lessons from a Detailed Experimental Study,” SIGMETRICS, 2018.}} | ||
- | |||
- | ===== Lecture 11b (24.10 Wed.) ===== | ||
- | === Recommended (lecture 11b): === | ||
- | |||
- | * {{A_THEORY_OF_HUMAN_MOTIVATION.pdf| A.H. Maslow, "A Theory of Human motivation." Psychological review, 1943.}} | ||
- | * {{Motivation_and_Personality-Maslow.pdf| A.H. Maslow, "Motivation and Personality,” 1954.}} | ||
- | * {{burks_vonneumann.pdf |A.W. Burks, H.H. Goldstein, J. von Neumann, "Preliminary Discussion of the Logical Design of an Electronic Computing Instrument," 1946.}} | ||
- | * {{stupid_architects_look_to_future.pdf| R. Sites, "It’s the Memory, Stupid!," Microprocessor report, 1996.}} | ||
- | * {{mutlu_hpca03.pdf| O. Mutlu, J. Stark, C. Wilkerson, Y.N. Patt, "Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-order Processors," HPCA, 2003.}} | ||
- | * {{profiling_a_warehouse-scale_computer.pdf| S. Kanev, J. P. Darago, K. M. Hazelwood, P. Ranganathan, T. Moseley, G. Wei, D. M. Brooks, "Profiling a Warehouse-scale Computer," ISCA, 2015.}} | ||
- | * {{a32-wang.pdf| S. Wang, E. Ipek. "Reducing Data Movement Energy via Online data Clustering and Encoding,” MICRO, 2016.}} | ||
- | * {{06983056.pdf| D. Pandiyan, and C. Wu. "Quantifying the Energy Cost of Data Movement for Emerging Smart Phone Workloads on Mobile Platforms." IISWC, 2014.}} | ||
- | * {{Google-consumer-workloads-data-movement-and-PIM_asplos18.pdf| A. Boroumand, S. Ghose, Y. Kim, R. Ausavarungnirun, E. Shiu, R. Thakur, D. Kim, A. Kuusela, A. Knies, P. Ranganathan, and O. Mutlu, "Google Workloads for Consumer Devices: Mitigating Data Movement Bottlenecks,” ASPLOS, 2018.}} | ||
- | * {{p285-rosenblum.pdf| M. Rosenblum, E. Bugnion, S.A. Herrod, E. Witchel, and A. Gupta, "The Impact of Architectural Trends on Operating System Performance,” In SIGOPS, 1995.}} | ||
- | * {{10.1.1.416.7554.pdf| J. Ousterhout, "Why Aren’t Operating Systems Getting Faster as Fast as Hardware,” USENIX, 1990.}} | ||
- | * {{01524129.pdf|L. Zhao, R. Iyer, S. Makineni, L. Bhuyan, D. Newell, “Hardware Support for Bulk Data Movement in Server Platforms,” ICCD, 2005.}} | ||
- | * {{Architecture_Support_for_Improving_Bulk_Memory_Cop.pdf| X. Jiang, Y. Solihin, L. Zhao, R. Iyer, “Architecture Support for Improving Bulk Memory Copying and Initialization Performance,” PACT, 2009}} | ||
- | * {{rowclone_micro13.pdf| V. Seshadri, Y. Kim, C. Fallin, D. Lee, R. Ausavarungnirun, G. Pekhimenko, Y. Luo, O. Mutlu, M.A. Kozuch, P.B. Gibbons, T.C. Mowry, "RowClone: Fast and Energy-Efficient In-DRAM Bulk Data Copy and Initialization," MICRO, 2013.}} | ||
- | * {{ambit-bulk-bitwise-dram_micro17.pdf| V. Seshadri, D. Lee, T. Mullins, H. Hassan, A. Boroumand, J. Kim, M.A. Kozuch, O. Mutlu, P.B. Gibbons, T.C. Mowry, “Ambit: In-Memory Accelerator for Bulk Bitwise Operations Using Commodity DRAM Technology,” MICRO, 2017.}} | ||
- | * {{in-DRAM-bulk-AND-OR-ieee_cal15.pdf| V. Seshadri, K. Hsieh, A. Boroumand, D. Lee, M.A. Kozuch, O.Mutlu, P.B. Gibbons, T.C. Mowry, "Fast Bulk Bitwise AND and OR in DRAM," IEEE CAL, 2015.}} | ||
- | * {{1711.01177.pdf| J.S. Kim, D. S. Cali, H. Xin, D. Lee, S. Ghose, M. Alser, H. Hassan, O. Ergin, C. Alkan, and O. Mutlu, "GRIM-Filter: Fast Seed Location Filtering in DNA Read Mapping Using Processing-in-Memory Technologies,” APBC, 2018.}} | ||
- | * {{p289-li.pdf| Y. Li, and J.M. Patel. "BitWeaving: Fast Scans for Main Memory Data Processing." SIGMOD 2013.}} | ||
- | * {{p605-goodwin.pdf| B. Goodwin, M. Hopcroft, D. Luu, A. Clemmer, M. Curmei, S. Elnikety, Y. He, "BitFunnel: Revisiting Signatures for Search,” SIGIR, 2017.}} | ||
- | |||
- | ===== Lecture 12 (25.10 Thu.) ===== | ||
- | === Recommended (lecture 12): === | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/ramulator_dram_simulator-ieee-cal15.pdf| Yoongu Kim, Weikun Yang, and Onur Mutlu, "Ramulator: A Fast and Extensible DRAM Simulator," CAL, 2015. }} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/tesseract-pim-architecture-for-graph-processing_isca15.pdf| Junwhan Ahn, Sungpack Hong, Sungjoo Yoo, Onur Mutlu, and Kiyoung Choi, "A Scalable Processing-in-Memory Accelerator for Parallel Graph Processing," ISCA, 2015.}} | ||
- | * {{Google-consumer-workloads-data-movement-and-PIM_asplos18.pdf| A. Boroumand, S. Ghose, Y. Kim, R. Ausavarungnirun, E. Shiu, R. Thakur, D. Kim, A. Kuusela, A. Knies, P. Ranganathan, and O. Mutlu, "Google Workloads for Consumer Devices: Mitigating Data Movement Bottlenecks,” ASPLOS, 2018.}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/TOM-programmer-transparent-GPU-near-data-processing_isca16.pdf| Kevin Hsieh, Eiman Ebrahimi, Gwangsun Kim, Niladrish Chatterjee, Mike O'Connor, Nandita Vijaykumar, Onur Mutlu, and Stephen W. Keckler, "Transparent Offloading and Mapping (TOM): Enabling Programmer-Transparent Near-Data Processing in GPU Systems," ISCA, 2016.}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/scheduling-for-GPU-processing-in-memory_pact16.pdf| Ashutosh Pattnaik, Xulong Tang, Adwait Jog, Onur Kayiran, Asit K. Mishra, Mahmut T. Kandemir, Onur Mutlu, and Chita R. Das, "Scheduling Techniques for GPU Architectures with Processing-In-Memory Capabilities," PACT, 2016.}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/in-memory-pointer-chasing-accelerator_iccd16.pdf| Kevin Hsieh, Samira Khan, Nandita Vijaykumar, Kevin K. Chang, Amirali Boroumand, Saugata Ghose, and Onur Mutlu, "Accelerating Pointer Chasing in 3D-Stacked Memory: Challenges, Mechanisms, Evaluation," ICCD, 2016.}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/pim-enabled-instructons-for-low-overhead-pim_isca15.pdf| Junwhan Ahn, Sungjoo Yoo, Onur Mutlu, and Kiyoung Choi,"PIM-Enabled Instructions: A Low-Overhead, Locality-Aware Processing-in-Memory Architecture," ISCA, 2015.}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/enhanced-memory-controller-for-dependent-loads_isca16.pdf| Milad Hashemi, Khubaib, Eiman Ebrahimi, Onur Mutlu, and Yale N. Patt, "Accelerating Dependent Cache Misses with an Enhanced Memory Controller," ISCA, 2016.}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/continuous-runahead-engine_micro16.pdf| Milad Hashemi, Onur Mutlu, and Yale N. Patt, "Continuous Runahead: Transparent Hardware Acceleration for Memory Intensive Workloads," MICRO, 2016.}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/LazyPIM-coherence-for-processing-in-memory_ieee-cal16.pdf| Amirali Boroumand, Saugata Ghose, Minesh Patel, Hasan Hassan, Brandon Lucia, Kevin Hsieh, Krishna T. Malladi, Hongzhong Zheng, and Onur Mutlu, "LazyPIM: An Efficient Cache Coherence Mechanism for Processing-in-Memory,"CAL 2016 }} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/concurrent-data-structures-for-PIM_spaa17.pdf|Zhiyu Liu, Irina Calciu, Maurice Herlihy, and Onur Mutlu, "Concurrent Data Structures for Near-Memory Computing," SPAA 2017}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/softMC_hpca17.pdf|H. Hassan, N. Vijaykumar, S. Khan, S. Ghose, K. Chang, G. Pekhimenko, D. Lee, O. Ergin, O. Mutlu, "SoftMC: A Flexible and Practical Open-Source Infrastructure for Enabling Experimental DRAM Studies," HPCA 2017}} | ||
- | * {{1711.01177.pdf| J.S. Kim, D. S. Cali, H. Xin, D. Lee, S. Ghose, M. Alser, H. Hassan, O. Ergin, C. Alkan, and O. Mutlu, "GRIM-Filter: Fast Seed Location Filtering in DNA Read Mapping Using Processing-in-Memory Technologies,” APBC, 2018.}} | ||
- | * {{https://en.wikipedia.org/wiki/The_Structure_of_Scientific_Revolutions| T.S. Kuhn, "The Structure of Scientific Revolutions," 1962}} | ||
- | * {{https://arxiv.org/pdf/1706.08642.pdf| Yu Cai, Saugata Ghose, Erich F. Haratsch, Yixin Luo, and Onur Mutlu, "Error characterization, mitigation, and recovery in flash-memory-based solid-state drives," Proceedings of the IEEE, 2017.}} | ||
- | * {{https://arxiv.org/pdf/1802.00320.pdf| Saugata Ghose, Kevin Hsieh, Amirali Boroumand, Rachata Ausavarungnirun, and Onur Mutlu, "Enabling the Adoption of Processing-in-Memory: Challenges, Mechanisms, Future Research Directions," Invited Book Chapter, to appear in 2018 }} | ||
- | |||
- | ===== Lecture 13 (31.10 Wed.) ===== | ||
- | === Required (Lecture 13): === | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/pcm_isca09.pdf|B. C. Lee, E. Ipek, O. Mutlu and D. Burger. "Architecting phase change memory as a scalable dram alternative." ISCA 2009.}} | ||
- | === Recommended (Lecture 13): === | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/sttram_ispass13.pdf |E. Kultursay, M. Kandemir, A. Sivasubramaniam, and O. Mutlu, "Evaluating STT-RAM as an Energy-Efficient Main Memory Alternative" ISPASS 2013}} | ||
- | * {{scalablehigh-performancemainmemorysystemusingphase-changememorytechnology.pdf | Moinuddin K. Qureshi, Viji Srinivasan, and Jude A. Rivers "Scalable High-Performance Main Memory System Using Phase-Change Memory Technology" ISCA 2009}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/rowbuffer-aware-caching_iccd12.pdf|H. Yoon, J. Meza, R. Ausavarungnirun, R. Harding, and O. Mutlu, "Row Buffer Locality Aware Caching Policies for Hybrid Memories" ICCD 2012}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/utility-based-hybrid-memory-management_cluster17.pdf|Y. Li, S. Ghose, J. Choi, J. Sun, H. Wang, and O. Mutlu, "Utility-Based Hybrid Memory Management" CLUSTER 2017}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/timber-fine-grained-dram-cache_ieee-cal12.pdf|J. Meza, J. Chang, H. Yoon, O. Mutlu, and P. Ranganathan, "Enabling Efficient and Scalable Hybrid Memories Using Fine-Granularity DRAM Cache Management" CAL 2012}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/persistent-memory-management_weed13.pdf|J. Meza, Y. Luo, S. Khan, J. Zhao, Y. Xie, and O. Mutlu, "A Case for Efficient Hardware-Software Cooperative Management of Storage and Memory" WEED 2013}} | ||
- | * {{phase-changetechnologyandthefutureofmainmemory.pdf |B. C. Lee, P. Zhou, J. Yang, Y. Zhang, B. Zhao, E. Ipek, O. Mutlu, and D. Burger "Phase Change Technology and the Future of Main Memory" IEEE Micro Top Picks 2010}} | ||
- | * {{pdram_ahybridpramanddrammainmemorysystem.pdf | G. Dhiman, R. Ayoub, T. Rosing "PDRAM: A hybrid PRAM and DRAM main memory system" DAC 2009}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/banshee-bandwidth-efficient-DRAM-cache_micro17.pdf|X. Yu, C. J. Hughes, N. Satish, O. Mutlu, and S. Devadas,"Banshee: Bandwidth-Efficient DRAM Caching via Software/Hardware Cooperation" MICRO 2017}} | ||
- | * [[https://dl.acm.org/citation.cfm?id=524799|F.G. Soltis,"Inside the AS/400" Duke Press Loveland, CO 1996]] | ||
- | |||
- | ===== Lecture 14a (1.11 Thu.) ===== | ||
- | === Required (Lecture 14a): === | ||
- | === Recommended (Lecture 14a): === | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/persistent-memory-management_weed13.pdf|J. Meza, Y. Luo, S. Khan, J. Zhao, Y. Xie, and O. Mutlu, "A Case for Efficient Hardware-Software Cooperative Management of Storage and Memory" WEED 2013}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/ThyNVM-transparent-crash-consistency-for-persistent-memory_micro15.pdf|J. Ren, J. Zhao, S. Khan, J., Y. Wu, and O. Mutlu, "ThyNVM: Enabling Software-Transparent Crash Consistency in Persistent Memory Systems," MICRO 2015}} | ||
- | * {{https://arxiv.org/pdf/1706.08642.pdf|Y. Cai, S. Ghose, E.F. Haratsch, Y. Luo, O. Mutlu, "Error Characterization, Mitigation, and Recovery in Flash Memory Based Solid State Drives," Proceedings of the IEEE 2017}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/NVMove-byte-based-persistence-tool_inflow16.pdf|H. Chauhan, I. Calciu, V. Chidambaram, E. Schkufza, O. Mutlu, and P. Subrahmanyam, "NVMove: Helping Programmers Move to Byte-Based Persistence," INFLOW 2016}} | ||
- | |||
- | ===== Lecture 14b (1.11 Thu.) ===== | ||
- | === Required (lecture 14b): === | ||
- | === Recommended (lecture 14b): === | ||
- | * {{https://arxiv.org/pdf/1706.08642.pdf|Y. Cai, S. Ghose, E.F. Haratsch, Y. Luo, O. Mutlu, "Error Characterization, Mitigation, and Recovery in Flash Memory Based Solid State Drives," Proceedings of the IEEE 2017}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/flash-memory-programming-vulnerabilities_hpca17.pdf|Y. Cai, S. Ghose, Y. Luo, K. Mai, O. Mutlu, E.F. Haratsch, "Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques," HPCA 2017}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/flash-correct-and-refresh_iccd12.pdf|Y. Cai, G. Yalcin, O. Mutlu, E. F. Haratsch, A. Cristal, O. Unsal, and K. Mai, "Flash Correct-and-Refresh: Retention-Aware Error Management for Increased Flash Memory Lifetime," ICCD 2012}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/online-nand-flash-memory-channel-model_jsac16.pdf|Y. Luo, S. Ghose, Y.Cai, E.F. Haratsch, O. Mutlu, "Enabling Accurate and Practical Online Flash Channel Modeling for Modern MLC NAND Flash Memory", JSAC, 2016}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/flash-read-disturb-errors_dsn15.pdf|Y.Cai, Y. Luo, S. Ghose, O. Mutlu, "Read Disturb Errors in MLC NAND Flash Memory: Characterization, Mitigation, and Recovery", DSN, 2015}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/warm-flash-write-hotness-aware-retention-management_msst15.pdf|Y. Luo, Y.Cai, S. Ghose, J. Choi, O. Mutlu, "WARM: Improving NAND Flash Memory Lifetime with Write-Hotness Aware Retention Management", MSST, 2015}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/flash-memory-data-retention_hpca15.pdf|Y.Cai, Y. Luo, E.F. Haratsch, K. Mai, O. Mutlu, "Data Retention in MLC NAND Flash Memory: Characterization, Optimization, and Recovery", HPCA, 2015}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/neighbor-assisted-error-correction-in-flash_sigmetrics14.pdf|Y.Cai, Yalcin, Gulay O. Mutlu, E.F. Haratsch, O. Unsal, A. Cristal, K. Mai, "Neighbor-Cell Assisted Error Correction for MLC NAND Flash Memories", SIGMETRICS, 2014}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/flash-programming-interference_iccd13.pdf|Y.Cai, O. Mutlu, E.F. Haratsch, K. Mai, "Program Interference in MLC NAND Flash Memory: Characterization, Modeling, and Mitigation", ICCD, 2013}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/flash-error-analysis-and-management_itj13.pdf|Y.Cai, Yalcin, Gulay O. Mutlu, E.F. Haratsch, Cristal, Adrian Unsal, Osman S K. Mai, "Error Analysis and Retention-Aware Error Management for NAND Flash Memory", ITJ, 2013}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/flash-memory-voltage-characterization_date13.pdf|Y.Cai, E.F. Haratsch, O. Mutlu, K. Mai, "Threshold Voltage Distribution in MLC NAND Flash Memory: Characterization, Analysis, and Modeling", DATE, 2013}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/flash-error-patterns_date12.pdf|Y.Cai, E.F. Haratsch, O. Mutlu, K. Mai, "Error Patterns in MLC NAND Flash Memory: Measurement, Characterization, and Analysis", DATE, 2012}} | ||
- | * {{https://www.usenix.org/system/files/conference/fast16/fast16-papers-schroeder.pdf|B. Schroeder, R. Lagisetty, A. Merchant, "Flash Reliability in Production: The Expected and the Unexpected.", FAST, 2016}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/flash-memory-chip-off-forensics-reliability_dfrws17.pdf|A. Fukami, S. Ghose, Y. Luo, Y.Cai, O. Mutlu, "Improving the Reliability of Chip-Off Forensic Analysis of NAND Flash Memory Devices", Digital Investigation, 2017}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/3D-NAND-flash-lifetime-early-retention-loss-and-process-variation_sigmetrics18_pomacs18-twocolumn.pdf|Y. Luo, S. Ghose, Y. Cai, E. F. Haratsch, O. Mutlu, “Improving 3D NAND Flash Memory Lifetime by Tolerating Early Retention Loss and Process Variation," SIGMETRICS, 2018}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/heatwatch-3D-nand-errors-and-self-recovery_hpca18.pdf|Y. Luo, S. Ghose, Y. Cai, E. F. Haratsch, O. Mutlu, "HeatWatch: Improving 3D NAND Flash Memory Device Reliability by Exploiting Self-Recovery and Temperature Awareness," HPCA, 2018}} | ||
- | * {{https://arxiv.org/pdf/1711.11427.pdf|Y. Cai, S. Ghose, E.F. Haratsch, Y. Luo, O. Mutlu, "Errors in Flash-Memory-Based Solid-State Drives: Analysis, Mitigation, and Recovery," arXiv, 2017}} | ||
- | * {{https://users.ece.cmu.edu/~omutlu/pub/flash-memory-failures-in-the-field-at-facebook_sigmetrics15.pdf|J. Meza, Q. Wu, S. Kumar, O. Mutlu, "A Large-Scale Study of Flash Memory Failures in the Field," SIGMETRICS, 2015}} | ||
- | |||
- | ===== Lecture 15 (14.11 Wed.) ===== | ||
- | === Required (lecture 15): === | ||
- | === Recommended (lecture 15): === | ||
- | * {{https://arxiv.org/pdf/1706.08642.pdf|Yu Cai, Saugata Ghose, Erich F. Haratsch, Yixin Luo, and Onur Mutlu, "Error Characterization, Mitigation, and Recovery in Flash Memory Based Solid State Drives" | ||
- | Proceedings of the IEEE, 2017.}} | ||
- | * {{https://arxiv.org/pdf/1711.11427.pdf|Yu Cai, Saugata Ghose, Erich F. Haratsch, Yixin Luo, and Onur Mutlu,"Errors in Flash-Memory-Based Solid-State Drives: Analysis, Mitigation, and Recovery" Invited Book Chapter in Inside Solid State Drives, 2018.}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/flash-programming-interference_iccd13.pdf | Yu Cai, Onur Mutlu, Erich F. Haratsch, and Ken Mai,"Program Interference in MLC NAND Flash Memory: Characterization, Modeling, and Mitigation" Proceedings of the 31st IEEE International Conference on Computer Design (ICCD), Asheville, NC, October 2013.}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/online-nand-flash-memory-channel-model_jsac16.pdf|Yixin Luo, Saugata Ghose, Yu Cai, Erich F. Haratsch, and Onur Mutlu,"Enabling Accurate and Practical Online Flash Channel Modeling for Modern MLC NAND Flash Memory" to appear in IEEE Journal on Selected Areas in Communications (JSAC), 2016.}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/neighbor-assisted-error-correction-in-flash_sigmetrics14.pdf|Yu Cai, Gulay Yalcin, Onur Mutlu, Eric Haratsch, Osman Unsal, Adrian Cristal, and Ken Mai,"Neighbor-Cell Assisted Error Correction for MLC NAND Flash Memories" Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS), Austin, TX, June 2014.}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/flash-read-disturb-errors_dsn15.pdf|Yu Cai, Yixin Luo, Saugata Ghose, Erich F. Haratsch, Ken Mai, and Onur Mutlu,"Read Disturb Errors in MLC NAND Flash Memory: Characterization and Mitigation" Proceedings of the 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Rio de Janeiro, Brazil, June 2015.}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/flash-memory-data-retention_hpca15.pdf|Yu Cai, Yixin Luo, Erich F. Haratsch, Ken Mai, and Onur Mutlu,"Data Retention in MLC NAND Flash Memory: Characterization, Optimization and Recovery" Proceedings of the 21st International Symposium on High-Performance Computer Architecture (HPCA), Bay Area, CA, February 2015.}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/flash-memory-failures-in-the-field-at-facebook_sigmetrics15.pdf|Justin Meza, Qiang Wu, Sanjeev Kumar, and Onur Mutlu,"A Large-Scale Study of Flash Memory Errors in the Field" Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS), Portland, OR, June 2015.}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/flash-memory-programming-vulnerabilities_hpca17.pdf|Yu Cai, Saugata Ghose, Yixin Luo, Ken Mai, Onur Mutlu, and Erich F. Haratsch,"Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques" Proceedings of the 23rd International Symposium on High-Performance Computer Architecture (HPCA) Industrial Session, Austin, TX, USA, February 2017.}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/online-nand-flash-memory-channel-model_jsac16.pdf|Yixin Luo, Saugata Ghose, Yu Cai, Erich F. Haratsch, and Onur Mutlu, "Enabling Accurate and Practical Online Flash Channel Modeling for Modern MLC NAND Flash Memory" to appear in IEEE Journal on Selected Areas in Communications (JSAC), 2016.}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/heatwatch-3D-nand-errors-and-self-recovery_hpca18.pdf|Yixin Luo, Saugata Ghose, Yu Cai, Erich F. Haratsch, and Onur Mutlu, "HeatWatch: Improving 3D NAND Flash Memory Device Reliability by Exploiting Self-Recovery and Temperature-Awareness" Proceedings of the 24th International Symposium on High-Performance Computer Architecture (HPCA), Vienna, Austria, February 2018.}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/3D-NAND-flash-lifetime-early-retention-loss-and-process-variation_sigmetrics18_pomacs18-twocolumn.pdf|Yixin Luo, Saugata Ghose, Yu Cai, Erich F. Haratsch, and Onur Mutlu, "Improving 3D NAND Flash Memory Lifetime by Tolerating Early Retention Loss and Process Variation" Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS), Irvine, CA, USA, June 2018.}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/warm-flash-write-hotness-aware-retention-management_msst15.pdf|Y. Luo, Y.Cai, S. Ghose, J. Choi, O. Mutlu, "WARM: Improving NAND Flash Memory Lifetime with Write-Hotness Aware Retention Management", MSST, 2015}} | ||
- | |||
- | ===== Lecture 16 (15.11 Thu.) ===== | ||
- | === Described in detail during lecture 16: === | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/mph_usenix_security07.pdf|T. Moscibroda and O. Mutlu. "Memory performance attacks: denial of memory service in multi-core systems," USENIX Security Symposium 2007.}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/stfm_micro07.pdf|O. Mutlu and T. Moscibroda, "Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors," MICRO 2007}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/parbs_isca08.pdf|O. Mutlu and T. Moscibroda, "Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems," ISCA 2008}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/atlas_hpca10.pdf|Y. Kim, D. Han, O. Mutlu, M. Harchol-Balter, “ATLAS: A Scalable and High-Performance Scheduling Algorithm for Multiple Memory Controllers,” HPCA 2010}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/tcm_micro10.pdf|Y. Kim, M. Papamichel, O. Mutlu, M. Harchol-Balter, “Thread Cluster Memory Scheduling: Exploiting Differences in Memory Access Behavior,” MICRO 2010}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/bliss-memory-scheduler_ieee-tpds16.pdf|L. Subramanian, D. Lee, V. Seshadri, H. Rastogi, O. Mutlu, “BLISS: Balancing Performance, Fairness, and Complexity in Memory Access Scheduling,” IEEE TPDS 2016}} | ||
- | |||
- | === Suggested (lecture 16): === | ||
- | * {{Kim_PACT2004.pdf|S. Kim, D. Chandra, Y. Solihin, "Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture," PACT 2004}} | ||
- | * {{p128-rixner.pdf |S. Rixner, W.J. Dally, U.J. Kapasi, P. Mattson, J.D. Owens. 2000. “Memory access scheduling,” ISCA 2000.}} | ||
- | * {{US5630096.pdf | W.K. Zuravleff and T. Robinson, “Controller for a synchronous DRAM that maximizes throughput by allowing memory requests and commands to be issued out of order” US Patent 5,630,096, May 1997.}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/pvc-qos_micro09.pdf|B. Grot, S.W. Keckler, O. Mutlu, “Preemptive Virtual Clock: A Flexible, Efficient, and Cost-effective QoS Scheme for Networks-on-Chip,” MICRO 2009}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/memory-channel-partitioning-micro11.pdf|S.P. Muralidhara, L. Subramanian, O. Mutlu, M. Kandemir, T. Moscibroda, “Reducing Memory Interference in Multicore Systems via Application-aware Memory Channel Partitioning,” MICRO 2011}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/parallel-memory-scheduling_micro11.pdf|E. Ebrahimi, R. Miftakhutdinov, C. Fallin, C.J. Lee, O. Mutlu, Y.N. Patt, “Parallel Application Memory Scheduling,” MICRO 2011}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/staged-memory-scheduling_isca12.pdf|R. Ausavarungnirun, K. Chang, L. Subramanian, G. Loh, O. Mutlu, “Staged Memory Scheduling: Achieving High Performance and Scalability in Heterogeneous Systems,” ISCA 2012}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/mise-predictable_memory_performance-hpca13.pdf|L. Subramanian, V. Seshadri, Y. Kim, B. Jaiyen, and O. Mutlu, "MISE: Providing Performance Predictability and Improving Fairness in Shared Main Memory Systems," HPCA 2013}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/application-slowdown-model_micro15.pdf|L. Subramanian, V. Seshadri, A. Ghosh, S. Khan, and O. Mutlu, "The Application Slowdown Model: Quantifying and Controlling the Impact of Inter-Application Interference at Shared Caches and Main Memory," MICRO 2015}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/dash_deadline-aware-heterogeneous-memory-scheduler_taco16.pdf|H. Usui, L. Subramanian, K. Chang, O. Mutlu, “DASH: Deadline-Aware High-Performance Memory Scheduler for Heterogeneous Systems with Hardware Accelerators,” ACM TACO 2016}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/prefetchaware-shared-resources_isca11.pdf|E. Ebrahimi, C.J. Lee, O. Mutlu, Y.N. Patt, "Prefetch-Aware Shared Resource Management for Multi-Core Systems," ISCA 2011}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/coordinated-prefetching_micro09.pdf|E. Ebrahimi, O. Mutlu, C.J. Lee, Y.N. Patt, "Coordinated Control of Multiple Prefetchers in Multi-Core Systems," MICRO 2009}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/bandwidth_lds_hpca09.pdf|E. Ebrahimi, O. Mutlu, Y.N. Patt, "Techniques for Bandwidth-Efficient Prefetching of Linked Data Structures in Hybrid Prefetching Systems," HPCA 2009}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/prefetch-dram_micro08.pdf|C.J. Lee, O. Mutlu, V. Narasiman, Y.N. Patt, "Prefetch-Aware DRAM Controllers," MICRO 2008}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/dram-aware-caches-TR-HPS-2010-002.pdf|C.J. Lee, V. Narasiman, E. Ebrahimi, O. Mutlu, Y.N. Patt, "DRAM-Aware Last-Level Cache Writeback: Reducing Write-Caused Interference in Memory Systems" HPS Technical Report, TR-HPS-2010-002, April 2010}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/firm-persistent-memory-scheduling_micro14.pdf|J. Zhao, O. Mutlu, Y. Xie, "FIRM: Fair and High-Performance Memory Control for Persistent Memory Systems," MICRO 2014}} | ||
- | |||
- | ===== Lecture 17 (21.11 Wed.) ===== | ||
- | === Recommended (lecture 17): === | ||
- | |||
- | * {{staged-memory-scheduling_isca12.pdf|R. Ausavarungnirun, K. K. Chang, L. Subramanian, G. H. Loh, and O. Mutlu. "Staged memory scheduling: Achieving high performance and scalability in heterogeneous systems." ISCA, 2012}} | ||
- | * {{dash_deadline-aware-heterogeneous-memory-scheduler_taco16.pdf|H. Usui, L. Subramanian, K. Chang, O. Mutlu, “DASH: Deadline-Aware High-Performance Memory Scheduler for Heterogeneous Systems with Hardware Accelerators,” ACM TACO, 2016}} | ||
- | * {{cpugpumc_dac12.pdf|M.K. Jeong, M. Erez, C. Sudanthi, and N. Paver. "A QoS-aware memory controller for dynamically balancing GPU and CPU bandwidth use in an MPSoC.” DAC, 2012}} | ||
- | * {{p128-rixner.pdf| S. Rixner, W.J. Dally, U.J. Kapasi, P. Mattson, J.D. Owens, "Memory access scheduling," ISCA 2000}} | ||
- | * {{us5630096.pdf| Zuravleff and Robinson. "Controller for a synchronous DRAM the maximizes throughput by allowing memory requests and commands to be issued out of order," US Patent 5,630,096, 1997}} | ||
- | * {{fst_asplos10.pdf|E. Ebrahimi, C. J. Lee, O. Mutlu, and Y. N. Patt, "Fairness via Source Throttling: A Configurable and High-Performance Fairness Substrate for Multi-Core Memory Systems," ASPLOS 2010}} | ||
- | * {{mise-predictable_memory_performance-hpca13.pdf|L. Subramanian, V. Seshadri, Y. Kim, B. Jaiyen, and O. Mutlu, "MISE: Providing Performance Predictability and Improving Fairness in Shared Main Memory Systems," HPCA 2013}} | ||
- | * {{stfm_micro07.pdf|O. Mutlu and T. Moscibroda, "Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors," MICRO 2007}} | ||
- | * {{per-thread_cycle_accounting_in_multicore_processors.pdf|K. D. Bois, S. Eyerman, L. Eeckhout, "Per-thread cycle accounting in multicore processors," TACO 2013}} | ||
- | * {{application-slowdown-model_micro15.pdf|L. Subramanian, V. Seshadri, A. Ghosh, S. Khan, and O. Mutlu, "The Application Slowdown Model: Quantifying and Controlling the Impact of Inter-Application Interference at Shared Caches and Main Memory," MICRO 2015}} | ||
- | * {{parallel-memory-scheduling_micro11.pdf|E. Ebrahimi, R. Miftakhutdinov, C. Fallin, C.J. Lee, O. Mutlu, Y.N. Patt, “Parallel Application Memory Scheduling,” MICRO 2011}} | ||
- | * {{memory-channel-partitioning-micro11.pdf|S.P. Muralidhara, L. Subramanian, O. Mutlu, M. Kandemir, T. Moscibroda, “Reducing Memory Interference in Multicore Systems via Application-aware Memory Channel Partitioning,” MICRO 2011}} | ||
- | |||
- | |||
- | ===== Lecture 18a (22.11 Thu.) ===== | ||
- | === Recommended (lecture 18a): === | ||
- | |||
- | * {{fst_asplos10.pdf|E. Ebrahimi, C. J. Lee, O. Mutlu, and Y. N. Patt, "Fairness via Source Throttling: A Configurable and High-Performance Fairness Substrate for Multi-Core Memory Systems," ASPLOS 2010}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/hetero-adaptive-source-throttling_sbacpad12.pdf|K. Chang, R. Ausavarungnirun, C. Fallin, and O. Mutlu, "HAT: Heterogeneous Adaptive Throttling for On-Chip Networks," | ||
- | SBAC-PAD, 2012}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/onchip-network-congestion-scalability_sigcomm2012.pdf|G. Nychis, C. Fallin, T. Moscibroda, O. Mutlu, and S. Seshan, "On-Chip Networks from a Networking Perspective: Congestion and Scalability in Many-core Interconnects," SIGCOMM, 2012}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/application-to-core-mapping_hpca13.pdf|R. Das, R. Ausavarungnirun, O. Mutlu, A. Kumar, and M. Azimi, "Application-to-Core Mapping Policies to Reduce Memory System Interference in Multi-Core Systems," HPCA 2013}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/architecture-aware-distributed-resource-management_vee15.pdf|H. Wang, C. Isci, L. Subramanian, J. Choi, D. Qian, and O. Mutlu, "A-DRM: Architecture-aware Distributed Resource Management of Virtualized Clusters," VEE 2015}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/decoupled-dma_pact15.pdf|D. Lee, L. Subramanian, R. Ausavarungnirun, J. Choi, and O. Mutlu, "Decoupled Direct Memory Access: Isolating CPU and IO Traffic by Leveraging a Dual-Data-Port DRAM," PACT 2015}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/softMC_hpca17.pdf|H. Hassan, N. Vijaykumar, S. Khan, S. Ghose, K. Chang, G. Pekhimenko, D. Lee, O. Ergin, and O. Mutlu, "SoftMC: A Flexible and Practical Open-Source Infrastructure for Enabling Experimental DRAM Studies," HPCA 2017}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/mise-predictable_memory_performance-hpca13.pdf|L. Subramanian, V. Seshadri, Y. Kim, B. Jaiyen, and O. Mutlu, "MISE: Providing Performance Predictability and Improving Fairness in Shared Main Memory Systems," HPCA 2013}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/application-slowdown-model_micro15.pdf|L. Subramanian, V. Seshadri, A. Ghosh, S. Khan, and O. Mutlu, "The Application Slowdown Model: Quantifying and Controlling the Impact of Inter-Application Interference at Shared Caches and Main Memory," MICRO 2015}} | ||
- | * {{https://users.ece.cmu.edu/~omutlu/pub/app-aware-noc_micro09.pdf|R. Das, O. Mutlu, T. Moscibroda, and C. R. Das, "Aergia: Exploiting Packet Latency Slack in On-Chip Networks," ISCA 2010}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/pvc-qos_micro09.pdf|B. Grot, S.W. Keckler, O. Mutlu, “Preemptive Virtual Clock: A Flexible, Efficient, and Cost-effective QoS Scheme for Networks-on-Chip,” MICRO 2009}} | ||
- | * {{https://users.ece.cmu.edu/~omutlu/pub/app-aware-noc_micro09.pdf|B. Grot, J. Hestness, S. W. Keckler, and O. Mutlu, "Kilo-NOC: A Heterogeneous Network-on-Chip Architecture for Scalability and Service Guarantees," ISCA 2011}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/mecs_hpca09.pdf|B. Grot, J. Hestness, S. W. Keckler, and O. Mutlu, "Express Cube Topologies for On-Chip Interconnects," HPCA 2009}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/SlimNoC_asplos18.pdf|M. Besta, S. M. Hassan, S. Yalamanchili, R. Ausavarungnirun, O. Mutlu, T. Hoefler, "Slim NoC: A Low-Diameter On-Chip Network Topology for High Energy Efficiency and Scalability," ASPLOS 2018}} | ||
- | * {{https://users.ece.cmu.edu/~omutlu/pub/bless_isca09.pdf|T. Moscibroda and O. Mutlu, “A Case for Bufferless Routing in On- Chip Networks,” ISCA 2009}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/chipper_hpca11.pdf|C. Fallin, C. Craik, and O. Mutlu, "CHIPPER: A Low-Complexity Bufferless Deflection Router," HPCA 2011}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/minimally-buffered-deflection-router_nocs12.pdf|C. Fallin, G. Nazario, X. Yu, K. Chang, R. Ausavarungnirun, and O. Mutlu, "MinBD: Minimally-Buffered Deflection Routing for Energy- Efficient Interconnect," NOCS 2012}} | ||
- | * {{http://users.ece.cmu.edu/~omutlu/pub/hierarchical-rings-with-deflection_sbacpad14.pdf|R. Ausavarungnirun, C. Fallin, X. Yu, K. K. Chang, G. Nazario, R. Das, G. H. Loh, O. Mutlu, “Design and Evaluation of Hierarchical Rings with Deflection Routing,” SBAC-PAD 2014}} | ||
- | * {{https://www.sciencedirect.com/science/article/pii/S0167819116000399|R. Ausavarungnirun, C. Fallin, X. Yu, K. Chang, G. Nazario, R. Das, G. Loh, and O. Mutlu, "A Case for Hierarchical Rings with Deflection Routing: An Energy-Efficient On-Chip Communication Substrate," PARCO 2016}} and {{https://arxiv.org/pdf/1602.06005.pdf|arXiv.org version}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/bufferless-and-minimally-buffered-deflection-routing_springer14.pdf|C. Fallin, G. Nazario, X. Yu, K. Chang, R. Ausavarungnirun, and O. Mutlu, "Bufferless and Minimally-Buffered Deflection Routing," Invited Book Chapter in Routing Algorithms in Networks-on-Chip, Springer, 2014}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/onchip-network-congestion-scalability_sigcomm2012.pdf|G. Nychis, C. Fallin, T. Moscibroda, O. Mutlu, and S. Seshan, "On-Chip Networks from a Networking Perspective: Congestion and Scalability in Many-core Interconnects," SIGCOMM 2012}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/on-chip-network-application-slowdown-estimation_iccd16.pdf|X. Xiang, S. Ghose, O. Mutlu, and N. Tzeng, "A Model for Application Slowdown Estimation in On- Chip Networks and Its Use for Improving System Fairness and Performance," ICCD, 2016}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/carpool-bufferless-network_ics17.pdf|X. Xiang, W. Shi, S. Ghose, L. Peng, O. Mutlu, and N. Tzeng, "Carpool: A Bufferless On-Chip Network Supporting Adaptive Multicast and Hotspot Alleviation," ICS 2017}} | ||
- | |||
- | ===== Lecture 18b (22.11 Thu.) ===== | ||
- | === Recommended (lecture 18b): === | ||
- | |||
- | * {{utility-based-partitioning.pdf|M.Qureshi and Y. Patt, "Utility-Based Cache Partitioning: A Low-Overhead, High- Performance, Runtime Mechanism to Partition Shared Caches," MICRO 2006}} | ||
- | * {{hpca02.pdf|G. E. Suh, S. Devadas, and L. Rudolph. "A New Memory Monitoring Scheme for Memory-Aware Scheduling and Partitioning," HPCA 2002}} | ||
- | * {{faircache.pdf|S. Kim, D. Chandra, and Y. Solihin. "Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture," PACT 2004}} | ||
- | * {{adaptive.pdf|M. K. Qureshi, "Adaptive Spill-Receive for robust high-performance caching in CMPs," HPCA 2009}} | ||
- | * {{reactivenuca.pdf|N. Hardavellas, M. Ferdman, B. Falsafi, and A. Ailamaki, "Reactive NUCA: near-optimal block placement and replication in distributed caches," ISCA 2009}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/qureshi_isca06.pdf| M. K. Qureshi, D. N. Lynch, O. Mutlu, Y. N. Patt, "A Case for MLP-Aware Cache Replacement," ISCA 2006}} | ||
- | * {{p381-qureshi.pdf| M. K. Qureshi, A. Jaleel, Y. N. Patt, S. C. Steely Jr., J. Emer, “Adaptive Insertion Policies for High Performance Caching,” ISCA 2007}} | ||
- | * {{evictedaddressfilter.pdf| V. Seshadri, O. Mutlu, M. A Kozuch, T. C Mowry, "The Evicted-Address Filter: A Unified Mechanism to Address Both Cache Pollution and Thrashing", PACT 2012}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/bdi-compression_pact12.pdf| G. Pekhimenko, V. Seshadri, O. Mutlu, P. B. Gibbons, M. A. Kozuch, and T. C. Mowry, "Base-Delta-Immediate Compression: Practical Data Compression for On-Chip Caches," PACT 2012}} | ||
- | * {{fedorova2007.pdf|A.Fedorova, M. Seltzer, M. D. Smith, "Improving Performance Isolation on Chip Multiprocessors via an Operating System Scheduler," PACT 2007}} | ||
- | * {{lin2008.pdf|J. Lin, Q. Lu, X. Ding, Z. Zhang, X. Zhang, and P. Sadayappan, "Gaining Insights into Multi-Core Cache Partitioning: Bridging the Gap between Simulation and Real Systems," HPCA 2008}} | ||
- | * {{jin2006.pdf|S. Cho and L. Jin, "Managing Distributed, Shared L2 Caches through OS- Level Page Allocation," MICRO 2006}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/acs_ieee_micro10.pdf|M. A. Suleman, O. Mutlu, M. K. Qureshi, and Y. N. Patt, "Accelerating Critical Section Execution with Asymmetric Multi-Core Architectures," MICRO 2010}} | ||
- | * {{p211-kim.pdf|C. Kim, D. Burger, and S. W. Keckler, "An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches," ASPLOS 2002}} | ||
- | |||
- | ===== Lecture 19a (28.11 Thu.) ===== | ||
- | === Described in detail during lecture 19a === | ||
- | * {{adaptive.pdf|M. K. Qureshi, Adaptive Spill-Receive for robust high-performance caching in CMPs, HPCA.2009}} | ||
- | * {{p211-kim.pdf|C. Kim, D. Burger, and S. W. Keckler. "An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches," ASPLOS '02}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/bdi-compression_pact12.pdf| G. Pekhimenko, V. Seshadri, O. Mutlu, P.B. Gibbons, M.A. Kozuch, T.C. Mowry, “Base-Delta-Immediate Compression: Practical Data Compression for On-Chip Caches,” PACT 2012}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/eaf-cache_pact12.pdf|V. Seshadri, O. Mutlu, M. A. Kozuch, and T. C. Mowry, "The evicted-address filter: A unified mechanism to address both cache pollution and thrashing," PACT'12}} | ||
- | * {{p208-jaleel.pdf|A. Jaleel, W. Hasenplaugh, M. Qureshi, J. Sebot, S. Steely, Jr., and J. Emer, "Adaptive insertion policies for managing shared caches," PACT '08}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/application-slowdown-model_micro15.pdf|L. Subramanian, V. Seshadri, A. Ghosh, S. Khan, and O. Mutlu, "The Application Slowdown Model: Quantifying and Controlling the Impact of Inter-Application Interference at Shared Caches and Main Memory," MICRO 2015}} | ||
- | |||
- | === Recommended (lecture 19a): === | ||
- | |||
- | * {{cooperativecaching.pdf|J. Chang and G. S. Sohi. 2006, "Cooperative Caching for Chip Multiprocessors," ISCA '06}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/acs_asplos09.pdf|M. A. Suleman, O. Mutlu, M. K. Qureshi, and Y. N. Patt, "Accelerating critical section execution with asymmetric multi-core architectures," ASPLOS'09}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/tldram_hpca13.pdf|D. Lee, Y. Kim, V. Seshadri, J. Liu, L. Subramanian, O. Mutlu, "Tiered-Latency DRAM: A Low Latency and Low Cost DRAM Architecture," HPCA 2013}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/qureshi_isca06.pdf|M.K. Qureshi, D.N. Lynch, O. Mutlu, Y.N. Patt. "A Case for MLP-Aware Cache Replacement". ISCA 2006}} | ||
- | * {{zero.pdf|M. M. Islam and P. Stenstrom, "Zero-Value Caches: Cancelling Loads that Return Zero," PACT'09}} | ||
- | * {{p258-yang.pdf|J. Yang, Y. Zhang, and R. Gupta. Frequent value compression in data caches. MICRO'00}} | ||
- | * {{http://ftp.cs.wisc.edu/pub/techreports/2004/TR1500.pdf|A.R. Alameldeen, and D. A. Wood, "Frequent pattern compression: A significance-based compression scheme for L2 caches," Dept. Comp. Scie., Univ. Wisconsin-Madison, Tech. Rep 1500 2004}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/toggle-aware-compression-for-GPUs_hpca16.pdf|G. Pekhimenko, E. Bolotin, N. Vijaykumar, O. Mutlu, T. C. Mowry, and S. W. Keckler, "A Case for Toggle-Aware Compression for GPU Systems". HPCA'16}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/caba-gpu-assist-warps_isca15.pdf|N. Vijaykumar, G. Pekhimenko, A. Jog, A. Bhowmick, R. Ausavarungnirun, C. Das, M. Kandemir, T. C. Mowry, and O. Mutlu, "A Case for Core-Assisted Bottleneck Acceleration in GPUs: Enabling Flexible Data Compression with Assist Warps". ISCA'15}} | ||
- | * {{p93-tyson.pdf|G. Tyson, M. Farrens, J. Matthews, and A. R. Pleszkun, "A modified approach to data cache management," MICRO'95}} | ||
- | * {{deadblock.pdf|A.C. Lai, C. Fide and B. Falsafi, "Dead-block prediction & dead-block correlating prefetchers," ISCA'01}} | ||
- | * {{p422-bloom.pdf|B.H. Bloom, "Space/Time Trade-offs in Hash Coding with Allowable Errors," CACM, 1970}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/mise-predictable_memory_performance-hpca13.pdf|L. Subramanian, V. Seshadri, Y. Kim, B. Jaiyen, and O. Mutlu, "MISE: Providing Performance Predictability and Improving Fairness in Shared Main Memory Systems," HPCA 2013}} | ||
- | |||
- | |||
- | ===== Lecture 19b (28.11 Thu.) ===== | ||
- | === Recommended (lecture 19b): === | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/raidr-dram-refresh_isca12.pdf|J. Liu, B. Jaiyen, R. Veras, O. Mutlu, "RAIDR: Retention-Aware Intelligent DRAM Refresh," ISCA 2012}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/tldram_hpca13.pdf|D. Lee, Y. Kim, V. Seshadri, J. Liu, L. Subramanian, O. Mutlu, "Tiered-Latency DRAM: A Low Latency and Low Cost DRAM Architecture," HPCA 2013}} | ||
- | * {{2007.TileInterconnection.IEEEMicro.pdf|D. Wentzlaff, P. Griffin, H. Hoffmann, L. Bao, B. Edwards, C. Ramey, M. Mattina, C. C. Miao, J. F. Brown III, and A. Agarwal, "On-Chip Interconnection Architecture of the Tile Processor". IEEE Micro 2007}} | ||
- | |||
- | |||
- | ===== Lecture 20 (29.11 Thu.) ===== | ||
- | === Recommended (lecture 20): === | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/acs_asplos09.pdf|M. Aater Suleman, Onur Mutlu, Moinuddin K. Qureshi, and Yale N. Patt. Accelerating critical section execution with asymmetric multi-core architectures. ASPLOS'09}} | ||
- | * {{bottleneck-identification-and-scheduling_asplos12.pdf| Jose A. Joao, M. Aater Suleman, Onur Mutlu, and Yale N. Patt, "Bottleneck Identification and Scheduling in Multithreaded Applications". ASPLOS'12}} | ||
- | * {{d7ce51c62671d5ffc1506786b0b7861ce00a.pdf| Jose A. Joao, M. Aater Suleman, Onur Mutlu, and Yale N. Patt, "Utility-Based Acceleration of Multithreaded Applications on Asymmetric CMPs". ISCA'13}} | ||
- | * {{22310236.pdf| Ed Grochowski, Ronny Ronen, John Shen, and Hong Wang, "Best of Both Latency and Throughput". ICCD 2004}} | ||
- | * {{amdahl.pdf|G. M. Amdahl, "Validity of the single processor approach to achieving large scale computing capabilities," AFIPS 1967}} | ||
- | * {{05389044.pdf|J. M. Tendler, J. S. Dodson, J. S. Fields, Jr., H. Le, and B. Sinharoy, "POWER4 System Microarchitecture". IBM J R&D 2002}} | ||
- | * {{719990eaab63a6bfa2988b5fd57a03b13229.pdf| Ron Kalla, Balaram Sinharoy, and Joel M. Tendler, "IBM Power5 Chip: A Dual-Core Multithreaded Processor". IEEE Micro 2004}} | ||
- | * {{ :niagara_a_32-way_multithreaded_sparc_processor.pdf | P. Kongetira, K. Aingaran, and K. Olukotun, "Niagara: A 32-Way Multithreaded Sparc Processor", IEEE Micro 2005}} | ||
- | * {{p441-suleman.pdf|M. Aater Suleman, Onur Mutlu, Jose A. Joao, Khubaib, Yale N. Patt, "Data Marshaling for Multi-Core Architectures". ISCA'10, IEEE Micro Top Picks 2011}} | ||
- | * {{dk52.pdf|Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy Ranganathan, and Dean M. Tullsen, "Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction". MICRO 2003}} | ||
- | * {{01431565.pdf| M. Annavaram, E. Grochowski, J. Shen, “Mitigating Amdahl’s Law Through EPI Throttling,” ISCA 2005}} | ||
- | * {{http://people.inf.ethz.ch/omutlu/pub/onur-Asymmetry-Everywhere-talk.pdf|O. Mutlu, "Asymmetry Everywhere (with Automatic Resource Management)," CRA Workshop on Advanced Computer Architecture Research 2010}} | ||
- | * {{http://users.ece.cmu.edu/~omutlu/pub/heterogeneous-block-architecture_iccd14.pdf|C. Fallin,, C. Wilkerson, O. Mutlu, "The Deterogeneous Block Architecture," ICCD'14}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/atlas_hpca10.pdf|Y. Kim, D. Han, O. Mutlu, M. Harchol-Balter, “ATLAS: A Scalable and High-Performance Scheduling Algorithm for Multiple Memory Controllers,” HPCA 2010}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/tcm_micro10.pdf|Y. Kim, M. Papamichel, O. Mutlu, M. Harchol-Balter, “Thread Cluster Memory Scheduling: Exploiting Differences in Memory Access Behavior,” MICRO 2010}} | ||
- | * {{http://users.ece.cmu.edu/~omutlu/pub/noc-congestion_hotnets10.pdf|G. Nychis, C. Falling, T. Moscibroda, O. Mutlu, "Next Generation On-chip Networks: What Kind of Congestion Control Do We Need?" HotNets 2010}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/timber-fine-grained-dram-cache_ieee-cal12.pdf|J. Meza, J. Chang, H. Yoon, O. Mutlu, and P. Ranganathan, "Enabling Efficient and Scalable Hybrid Memories Using Fine-Granularity DRAM Cache Management" CAL 2012}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/rowbuffer-aware-caching_iccd12.pdf|H. Yoon, J. Meza, R. Ausavarungnirun, R. Harding, and O. Mutlu, "Row Buffer Locality Aware Caching Policies for Hybrid Memories" ICCD 2012}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/heterogeneous-reliability-memory-for-data-centers_dsn14.pdf| Y. Luo, S. Govindan, B. Sharma, M. Santaniello, J. Meza, A. Kansal, J. Liu, B. Khessib, K. Vaid, O. Mutlu, "Characterizing Application Memory Error Vulnerability to Optimize Datacenter Cost via Heterogeneous-Reliability Memory," DSN 2014}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/tldram_hpca13.pdf|D. Lee, Y. Kim, V. Seshadri, J. Liu, L. Subramanian, O. Mutlu, "Tiered-Latency DRAM: A Low Latency and Low Cost DRAM Architecture," HPCA 2013}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/raidr-dram-refresh_isca12.pdf|J. Liu, B. Jaiyen, R. Veras, O. Mutlu, "RAIDR: Retention-Aware Intelligent DRAM Refresh," ISCA 2012}} | ||
- | |||
- | ===== Lecture 21 (05.12 Wed.) ===== | ||
- | === Suggested (lecture 21): === | ||
- | * {{https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html|NVIDIA, "CUDA C Programming Guide," Version 9.0, 2018}} | ||
- | * {{https://www.sciencedirect.com/science/book/9780128119860|D.B. Kirk and W.M. Hwu, "Programming Massively Parallel Processors. A Hands-on Approach," Third Edition, 2017}} | ||
- | * {{p140-fisher.pdf|J.A. Fisher, “Very Long Instruction Word architectures and the ELI-512,” ISCA 1983}} | ||
- | * {{Sung_2012.pdf|I.J. Sung, G.D. Liu, W.M. Hwu, "DL: A Data Layout Transformation System for Heterogeneous Computing," INPAR 2012}} | ||
- | * {{pseudo-randomly_interleaved_memory.pdf|B. R. Rau, "Pseudo-Randomly Interleaved Memory," ISCA 1991}} | ||
- | * {{Braak_2016.pdf|G.J.v.d. Braak, J. Gomez-Luna, J.M. Gonzalez-Linares, H. Corporaal, N. Guil, "Configurable XOR Hash Functions for Banked Scratchpad Memories in GPUs," IEEE TC, 2016}} | ||
- | * {{GomezLuna_2013.pdf|J. Gomez-Luna, J.M. Gonzalez-Linares, J.I. Benavides, N. Guil, "Performance Modeling of Atomic Additions on GPU Scratchpad Memory," IEEE TPDS, 2013}} | ||
- | * {{GomezLuna_2012.pdf|J. Gomez-Luna, J.M. Gonzalez-Linares, J.I. Benavides, N. Guil, "Performance Models for Asynchronous Data Transfers on Consumer Graphics Processing Units," JPDC, 2012}} | ||
- | * {{GomezLuna_2017.pdf|J. Gomez-Luna, I. E. Hajj, L. Chang, V. Garcia-Flores, S. G. de Gonzalo, T. B. Jablin, A. J. Peña, W. Hwu, "Chai: Collaborative heterogeneous applications for integrated-architectures," ISPASS 2017}} | ||
- | |||
- | ===== Lecture 22 (6.12 Thu.) ===== | ||
- | === Required (lecture 22): === | ||
- | * {{amdahl.pdf|G. M. Amdahl, "Validity of the single processor approach to achieving large scale computing capabilities," AFIPS 1967}} | ||
- | * {{lamport.pdf|L. Lamport, "How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs," IEEE Transactions on Computers, 1979}} | ||
- | * {{a_low-overhead_coherence_solution_for_multiprocessors_with_private_cache_memories.pdf|M. S. Papamarcos and J. H. Patel, "A low-overhead coherence solution for multiprocessors with private cache memories," ISCA 1984}} | ||
- | === Described in detail during lecture 22: === | ||
- | * {{using_cache_memory_to_reduce_processor-memory_traffic.pdf|J. R. Goodman, "Using cache memory to reduce processor-memory traffic," ISCA 1983}} | ||
- | * {{a_low-overhead_coherence_solution_for_multiprocessors_with_private_cache_memories.pdf|M. S. Papamarcos and J. H. Patel, "A low-overhead coherence solution for multiprocessors with private cache memories," ISCA 1984}} | ||
- | * {{a_new_solution_to_coherence_problems_in_multicache_systems.pdf|L. M. Censier and P. Feautrier, "A new solution to coherence problems in multicache systems," IEEE Trans. Computers, 1978}} | ||
- | * {{token_coherence_decoupling_performance_and_correctness.pdf|M. Martin, M. D. Hill, and D. A. Wood, "Token coherence: decoupling performance and correctness," ISCA 2003}} | ||
- | === Recommended (lecture 22): === | ||
- | * {{flynn.pdf|M. J. Flynn, "Very High-Speed Computing Systems," Proc. of IEEE, 1966}} | ||
- | * {{multiprocessors-multicomputers.pdf|M. D. Hill, N. P. Jouppi, G. S. Sohi, "Multiprocessors and Multicomputers,” pp. 551-560 in Readings in Computer Architecture.}} | ||
- | * {{memory_consistency_and_event_ordering_in_scalable_shared-memory_multiprocessors.pdf|K. Gharachorloo, D. | ||
- | Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J. Hennessy, "Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors," ISCA 1990}} | ||
- | * {{two_techniques_to_enhance_the_performanc_of_memory_consistency_models.pdf|K. Gharachorloo, A. Gupta, and J. Hennessy, "Two Techniques to Enhance the Performance of Memory Consistency Models," ICPP 1991}} | ||
- | * {{bulksc_bulk_enforcement_of_sequential_consistency.pdf|L. Ceze, J. Tuck, P. Montesinos, and J. Torrellas, "BulkSC: bulk enforcement of sequential consistency," ISCA 2007}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/ThyNVM-transparent-crash-consistency-for-persistent-memory_micro15.pdf|J. Ren, J. Zhao, S. Khan, J., Y. Wu, and O. Mutlu, "ThyNVM: Enabling Software-Transparent Crash Consistency in Persistent Memory Systems," MICRO 2015}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/NVMove-byte-based-persistence-tool_inflow16.pdf|H. Chauhan, I. Calciu, V. Chidambaram, E. Schkufza, O. Mutlu, and P. Subrahmanyam, "NVMove: Helping Programmers Move to Byte-Based Persistence," INFLOW 2016}} | ||
- | * {{a_new_solution_to_coherence_problems_in_multicache_systems.pdf|L. M. Censier and P. Feautrier, "A new solution to coherence problems in multicache systems," IEEE Trans. Computers, 1978}} | ||
- | * {{using_cache_memory_to_reduce_processor-memory_traffic.pdf|J. R. Goodman, "Using cache memory to reduce processor-memory traffic," ISCA 1983}} | ||
- | * {{the_sgi_origin_a_ccnuma_highly_scalable_server.pdf|J. Laudon and D. Lenoski, "The SGI Origin: A ccNUMA Highly Scalable Server," ISCA 1997}} | ||
- | * {{token_coherence_decoupling_performance_and_correctness.pdf|M. Martin, M. D. Hill, and D. A. Wood, "Token coherence: decoupling performance and correctness," ISCA 2003}} | ||
- | * {{on_the_inclusion_properties_for_multi-level_cache_hierarchies.pdf|J. Baer and W. Wang, "On the inclusion properties for multi-level cache hierarchies," ISCA 1988}} | ||
- | * {{designofacomputer_cdc6600.pdf|J. E. Thornton, "CDC 6600: Design of a Computer,” 1970}} | ||
- | * {{a_pipelined_shared_resource_mimd_computer.pdf | B. J. Smith, "A Pipelined, Shared Resource MIMD Computer", ICPP 1978}} | ||
- | * {{a_new_method_of_solving_numerical_equations_of_all_orders_by_continuous_.pdf|W. G. Horner, "A new method of solving numerical equations of all orders, by continuous approximation," Philosophical Transactions of the Royal Society, 1819}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/acs_asplos09.pdf|M. A. Suleman, O. Mutlu, M. K. Qureshi, and Y. N. Patt, "Accelerating critical section execution with asymmetric multi-core architectures," ASPLOS'09}} | ||
- | * {{co-operating_sequential_processes.pdf|E. W. Dijkstra, "Cooperating Sequential Processes," 1965}} | ||
- | * {{culler_parcomparch_5.1.pdf|Culler and Singh, Parallel Computer Architecture, Chapter 5.1 (pp 269–283)}} | ||
- | * {{culler_parcomparch_5.3.pdf|Culler and Singh, Parallel Computer Architecture, Chapter 5.3 (pp 291-305)}} | ||
- | * {{ph_computerorganizationanddesignthehardwaresoftwareinterface5th_5.10.pdf|P&H, Computer Organization and Design, Chapter 5.10 (pp 466-470)}} | ||
- | |||
- | ===== Lecture 23 (12.12 Wed.) ===== | ||
- | === Described in detail during lecture 23): === | ||
- | * {{bless_isca09.pdf|T. Moscibroda and O. Mutlu, "A Case for Bufferless Routing in On-Chip Networks", ISCA 2009}} | ||
- | |||
- | === Suggested (lecture 23): === | ||
- | * {{app-aware-noc_micro09.pdf|R. Das, O. Mutlu, T. Moscibroda, and C. R. Das, "Application-Aware Prioritization Mechanisms for On-Chip Networks", MICRO 2009}} | ||
- | * {{ultrasparc.pdf|M. Shah, J. Barreh, J. Brooks, R. Golla, G. Grohoski, N. Gura, R. Hetherington, P. Jordan, M. Luttrell, C. Olson, B. Saha, D. Sheahan, L. Spracklen, and A. Wynn, "UltraSPARC T2: A Highly-Threaded, Power-Efficient, SPARC SOC", ASSCC 2007}} | ||
- | * {{7d2822e9b7fcd60f147823478b59fcf7569e.pdf|J. H. Patel, "Processor-memory interconnections for multiprocessors", ISCA 1979}} | ||
- | * {{Ultracomputer.pdf|A. Gottlieb, R. Grishman, C. P. Kruskal, K. P. McAuliffe, L. Rudolph, and M. Snir, "The NYU Ultracomputer - Designing an MIMD Shared Memory Parallel Computer", IEEE Trans. on Comp. 1983}} | ||
- | * {{hierarchical-rings-with-deflection_sbacpad14.pdf|R. Ausavarungnirun, C. Fallin, X. Yu, K. Chang, G. Nazario, R. Das, G. Loh, and O. Mutlu, "Design and Evaluation of Hierarchical Rings with Deflection Routing", SBAC-PAD 2014}} | ||
- | * {{p272-leiserson.pdf|C.E. Leiserson, Z.S. Abuhamdeh, D.C. Douglas, C.R. Feynman, M.N. Ganmukhi, J.V. Hill, D. Hillis, B.C. Kuszmaul, M.A. St. Pierre, D.S. Wells, M.C. Wong, S.-W. Yang, R. Zak, "The Network Architecture of the Connection Machine CM-5", SPAA 1992}} | ||
- | * {{seitz_cacm_1985.pdf|C. L. Seitz, "The Cosmic Cube", CACM 1985}} | ||
- | * {{L8-TurnModel-ISCA92.pdf|C. J. Glass and L. M. Ni, "The Turn Model for Adaptive Routing", ISCA 1992}} | ||
- | * {{maze-routing_nocs15.pdf|M. Fattah, A. Airola, R. Ausavarungnirun, N. Mirzaei, P. Liljeberg, J. Plosila, S. Mohammadi, T. Pahikkala, O. Mutlu, and H. Tenhunen, "A Low-Overhead, Fully-Distributed, Guaranteed-Delivery Routing Algorithm for Faulty Network-on-Chips", NOCS 2015}} | ||
- | * {{Baran64.pdf|P. Baran, "On Distributed Communications Networks", IEEE Trans. Comm., 1964}} | ||
- | * {{bufferless_springer14.pdf|C. Fallin, G. Nazario, X. Yu, K. Chang, R. Ausavarungnirun, and O. Mutlu, "Bufferless and Minimally-Buffered Deflection Routing", Routing Algorithms in Networks-on-Chip (invited) 2014}} | ||
- | * {{virtual+channel.pdf|W. J. Dally, "Virtual Channel Flow Control", ISCA 1990}} | ||
- | |||
- | |||
- | |||
- | ===== Lecture 24 (13.12 Thu.) ===== | ||
- | === Described in detail during lecture 24: === | ||
- | * {{05749724.pdf|C. Fallin, C. Craik, and O. Mutlu, "CHIPPER: A Low-Complexity Bufferless Deflection Router", HPCA 2011}} | ||
- | * {{bufferless_springer14.pdf|C. Fallin, G. Nazario, X. Yu, K. Chang, R. Ausavarungnirun, and O. Mutlu, "Bufferless and Minimally-Buffered Deflection Routing", Routing Algorithms in Networks-on-Chip (invited book chapter), 2014}} | ||
- | * {{06209256.pdf|C. Fallin, G. Nazario, X. Yu, K. Chang, R. Ausavarungnirun, and O. Mutlu, "MinBD: Minimally-Buffered Deflection Routing for Energy-Efficient Interconnect", NOCS 2012}} | ||
- | |||
- | === Suggested (lecture 24): === | ||
- | * {{app-aware-noc_micro09.pdf|R. Das, O. Mutlu, T. Moscibroda, and C. R. Das, "Application-Aware Prioritization Mechanisms for On-Chip Networks", MICRO 2009}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/hetero-adaptive-source-throttling_sbacpad12.pdf|K. Chang, R. Ausavarungnirun, C. Fallin, and O. Mutlu, "HAT: Heterogeneous Adaptive Throttling for On-Chip Networks," | ||
- | SBAC-PAD, 2012}} | ||
- | * {{bless_isca09.pdf|T. Moscibroda and O. Mutlu, "A Case for Bufferless Routing in On-Chip Networks", ISCA 2009}} | ||
- | * {{06970669.pdf|R. Ausavarungnirun, C. Fallin, X. Yu, K. Chang, G. Nazario, R. Das, G. H. Loh, and O. Mutlu, "Design and Evaluation of Hierarchical Rings with Deflection Routing", SBAC-PAD 2014}} | ||
- | * {{1-s2.0-s0167819116000399-main.pdf|R. Ausavarungnirun, C. Fallin, X. Yu, K. Chang, G. Nazario, R. Das, G. H.Loh, and O. Mutlu, "A Case for Hierarchical Rings with Deflection Routing: An Energy-Efficient On-Chip Communication Substrate", PARCO 2016}} | ||
- | * {{p106-das.pdf|R. Das, O. Mutlu, T. Moscibroda, and C.R. Das, "Aergia: Exploiting Packet Latency Slack in On-Chip Networks", ISCA 2010}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/pvc-qos_micro09.pdf|B. Grot, S.W. Keckler, O. Mutlu, "Preemptive Virtual Clock: A Flexible, Efficient, and Cost-effective QoS Scheme for Networks-on-Chip", MICRO 2009}} | ||
- | * {{p401-grot.pdf|B. Grot, J. Hestness, S. W. Keckler, and O. Mutlu, "Kilo-NOC: A Heterogeneous Network-on-Chip Architecture for Scalability and Service Guarantees", ISCA 2011}} | ||
- | * {{https://people.inf.ethz.ch/omutlu/pub/onchip-network-congestion-scalability_sigcomm2012.pdf|G. Nychis, C. Fallin, T. Moscibroda, O. Mutlu, and S. Seshan, "On-Chip Networks from a Networking Perspective: Congestion and Scalability in Many-core Interconnects," SIGCOMM, 2012}} | ||
- | * {{http://users.ece.cmu.edu/~omutlu/pub/noc-congestion_hotnets10.pdf|G. Nychis, C. Falling, T. Moscibroda, O. Mutlu, "Next Generation On-chip Networks: What Kind of Congestion Control Do We Need?" HotNets 2010}} |