User Tools

Site Tools


readings

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
readings [2020/12/05 16:22] geraldodreadings [2021/01/04 08:45] (current) – [Lecture 26 (31.12 Thu.)] firtinac
Line 640: Line 640:
   * {{https://core.ac.uk/download/pdf/207290433.pdf| H. W. Cain and P. Nagpurkar, “Runahead Execution vs. Conventional Data Prefetching   * {{https://core.ac.uk/download/pdf/207290433.pdf| H. W. Cain and P. Nagpurkar, “Runahead Execution vs. Conventional Data Prefetching
 in the IBM POWER6 Microprocessor,” ISPASS 2010}} in the IBM POWER6 Microprocessor,” ISPASS 2010}}
-  * {{https://people.inf.ethz.ch/omutlu/pub/mutlu_isca05.pdf| Onur Mutlu, Hyesoon Kim, and Yale N. Patt, "Techniques for Efficient Processing in Runahead Execution Engines" ISCA 2005, MICRO TOP PICKS 2006}} 
   * {{https://people.inf.ethz.ch/omutlu/pub/mutlu_dissertation.pdf| Onur Mutlu, "Efficient Runahead Execution Processors" Ph.D. Dissertation 2006}}   * {{https://people.inf.ethz.ch/omutlu/pub/mutlu_dissertation.pdf| Onur Mutlu, "Efficient Runahead Execution Processors" Ph.D. Dissertation 2006}}
   * {{https://people.inf.ethz.ch/omutlu/pub/srinath_hpca07.pdf|S. Srinath, O. Mutlu, H. Kim, Y.N. Patt, "Feedback Directed Prefetching: Improving the Performance and Bandwidth-Efficiency of Hardware Prefetchers," HPCA 2007}}   * {{https://people.inf.ethz.ch/omutlu/pub/srinath_hpca07.pdf|S. Srinath, O. Mutlu, H. Kim, Y.N. Patt, "Feedback Directed Prefetching: Improving the Performance and Bandwidth-Efficiency of Hardware Prefetchers," HPCA 2007}}
Line 649: Line 648:
   * {{https://people.inf.ethz.ch/omutlu/pub/informed-caching-for-prefetching_taco15.pdf|V. Seshadri, S. Yedkar, H. Xin, O. Mutlu, P. P. Gibbons, M. A. Kozuch, and T. C. Mowry, "Mitigating prefetcher-caused pollution using informed caching policies for prefetched blocks," TACO 2015}}   * {{https://people.inf.ethz.ch/omutlu/pub/informed-caching-for-prefetching_taco15.pdf|V. Seshadri, S. Yedkar, H. Xin, O. Mutlu, P. P. Gibbons, M. A. Kozuch, and T. C. Mowry, "Mitigating prefetcher-caused pollution using informed caching policies for prefetched blocks," TACO 2015}}
   * {{https://people.inf.ethz.ch/omutlu/pub/orchestrated-gpgpu-scheduling-prefetching_isca13.pdf|A. Jog, O. Kayiran, A. K. Mishra, M. T. Kandemir, O. Mutlu, R. Iyer, and C. R. Das, "Orchestrated scheduling and prefetching for GPGPUs," ISCA 2013}}   * {{https://people.inf.ethz.ch/omutlu/pub/orchestrated-gpgpu-scheduling-prefetching_isca13.pdf|A. Jog, O. Kayiran, A. K. Mishra, M. T. Kandemir, O. Mutlu, R. Iyer, and C. R. Das, "Orchestrated scheduling and prefetching for GPGPUs," ISCA 2013}}
-  * {{https://people.inf.ethz.ch/omutlu/pub/mutlu_ieee_micro06.pdf| O. Mutlu, H. Kim, and Y.N. Patt, “Efficient Runahead Execution: Power-Efficient Memory Latency Tolerance,” ISCA 2005, IEEE Micro Top Picks 2006}}+  * {{https://people.inf.ethz.ch/omutlu/pub/mutlu_isca05.pdf| O. Mutlu, H. Kim, and Y.N. Patt, “Techniques for Efficient Processing in Runahead Execution Engines,” ISCA 2005}} 
 +  * {{https://people.inf.ethz.ch/omutlu/pub/mutlu_ieee_micro06.pdf| O. Mutlu, H. Kim, and Y.N. Patt, “Efficient Runahead Execution: Power-Efficient Memory Latency Tolerance,” IEEE Micro Top Picks 2006}}
   * {{https://people.inf.ethz.ch/omutlu/pub/mutlu_micro05.pdf| Onur Mutlu, Hyesoon Kim, and Yale N. Patt, "Address-Value Delta (AVD) Prediction: Increasing the Effectiveness of Runahead Execution by Exploiting Regular Memory Allocation Patterns" MICRO, 2005.}}   * {{https://people.inf.ethz.ch/omutlu/pub/mutlu_micro05.pdf| Onur Mutlu, Hyesoon Kim, and Yale N. Patt, "Address-Value Delta (AVD) Prediction: Increasing the Effectiveness of Runahead Execution by Exploiting Regular Memory Allocation Patterns" MICRO, 2005.}}
   * {{https://www.microarch.org/micro37/papers/11_Armstrong-WrongPath.pdf| David N. Armstrong, Hyesoon Kim, Onur Mutlu, Yale N. Patt,"Wrong Path Events: Exploiting Unusual and Illegal Program Behavior for Early Misprediction Detection and Recovery" MICRO, 2004.}}   * {{https://www.microarch.org/micro37/papers/11_Armstrong-WrongPath.pdf| David N. Armstrong, Hyesoon Kim, Onur Mutlu, Yale N. Patt,"Wrong Path Events: Exploiting Unusual and Illegal Program Behavior for Early Misprediction Detection and Recovery" MICRO, 2004.}}
Line 656: Line 656:
  
 === Described in detail during lecture 19b: === === Described in detail during lecture 19b: ===
-  * {{https://safari.ethz.ch/architecture/fall2019/lib/exe/fetch.php?media=lecture1-amdahl.pdf| G. M. Amdahl, "Validity of the single processor approach to achieving large scale computing capabilities," AFIPS 1967}}+  * {{lecture1-amdahl.pdf| G. M. Amdahl, "Validity of the single processor approach to achieving large scale computing capabilities," AFIPS 1967}}
      
 === Suggested (lecture 19b): === === Suggested (lecture 19b): ===
-   * {{https://safari.ethz.ch/digitaltechnik/spring2020/lib/exe/fetch.php?media=flynn_1966.pdf|M.J. Flynn, “Very high-speed computing systems,” Proc. of IEEE 1966}} +   * {{flynn_1966.pdf|M.J. Flynn, “Very high-speed computing systems,” Proc. of IEEE 1966}} 
-   * {{https://safari.ethz.ch/architecture/fall2019/lib/exe/fetch.php?media=multiprocessors-multicomputers.pdf| M. D. Hill, N. P. Jouppi, G. S. Sohi, "Multiprocessors and Multicomputers,” pp. 551-560 in Readings in Computer Architecture.}}+   * {{multiprocessors-multicomputers.pdf| M. D. Hill, N. P. Jouppi, G. S. Sohi, "Multiprocessors and Multicomputers,” pp. 551-560 in Readings in Computer Architecture.}}
    * {{|M. D. Hill, N. P. Jouppi, G. S. Sohi, "Dataflow and Multithreading,” pp. 309-314 in Readings in Computer Architecture.}}    * {{|M. D. Hill, N. P. Jouppi, G. S. Sohi, "Dataflow and Multithreading,” pp. 309-314 in Readings in Computer Architecture.}}
-  * {{https://safari.ethz.ch/architecture/fall2019/lib/exe/fetch.php?media=how-to-make-a-multiprocessor-computer-that-correctly-executes-multiprocess-programs.pdf|L. Lamport. "How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs". IEEE Trans. 1979}} +  * {{how-to-make-a-multiprocessor-computer-that-correctly-executes-multiprocess-programs.pdf|L. Lamport. "How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs". IEEE Trans. 1979}} 
-  * {{https://safari.ethz.ch/architecture/fall2019/lib/exe/fetch.php?media=a_low-overhead_coherence_solution_for_multiprocessors_with_private_cache_memories.pdf|M. S. Papamarcos and J. H. Patel, "A low-overhead coherence solution for multiprocessors with private cache memories," ISCA 1984}} +  * {{a_low-overhead_coherence_solution_for_multiprocessors_with_private_cache_memories.pdf|M. S. Papamarcos and J. H. Patel, "A low-overhead coherence solution for multiprocessors with private cache memories," ISCA 1984}} 
-   * {{https://safari.ethz.ch/architecture/fall2017/lib/exe/fetch.php?media=culler_parcomparch_5.1.pdf| Culler and Singh, Parallel Computer Architecture, Chapter 5.1 (pp 269 – 283)}} +   * {{culler_parcomparch_5.1.pdf| Culler and Singh, Parallel Computer Architecture, Chapter 5.1 (pp 269 – 283)}} 
-   * {{https://safari.ethz.ch/architecture/fall2019/lib/exe/fetch.php?media=culler_parcomparch_5.3.pdf|Culler and Singh, Parallel Computer Architecture, Chapter 5.3 (pp 291 – 305)}} +   * {{culler_parcomparch_5.3.pdf|Culler and Singh, Parallel Computer Architecture, Chapter 5.3 (pp 291 – 305)}} 
-   * {{https://safari.ethz.ch/architecture/fall2017/lib/exe/fetch.php?media=p_h_ch5.pdf |P&H, Computer Organization and Design,  Chapter 5.8 (pp 534 – 538 in 4th and 4th revised eds.)}}+   * {{p_h_ch5.pdf |P&H, Computer Organization and Design,  Chapter 5.8 (pp 534 – 538 in 4th and 4th revised eds.)}}
  
 === Mentioned (lecture 19b): === === Mentioned (lecture 19b): ===
    * {{https://archive.computerhistory.org/resources/text/CDC/cdc.6600.thornton.design_of_a_computer_the_control_data_6600.1970.102630394.pdf| J. E. Thornton,  “CDC 6600: Design of a Computer,” 1970}}    * {{https://archive.computerhistory.org/resources/text/CDC/cdc.6600.thornton.design_of_a_computer_the_control_data_6600.1970.102630394.pdf| J. E. Thornton,  “CDC 6600: Design of a Computer,” 1970}}
-   * {{https://safari.ethz.ch/digitaltechnik/spring2018/lib/exe/fetch.php?media=pipelined1978smith.pdf| B. Smith, “A pipelined, shared resource MIMD computer,” ICPP 1978 }}+   * {{pipelined1978smith.pdf| B. Smith, “A pipelined, shared resource MIMD computer,” ICPP 1978 }}
    * {{https://people.inf.ethz.ch/omutlu/pub/acs_asplos09.pdf|M. A. Suleman, O. Mutlu, M. K. Qureshi, and Y. N. Patt, "Accelerating critical section execution with asymmetric multi-core architectures," ASPLOS'09}}    * {{https://people.inf.ethz.ch/omutlu/pub/acs_asplos09.pdf|M. A. Suleman, O. Mutlu, M. K. Qureshi, and Y. N. Patt, "Accelerating critical section execution with asymmetric multi-core architectures," ASPLOS'09}}
-   * {{https://safari.ethz.ch/architecture/fall2019/lib/exe/fetch.php?media=bottleneck-identification-and-scheduling_asplos12.pdf| Jose A. Joao, M. Aater Suleman, Onur Mutlu,   and Yale N. Patt, "Bottleneck Identification and Scheduling in Multithreaded Applications". ASPLOS'12}} +   * {{bottleneck-identification-and-scheduling_asplos12.pdf| Jose A. Joao, M. Aater Suleman, Onur Mutlu,   and Yale N. Patt, "Bottleneck Identification and Scheduling in Multithreaded Applications". ASPLOS'12}} 
-   * {{https://safari.ethz.ch/architecture/fall2019/lib/exe/fetch.php?media=d7ce51c62671d5ffc1506786b0b7861ce00a.pdf| Jose A. Joao, M. Aater Suleman, Onur Mutlu, and Yale N. Patt, "Utility-Based Acceleration of Multithreaded Applications on Asymmetric CMPs". ISCA'13}}+   * {{d7ce51c62671d5ffc1506786b0b7861ce00a.pdf| Jose A. Joao, M. Aater Suleman, Onur Mutlu, and Yale N. Patt, "Utility-Based Acceleration of Multithreaded Applications on Asymmetric CMPs". ISCA'13}}
    * {{https://people.inf.ethz.ch/omutlu/pub/dm_ieee_micro_top_picks11.pdf|M. A. Suleman, O. Mutlu, J. A. Joao, Khubaib, and Y. N. Patt, "Data Marshaling for Multi-core Systems", ISCA 2010}}    * {{https://people.inf.ethz.ch/omutlu/pub/dm_ieee_micro_top_picks11.pdf|M. A. Suleman, O. Mutlu, J. A. Joao, Khubaib, and Y. N. Patt, "Data Marshaling for Multi-core Systems", ISCA 2010}}
  
Line 704: Line 704:
    * {{on_the_inclusion_properties_for_multi-level_cache_hierarchies.pdf|J. Baer and W. Wang, "On the inclusion properties for multi-level cache hierarchies," ISCA 1988}}    * {{on_the_inclusion_properties_for_multi-level_cache_hierarchies.pdf|J. Baer and W. Wang, "On the inclusion properties for multi-level cache hierarchies," ISCA 1988}}
    * {{https://people.inf.ethz.ch/omutlu/pub/LazyPIM-coherence-for-processing-in-memory_ieee-cal16.pdf| A. Boroumand, S. Ghose, M. Patel, H. Hassan, B. Lucia, K. Hsieh, K.T. Malladi, H. Zheng, and O. Mutlu, "LazyPIM: An Efficient Cache Coherence Mechanism for Processing-in-Memory," CAL 2016}}    * {{https://people.inf.ethz.ch/omutlu/pub/LazyPIM-coherence-for-processing-in-memory_ieee-cal16.pdf| A. Boroumand, S. Ghose, M. Patel, H. Hassan, B. Lucia, K. Hsieh, K.T. Malladi, H. Zheng, and O. Mutlu, "LazyPIM: An Efficient Cache Coherence Mechanism for Processing-in-Memory," CAL 2016}}
 +===== Lecture 22 (27.12 Sun.) =====
 +=== Described in detail during lecture 22 ===
 +   * {{p239-gottlieb.pdf| A. Gottlieb, R. Grishman, C. P. Kruskal, K, P. McAuliffe, L. Rudolph, M. Snir, "The NYU Ultracomputer - Designing an MIMD Shared Memory Parallel Computer", ISCA 1998}}
 +   * {{0211027.pdf| L. G. Valiant, "A Scheme for Fast Parallel Communication", SIAM Journal on Computing, 1982}}
 +   * {{https://people.inf.ethz.ch/omutlu/pub/bless_isca09.pdf| T. Moscibroda and O. Mutlu, "A Case for Bufferless Routing in On-Chip Networks", ISCA 2009}}
 +   * {{chipper_hpca11.pdf|C. Fallin, C. Craik, and O. Mutlu, "CHIPPER: A Low-Complexity Bufferless Deflection Router", HPCA 2011}}
 +   * {{bufferless-and-minimally-buffered-deflection-routing_springer14.pdf|C. Fallin, G. Nazario, X. Yu, K. Chang, R. Ausavarungnirun, and O. Mutlu, "Bufferless and Minimally-Buffered Deflection Routing", Routing Algorithms in Networks-on-Chip (invited book chapter), 2014}}
 +   * {{minimally-buffered-deflection-router_nocs12.pdf|C. Fallin, G. Nazario, X. Yu, K. Chang, R. Ausavarungnirun, and O. Mutlu, "MinBD: Minimally-Buffered Deflection Routing for Energy-Efficient Interconnect", NOCS 2012}}
 +=== Suggested (lecture 22): ===
 +   * {{p168-patel.pdf| J. Patel, "Performance of Processor-Memory Interconnections for Multiprocessors", ISCA 1979}}
 +      * {{https://people.inf.ethz.ch/omutlu/pub/hierarchical-rings-with-deflection_sbacpad14.pdf| R. Ausavarungnirun, C. Fallin, X. Yu, K. Chang, G. Nazario, R. Das, G. Loh, and O. Mutlu, "Design and Evaluation of Hierarchical Rings with Deflection Routing", SBAC-PAD 2014}}
 +   * {{https://arxiv.org/pdf/1602.06005.pdf| R. Ausavarungnirun, C. Fallin, X. Yu, K. Chang, G. Nazario, R. Das, G. Loh, and O. Mutlu, "A Case for Hierarchical Rings with Deflection Routing: An Energy-Efficient On-Chip Communication Substrate", PARCO 2016}}
 +   * {{p272-leiserson.pdf| C. E. Leiserson, Z. S. Abuhamdeh, D. C. Douglas, C. R. Feynman, M. N. Ganmukhi, J. V. Hill, D. Hillis, B. C. Kuszmaul, M. A. St. Pierre, D. S. Wells, M. C. Wong, Shaw-Wen Yang, R. Zak, "The network architecture of the Connection Machine CM-5", SPAA 1992}}
 +   * {{p22-seitz.pdf| C. L. Seitz, "The cosmic cube", CACM 1985}}
 +   * {{p278-glass.pdf| C. J. Glass, L. M. Ni, "The turn model for adaptive routing", ISCA 1992}}
 +   * {{p263-valiant.pdf| L.G. Valiant, G.J. Brebner, "Universal schemes for parallel communication", STOC 1981}}
 +   * {{https://people.inf.ethz.ch/omutlu/pub/maze-routing_nocs15.pdf| M. Fattah, A. Airola, R. Ausavarungnirun, N. Mirzaei, P. Liljeberg, J. Plosila, S. Mohammadi, T. Pahikkala, O. Mutlu, and H. Tenhunen, "A Low-Overhead, Fully-Distributed, Guaranteed-Delivery Routing Algorithm for Faulty Network-on-Chips", NOCS, 2015}}
 +   * {{P2626.pdf| P. Baran, "On Distributed Communication Networks", RAND Technical Report, 1962}}
 +   * {{app-aware-noc_micro09.pdf|R. Das, O. Mutlu, T. Moscibroda, and C.R. Das, "Aergia: Exploiting Packet Latency Slack in On-Chip Networks", ISCA 2010}}
 +===== Lecture 23 (28.12 Mon.) =====
 +=== Described in detail during lecture 23 ===
 +   * {{chipper_hpca11.pdf|C. Fallin, C. Craik, and O. Mutlu, "CHIPPER: A Low-Complexity Bufferless Deflection Router", HPCA 2011}}
 +   * {{bufferless-and-minimally-buffered-deflection-routing_springer14.pdf|C. Fallin, G. Nazario, X. Yu, K. Chang, R. Ausavarungnirun, and O. Mutlu, "Bufferless and Minimally-Buffered Deflection Routing", Routing Algorithms in Networks-on-Chip (invited book chapter), 2014}}
 +   * {{minimally-buffered-deflection-router_nocs12.pdf|C. Fallin, G. Nazario, X. Yu, K. Chang, R. Ausavarungnirun, and O. Mutlu, "MinBD: Minimally-Buffered Deflection Routing for Energy-Efficient Interconnect", NOCS 2012}}
 +=== Suggested (lecture 23): ===
 +     {{https://people.inf.ethz.ch/omutlu/pub/hierarchical-rings-with-deflection_sbacpad14.pdf| R. Ausavarungnirun, C. Fallin, X. Yu, K. Chang, G. Nazario, R. Das, G. Loh, and O. Mutlu, "Design and Evaluation of Hierarchical Rings with Deflection Routing", SBAC-PAD 2014}}
 +   * {{https://arxiv.org/pdf/1602.06005.pdf| R. Ausavarungnirun, C. Fallin, X. Yu, K. Chang, G. Nazario, R. Das, G. Loh, and O. Mutlu, "A Case for Hierarchical Rings with Deflection Routing: An Energy-Efficient On-Chip Communication Substrate", PARCO 2016}}
 +   * {{P2626.pdf| P. Baran, "On Distributed Communication Networks", RAND Technical Report, 1962}}
 +===== Lecture 24 (29.12 Tue.) =====
 +=== Suggested (lecture 24): ===
 +  * {{Flynn_1966.pdf|M.J. Flynn, “Very high-speed computing systems,” Proc. of IEEE 1966}}
 +  * {{p140-fisher.pdf|J.A.Fisher, "Very Long Instruction Word architectures and the ELI-512,” ISCA 1983}}
 +  * {{p63-russell.pdf|R.M. Russell, "The CRAY-1 computer system,” CACM 1978}}
 +  * {{p74-rau.pdf|B.R. Rau, "Pseudo-randomly interleaved memory,” ISCA 1991}}
 +  * {{mmx_technology_1996.pdf|A. Peleg and U. Weiser, "MMX technology extension to the Intel architecture,” IEEE Micro 1996}}
 +  * {{04523358.pdf|E. Lindholm, J. Nickolls, S. Oberman, and J. Montrym, "NVIDIA Tesla: A Unified Graphics and Computing Architecture,” IEEE Micro 2008}}
 +  * {{30470407.pdf| W.W.L. Fung, I. Sham, G. Yuan, and T.M. Aamodt, "Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow," MICRO 2007}}
 +===== Lecture 25 (30.12 Wed.) =====
 +=== Suggested (lecture 25): ===
 +  * {{cuda_c_programming_guide.pdf|NVIDIA, "CUDA C++ PROGRAMMING GUIDE ," 2019}}
 +  * {{2013_programming_massively_parallel_processors_a_hands-on_approach_2nd.pdf| Hwu and Kirk , “Programming Massively Parallel Processors ” 2017}}
 +  * {{p140-fisher.pdf|Fisher , “Very Long Instruction Word Architectures and the ELI-512,” ISCA 1983}}
 +  * {{sung_2012.pdf|I. Sung, G. D. Liu, and W. W. Hwu , “DL: A data layout transformation system for heterogeneous computing ,” INPAR 2012}}
 +  * {{10.1.1.12.7149.pdf|B. R. Rau , “Pseudo-randomly interleaved memory ,” ISCA 1991}}
 +  * {{configurable_xor_hash_functions_for_banked_scratchpad_memories_in_gpus.pdf|G. Braak, J. Gomez-Luna, J.M. Gonzalez-Linares, Henk Corporal, and Nicolas Guil , “Configurable XOR Hash Functions for Banked Scratchpad Memories in GPUs,” IEEE TC 2016}}
 +  * {{gomezluna_2013.pdf|J. Gomez-Luna, J.M. Gonzalez-Linares, J.I. Benavides Benitez, and N. G. Mata , “Performance Modeling of Atomic Additions on GPU Scratchpad Memory ,” IEEE TPDS 2013}}
 +  * {{gomezluna_2012.pdf|J. Gomez-Luna, J.M. Gonzalez-Linares, J. I. Benavides, and N. Guil, “Performance models for asynchronous data transfers on consumer Graphics Processing Units,” JPDC 2012}}
 +  * {{ransac-publication.pdf|M.A. Fisher, and R.C. Bolles ”Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography“, Graphics and Image Processing, 1981}}
 +
 +===== Lecture 26 (31.12 Thu.) =====
 +=== Suggested (lecture 26): ===
 +    * {{https://arxiv.org/pdf/1706.08642.pdf|Y. Cai, S. Ghose, E.F. Haratsch, Y. Luo, O. Mutlu, "Error Characterization, Mitigation, and Recovery in Flash Memory Based Solid State Drives," Proceedings of the IEEE 2017}}
 +    * {{https://people.inf.ethz.ch/omutlu/pub/flash-memory-programming-vulnerabilities_hpca17.pdf|Y. Cai, S. Ghose, Y. Luo, K. Mai, O. Mutlu, E.F. Haratsch, "Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques," HPCA 2017}}
 +    * {{https://people.inf.ethz.ch/omutlu/pub/flash-correct-and-refresh_iccd12.pdf|Y. Cai, G. Yalcin, O. Mutlu, E. F. Haratsch, A. Cristal, O. Unsal, and K. Mai, "Flash Correct-and-Refresh: Retention-Aware Error Management for Increased Flash Memory Lifetime," ICCD 2012}}
 +    * {{https://people.inf.ethz.ch/omutlu/pub/online-nand-flash-memory-channel-model_jsac16.pdf|Y. Luo, S. Ghose, Y.Cai, E.F. Haratsch, O. Mutlu, "Enabling Accurate and Practical Online Flash Channel Modeling for Modern MLC NAND Flash Memory", JSAC, 2016}}
 +    * {{https://people.inf.ethz.ch/omutlu/pub/flash-read-disturb-errors_dsn15.pdf|Y.Cai, Y. Luo, S. Ghose, O. Mutlu, "Read Disturb Errors in MLC NAND Flash Memory: Characterization, Mitigation, and Recovery", DSN, 2015}}
 +    * {{https://people.inf.ethz.ch/omutlu/pub/warm-flash-write-hotness-aware-retention-management_msst15.pdf|Y. Luo, Y.Cai, S. Ghose, J. Choi, O. Mutlu, "WARM: Improving NAND Flash Memory Lifetime with Write-Hotness Aware Retention Management", MSST, 2015}}
 +    * {{https://people.inf.ethz.ch/omutlu/pub/flash-memory-data-retention_hpca15.pdf|Y.Cai, Y. Luo, E.F. Haratsch, K. Mai, O. Mutlu, "Data Retention in MLC NAND Flash Memory: Characterization, Optimization, and Recovery", HPCA, 2015}}
 +    * {{https://people.inf.ethz.ch/omutlu/pub/neighbor-assisted-error-correction-in-flash_sigmetrics14.pdf|Y.Cai, Yalcin, Gulay O. Mutlu, E.F. Haratsch, O. Unsal, A. Cristal, K. Mai, "Neighbor-Cell Assisted Error Correction for MLC NAND Flash Memories", SIGMETRICS, 2014}}
 +    * {{https://people.inf.ethz.ch/omutlu/pub/flash-programming-interference_iccd13.pdf|Y.Cai, O. Mutlu, E.F. Haratsch, K. Mai, "Program Interference in MLC NAND Flash Memory: Characterization, Modeling, and Mitigation", ICCD, 2013}}
 +    * {{https://people.inf.ethz.ch/omutlu/pub/flash-error-analysis-and-management_itj13.pdf|Y.Cai, Yalcin, Gulay O. Mutlu, E.F. Haratsch, Cristal, Adrian Unsal, Osman S K. Mai, "Error Analysis and Retention-Aware Error Management for NAND Flash Memory", ITJ, 2013}}
 +    * {{https://people.inf.ethz.ch/omutlu/pub/flash-memory-voltage-characterization_date13.pdf|Y.Cai, E.F. Haratsch, O. Mutlu, K. Mai, "Threshold Voltage Distribution in MLC NAND Flash Memory: Characterization, Analysis, and Modeling", DATE, 2013}}
 +    * {{https://people.inf.ethz.ch/omutlu/pub/flash-error-patterns_date12.pdf|Y.Cai, E.F. Haratsch, O. Mutlu, K. Mai, "Error Patterns in MLC NAND Flash Memory: Measurement, Characterization, and Analysis", DATE, 2012}}
 +    * {{https://www.usenix.org/system/files/conference/fast16/fast16-papers-schroeder.pdf|B. Schroeder, R. Lagisetty, A. Merchant, "Flash Reliability in Production: The Expected and the Unexpected.", FAST, 2016}}
 +    * {{https://people.inf.ethz.ch/omutlu/pub/flash-memory-chip-off-forensics-reliability_dfrws17.pdf|A. Fukami, S. Ghose, Y. Luo, Y.Cai, O. Mutlu, "Improving the Reliability of Chip-Off Forensic Analysis of NAND Flash Memory Devices", Digital Investigation, 2017}}
 +    * {{https://people.inf.ethz.ch/omutlu/pub/3D-NAND-flash-lifetime-early-retention-loss-and-process-variation_sigmetrics18_pomacs18-twocolumn.pdf|Y. Luo, S. Ghose, Y. Cai, E. F. Haratsch, O. Mutlu, “Improving 3D NAND Flash Memory Lifetime by Tolerating Early Retention Loss and Process Variation," SIGMETRICS, 2018}}
 +    * {{https://people.inf.ethz.ch/omutlu/pub/heatwatch-3D-nand-errors-and-self-recovery_hpca18.pdf|Y. Luo, S. Ghose, Y. Cai, E. F. Haratsch, O. Mutlu, "HeatWatch: Improving 3D NAND Flash Memory Device Reliability by Exploiting Self-Recovery and Temperature Awareness," HPCA, 2018}}
 +    * {{https://arxiv.org/pdf/1711.11427.pdf|Y. Cai, S. Ghose, E.F. Haratsch, Y. Luo, O. Mutlu, "Errors in Flash-Memory-Based Solid-State Drives: Analysis, Mitigation, and Recovery," arXiv, 2017}}
 +    * {{https://users.ece.cmu.edu/~omutlu/pub/flash-memory-failures-in-the-field-at-facebook_sigmetrics15.pdf|J. Meza, Q. Wu, S. Kumar, O. Mutlu, "A Large-Scale Study of Flash Memory Failures in the Field," SIGMETRICS, 2015}}
 +    * {{https://arxiv.org/pdf/1711.11427.pdf|Yu Cai, Saugata Ghose, Erich F. Haratsch, Yixin Luo, and Onur Mutlu,"Errors in Flash-Memory-Based Solid-State Drives: Analysis, Mitigation, and Recovery" Invited Book Chapter in Inside Solid State Drives, 2018.}}
 +    * {{https://people.inf.ethz.ch/omutlu/pub/online-nand-flash-memory-channel-model_jsac16.pdf|Yixin Luo, Saugata Ghose, Yu Cai, Erich F. Haratsch, and Onur Mutlu, "Enabling Accurate and Practical Online Flash Channel Modeling for Modern MLC NAND Flash Memory" to appear in IEEE Journal on Selected Areas in Communications (JSAC), 2016.}}
 +
 +===== Lecture 27 (4.01 Mon.) =====
 +=== Suggested (lecture 27): ===
 +  * {{1982-kung-why-systolic-architecture.pdf | H.T. Kung, “Why Systolic Architectures?,” IEEE Computer, 1982}}
 +  * {{p1-Jouppi.pdf | N. Jouppi, C. Young, N. Patil, D. Patterson, G. Agrawal, R. Bajwa, S. Bates, S. Bhatia, N. Boden, A. Borchers and R. Boyle, “In-datacenter Performance Analysis of a Tensor Processing Unit,” ISCA 2017}}
 +  * {{4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf | A. Krizhevsky, I. Sutskever, G.E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," NIPS 2012}}
 +  * {{GoogLeNet.pdf | C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, "Going Deeper with Convolutions," CVPR 2015}}
 +  * {{resnet.pdf | K. He, X. Zhang, S. Ren, J. Sun, “Deep Residual Learning for Image Recognition,” CVPR 2016}}
 +  * {{p346-annaratone.pdf | M. Annaratone, E. Arnould, T. Gross, H.T. Kung, and M.S. Lam, “Warp Architecture and Implementation,” ACM SIGARCH Computer Architecture News, 1986}}
 +  * {{ADA184329.pdf | M. Annaratone, E. Arnould, T. Gross, H.T. Kung, M. Lam, O. Menzilcioglu, and J.A. Webb, “The Warp Computer: Architecture, Implementation, and Performance," IEEE TC, 1987}}
 +  * {{Smith-1982-Decoupled-Access-Execute-Computer-Architectures.pdf | J.E. Smith, “Decoupled Access/Execute Computer Architectures,” ISCA 1982}}
 +  * {{p199-smith.pdf | J.E. Smith, G. E. Dermer, B. D. Vanderwarn, S. D. Klinger, and C. M. Rozewski, "The ZS-1 Central Processor,” ACM SIGARCH Computer Architecture News, 1987}}
 +  * {{DynamicScheduling.pdf | J.E. Smith, “Dynamic Instruction Scheduling and the Astronautics ZS-1,” IEEE Computer, 1989}}
 +  * {{microarchitecture_pentium4_2001.pdf | G. Hinton, D. Sager, M. Upton, and D. Boggs, "The Microarchitecture of the Pentium® 4 Processor," Intel Technology Journal, 2001}}
 +  * {{mutlu_hpca_2003.pdf | O. Mutlu, J. Stark, C. Wilkerson, and Y.N. Patt, "Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-order Processors,” HPCA 2003}}
 +
 +===== Lecture 28 (4.01 Mon.) =====
 +=== Suggested (lecture 28): ===
  
 +  * {{parallel1964thornton.pdf | J. Thornton, “Parallel Operation in the Control Data 6600,” AFIPS 1964.}} 
 +  * {{pipelined1978smith.pdf | B.J. Smith, “A Pipelined, Shared Resource MIMD Computer,” ICPP 1978.}}
 +  * {{kongetira05_niagara.pdf | P. Kongetira, A. Kathirgamar, and K. Olukotun, “Niagara: A 32-Way Multithreaded SPARC Processor,” Micro, 2005.}}
 +  * {{hep_burton.pdf | B.J. Smith, "Architecture and Applications of the HEP Multiprocessor Computer System," International Society for Optics and Photonics, 1982.}}
 +  * {{tera_alverson.pdf | R. Alverson, D. Callahan, D. Cummings, B. Koblenz, A. Porterfield, and B. Smith, "The Tera Computer System," International Conference on Supercomputing, 2014}}
readings.1607185373.txt.gz · Last modified: 2020/12/05 16:22 by geraldod