User Tools

Site Tools


readings

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
readings [2020/12/31 15:48] – [Lecture 21 (04.12 Fri.)] firtinacreadings [2021/01/04 08:45] (current) – [Lecture 26 (31.12 Thu.)] firtinac
Line 709: Line 709:
    * {{0211027.pdf| L. G. Valiant, "A Scheme for Fast Parallel Communication", SIAM Journal on Computing, 1982}}    * {{0211027.pdf| L. G. Valiant, "A Scheme for Fast Parallel Communication", SIAM Journal on Computing, 1982}}
    * {{https://people.inf.ethz.ch/omutlu/pub/bless_isca09.pdf| T. Moscibroda and O. Mutlu, "A Case for Bufferless Routing in On-Chip Networks", ISCA 2009}}    * {{https://people.inf.ethz.ch/omutlu/pub/bless_isca09.pdf| T. Moscibroda and O. Mutlu, "A Case for Bufferless Routing in On-Chip Networks", ISCA 2009}}
-   * {{05749724.pdf|C. Fallin, C. Craik, and O. Mutlu, "CHIPPER: A Low-Complexity Bufferless Deflection Router", HPCA 2011}} +   * {{chipper_hpca11.pdf|C. Fallin, C. Craik, and O. Mutlu, "CHIPPER: A Low-Complexity Bufferless Deflection Router", HPCA 2011}} 
-   * {{bufferless_springer14.pdf|C. Fallin, G. Nazario, X. Yu, K. Chang, R. Ausavarungnirun, and O. Mutlu, "Bufferless and Minimally-Buffered Deflection Routing", Routing Algorithms in Networks-on-Chip (invited book chapter), 2014}} +   * {{bufferless-and-minimally-buffered-deflection-routing_springer14.pdf|C. Fallin, G. Nazario, X. Yu, K. Chang, R. Ausavarungnirun, and O. Mutlu, "Bufferless and Minimally-Buffered Deflection Routing", Routing Algorithms in Networks-on-Chip (invited book chapter), 2014}} 
-   * {{06209256.pdf|C. Fallin, G. Nazario, X. Yu, K. Chang, R. Ausavarungnirun, and O. Mutlu, "MinBD: Minimally-Buffered Deflection Routing for Energy-Efficient Interconnect", NOCS 2012}}+   * {{minimally-buffered-deflection-router_nocs12.pdf|C. Fallin, G. Nazario, X. Yu, K. Chang, R. Ausavarungnirun, and O. Mutlu, "MinBD: Minimally-Buffered Deflection Routing for Energy-Efficient Interconnect", NOCS 2012}}
 === Suggested (lecture 22): === === Suggested (lecture 22): ===
    * {{p168-patel.pdf| J. Patel, "Performance of Processor-Memory Interconnections for Multiprocessors", ISCA 1979}}    * {{p168-patel.pdf| J. Patel, "Performance of Processor-Memory Interconnections for Multiprocessors", ISCA 1979}}
Line 722: Line 722:
    * {{https://people.inf.ethz.ch/omutlu/pub/maze-routing_nocs15.pdf| M. Fattah, A. Airola, R. Ausavarungnirun, N. Mirzaei, P. Liljeberg, J. Plosila, S. Mohammadi, T. Pahikkala, O. Mutlu, and H. Tenhunen, "A Low-Overhead, Fully-Distributed, Guaranteed-Delivery Routing Algorithm for Faulty Network-on-Chips", NOCS, 2015}}    * {{https://people.inf.ethz.ch/omutlu/pub/maze-routing_nocs15.pdf| M. Fattah, A. Airola, R. Ausavarungnirun, N. Mirzaei, P. Liljeberg, J. Plosila, S. Mohammadi, T. Pahikkala, O. Mutlu, and H. Tenhunen, "A Low-Overhead, Fully-Distributed, Guaranteed-Delivery Routing Algorithm for Faulty Network-on-Chips", NOCS, 2015}}
    * {{P2626.pdf| P. Baran, "On Distributed Communication Networks", RAND Technical Report, 1962}}    * {{P2626.pdf| P. Baran, "On Distributed Communication Networks", RAND Technical Report, 1962}}
-   * {{p106-das.pdf|R. Das, O. Mutlu, T. Moscibroda, and C.R. Das, "Aergia: Exploiting Packet Latency Slack in On-Chip Networks", ISCA 2010}}+   * {{app-aware-noc_micro09.pdf|R. Das, O. Mutlu, T. Moscibroda, and C.R. Das, "Aergia: Exploiting Packet Latency Slack in On-Chip Networks", ISCA 2010}} 
 +===== Lecture 23 (28.12 Mon.) ===== 
 +=== Described in detail during lecture 23 === 
 +   * {{chipper_hpca11.pdf|C. Fallin, C. Craik, and O. Mutlu, "CHIPPER: A Low-Complexity Bufferless Deflection Router", HPCA 2011}} 
 +   * {{bufferless-and-minimally-buffered-deflection-routing_springer14.pdf|C. Fallin, G. Nazario, X. Yu, K. Chang, R. Ausavarungnirun, and O. Mutlu, "Bufferless and Minimally-Buffered Deflection Routing", Routing Algorithms in Networks-on-Chip (invited book chapter), 2014}} 
 +   * {{minimally-buffered-deflection-router_nocs12.pdf|C. Fallin, G. Nazario, X. Yu, K. Chang, R. Ausavarungnirun, and O. Mutlu, "MinBD: Minimally-Buffered Deflection Routing for Energy-Efficient Interconnect", NOCS 2012}} 
 +=== Suggested (lecture 23): === 
 +     {{https://people.inf.ethz.ch/omutlu/pub/hierarchical-rings-with-deflection_sbacpad14.pdf| R. Ausavarungnirun, C. Fallin, X. Yu, K. Chang, G. Nazario, R. Das, G. Loh, and O. Mutlu, "Design and Evaluation of Hierarchical Rings with Deflection Routing", SBAC-PAD 2014}} 
 +   * {{https://arxiv.org/pdf/1602.06005.pdf| R. Ausavarungnirun, C. Fallin, X. Yu, K. Chang, G. Nazario, R. Das, G. Loh, and O. Mutlu, "A Case for Hierarchical Rings with Deflection Routing: An Energy-Efficient On-Chip Communication Substrate", PARCO 2016}} 
 +   * {{P2626.pdf| P. Baran, "On Distributed Communication Networks", RAND Technical Report, 1962}} 
 +===== Lecture 24 (29.12 Tue.) ===== 
 +=== Suggested (lecture 24): === 
 +  * {{Flynn_1966.pdf|M.J. Flynn, “Very high-speed computing systems,” Proc. of IEEE 1966}} 
 +  * {{p140-fisher.pdf|J.A.Fisher, "Very Long Instruction Word architectures and the ELI-512,” ISCA 1983}} 
 +  * {{p63-russell.pdf|R.M. Russell, "The CRAY-1 computer system,” CACM 1978}} 
 +  * {{p74-rau.pdf|B.R. Rau, "Pseudo-randomly interleaved memory,” ISCA 1991}} 
 +  * {{mmx_technology_1996.pdf|A. Peleg and U. Weiser, "MMX technology extension to the Intel architecture,” IEEE Micro 1996}} 
 +  * {{04523358.pdf|E. Lindholm, J. Nickolls, S. Oberman, and J. Montrym, "NVIDIA Tesla: A Unified Graphics and Computing Architecture,” IEEE Micro 2008}} 
 +  * {{30470407.pdf| W.W.L. Fung, I. Sham, G. Yuan, and T.M. Aamodt, "Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow," MICRO 2007}} 
 +===== Lecture 25 (30.12 Wed.) ===== 
 +=== Suggested (lecture 25): === 
 +  * {{cuda_c_programming_guide.pdf|NVIDIA, "CUDA C++ PROGRAMMING GUIDE ," 2019}} 
 +  * {{2013_programming_massively_parallel_processors_a_hands-on_approach_2nd.pdf| Hwu and Kirk , “Programming Massively Parallel Processors ” 2017}} 
 +  * {{p140-fisher.pdf|Fisher , “Very Long Instruction Word Architectures and the ELI-512,” ISCA 1983}} 
 +  * {{sung_2012.pdf|I. Sung, G. D. Liu, and W. W. Hwu , “DL: A data layout transformation system for heterogeneous computing ,” INPAR 2012}} 
 +  * {{10.1.1.12.7149.pdf|B. R. Rau , “Pseudo-randomly interleaved memory ,” ISCA 1991}} 
 +  * {{configurable_xor_hash_functions_for_banked_scratchpad_memories_in_gpus.pdf|G. Braak, J. Gomez-Luna, J.M. Gonzalez-Linares, Henk Corporal, and Nicolas Guil , “Configurable XOR Hash Functions for Banked Scratchpad Memories in GPUs,” IEEE TC 2016}} 
 +  * {{gomezluna_2013.pdf|J. Gomez-Luna, J.M. Gonzalez-Linares, J.I. Benavides Benitez, and N. G. Mata , “Performance Modeling of Atomic Additions on GPU Scratchpad Memory ,” IEEE TPDS 2013}} 
 +  * {{gomezluna_2012.pdf|J. Gomez-Luna, J.M. Gonzalez-Linares, J. I. Benavides, and N. Guil, “Performance models for asynchronous data transfers on consumer Graphics Processing Units,” JPDC 2012}} 
 +  * {{ransac-publication.pdf|M.A. Fisher, and R.C. Bolles ”Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography“, Graphics and Image Processing, 1981}} 
 + 
 +===== Lecture 26 (31.12 Thu.) ===== 
 +=== Suggested (lecture 26): === 
 +    * {{https://arxiv.org/pdf/1706.08642.pdf|Y. Cai, S. Ghose, E.F. Haratsch, Y. Luo, O. Mutlu, "Error Characterization, Mitigation, and Recovery in Flash Memory Based Solid State Drives," Proceedings of the IEEE 2017}} 
 +    * {{https://people.inf.ethz.ch/omutlu/pub/flash-memory-programming-vulnerabilities_hpca17.pdf|Y. Cai, S. Ghose, Y. Luo, K. Mai, O. Mutlu, E.F. Haratsch, "Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques," HPCA 2017}} 
 +    * {{https://people.inf.ethz.ch/omutlu/pub/flash-correct-and-refresh_iccd12.pdf|Y. Cai, G. Yalcin, O. Mutlu, E. F. Haratsch, A. Cristal, O. Unsal, and K. Mai, "Flash Correct-and-Refresh: Retention-Aware Error Management for Increased Flash Memory Lifetime," ICCD 2012}} 
 +    * {{https://people.inf.ethz.ch/omutlu/pub/online-nand-flash-memory-channel-model_jsac16.pdf|Y. Luo, S. Ghose, Y.Cai, E.F. Haratsch, O. Mutlu, "Enabling Accurate and Practical Online Flash Channel Modeling for Modern MLC NAND Flash Memory", JSAC, 2016}} 
 +    * {{https://people.inf.ethz.ch/omutlu/pub/flash-read-disturb-errors_dsn15.pdf|Y.Cai, Y. Luo, S. Ghose, O. Mutlu, "Read Disturb Errors in MLC NAND Flash Memory: Characterization, Mitigation, and Recovery", DSN, 2015}} 
 +    * {{https://people.inf.ethz.ch/omutlu/pub/warm-flash-write-hotness-aware-retention-management_msst15.pdf|Y. Luo, Y.Cai, S. Ghose, J. Choi, O. Mutlu, "WARM: Improving NAND Flash Memory Lifetime with Write-Hotness Aware Retention Management", MSST, 2015}} 
 +    * {{https://people.inf.ethz.ch/omutlu/pub/flash-memory-data-retention_hpca15.pdf|Y.Cai, Y. Luo, E.F. Haratsch, K. Mai, O. Mutlu, "Data Retention in MLC NAND Flash Memory: Characterization, Optimization, and Recovery", HPCA, 2015}} 
 +    * {{https://people.inf.ethz.ch/omutlu/pub/neighbor-assisted-error-correction-in-flash_sigmetrics14.pdf|Y.Cai, Yalcin, Gulay O. Mutlu, E.F. Haratsch, O. Unsal, A. Cristal, K. Mai, "Neighbor-Cell Assisted Error Correction for MLC NAND Flash Memories", SIGMETRICS, 2014}} 
 +    * {{https://people.inf.ethz.ch/omutlu/pub/flash-programming-interference_iccd13.pdf|Y.Cai, O. Mutlu, E.F. Haratsch, K. Mai, "Program Interference in MLC NAND Flash Memory: Characterization, Modeling, and Mitigation", ICCD, 2013}} 
 +    * {{https://people.inf.ethz.ch/omutlu/pub/flash-error-analysis-and-management_itj13.pdf|Y.Cai, Yalcin, Gulay O. Mutlu, E.F. Haratsch, Cristal, Adrian Unsal, Osman S K. Mai, "Error Analysis and Retention-Aware Error Management for NAND Flash Memory", ITJ, 2013}} 
 +    * {{https://people.inf.ethz.ch/omutlu/pub/flash-memory-voltage-characterization_date13.pdf|Y.Cai, E.F. Haratsch, O. Mutlu, K. Mai, "Threshold Voltage Distribution in MLC NAND Flash Memory: Characterization, Analysis, and Modeling", DATE, 2013}} 
 +    * {{https://people.inf.ethz.ch/omutlu/pub/flash-error-patterns_date12.pdf|Y.Cai, E.F. Haratsch, O. Mutlu, K. Mai, "Error Patterns in MLC NAND Flash Memory: Measurement, Characterization, and Analysis", DATE, 2012}} 
 +    * {{https://www.usenix.org/system/files/conference/fast16/fast16-papers-schroeder.pdf|B. Schroeder, R. Lagisetty, A. Merchant, "Flash Reliability in Production: The Expected and the Unexpected.", FAST, 2016}} 
 +    * {{https://people.inf.ethz.ch/omutlu/pub/flash-memory-chip-off-forensics-reliability_dfrws17.pdf|A. Fukami, S. Ghose, Y. Luo, Y.Cai, O. Mutlu, "Improving the Reliability of Chip-Off Forensic Analysis of NAND Flash Memory Devices", Digital Investigation, 2017}} 
 +    * {{https://people.inf.ethz.ch/omutlu/pub/3D-NAND-flash-lifetime-early-retention-loss-and-process-variation_sigmetrics18_pomacs18-twocolumn.pdf|Y. Luo, S. Ghose, Y. Cai, E. F. Haratsch, O. Mutlu, “Improving 3D NAND Flash Memory Lifetime by Tolerating Early Retention Loss and Process Variation," SIGMETRICS, 2018}} 
 +    * {{https://people.inf.ethz.ch/omutlu/pub/heatwatch-3D-nand-errors-and-self-recovery_hpca18.pdf|Y. Luo, S. Ghose, Y. Cai, E. F. Haratsch, O. Mutlu, "HeatWatch: Improving 3D NAND Flash Memory Device Reliability by Exploiting Self-Recovery and Temperature Awareness," HPCA, 2018}} 
 +    * {{https://arxiv.org/pdf/1711.11427.pdf|Y. Cai, S. Ghose, E.F. Haratsch, Y. Luo, O. Mutlu, "Errors in Flash-Memory-Based Solid-State Drives: Analysis, Mitigation, and Recovery," arXiv, 2017}} 
 +    * {{https://users.ece.cmu.edu/~omutlu/pub/flash-memory-failures-in-the-field-at-facebook_sigmetrics15.pdf|J. Meza, Q. Wu, S. Kumar, O. Mutlu, "A Large-Scale Study of Flash Memory Failures in the Field," SIGMETRICS, 2015}} 
 +    * {{https://arxiv.org/pdf/1711.11427.pdf|Yu Cai, Saugata Ghose, Erich F. Haratsch, Yixin Luo, and Onur Mutlu,"Errors in Flash-Memory-Based Solid-State Drives: Analysis, Mitigation, and Recovery" Invited Book Chapter in Inside Solid State Drives, 2018.}} 
 +    * {{https://people.inf.ethz.ch/omutlu/pub/online-nand-flash-memory-channel-model_jsac16.pdf|Yixin Luo, Saugata Ghose, Yu Cai, Erich F. Haratsch, and Onur Mutlu, "Enabling Accurate and Practical Online Flash Channel Modeling for Modern MLC NAND Flash Memory" to appear in IEEE Journal on Selected Areas in Communications (JSAC), 2016.}} 
 + 
 +===== Lecture 27 (4.01 Mon.) ===== 
 +=== Suggested (lecture 27): === 
 +  * {{1982-kung-why-systolic-architecture.pdf | H.T. Kung, “Why Systolic Architectures?,” IEEE Computer, 1982}} 
 +  * {{p1-Jouppi.pdf | N. Jouppi, C. Young, N. Patil, D. Patterson, G. Agrawal, R. Bajwa, S. Bates, S. Bhatia, N. Boden, A. Borchers and R. Boyle, “In-datacenter Performance Analysis of a Tensor Processing Unit,” ISCA 2017}} 
 +  * {{4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf | A. Krizhevsky, I. Sutskever, G.E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," NIPS 2012}} 
 +  * {{GoogLeNet.pdf | C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, "Going Deeper with Convolutions," CVPR 2015}} 
 +  * {{resnet.pdf | K. He, X. Zhang, S. Ren, J. Sun, “Deep Residual Learning for Image Recognition,” CVPR 2016}} 
 +  * {{p346-annaratone.pdf | M. Annaratone, E. Arnould, T. Gross, H.T. Kung, and M.S. Lam, “Warp Architecture and Implementation,” ACM SIGARCH Computer Architecture News, 1986}} 
 +  * {{ADA184329.pdf | M. Annaratone, E. Arnould, T. Gross, H.T. Kung, M. Lam, O. Menzilcioglu, and J.A. Webb, “The Warp Computer: Architecture, Implementation, and Performance," IEEE TC, 1987}} 
 +  * {{Smith-1982-Decoupled-Access-Execute-Computer-Architectures.pdf | J.E. Smith, “Decoupled Access/Execute Computer Architectures,” ISCA 1982}} 
 +  * {{p199-smith.pdf | J.E. Smith, G. E. Dermer, B. D. Vanderwarn, S. D. Klinger, and C. M. Rozewski, "The ZS-1 Central Processor,” ACM SIGARCH Computer Architecture News, 1987}} 
 +  * {{DynamicScheduling.pdf | J.E. Smith, “Dynamic Instruction Scheduling and the Astronautics ZS-1,” IEEE Computer, 1989}} 
 +  * {{microarchitecture_pentium4_2001.pdf | G. Hinton, D. Sager, M. Upton, and D. Boggs, "The Microarchitecture of the Pentium® 4 Processor," Intel Technology Journal, 2001}} 
 +  * {{mutlu_hpca_2003.pdf | O. Mutlu, J. Stark, C. Wilkerson, and Y.N. Patt, "Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-order Processors,” HPCA 2003}} 
 + 
 +===== Lecture 28 (4.01 Mon.) ===== 
 +=== Suggested (lecture 28): ===
  
 +  * {{parallel1964thornton.pdf | J. Thornton, “Parallel Operation in the Control Data 6600,” AFIPS 1964.}} 
 +  * {{pipelined1978smith.pdf | B.J. Smith, “A Pipelined, Shared Resource MIMD Computer,” ICPP 1978.}}
 +  * {{kongetira05_niagara.pdf | P. Kongetira, A. Kathirgamar, and K. Olukotun, “Niagara: A 32-Way Multithreaded SPARC Processor,” Micro, 2005.}}
 +  * {{hep_burton.pdf | B.J. Smith, "Architecture and Applications of the HEP Multiprocessor Computer System," International Society for Optics and Photonics, 1982.}}
 +  * {{tera_alverson.pdf | R. Alverson, D. Callahan, D. Cummings, B. Koblenz, A. Porterfield, and B. Smith, "The Tera Computer System," International Conference on Supercomputing, 2014}}
readings.1609429700.txt.gz · Last modified: 2020/12/31 15:48 by firtinac