This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
readings [2018/12/06 09:36] alserm [Lecture 22 (6.12 Thu.)] |
readings [2019/12/12 10:02] (current) |
||
---|---|---|---|
Line 479: | Line 479: | ||
* {{d7ce51c62671d5ffc1506786b0b7861ce00a.pdf| Jose A. Joao, M. Aater Suleman, Onur Mutlu, and Yale N. Patt, "Utility-Based Acceleration of Multithreaded Applications on Asymmetric CMPs". ISCA'13}} | * {{d7ce51c62671d5ffc1506786b0b7861ce00a.pdf| Jose A. Joao, M. Aater Suleman, Onur Mutlu, and Yale N. Patt, "Utility-Based Acceleration of Multithreaded Applications on Asymmetric CMPs". ISCA'13}} | ||
* {{22310236.pdf| Ed Grochowski, Ronny Ronen, John Shen, and Hong Wang, "Best of Both Latency and Throughput". ICCD 2004}} | * {{22310236.pdf| Ed Grochowski, Ronny Ronen, John Shen, and Hong Wang, "Best of Both Latency and Throughput". ICCD 2004}} | ||
- | * {{amdahl.pdf|G. M. Amdahl, "Validity of the single processor approach to achieving large scale computing capabilities," AFIPS 1967}} | + | * {{lecture1-amdahl.pdf|G. M. Amdahl, "Validity of the single processor approach to achieving large scale computing capabilities," AFIPS 1967}} |
* {{05389044.pdf|J. M. Tendler, J. S. Dodson, J. S. Fields, Jr., H. Le, and B. Sinharoy, "POWER4 System Microarchitecture". IBM J R&D 2002}} | * {{05389044.pdf|J. M. Tendler, J. S. Dodson, J. S. Fields, Jr., H. Le, and B. Sinharoy, "POWER4 System Microarchitecture". IBM J R&D 2002}} | ||
* {{719990eaab63a6bfa2988b5fd57a03b13229.pdf| Ron Kalla, Balaram Sinharoy, and Joel M. Tendler, "IBM Power5 Chip: A Dual-Core Multithreaded Processor". IEEE Micro 2004}} | * {{719990eaab63a6bfa2988b5fd57a03b13229.pdf| Ron Kalla, Balaram Sinharoy, and Joel M. Tendler, "IBM Power5 Chip: A Dual-Core Multithreaded Processor". IEEE Micro 2004}} | ||
Line 511: | Line 511: | ||
===== Lecture 22 (6.12 Thu.) ===== | ===== Lecture 22 (6.12 Thu.) ===== | ||
=== Required (lecture 22): === | === Required (lecture 22): === | ||
- | * {{amdahl.pdf|G. M. Amdahl, "Validity of the single processor approach to achieving large scale computing capabilities," AFIPS 1967}} | + | * {{lecture1-amdahl.pdf|G. M. Amdahl, "Validity of the single processor approach to achieving large scale computing capabilities," AFIPS 1967}} |
* {{lamport.pdf|L. Lamport, "How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs," IEEE Transactions on Computers, 1979}} | * {{lamport.pdf|L. Lamport, "How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs," IEEE Transactions on Computers, 1979}} | ||
* {{a_low-overhead_coherence_solution_for_multiprocessors_with_private_cache_memories.pdf|M. S. Papamarcos and J. H. Patel, "A low-overhead coherence solution for multiprocessors with private cache memories," ISCA 1984}} | * {{a_low-overhead_coherence_solution_for_multiprocessors_with_private_cache_memories.pdf|M. S. Papamarcos and J. H. Patel, "A low-overhead coherence solution for multiprocessors with private cache memories," ISCA 1984}} | ||
Line 541: | Line 541: | ||
* {{culler_parcomparch_5.3.pdf|Culler and Singh, Parallel Computer Architecture, Chapter 5.3 (pp 291-305)}} | * {{culler_parcomparch_5.3.pdf|Culler and Singh, Parallel Computer Architecture, Chapter 5.3 (pp 291-305)}} | ||
* {{ph_computerorganizationanddesignthehardwaresoftwareinterface5th_5.10.pdf|P&H, Computer Organization and Design, Chapter 5.10 (pp 466-470)}} | * {{ph_computerorganizationanddesignthehardwaresoftwareinterface5th_5.10.pdf|P&H, Computer Organization and Design, Chapter 5.10 (pp 466-470)}} | ||
+ | |||
+ | ===== Lecture 23 (12.12 Wed.) ===== | ||
+ | === Described in detail during lecture 23): === | ||
+ | * {{bless_isca09.pdf|T. Moscibroda and O. Mutlu, "A Case for Bufferless Routing in On-Chip Networks", ISCA 2009}} | ||
+ | |||
+ | === Suggested (lecture 23): === | ||
+ | * {{app-aware-noc_micro09.pdf|R. Das, O. Mutlu, T. Moscibroda, and C. R. Das, "Application-Aware Prioritization Mechanisms for On-Chip Networks", MICRO 2009}} | ||
+ | * {{ultrasparc.pdf|M. Shah, J. Barreh, J. Brooks, R. Golla, G. Grohoski, N. Gura, R. Hetherington, P. Jordan, M. Luttrell, C. Olson, B. Saha, D. Sheahan, L. Spracklen, and A. Wynn, "UltraSPARC T2: A Highly-Threaded, Power-Efficient, SPARC SOC", ASSCC 2007}} | ||
+ | * {{7d2822e9b7fcd60f147823478b59fcf7569e.pdf|J. H. Patel, "Processor-memory interconnections for multiprocessors", ISCA 1979}} | ||
+ | * {{Ultracomputer.pdf|A. Gottlieb, R. Grishman, C. P. Kruskal, K. P. McAuliffe, L. Rudolph, and M. Snir, "The NYU Ultracomputer - Designing an MIMD Shared Memory Parallel Computer", IEEE Trans. on Comp. 1983}} | ||
+ | * {{hierarchical-rings-with-deflection_sbacpad14.pdf|R. Ausavarungnirun, C. Fallin, X. Yu, K. Chang, G. Nazario, R. Das, G. Loh, and O. Mutlu, "Design and Evaluation of Hierarchical Rings with Deflection Routing", SBAC-PAD 2014}} | ||
+ | * {{p272-leiserson.pdf|C.E. Leiserson, Z.S. Abuhamdeh, D.C. Douglas, C.R. Feynman, M.N. Ganmukhi, J.V. Hill, D. Hillis, B.C. Kuszmaul, M.A. St. Pierre, D.S. Wells, M.C. Wong, S.-W. Yang, R. Zak, "The Network Architecture of the Connection Machine CM-5", SPAA 1992}} | ||
+ | * {{seitz_cacm_1985.pdf|C. L. Seitz, "The Cosmic Cube", CACM 1985}} | ||
+ | * {{L8-TurnModel-ISCA92.pdf|C. J. Glass and L. M. Ni, "The Turn Model for Adaptive Routing", ISCA 1992}} | ||
+ | * {{maze-routing_nocs15.pdf|M. Fattah, A. Airola, R. Ausavarungnirun, N. Mirzaei, P. Liljeberg, J. Plosila, S. Mohammadi, T. Pahikkala, O. Mutlu, and H. Tenhunen, "A Low-Overhead, Fully-Distributed, Guaranteed-Delivery Routing Algorithm for Faulty Network-on-Chips", NOCS 2015}} | ||
+ | * {{Baran64.pdf|P. Baran, "On Distributed Communications Networks", IEEE Trans. Comm., 1964}} | ||
+ | * {{bufferless_springer14.pdf|C. Fallin, G. Nazario, X. Yu, K. Chang, R. Ausavarungnirun, and O. Mutlu, "Bufferless and Minimally-Buffered Deflection Routing", Routing Algorithms in Networks-on-Chip (invited) 2014}} | ||
+ | * {{virtual+channel.pdf|W. J. Dally, "Virtual Channel Flow Control", ISCA 1990}} | ||
+ | |||
+ | |||
+ | |||
+ | ===== Lecture 24 (13.12 Thu.) ===== | ||
+ | === Described in detail during lecture 24: === | ||
+ | * {{05749724.pdf|C. Fallin, C. Craik, and O. Mutlu, "CHIPPER: A Low-Complexity Bufferless Deflection Router", HPCA 2011}} | ||
+ | * {{bufferless_springer14.pdf|C. Fallin, G. Nazario, X. Yu, K. Chang, R. Ausavarungnirun, and O. Mutlu, "Bufferless and Minimally-Buffered Deflection Routing", Routing Algorithms in Networks-on-Chip (invited book chapter), 2014}} | ||
+ | * {{06209256.pdf|C. Fallin, G. Nazario, X. Yu, K. Chang, R. Ausavarungnirun, and O. Mutlu, "MinBD: Minimally-Buffered Deflection Routing for Energy-Efficient Interconnect", NOCS 2012}} | ||
+ | |||
+ | === Suggested (lecture 24): === | ||
+ | * {{app-aware-noc_micro09.pdf|R. Das, O. Mutlu, T. Moscibroda, and C. R. Das, "Application-Aware Prioritization Mechanisms for On-Chip Networks", MICRO 2009}} | ||
+ | * {{https://people.inf.ethz.ch/omutlu/pub/hetero-adaptive-source-throttling_sbacpad12.pdf|K. Chang, R. Ausavarungnirun, C. Fallin, and O. Mutlu, "HAT: Heterogeneous Adaptive Throttling for On-Chip Networks," | ||
+ | SBAC-PAD, 2012}} | ||
+ | * {{bless_isca09.pdf|T. Moscibroda and O. Mutlu, "A Case for Bufferless Routing in On-Chip Networks", ISCA 2009}} | ||
+ | * {{06970669.pdf|R. Ausavarungnirun, C. Fallin, X. Yu, K. Chang, G. Nazario, R. Das, G. H. Loh, and O. Mutlu, "Design and Evaluation of Hierarchical Rings with Deflection Routing", SBAC-PAD 2014}} | ||
+ | * {{1-s2.0-s0167819116000399-main.pdf|R. Ausavarungnirun, C. Fallin, X. Yu, K. Chang, G. Nazario, R. Das, G. H.Loh, and O. Mutlu, "A Case for Hierarchical Rings with Deflection Routing: An Energy-Efficient On-Chip Communication Substrate", PARCO 2016}} | ||
+ | * {{p106-das.pdf|R. Das, O. Mutlu, T. Moscibroda, and C.R. Das, "Aergia: Exploiting Packet Latency Slack in On-Chip Networks", ISCA 2010}} | ||
+ | * {{https://people.inf.ethz.ch/omutlu/pub/pvc-qos_micro09.pdf|B. Grot, S.W. Keckler, O. Mutlu, "Preemptive Virtual Clock: A Flexible, Efficient, and Cost-effective QoS Scheme for Networks-on-Chip", MICRO 2009}} | ||
+ | * {{p401-grot.pdf|B. Grot, J. Hestness, S. W. Keckler, and O. Mutlu, "Kilo-NOC: A Heterogeneous Network-on-Chip Architecture for Scalability and Service Guarantees", ISCA 2011}} | ||
+ | * {{https://people.inf.ethz.ch/omutlu/pub/onchip-network-congestion-scalability_sigcomm2012.pdf|G. Nychis, C. Fallin, T. Moscibroda, O. Mutlu, and S. Seshan, "On-Chip Networks from a Networking Perspective: Congestion and Scalability in Many-core Interconnects," SIGCOMM, 2012}} | ||
+ | * {{http://users.ece.cmu.edu/~omutlu/pub/noc-congestion_hotnets10.pdf|G. Nychis, C. Falling, T. Moscibroda, O. Mutlu, "Next Generation On-chip Networks: What Kind of Congestion Control Do We Need?" HotNets 2010}} |