skip to content
Design of Digital Circuits - Spring 2018
User Tools
Log In
Site Tools
Search
Tools
Show pagesource
Old revisions
Backlinks
Recent Changes
Media Manager
Sitemap
Log In
>
Recent Changes
Media Manager
Sitemap
Trace:
•
readings
Sidebar
Home
Materials
Lectures/Schedule
Lecture Buzzwords
Readings
Optional HWs
Labs
Exams
Technical Docs
Resources
Digitaltechnik SS17: Lecture Videos
Digitaltechnik SS17: Slides and Assignments
Moodle
D-INFK: Distinguished Colloquium (Spring 2018)
Prof. Wen-mei Hwu's talk (30.04.2018)
readings
Table of Contents
Readings
Books
Lecture 1 (22.02 Thu.)
Lecture 2 (23.02 Fri.)
Lecture 4 (02.03 Fri.)
Lecture 5 (08.03 Thu.)
Lecture 6 (09.03 Fri.)
Lecture 7 (15.03 Thu.)
Lecture 8 (16.03 Fri.)
Lecture 9 (22.03 Thu.)
Lecture 10 (23.03 Fri.)
Lecture 11 (29.03 Thu.)
Lecture 12 (12.04 Thu.)
Lecture 13 (13.04 Fri.)
Lecture 14 (19.04 Thu.)
Lecture 15 (20.04 Fri.)
Lecture 16 (26.04 Thu.)
Lecture 17 (27.04 Fri.)
Lecture 18 (03.05 Thu.)
Lecture 19 (04.05 Fri.)
Lecture 20 (11.05 Fri.)
Lecture 21 (17.05 Thu.)
Lecture 22 (18.05 Fri.)
Lecture 23a (24.05 Thu.)
Lecture 23b (24.05 Thu.)
Lecture 24 (25.05 Fri.)
Lecture 25a (31.05 Thu.)
Lecture 25b (31.05 Thu.)
Readings
Books
Y.N. Patt and S.J. Patel, "Introduction to Computing Systems."
"Appendix A: The LC-3b ISA."
"Appendix C: The Microarchitecture of the LC-3b, Basic Machine."
"LC-3b Figures."
D. Harris and S. Harris, "Digital Design and Computer Architecture (2nd Edition)."
D. Harris and S. Harris, "Digital Design and Computer Architecture (1st Edition)."
H&H textbooks are available online (through the ETH network or the ETH VPN).
Lecture 1 (22.02 Thu.)
Reading assignments (Lecture 1):
Y.N. Patt and S.J. Patel, “Chapter 1: Welcome Aboard. Introduction to Computing Systems.”
Y.N. Patt and S.J. Patel, “Chapter 2: Bits, Data Types, and Operations. Introduction to Computing Systems.”
D. Harris and S. Harris, “Chapter 1: From Zero to One. Digital Design and Computer Architecture.”
"Binary Numbers (pdf)."
"Binary Numbers (pptx)."
Suggested readings (Lecture 1):
R.W. Hamming, "You and Your Research," Transcription of the Bell Communications Research Colloquium Seminar, 1986.
Mentioned in Lecture 1:
N.P. Jouppi, C. Young, N. Patil, D. Patterson, G. Agrawal, R. Bajwa, S. Bates, S. Bhatia, N. Boden, A. Borchers, R. Boyle, P.-L. Cantin, C. Chao, C. Clark, J. Coriell, M. Daley, M. Dau, J. Dean, B. Gelb, T.V. Ghaemmaghami, R. Gottipati, W. Gulland, R. Hagmann, C.R. Ho, D. Hogberg, J. Hu, R. Hundt, D. Hurt, J. Ibarz, A. Jaffey, A. Jaworski, A. Kaplan, H. Khaitan, D. Killebrew, A. Koch, N. Kumar, S. Lacy, J. Laudon, J. Law, D. Le, C. Leary, Z. Liu, K. Lucke, A. Lundin, G. MacKean, A. Maggiore, M. Mahony, K. Miller, R. Nagarajan, R. Narayanaswami, R. Ni, K. Nix, T. Norrie, M. Omernick, N. Penukonda, A. Phelps, J. Ross, M. Ross, A. Salek, E. Samadiani, C. Severn, G. Sizikov, M. Snelham, J. Souter, D. Steinberg, A. Swing, M. Tan, G. Thorson, B. Tian, H. Toma, E. Tuttle, V. Vasudevan, R. Walter, W. Wang, E. Wilcox, D.H. Yoon, "In-Datacenter Performance Analysis of a Tensor Processing Unit,” ISCA 2017.
R.W. Hamming, "Numerical Methods for Scientists aznd Engineers,” 1962.
R.W. Hamming, "Error Detecting and Error Correcting Codes," Bell System Technical Journal, 1950.
Lecture 2 (23.02 Fri.)
Suggested readings (Lecture 2):
J.C. Dehnert, B.K. Grant, J.P. Banning, R. Johnson, T. Kistler, A. Klaiber, J. Mattson, "The Transmeta Code Morphing™ Software: Using Speculation, Recovery, and Adaptive Retranslation to Address Real-life Challenges," CGO 2003
A. Klaiber, "The technology behind Crusoe processors," Transmeta Technical Brief, 2000.
Mentioned in Lecture 2:
J. Horn, "Project Zero Reading Privileged Memory with a Side-channel," Google Project Zero, 2018
Y. Kim, R. Daly, J. Kim, C. Fallin, J.H. Lee, D. Lee, C. Wilkerson, K. Lai, O. Mutlu, "Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors," ISCA 2014
M. Seaborn and T. Dullien, "Exploiting the DRAM rowhammer bug to gain kernel privileges," Google Project Zero, 2015
D. Gruss, C. Maurice, S. Mangard, "Rowhammer.js: A Remote Software-Induced Fault Attack in JavaScript," DIMVA 2016
V. van der Veen, Y. Fratantonio, M. Lindorfer, D. Gruss, C. Maurice, G. Vigna, H. Bos, K. Razavi, C. Giuffrida, "Drammer: Deterministic Rowhammer Attacks on Mobile Platforms," CCS 2016
L. Lamport, R. Showtak, M. Pease, "The Byzantine Generals Problem," ACM TOPLAS, 1982
O. Mutlu, "The RowHammer Problem and Other Issues We May Face as Memory Becomes Denser," DATE 2017
Lecture 4 (02.03 Fri.)
Reading assignments (Lecture 4):
Y.N. Patt and S.J. Patel, “Chapter 3: Digital Logic Structures.”
D. Harris and S. Harris, “Chapter 2: Combinational Logic Design.”
Suggested readings (Lecture 4):
O. Mutlu, "The RowHammer Problem and Other Issues We May Face as Memory Becomes Denser" Invited Paper in Proceedings of the Design, Automation, and Test in Europe Conference (DATE), Lausanne, Switzerland, March 2017.
T. Moscibroda and O. Mutlu. "Memory performance attacks: denial of memory service in multi-core systems," USENIX Security Symposium 2007.
J. Liu, B. Jaiyen, R. Veras, O. Mutlu, "RAIDR: Retention-Aware Intelligent DRAM Refresh," ISCA 2012.
B.H. Bloom, “Space/Time Trade-offs in Hash Coding with Allowable Errors”, CACM 1970.
Mentioned in Lecture 4:
S. Rixner, W.J. Dally, U.J. Kapasi, P. Mattson, J.D. Owens. 2000. “Memory access scheduling,” ISCA 2000.
W.K. Zuravleff and T. Robinson, “Controller for a synchronous DRAM that maximizes throughput by allowing memory requests and commands to be issued out of order” US Patent 5,630,096, May 1997.
O. Mutlu and T. Moscibroda, "Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors,” MICRO 2007.
O. Mutlu and T. Moscibroda, "Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems,” ISCA 2008.
S.P. Muralidhara, L. Subramanian, O. Mutlu, M. Kandemir, T. Moscibroda, "Reducing Memory Interference in Multicore Systems via Application-Aware Memory Channel Partitioning,” MICRO 2011.
O. Mutlu, "Memory Scaling: A Systems Architecture Perspective,” Technical talk at MEMCON 2013.
K. Chang, D. Lee, Z. Chishti, A. Alameldeen, C. Wilkerson, Y. Kim, O. Mutlu, "Improving DRAM Performance by Parallelizing Refreshes with Accesses,” HPCA 2014.
Lecture 5 (08.03 Thu.)
Video lecture assignment (Lecture 5):
Onur Mutlu - Future Computing Architectures - ETH Zurich Inaugural Lecture - 15 May 2017
Reading assignments (Lecture 5):
Y.N. Patt and S.J. Patel, “Chapter 3: Digital Logic Structures.” (until Chapter 3.3)
D. Harris and S. Harris, “Chapter 2: Combinational Logic Design.”
Suggested readings (Lecture 5):
B.H. Bloom, "Space/Time Trade-offs in Hash Coding with Allowable Errors," CACM, 1970
J. Liu, B. Jaiyen, R. Veras, O. Mutlu, “RAIDR: Retention-Aware Intelligent DRAM Refresh,” ISCA 2012
G.E. Moore. "Cramming more components onto integrated circuits," Electronics magazine, 1965
Lecture 6 (09.03 Fri.)
Reading assignments (Lecture 6):
Y.N. Patt and S.J. Patel,
“Chapter 3: Digital Logic Structures.” (Sections 3.1, 3.2, and 3.3)
D. Harris and S. Harris,
“Chapter 2: Combinational Logic Design”
“Chapter 4: Hardware Description Languages” (Sections 4.1, 4.2, 4.3, and 4.5)
Suggested readings (Lecture 6):
D. Harris and S. Harris,
“Chapter 3: Sequential Logic Design”
“Chapter 5: Digital Building Blocks” (Sections 5.1 and 5.2)
Lecture 7 (15.03 Thu.)
Reading assignments (Lecture 7):
Y.N. Patt and S.J. Patel,
“Chapter 3: Digital Logic Structures” (Sections 3.4 until the end)
D. Harris and S. Harris,
“Chapter 3: Sequential Logic Design”
“Chapter 4: Hardware Description Languages”
“Chapter 5: Digital Building Blocks” (note: reading spans multiple lectures)
Lecture 6: Slides 102-120
Lecture 7: Slides 138-152
Lecture 8 (16.03 Fri.)
Reading assignments (Lecture 8):
D. Harris and S. Harris,
“Chapter 2: Combinational Logic Design” (Section 2.9)
“Chapter 3: Sequential Logic Design” (Section 3.5)
“Chapter 5: Digital Building Blocks” (note: reading spans multiple lectures)
Suggested readings (Lecture 8):
P.E. Gronowski, W.J. Bowhill, R.P. Preston, M.K. Gowan, R.L. Allmon, "High-Performance Microprocessor Design," IEEE Journal of Solid-State Circuits 1998
A. Abdelhadi, R. Ginosar, A. Kolodny, E.G. Friedman, "Timing–Driven Variation–Aware Nonuniform Clock Mesh Synthesis," GLSVLSI 2010
Lecture 9 (22.03 Thu.)
Reading assignments (Lecture 9):
Y.N. Patt and S.J. Patel,
“Chapter 4: The Von Neumann Model”
“Chapter 5: The LC-3”
"Appendix A: The LC-3b ISA."
"Appendix C: The Microarchitecture of the LC-3b, Basic Machine."
D. Harris and S. Harris,
“Chapter 5: Digital Building Blocks” (note: reading spans multiple lectures)
“Chapter 6: Architecture”
“Appendix B: MIPS Instructions”
Suggested readings (Lecture 9):
A.W. Burks, H.H. Goldstein, J. von Neumann, "Preliminary discussion of the logical design of an electronic computing instrument," 1946
Lecture 10 (23.03 Fri.)
Reading assignments (Lecture 10):
Y.N. Patt and S.J. Patel,
“Chapter 5: The LC-3”
“Chapter 6: Programming”
"Appendix A: The LC-3b ISA."
"Appendix C: The Microarchitecture of the LC-3b, Basic Machine."
D. Harris and S. Harris,
“Chapter 5: Digital Building Blocks” (note: reading spans multiple lectures)
“Chapter 6: Architecture”
“Appendix B: MIPS Instructions”
Lecture 11 (29.03 Thu.)
Reading assignments (Lecture 11):
Y.N. Patt and S.J. Patel,
"Appendix A: The LC-3b ISA."
"Appendix C: The Microarchitecture of the LC-3b, Basic Machine."
D. Harris and S. Harris,
“Chapter 7: Microarchitecture” (Section 7.1-7.4)
Suggested readings (Lecture 11):
A.W. Burks, H.H. Goldstein, J. von Neumann, "Preliminary discussion of the logical design of an electronic computing instrument," 1946
J.R. Gurd, C.C. Kirkham, I. Watson, "Manchester data flow computer," CACM, 1985
J.B. Dennis, D. Misunas, "A preliminary architecture for a basic data-flow processor," ISCA 1974
Y.N. Patt, "Requirements, Bottlenecks, and Good Fortune: Agents for Microprocessor Evolution," IEEE Micro 2001
Suggested video lecture (Lecture 11):
Onur Mutlu - Lec 22 - Dataflow - Parallel Computer Architecture - CMU - 2012
Lecture slides
Lecture 12 (12.04 Thu.)
Reading assignments (Lecture 12):
Y.N. Patt and S.J. Patel,
“Appendices A and C: Multi-cycle microarchitecture”
“Appendices A and C: Microprogramming”
D. Harris and S. Harris,
“Chapter 7.4: Multi-cycle microarchitecture”
“Chapter 7.5: Pipelining”
Lecture 13 (13.04 Fri.)
Reading assignments (Lecture 13):
Y.N. Patt and S.J. Patel,
"Appendix A: The LC-3b ISA."
"Appendix C: The Microarchitecture of the LC-3b, Basic Machine."
Suggested readings (Lecture 13):
M.V. Wilkes, "The best way to design an automatic calculating machine," Proc. Manchester Univ. Computer Inaugural Conf., 1951
W.T. Wilner, “Microprogramming environment on the Burroughs B1700,” CompCon, 1972
L.C. Heller and M.S. Farrell, "Millicode in an IBM zSeries processor," IBM Journal of Research and Development, 2004
Lecture 14 (19.04 Thu.)
Reading assignments (Lecture 14):
D. Harris and S. Harris,
“Chapter 7.5: Pipelining”
“Chapter 7.7: Advanced Microarchitecture”
Suggested readings (Lecture 14):
J.E. Smith and G.S. Sohi, "The Microarchitecture of Superscalar Processors," Proc. of the IEEE, 1995
Lecture 15 (20.04 Fri.)
Reading assignments (Lecture 15):
D. Harris and S. Harris,
“Chapter 7.5: Pipelining”
“Chapter 7.7: Advanced Microarchitecture”
Suggested readings (Lecture 15):
J.E. Smith and A.R. Plezskun, "Implementing Precise Interrupts in Pipelined Processors," IEEE Trans on Computers 1988 and ISCA 1985
Lecture 16 (26.04 Thu.)
Reading assignments (Lecture 16):
J.E. Smith and G.S. Sohi, "The Microarchitecture of Superscalar Processors," Proc. of the IEEE, 1995
D. Harris and S. Harris,
“Chapters 7.6-7.9”
Suggested readings (Lecture 16):
R.E. Kessler, "The Alpha 21264 Microprocessor," IEEE Micro, 1999
R. M. Tomasulo, "An Efficient Algorithm for Exploiting Multiple Arithmetic Units," IBM Journal of Research and Development, 1967
Y.N. Patt, W.M. Hwu, M.C. Shebanow, "HPS, a new microarchitecture: rationale and introduction," MICRO 1985
Y.N. Patt, S.W. Melvin, W.M. Hwu, M.C. Shebanow, "Critical issues regarding HPS, a high performance microarchitecture," MICRO 1985
R.P. Colwell, "The Pentium Chronicles: introduction," Computer, 2006
Lecture 17 (27.04 Fri.)
Reading assignments (Lecture 17):
J.E. Smith and G.S. Sohi, "The Microarchitecture of Superscalar Processors," Proc. of the IEEE, 1995
D. Harris and S. Harris,
“Chapters 7.8-7.9”
Suggested readings (Lecture 17):
R.E. Kessler, "The Alpha 21264 Microprocessor," IEEE Micro, 1999
G.Hinton, D.Sager, M.Upton, D.Boggs, D.Carmean, A.Kyker, P.Roussel, “The Microarchitecture of the Pentium 4 Processor,” Intel Technology Journal, 2001.
O.Mutlu, J.Stark,C.Wilkerson,Y.N.Patt, “Runahead Execution,” HPCA 2003.
K.C.Yeager, “The MIPS R10000 Superscalar Microprocessor,” IEEE Micro, April 1996
J.M.Tendler, J.S.Dodson, J.S.Fields, Jr.H.Le, B.Sinharoy, “POWER4 system microarchitecture,” IBM J R&D, 2002.
R.Kalla, B,Sinharoy, J.M.Tendler, “IBM Power5 Chip: A Dual-Core Multithreaded Processor,” IEEE Micro 2004.
Lecture 18 (03.05 Thu.)
Reading assignments (Lecture 18):
J.E. Smith and G.S. Sohi, "The Microarchitecture of Superscalar Processors," Proc. of the IEEE, 1995
D. Harris and S. Harris,
“Chapters 7.8-7.9”
Suggested readings (Lecture 18):
R.E. Kessler, "The Alpha 21264 Microprocessor," IEEE Micro, 1999
S. McFarling, "Combining Branch Predictors" DEC WRL Technical Report 1993
T. Ball and J. R. Larus, "Branch Prediction for Free" PLDI 1993
J. Smith, "A study of Branch Prediction Strategies" ISCA 1981
T. Yeh and Y. Patt, "Two-Level Adaptive Training Branch Prediction" MICRO 1991
Lecture 19 (04.05 Fri.)
Reading assignments (Lecture 19):
J.E. Smith and G.S. Sohi, "The Microarchitecture of Superscalar Processors," Proc. of the IEEE, 1995
D. Harris and S. Harris,
“Chapters 7.8-7.9”
Suggested readings (Lecture 19):
"A Remote Hack Hijacks Android Phones via Electric Leaks in their Memory," Wired, 03.05.2018.
"Drive-by Rowhammer attack uses GPU to compromise an Android phone," Arstechnica, 03.05.2018.
P. Frigo, C. Giuffrida, H. Bos, K. Razavi, "Grand Pwning Unit: Accelerating Microarchitectural Attacks with the GPU," S&P, 2018.
R.E. Kessler, "The Alpha 21264 Microprocessor," IEEE Micro, 1999.
S. McFarling, "Combining Branch Predictors," DEC WRL Technical Report 1993.
D. Jimenez and C. Lin, “Dynamic Branch Prediction with Perceptrons,” HPCA 2001.
A. Seznec, “Analysis of the O-Geometric History Length Branch Predictor,” ISCA 2005.
S. Gochman, “The Intel Pentium M Processor: Microarchitecture and Performance,” Intel Technology Journal, May 2003.
F. Rosenblatt, “Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms,” 1962.
A. Seznec and P. Michaud, “A Case for (Partially) Tagged Geometric History Length Branch Prediction,” JILP, 2006.
A. Seznec, “TAGE-SC-L Branch Predictors,” CBP, 2014.
A. Seznec, “TAGE-SC-L Branch Predictors Again,” CBP, 2016.
E. Jacobsen, E. Rotenberg, and J.E. Smith, “Assigning Confidence to Conditional Branch Predictions,” MICRO, 1996.
J.R. Allen, K. Kennedy, C. Porterfield, and J. Warren, “Conversion of Control Dependence to Data Dependence,” POPL 1983.
H. Kim, O. Mutlu, J. Stark, and Y.N. Patt, “Wish Branches: Combining Conditional Branching and Predication for Adaptive Predicated Execution,” MICRO, 2005.
E. Riseman and C. Foster, “The Inhibition of Potential Parallelism by Conditional Jumps,” IEEE Transactions on Computers, 1972.
P. Chang, E. Hao, and Y.N. Patt, “Target Prediction for Indirect Jumps,” ISCA 1997.
J. Fisher, “Very Long Instruction Word architectures and the ELI-512,” ISCA 1983.
W. Hwu, S. Mahlke, W. Chen, P. Chang, N. Warter, R. Bringmann, R. Ouellette, R. Hank, T. Kiyohara, G. Haab, and J. Holm, "The Superblock: An Effective Technique for VLIW and Superscalar Compilation," The Journal of Supercomputing, 1993.
P. Chang, S. Mahlke, W. Chen, N. Warter, and W. Hwu, "IMPACT: an Architectural Framework for Multiple-instruction-issue Processors," ISCA 1991.
J. Thornton, “Parallel Operation in the Control Data 6600,” AFIPS 1964.
B.J. Smith, “A Pipelined, Shared Resource MIMD Computer,” ICPP 1978.
P. Kongetira, A. Kathirgamar, and K. Olukotun, “Niagara: A 32-Way Multithreaded SPARC Processor,” Micro, 2005.
B.J. Smith, "Architecture and Applications of the HEP Multiprocessor Computer System," International Society for Optics and Photonics, 1982.
R. Alverson, D. Callahan, D. Cummings, B. Koblenz, A. Porterfield, and B. Smith, "The Tera Computer System," International Conference on Supercomputing, 2014
Suggested video lecture (Lecture 19):
Onur Mutlu - Lecture 16. Static Instruction Scheduling - CMU - 2015
Lecture 20 (11.05 Fri.)
Reading assignments (Lecture 20):
A. Peleg and U. Weiser, "MMX Technology Extension to the Intel Architecture," IEEE Micro, 1996
E. Lindholm, J. Nickolls, S. Oberman, J. Montrym, "NVIDIA Tesla: A Unified Graphics and Computing Architecture," IEEE Micro, 2008
Suggested readings (Lecture 20):
"Packets over a LAN are all it takes to trigger serious Rowhammer bit flips," ArsTechnica, 10.05.2018
A. Tatar, R. Krishnan, E. Athanasopoulos, C. Giuffrida, H. Bos, K. Razavi, "Throwhammer: Rowhammer Attacks over the Network and Defenses," ATC 2018
O. Mutlu, "The RowHammer Problem and Other Issues We May Face as Memory Becomes Denser," DATE 2017
M. Flynn, “Very High-Speed Computing Systems,” Proc. of IEEE, 1966
J.A. Fisher, “Very Long Instruction Word architectures and the ELI-512,” ISCA 1983
Cray Research Inc., “The CRAY X-MP Series of Computer Systems,” 1985
Lecture 21 (17.05 Thu.)
Reading assignments (Lecture 21):
A. Peleg and U. Weiser, "MMX Technology Extension to the Intel Architecture," IEEE Micro, 1996
E. Lindholm, J. Nickolls, S. Oberman, J. Montrym, "NVIDIA Tesla: A Unified Graphics and Computing Architecture," IEEE Micro, 2008
Suggested readings (Lecture 21):
M. Flynn, “Very High-Speed Computing Systems,” Proc. of IEEE, 1966
B. R. Rau, "Pseudo-Randomly Interleaved Memory," ISCA 1991
W.W.L. Fung, I. Sham, G. Yuan. T.M. Aamodt, “Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow,” MICRO 2007
Lecture 22 (18.05 Fri.)
Suggested readings (Lecture 22):
NVIDIA, "CUDA C Programming Guide," Version 9.0, 2018
D.B. Kirk and W.M. Hwu, "Programming Massively Parallel Processors. A Hands-on Approach," Third Edition, 2017
J.A. Fisher, “Very Long Instruction Word architectures and the ELI-512,” ISCA 1983
I.J. Sung, G.D. Liu, W.M. Hwu, "DL: A Data Layout Transformation System for Heterogeneous Computing," INPAR 2012
B. R. Rau, "Pseudo-Randomly Interleaved Memory," ISCA 1991
G.J.v.d. Braak, J. Gomez-Luna, J.M. Gonzalez-Linares, H. Corporaal, N. Guil, "Configurable XOR Hash Functions for Banked Scratchpad Memories in GPUs," IEEE TC, 2016
J. Gomez-Luna, J.M. Gonzalez-Linares, J.I. Benavides, N. Guil, "Performance Modeling of Atomic Additions on GPU Scratchpad Memory," IEEE TPDS, 2013
J. Gomez-Luna, J.M. Gonzalez-Linares, J.I. Benavides, N. Guil, "Performance Models for Asynchronous Data Transfers on Consumer Graphics Processing Units," JPDC, 2012
Lecture 23a (24.05 Thu.)
Reading assignments (Lecture 23a):
H.T. Kung, “Why Systolic Architectures?,” IEEE Computer, 1982
Suggested readings (Lecture 23a):
N. Jouppi, C. Young, N. Patil, D. Patterson, G. Agrawal, R. Bajwa, S. Bates, S. Bhatia, N. Boden, A. Borchers and R. Boyle, “In-datacenter Performance Analysis of a Tensor Processing Unit,” ISCA 2017
A. Krizhevsky, I. Sutskever, G.E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," NIPS 2012
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, "Going Deeper with Convolutions," CVPR 2015
K. He, X. Zhang, S. Ren, J. Sun, “Deep Residual Learning for Image Recognition,” CVPR 2016
M. Annaratone, E. Arnould, T. Gross, H.T. Kung, and M.S. Lam, “Warp Architecture and Implementation,” ACM SIGARCH Computer Architecture News, 1986
M. Annaratone, E. Arnould, T. Gross, H.T. Kung, M. Lam, O. Menzilcioglu, and J.A. Webb, “The Warp Computer: Architecture, Implementation, and Performance," IEEE TC, 1987
J.E. Smith, “Decoupled Access/Execute Computer Architectures,” ISCA 1982
J.E. Smith, G. E. Dermer, B. D. Vanderwarn, S. D. Klinger, and C. M. Rozewski, "The ZS-1 Central Processor,” ACM SIGARCH Computer Architecture News, 1987
J.E. Smith, “Dynamic Instruction Scheduling and the Astronautics ZS-1,” IEEE Computer, 1989
G. Hinton, D. Sager, M. Upton, and D. Boggs, "The Microarchitecture of the Pentium® 4 Processor," Intel Technology Journal, 2001
O. Mutlu, J. Stark, C. Wilkerson, and Y.N. Patt, "Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-order Processors,” HPCA 2003
Lecture 23b (24.05 Thu.)
Reading assignments (Lecture 23b):
D. Harris and S. Harris,
“Chapters 8.1-8.3”
Y.N. Patt and S.J. Patel,
“Chapter 3.5”
Suggested readings (Lecture 23b):
M. Wilkes, “Slave Memories and Dynamic Storage Allocation,” IEEE Trans. On Electronic Computers, 1965
A.W. Burks, H.H. Goldstein, J. von Neumann, “Preliminary Discussion of the Logical Design of an Electronic Computing Instrument,” The Origins of Digital Computers, 1946
J.S. Liptay, “Structural Aspects of the System/360 Model 85 II: The Cache,” IBM Systems Journal, 1968
J. Fotheringham, “Dynamic Storage Allocation in the Atlas Computer, Including an Automatic Use of a Backing Store,” CACM 1961
L. Bloom, M. Cohen, S. Porter, “Considerations in the Design of a Computer with High Logic-to-Memory Speed Ratio,” Proceedings of Gigacycle Computing Systems, 1962
M. Qureshi, D. Lynch, O. Mutlu, Y.N. Patt, “A Case for MLP-Aware Cache Replacement,“ ISCA, 2006
L. Belady, “A Study of Replacement Algorithms for a Virtual-Storage Computer,” IBM Systems Journal, 1966
Lecture 24 (25.05 Fri.)
Reading assignments (Lecture 24):
D. Harris and S. Harris,
“Chapters 8.1-8.3”
Y.N. Patt and S.J. Patel,
“Chapter 3.5”
Suggested readings (Lecture 24):
M. Wilkes, “Slave Memories and Dynamic Storage Allocation,” IEEE Trans. On Electronic Computers, 1965
J.S. Liptay, “Structural Aspects of the System/360 Model 85 II: The Cache,” IBM Systems Journal, 1968
J. Fotheringham, “Dynamic Storage Allocation in the Atlas Computer, Including an Automatic Use of a Backing Store,” CACM 1961
L. Bloom, M. Cohen, S. Porter, “Considerations in the Design of a Computer with High Logic-to-Memory Speed Ratio,” Proceedings of Gigacycle Computing Systems, 1962
M. Qureshi, D. Lynch, O. Mutlu, Y.N. Patt, “A Case for MLP-Aware Cache Replacement,“ ISCA, 2006
Lecture 25a (31.05 Thu.)
Reading assignments (Lecture 25a):
D. Harris and S. Harris,
“Chapters 8.1-8.3”
Y.N. Patt and S.J. Patel,
“Chapter 3.5”
Suggested readings (Lecture 25a):
M. Wilkes, “Slave Memories and Dynamic Storage Allocation,” IEEE Trans. On Electronic Computers, 1965
Lecture 25b (31.05 Thu.)
Reading assignments (Lecture 25b):
D. Harris and S. Harris,
“Chapters 8.4”
readings.txt
· Last modified: 2019/02/12 17:34 (external edit)
Page Tools
Show pagesource
Old revisions
Backlinks
Back to top