Description
Efnisyfirlit
- Cover image
- Title page
- Table of Contents
- In Praise of Computer Organization and Design: The Hardware/Software Interface, Sixth Edition
- Copyright
- Dedication
- Preface
- About This Book
- About the Other Book
- Changes for the Sixth Edition
- Instructor Support
- Concluding Remarks
- Acknowledgments for the Sixth Edition
- 1. Computer Abstractions and Technology
- 1.1 Introduction
- 1.2 Seven Great Ideas in Computer Architecture
- 1.3 Below Your Program
- 1.4 Under the Covers
- 1.5 Technologies for Building Processors and Memory
- 1.6 Performance
- 1.7 The Power Wall
- 1.8 The Sea Change: The Switch from Uniprocessors to Multiprocessors
- 1.9 Real Stuff: Benchmarking the Intel Core i7
- 1.10 Going Faster: Matrix Multiply in Python
- 1.11 Fallacies and Pitfalls
- 1.12 Concluding Remarks
- Historical Perspective and Further Reading
- 1.13 Historical Perspective and Further Reading
- 1.14 Self-Study
- 1.15 Exercises
- 2. Instructions: Language of the Computer
- 2.1 Introduction
- 2.2 Operations of the Computer Hardware
- 2.3 Operands of the Computer Hardware
- 2.4 Signed and Unsigned Numbers
- 2.5 Representing Instructions in the Computer
- 2.6 Logical Operations
- 2.7 Instructions for Making Decisions
- 2.8 Supporting Procedures in Computer Hardware
- 2.9 Communicating with People
- 2.10 MIPS Addressing for 32-bit Immediates and Addresses
- 2.11 Parallelism and Instructions: Synchronization
- 2.12 Translating and Starting a Program
- 2.13 A C Sort Example to Put It All Together
- 2.14 Arrays versus Pointers
- Advanced Material: Compiling C and Interpreting Java
- 2.15 Advanced Material: Compiling C and Interpreting Java
- 2.16 Real Stuff: ARMv7 (32-bit) Instructions
- 2.17 Real Stuff: ARMv8 (64-bit) Instructions
- 2.18 Real Stuff: RISC-V Instructions
- 2.19 Real Stuff: x86 Instructions
- 2.20 Going Faster: Matrix Multiply in C
- 2.21 Fallacies and Pitfalls
- 2.22 Concluding Remarks
- Historical Perspective and Further Reading
- 2.21 Historical Perspective and Further Reading
- 2.24 Self-Study
- 2.25 Exercises
- 3. Arithmetic for Computers
- 3.1 Introduction
- 3.2 Addition and Subtraction
- 3.3 Multiplication
- 3.4 Division
- 3.5 Floating Point
- 3.6 Parallelism and Computer Arithmetic: Subword Parallelism
- 3.7 Real Stuff: Streaming SIMD Extensions and Advanced Vector Extensions in x86
- 3.8 Going Faster: Subword Parallelism and Matrix Multiply
- 3.9 Fallacies and Pitfalls
- 3.10 Concluding Remarks
- Historical Perspective and Further Reading
- 3.11 Historical Perspective and Further Reading
- 3.12 Self-Study
- 3.13 Exercises
- 4. The Processor
- 4.1 Introduction
- 4.2 Logic Design Conventions
- 4.3 Building a Datapath
- 4.4 A Simple Implementation Scheme
- A Multicycle Implementation
- 4.5 A Multicycle Implementation
- 4.6 An Overview of Pipelining
- 4.7 Pipelined Datapath and Control
- 4.8 Data Hazards: Forwarding versus Stalling
- 4.9 Control Hazards
- 4.10 Exceptions
- 4.11 Parallelism via Instructions
- 4.12 Putting It All Together: The Intel Core i7 6700 and ARM Cortex-A53
- The ARM Cortex-A53
- Performance of the A53 Pipeline
- The Intel Core i7 6700
- Performance of the i7
- 4.13 Going Faster: Instruction-Level Parallelism and Matrix Multiply
- Advanced Topic: an Introduction to Digital Design Using a Hardware Design Language to Describe and Model a Pipeline and More Pipelining Illustrations
- 4.14 An Introduction to Digital Design Using a Hardware Design Language to Describe and Model a Pipeline and More Pipelining Illustrations
- 4.15 Fallacies and Pitfalls
- 4.16 Concluding Remarks
- Historical Perspective and Further Reading
- 4.17 Historical Perspective and Further Reading
- 4.18 Self Study
- Self-Study Answers
- 4.19 Exercises
- 5. Large and Fast: Exploiting Memory Hierarchy
- 5.1 Introduction
- 5.2 Memory Technologies
- 5.3 The Basics of Caches
- 5.4 Measuring and Improving Cache Performance
- 5.5 Dependable Memory Hierarchy
- 5.6 Virtual Machines
- 5.7 Virtual Memory
- 5.8 A Common Framework for Memory Hierarchy
- 5.9 Using a Finite-State Machine to Control a Simple Cache
- 5.10 Parallelism and Memory Hierarchy: Cache Coherence
- Parallelism and Memory Hierarchy: Redundant Arrays of Inexpensive Disks
- 5.11 Parallelism and the Memory Hierarchy: Redundant Arrays of Inexpensive Disks
- Advanced Material: Implementing Cache Controllers
- 5.12 Advanced Material: Implementing Cache Controllers
- 5.13 Real Stuff: The ARM Cortex-A8 and Intel Core i7 Memory Hierarchies
- Performance of the Cortex-A53 and Core i7 Memory Hierarchies
- 5.14 Going Faster: Cache Blocking and Matrix Multiply
- 5.15 Fallacies and Pitfalls
- 5.16 Concluding Remarks
- Historical Perspective and Further Reading
- 5.17 Historical Perspective and Further Reading
- 5.18 Self-Study
- Self-Study Answers
- 5.19 Exercises
- 6. Parallel Processors from Client to Cloud
- 6.1 Introduction
- 6.2 The Difficulty of Creating Parallel Processing Programs
- 6.3 SISD, MIMD, SIMD, SPMD, and Vector
- 6.4 Hardware Multithreading
- 6.5 Multicore and Other Shared Memory Multiprocessors
- 6.6 Introduction to Graphics Processing Units
- 6.7 Domain Specific Architectures
- 6.8 Clusters, Warehouse Scale Computers, and Other Message-Passing Multiprocessors
- 6.9 Introduction to Multiprocessor Network Topologies
- Communicating to the Outside World: Cluster Networking
- 6.10 Communicating to the Outside World: Cluster Networking
- 6.11 Multiprocessor Benchmarks and Performance Models
- 6.12 Real Stuff: Benchmarking the Google TPUv3 Supercomputer and an NVIDIA Volta GPU Cluster
- 6.13 Going Faster: Multiple Processors and Matrix Multiply
- 6.14 Fallacies and Pitfalls
- 6.15 Concluding Remarks
- Historical Perspective and Further Reading
- 6.16 Historical Perspective and Further Reading
- 6.17 Self-Study
- Answers to Self-Study
- 6.18 Exercises
- Appendices
- Appendix A. Assemblers, Linkers, and the SPIM Simulator
- A.1 Introduction
- A.2 Assemblers
- A.3 Linkers
- A.4 Loading
- A.5 Memory Usage
- A.6 Procedure Call Convention
- A.7 Exceptions and Interrupts
- A.8 Input and Output
- A.9 SPIM
- A.10 MIPS R2000 Assembly Language
- A.11 Concluding Remarks
- A.12 Exercises
- Further Reading
- Appendix B. The Basics of Logic Design
- B.1 Introduction
- B.2 Gates, Truth Tables, and Logic Equations
- B.3 Combinational Logic
- B.4 Using a Hardware Description Language
- B.5 Constructing a Basic Arithmetic Logic Unit
- B.6 Faster Addition: Carry Lookahead
- B.7 Clocks
- B.8 Memory Elements: Flip-Flops, Latches, and Registers
- B.9 Memory Elements: SRAMs and DRAMs
- B.10 Finite-State Machines
- B.11 Timing Methodologies
- B.12 Field Programmable Devices
- B.13 Concluding Remarks
- B.14 Exercises
- Further Reading
- Appendix C. Graphics and Computing GPUs
- C.1 Introduction
- C.2 GPU System Architectures
- C.3 Programming GPUs
- C.4 Multithreaded Multiprocessor Architecture
- C.5 Parallel Memory System
- C.6 Floating-point Arithmetic
- C.7 Real Stuff: The NVIDIA GeForce 8800
- C.8 Real Stuff: Mapping Applications to GPUs
- C.9 Fallacies and Pitfalls
- C.10 Concluding Remarks
- C.11 Historical Perspective and Further Reading
- Further Reading
- Appendix D. Mapping Control to Hardware
- D.1 Introduction
- D.2 Implementing Combinational Control Units
- D.3 Implementing Finite-State Machine Control
- D.4 Implementing the Next-State Function with a Sequencer
- D.5 Translating a Microprogram to Hardware
- D.6 Concluding Remarks
- D.7 Exercises
- Appendix E. Survey of Instruction Set Architectures
- E.1 Introduction
- E.2 A Survey of RISC Architectures for Desktop, Server, and Embedded Computers
- E.3 The Intel 80×86
- E.4 The VAX Architecture
- E.5 The IBM 360/370 Architecture for Mainframe Computers
- E.6 Historical Perspective and References
- Glossary
- Further Reading
- Index
- MIPS Reference Data Card (“Green Card”)
Reviews
There are no reviews yet.