PPT Slides - Edward Bosworth, Ph.D.

Download Report

Transcript PPT Slides - Edward Bosworth, Ph.D.

Modern Computers
von Neumann Machines
and
Other Computers
Computer Types
• Recall that a “von Neumann” machine is
a standard stored-program computer.
• Most of the computers we use are of
this type.
• We shall discuss the essential organization
of such a computer, and then mention a few
changes to yield a faster CPU.
• We then mention several other designs, called
“non Von” – not plain stored program devices.
2
The von Neumann Inheritance
• The EDVAC, designed in 1945, was one of the
first stored program machines.
• All modern computers are modifications of
this design; only the technical implementation
has been changed.
• Other designs have been suggested, emulated
on a von Neumann machine, and abandoned
because the emulation worked so well.
3
Types of Modern Computers
1.
2.
3.
4.
5.
6.
General numerical computation, involving
both integers and real numbers.
Device automation and control.
Message switching, including routers and
firewalls on the Internet.
Computer–generated graphics.
Graphics–based computer games.
Computer–enhanced video.
(How about those extra lines superimposed
on football fields?)
4
General-purpose and
Special-purpose
• Most computers are general-purpose devices,
tailored to a given purpose by the software.
• Sometimes a high-volume market demand
justifies a special-purpose design. Examples:
• 1. Routers used on the Internet.
• 2. Graphics cards, such as the NVIDIA cards.
• NVIDIA has launched the CUDA (Compute
Unified Device Architecture), allowing the
graphics card to be used as a standard CPU.
5
Components of a Stored Program
Computer
• The four major components of a modern
stored program computer are:
1. The Central Processing Unit (CPU)
2. The Primary Memory (also called “core
memory” or “main memory”)
3. The Input / Output system
4. One or more system busses to allow
the components to communicate.
6
Components of a Stored Program
Computer (Page 2)
7
Major Components
• The system memory (of which this computer has
512 MB) is used for transient storage of programs
and data. This is accessed much like an array,
with the memory address serving the function of
an array index.
• The Input / Output system (I/O System) is used
for the computer to save data and programs and
for it to accept input data and communicate
output data. Technically the hard drive is an I/O
device. It is also considered a memory device.
8
Major Components (Page 2)
The Central Processing Unit (CPU) handles execution of the program. It has
four main components:
1.
2.
3.
4.
The ALU (Arithmetic Logic Unit), which performs all of the
arithmetic and logical operations of the CPU, including logic
tests for branching.
The Control Unit, which causes the CPU to follow the instructions
found in the assembly language program being executed.
The register file, which stores data internally in the CPU. There
are general purpose (user) registers and special purpose
registers used by the Control Unit.
A set of 3 internal busses to allow the CPU units to communicate.
This number is set by the ALU structure.
9
The ALU
• The Arithmetic Logic Unit is designed to
handle standard arithmetic, such as the
dyadic (2 input) operation C = A + B.
• It has two inputs and one output.
10
More on the CPU
• The CPU has a number of special-purpose
registers, including the IP, IR, MAR, and MBR.
• The IR holds the instruction being executed.
It is accessed by the control unit, which emits
control signals to cause the CPU to execute
the instruction properly.
• The IP (Instruction Pointer) holds the address
of the instruction to be executed next.
11
Structure of the CPU
• The control unit interprets the instruction in
the Instruction Register.
12
The Simple Top Level Model
Logically speaking, the computer has one bus
used to allow all components to communicate.
Early designs, such as the PDP-11, had only one
bus. For these slow machines, it worked well.
13
Realistic Organizations
• The design on the previous slide is logically
correct, but IT WON’T WORK.
• IT IS TOO SLOW. Problem: A single system
level bus cannot handle the load.
• Modern gamers demand fast video; this
requires a fast bus to the video chip.
• The memory system is always a performance
bottleneck. We need a dedicated memory bus
in order to allow acceptable performance.
14
First Revision of the Design
• Here we have separate dedicated busses
to each of memory and the graphics system.
• Here, the slow I/O devices do not hold back
the very much faster ones.
15
Second Revision of the Design
• Legacy I/O devices of various speeds must
be accommodated by the design.
• Here an I/O Control Hub (ICH) manages two
busses, one for legacy devices.
16
Current State of the Design
17
The Motherboard
• In personal computers, a motherboard is the
central printed circuit board (PCB) used to
connect many of the system components.
• All busses external to the CPU are on the
motherboard.
• The motherboard includes connector sockets
for the CPU and many I/O peripherals.
• The motherboard must be matched to the
CPU, especially in the bus speed.
18
Picture of a Motherboard
19
The Chip Set
• The term “chip set” refers to a set of VLSI
chips that are designed to work together.
• The chip set is specific to a given CPU.
It mostly relates to chips on the mother board.
• Based on Intel Pentium-class microprocessors,
the term chipset often refers to a specific pair
of chips on the motherboard: the Northbridge
and the Southbridge.
20
North and South
• The Northbridge links the CPU to very highspeed devices, especially main memory and
graphics controllers, and the Southbridge
connects to lower-speed peripheral buses
(such as PCI or ISA). In many modern chipsets,
the Southbridge contains some on-chip
integrated peripherals, such as Ethernet, USB,
and audio devices.
21
The Memory Component
• The memory stores the instructions and data for
an executing program.
• Memory is characterized by the smallest
addressable unit:
Byte addressable - the smallest unit is an 8–bit
byte.
Word addressable - the smallest unit is a word,
usually 16 or 32 bits in length.
• Most modern computers are byte addressable,
facilitating access to character data.
22
Logical View of Memory
• Logically, computer memory should be
considered as an array.
The index into this array is called the address
or “memory address”.
• A logical view of such a byte addressable
memory might be written in code as:
Const MemSize = 65536
byte Memory[MemSize]
// Indexed 0 … (MemSize – 1)
23
Simplistic View of Memory
• Memory is seen as a single linear array.
• The MAR holds the address; the MBR, data.
24
Memory Operations
• Read sequence:
Put address into MAR; command a READ.
• Then copy the contents of the MBR
into an internal CPU register.
• Write sequence:
Put address into MAR; data into the MBR.
• Then command a WRITE.
25
Multi-Level Memory
• What we want is a very large memory, in which
each memory element is fabricated from very fast
components. But fast means expensive.
• What we can afford is a very large memory, in
which each memory element is fabricated from
moderately fast, but inexpensive, components.
• Modern computers achieve good performance
from a large, moderately fast, main memory by
using a multi-level cache memory.
• Quite often, there are 2 or 3 levels of cache.
26
The Principle of Locality
• It is the principle of locality that allows cache
memory to deliver faster performance.
• This principle is based on observed memory
access patterns of real programs.
• When a program accesses a memory location:
1. It will likely access memory close to it.
2. It will likely access that location again.
27
A Modern Two-Level Cache
• All Pentium designs have at least two levels
of cache memory. L2 holds 1 to 2 MB.
• The L1 cache is split, with an Instruction Cache
and a Data Cache.
28
The Split L1 Cache
• Memory can do only one thing at a time.
• The split L1 cache is realized as 2 independent
very fast memories. The CPU can access both
at the same time.
• This parallel access allows modern CPU design
tricks, especially pipelining.
• Note that the CPU does not write back to the
I cache. That makes it simpler and faster.
29
Multicore Designs Have 3 Cache Levels
30
A Reasonably Modern Memory Chip
31
The Fetch-Execute Cycle
• Each instruction is fetched from the memory
and executed. The IP (Instruction Pointer)
holds the address of the instruction.
• The common fetch cycle follows this plan.
• MAR  IP. // Get address of the instruction.
• READ.
// Put the address into the MAR
// and read memory.
• IR  MBR. // Place the instruction into IR.
32
Memory Stores Both
Instructions and Data
• A stored program
computer can allow
instructions and data to
be intermixed.
• This normally causes no
problems, but it does
have that potential.
33
Conventional Division of Memory
• One conventional way
to handle memory is to
divide it into logical
segments.
• Here we see 2 segments
for one program: data
and instructions.
• For historical reasons,
the instructions may be
called “text”.
34
Two Programs in Memory
• Here is a depiction of 2
programs in memory.
• Each has a unique
subroutine, as well as
using common trig
functions.
• Note that the two
copies of the trig
functions.
35
Share Common Subprograms
• Dynamic Link Libraries
are an outgrowth of the
idea that common
subprograms can be
shared.
• Any shared subprogram
must be written in a
special style, called
“reentrant”.
36