03 Top Level View of Computer Function and Interconnection

Download Report

Transcript 03 Top Level View of Computer Function and Interconnection

William Stallings
Computer Organization
and Architecture
8th Edition
Chapter 3
Top Level View of Computer
Function and Interconnection
Program Concept
• Hardwired systems are inflexible
• General purpose hardware can do
different tasks, given correct control
signals
• Instead of re-wiring, supply a new set of
control signals
Program Concept
• Hardwired systems are inflexible
What is a program?
• A sequence of steps
• For each step, an arithmetic or logical
operation is done
• For each operation, a different set of
control signals is needed
Basic Programme Code
Function of Control Unit
• For each operation a unique code is
provided
—e.g. ADD, MOVE
• A hardware segment accepts the code and
issues the control signals
• We have a computer!
Components
• The Control Unit and the Arithmetic and
Logic Unit constitute the Central
Processing Unit
• Data and instructions need to get into the
system and results out
—Input/output
• Temporary storage of code and results is
needed
—Main memory
Computer Components:
Top Level View
Instruction Cycle
• Two steps:
—Fetch
—Execute
Fetch Cycle
• Program Counter (PC) holds address of
next instruction to fetch
• Processor fetches instruction from
memory location pointed to by PC
• Increment PC
—Unless told otherwise
• Instruction loaded into Instruction
Register (IR)
• Processor interprets instruction and
performs required actions
Execute Cycle
• Processor-memory
—data transfer between CPU and main memory
• Processor I/O
—Data transfer between CPU and I/O module
• Data processing
—Some arithmetic or logical operation on data
• Control
—Alteration of sequence of operations
—e.g. jump
• Combination of above
Hypothetical Machine
Example of Program Execution
Instruction Cycle State Diagram
Interrupts
• Mechanism by which other modules (e.g.
I/O) may interrupt normal sequence of
processing
• Program
—e.g. overflow, division by zero
• Timer
—Generated by internal processor timer
—Used in pre-emptive multi-tasking
• I/O
—from I/O controller
• Hardware failure
—e.g. memory parity error
Program Flow Control
Interrupt Cycle
• Added to instruction cycle
• Processor checks for interrupt
—Indicated by an interrupt signal
• If no interrupt, fetch next instruction
• If interrupt pending:
—Suspend execution of current program
—Save context
—Set PC to start address of interrupt handler
routine
—Process interrupt
—Restore context and continue interrupted
program
Transfer of Control via Interrupts
Instruction Cycle with Interrupts
Program Flow Control
Program Timing
Short I/O Wait
Program Flow Control
Program Timing
Long I/O Wait
Instruction Cycle (with Interrupts) State Diagram
Multiple Interrupts
• Disable interrupts
—Processor will ignore further interrupts whilst
processing one interrupt
—Interrupts remain pending and are checked
after first interrupt has been processed
—Interrupts handled in sequence as they occur
• Define priorities
—Low priority interrupts can be interrupted by
higher priority interrupts
—When higher priority interrupt has been
processed, processor returns to previous
interrupt
Multiple Interrupts - Sequential
Multiple Interrupts – Nested
Time Sequence of Multiple Interrupts
(Interrupt
service
routine)
Connecting
• All the units must be connected
• Different type of connection for different
type of unit
—Memory
—Input/Output
—CPU
PCI Express bus card slots (from top to bottom: x4, x16, x1 and
x16), compared to a traditional 32-bit PCI bus card slot
(bottom).
Computer Modules
Memory Connection
• Receives and sends data
• Receives addresses (of locations)
• Receives control signals
—Read
—Write
—Timing
Input/Output Connection(1)
• Similar to memory from computer’s
viewpoint
• Output
—Receive data from computer
—Send data to peripheral
• Input
—Receive data from peripheral
—Send data to computer
Input/Output Connection(2)
• Receive control signals from computer
• Send control signals to peripherals
—e.g. spin disk
• Receive addresses from computer
—e.g. port number to identify peripheral
• Send interrupt signals (control)
CPU Connection
•
•
•
•
Reads instruction and data
Writes out data (after processing)
Sends control signals to other units
Receives (& acts on) interrupts
Buses
• There are a number of possible
interconnection systems
• Single and multiple BUS structures are
most common
• e.g. Control/Address/Data bus (PC)
• e.g. Unibus (DEC-PDP)
What is a Bus?
• A communication pathway connecting two
or more devices
• Usually broadcast
• Often grouped
—A number of channels in one bus
—e.g. 32 bit data bus is 32 separate single bit
channels
• Power lines may not be shown
Data Bus
• Carries data
—Remember that there is no difference between
“data” and “instruction” at this level
• Width is a key determinant of
performance
—8, 16, 32, 64 bit
Address bus
• Identify the source or destination of data
• e.g. CPU needs to read an instruction
(data) from a given location in memory
• Bus width determines maximum memory
capacity of system
—e.g. 8080 has 16 bit address bus giving 64k
address space
—2
16
=
=
=
2 10 X 2 6
2 6 X 2 10
64 k
Control Bus
• Control and timing information
—Memory read/write signal
—Interrupt request
—Clock signals
Bus Interconnection Scheme
Big and Yellow?
• What do buses look like?
—Parallel lines on circuit boards
—Ribbon cables
—Strip connectors on mother boards
– e.g. PCI
—Sets of wires
Bus
Physical Realization of Bus Architecture
Single Bus Problems
• Lots of devices on one bus leads to:
—Propagation delays
– Long data paths mean that co-ordination of bus use
can adversely affect performance
– If aggregate data transfer approaches bus capacity
• Most systems use multiple buses to
overcome these problems
Traditional (ISA)
(with cache)
High Performance Bus
Bus Types
• Dedicated
—Separate data & address lines
• Multiplexed
—Shared lines
—Address valid or data valid control line
—Advantage - fewer lines
—Disadvantages
– More complex control
– Ultimate performance
Bus Arbitration
• More than one module controlling the bus
• e.g. CPU and DMA controller
• Only one module may control bus at one
time
• Arbitration may be centralised or
distributed
Centralised or Distributed Arbitration
• Centralised
—Single hardware device controlling bus access
– Bus Controller
– Arbiter
—May be part of CPU or separate
• Distributed
—Each module may claim the bus
—Control logic on all modules
Timing
• Co-ordination of events on bus
• Synchronous
—Events determined by clock signals
—Control Bus includes clock line
—A single 1-0 is a bus cycle
—All devices can read clock line
—Usually sync on leading edge
—Usually a single cycle for an event
Synchronous Timing Diagram
Asynchronous Timing – Read Diagram
Asynchronous Timing – Write Diagram
Point-to-Point Interconnect
Principal reason for change
was the electrical
constraints encountered
with increasing the
frequency of wide
synchronous buses
At higher and higher data
rates it becomes
increasingly difficult to
perform the
synchronization and
arbitration functions in a
timely fashion
A conventional shared bus
on the same chip magnified
the difficulties of increasing
bus data rate and reducing
bus latency to keep up with
the processors
Has lower latency, higher
data rate, and better
scalability
Quick Path
+Interconnect
 Introduced in 2008
 Multiple direct connections
 Direct pairwise connections to other components
eliminating the need for arbitration found in shared
transmission systems
 Layered protocol architecture
 These processor level interconnects use a layered
protocol architecture rather than the simple use of
control signals found in shared bus arrangements
 Packetized data transfer
 Data are sent as a sequence of packets each of
which includes control headers and error control
codes
QPI
QPI
I/O Hub
PCI Express
I/O device
DRAM
Core
D
DRAM
Core
C
Multicore
Configuration
Using
QPI
I/O device
I/O device
DRAM
Core
B
I/O device
Core
A
DRAM
I/O Hub
Memory bus
Figure 3.20 Multicore Configuration Using QPI
QPI Layers
Packets
Protocol
Protocol
Routing
Routing
Flits
Link
Phits
Physical
Figure 3.21 QPI Layers
Flit – flow control unit
Phit – physical unit
Link
Physical
Physical Interface of the Intel QPI
Interconnect
COMPONENT A
Fwd Clk
Transmission Lanes
Reception Lanes
Rcv Clk
Rcv Clk
Reception Lanes
Transmission Lanes
Fwd Clk
Intel QuickPath Interconnect Port
Intel QuickPath Interconnect Port
COMPONENT B
Figure 3.22 Physical Interface of the Intel QPI Interconnect
QPI Multilane Distribution
bit stream of flits
#2n+1
#2n
#n+2
#n+1
#n
#2
#2n+1
#n+1
#1
QPI
lane 0
#2n+2
#n+2
#2
QPI
lane 1
#3n
#2n
#n
QPI
lane 19
#1
Figure 3.23 QPI Multilane Distribution
+
QPI Link Layer
•
• Performs two key
functions: flow control
and error control
— Operate on the
level of the flit
(flow control unit)
— Each flit consists of
a 72-bit message
payload and an 8bit error control
code called a cyclic
redundancy check
(CRC)
Flow control function
— Needed to ensure that a
sending QPI entity does
not overwhelm a receiving
QPI entity by sending data
faster than the receiver
can process the data and
clear buffers for more
incoming data
•
Error control function
— Detects and recovers
from bit errors, and so
isolates higher layers
from experiencing bit
errors
QPI Routing and Protocol Layers
Routing Layer
• Used to determine the
course that a packet will
traverse across the
available system
interconnects
• Defined by firmware
and describe the
possible paths that a
packet can follow
Protocol Layer
• Packet is defined as the
unit of transfer
• One key function
performed at this level
is a cache coherency
protocol which deals
with making sure that
main memory values
held in multiple caches
are consistent
• A typical data packet
payload is a block of
data being sent to or
from a cache
PCI Bus
•
•
•
•
Peripheral Component Interconnection
Intel released to public domain
32 or 64 bit
50 lines
PCI Expresss Configuration
Core
Gigabit
Ethernet
Core
PCIe
Memory
Chipset
PCIe–PCI
Bridge
PCIe
Memory
PCIe
PCIe
PCIe
Legacy
endpoint
PCIe
Switch
PCIe
endpoint
PCIe
PCIe
endpoint
PCIe
endpoint
Figure 3.24 Typical Configuration Using PCIe
PCIe Protocol Layers
Transaction
Data Link
Physical
Transaction layer
packets (TLP)
Data link layer
packets (DLLP)
Transaction
Data Link
Physical
Figure 3.25 PCIe Protocol Layers
PCIe Protocol layers
• Physical: Actual wires carrying the signals, and
circuitry and logic to support the transmission and
receipt of the 1s and 0s.
• • Data link: Responsible for reliable transmission
and flow control. Data packets Data Link Layer
Packets (DLLPs).
• • Transaction: Generates and consumes data packets
for load/store data transfer mechanisms, manages the
flow control of packets between the two components
on a link. Data packets are called Transaction Layer
Packets (TLPs).
PCIe Multilane Distribution
B4
B0
128b/
130b
PCIe
lane 0
B5
B1
128b/
130b
PCIe
lane 1
B6
B2
128b/
130b
PCIe
lane 2
B7
B3
128b/
130b
PCIe
lane 3
byte stream
B7
B6
B5
B4
B3
B2
B1
B0
round robin
Figure 3.26 PCIe Multilane Distribution
+The TL supports four address spaces:
•
Memory
— The memory space
includes system main
memory and PCIe I/O
devices
— Certain ranges of
memory addresses map
into I/O devices
• Configuration
— This address space
enables the TL to
read/write
configuration registers
associated with I/O
devices
• I/O
— This address space is
used for legacy PCI
devices, with reserved
address ranges used to
address legacy I/O
devices
• Message
— This address space is
for control signals
related to interrupts,
error handling, and
power management
1
Sequence number
Start
DLLP
0 to 4096
Data
0 or 4
ECRC
4
LCRC
1
STP framing
(a) Transaction Layer Packet
2
CRC
1
End
Appended by Physical Layer
Header
Appended by Data Link Layer
12 or 16
Created by Transaction Layer
4
(b) Data Link Layer Packet
Figure 3.28 PCIe Protocol Data Unit Format
Appended by PL
2
STP framing
Created
by DLL
Number
of octets
1
PCIe
Protocol
Data
Unit
Format
TLP Memory Request Format
32 bits
16 octets
R Fmt
Type
R
Traffic
Class
R
T E
Attr R
E P
Requestor ID
Tag
Length
Last
DW BE
First
DW BE
Address [63:32]
Address [31:2]
Figure 3.29 TLP Memory Request Format
R
Summary
• Computer
components
• Computer function
— Instruction fetch and
execute
— Interrupts
— I/O function
• Interconnection
structures
• Bus interconnection
— Bus structure
— Multiple bus hierarchies
— Elements of bus design
— Point-to-point
interconnect
— QPI physical layer
— QPI link layer
— QPI routing layer
— QPI protocol layer
— PCI express
— PCI physical and
logical architecture
— PCIe physical layer
— PCIe transaction layer
— PCIe data link layer