Šiuolaikinių kompiuterių architektūra

Transcript Šiuolaikinių kompiuterių architektūra

COMPUTER
ARCHITECTURE
(P175B125)
Assoc.Prof. Stasys Maciulevičius
Computer Dept.
[email protected]
Virtual memory

Modern computers can simultaneously run several
programs (in parallel or pseudoparallel mode)

Each such program (process) has a separate code
and data area

The mechanism that provides for more processes run
simultaneously, sharing the memory correctly and
correct addresing of information, transforming logical
addresses to physical addresses is called Virtual
Memory
2009-2013
©S.Maciulevičius
2
Virtual memory

Moreover, the virtual memory ensure that the
necessary for process information (program code
and data) at the appropriate time would be loaded
in the main memory, protects the process memory
space from other processes

Virtual memory in physical point of view - the main
memory plus part of the external memory, together
with the tools for address transforming and
information interchanging between these levels

Virtual memory in logical point of view - extended
memory space with contiguous addressing
2009-2013
©S.Maciulevičius
3
Virtual memory
There are two main principles used in
realization of virtual memory:
1) Segmentation - an application's virtual address space is
divided into variable-length segments. A virtual address
consists of a segment number and an offset within the
segment. Task may have several segments – code
(program), data, stack.
2) Paging – an application's virtual address space is divided
into fixed sized pages; a page is a block of contiguous
virtual memory addresses, usually at least 4Kbytes in size.
The pages do not have to be contiguous in memory
2009-2013
©S.Maciulevičius
4
Main and external memory
CPU
reg.
Swap in
segment
(page)
Cache
Main memory
Swap out
segment
(page)
External memory
2009-2013
©S.Maciulevičius
5
Segmented virtual memory
Op. system
Op. system
Op. system
Op. system
2’nd process
2’nd process
2’nd process
2’nd process
3’rd process
3’rd process
3’rd process
3’rd process
4’th process
4’th process
5’th process
5’th process
1’st process
2009-2013
©S.Maciulevičius
6’th process
5’th process
5’th process
6
Segmenting
31
24 23
Segment No.
0
Memory
Byte address
8000
2 segm
Segment table
8
Segm.No.
0
1
2
…
32-bit base
24
20000
20000H
4F000H
8000H
….
0 segm.
4F000
1 segm.
2009-2013
©S.Maciulevičius
7
Simple mechanism of segmentation
Program length
Data length
MUX
Compa-
Stack length
rator
Page fault
Program base
Data base
MUX
Sum-
Stack base
mator
Physical
address
Segment
Offset
2009-2013
©S.Maciulevičius
8
Segmentation principles in IA-32
15
0 31
0
Selector
Effective address
Descriptor table
Segment
descriptor
Base address
+
31
0
Linear (physical) address
2009-2013
©S.Maciulevičius
9
Segmentation mechanism in IA-32
In IA-32 architecture segmentation is supported by
folowing tools:
• mechanism for calculation of physical address
• segment descriptor tables:
• local descriptor table (LDT)
• global descriptor table (GDT)
• interrupt descriptor table (IDT)
• privilegy system
Each table has assigned to it the processor register,
which holds:
• 16-bit limit (size of table)
• 32-bit base address (table location in memory)
2009-2013
©S.Maciulevičius
10
Address spaces in IA-32
IA-32 architecture has three such address spaces:
• logical address space; logical (or virtual) address
consists of two integers: a 16-bit segment selector and a
32-bit offset; space size is 214 selectors  4 GB = 64 TB
• linear address space; linear address appears on the
output of segmentation unit, as result of logical address
translation;
• physical address space; physical address appears on
the output of paging unit; in case when paging is not
used, physical address equals to linear address; this
address (BE7-BE0 bits and A31-A3) goas to main
memory
2009-2013
©S.Maciulevičius
11
Simple mechanism of paging
Logical page number
Byte offset
Page table
Protection bits
Page frame number
2009-2013
©S.Maciulevičius
Byte offset
12
Segmentation with paging in IA-32
15
0 31
0
Selector
Effective address (offset)
Descriptor’s
index
32
14
Segmentation
unit
32
Linear
address
Paging
unit
(optional)
32
31
0
Physical address
2009-2013
©S.Maciulevičius
13
Segmentation mechanism in IA-32
2009-2013
©S.Maciulevičius
14
Paging in IA-32
Linear address
Directory
Page No.
Byte offset
Page (4 KB)
10
12
10
Target
Page table
Page directory
PTE
PDE
CR3 (PDBR)
2009-2013
Control register
(Page Directory Phys. Base Address)
©S.Maciulevičius
15
Address translation
Transformation of virtual address to physical is
address translation.
The problem - the extra step – access to page
table. How to speed up the memory access?
1. To store whole page table in processor - is unrealistic
because the page table takes a lot of place megabytes. For example, if the page size is 4 KB, a 4
GB of memory takes 4 GB / 4 KB = 1024 K pages!
2. To store part of page table in special cache in
processor. Each entry in this cache ensure fast access
even to 1000 words
2009-2013
©S.Maciulevičius
16
Translation lookaside buffer (TLB)
Effective
address
from CPU
Logical page number
Byte offset
Miss
TLB
Page
table
Load
TLB
Page not
in main
memory
Page
swap with
hard disc
Hit
OR
Page frame number
Byte offset
Physical address to main memory
2009-2013
©S.Maciulevičius
17
TLB and cache
Effective Logical page number
Byte offset Byte
address
Index
Tag
Index
TLB
Cache
Hit
=?
Hit
Physical
address
=?
Hit
Data line
2009-2013
©S.Maciulevičius
Byte
No.
18
Memory management unit
A memory management unit (MMU) is a
computer hardware component responsible for
handling accesses to memory requested by the
CPU
Its functions include:
• translation of virtual addresses to physical addresses
(i.e., virtual memory management)
• memory protection
• cache control
• bus arbitration, and, in simpler computer architectures
(especially 8-bit systems)
• bank switching
2009-2013
©S.Maciulevičius
19
AMD K7 microarchitecture
2009-2013
©S.Maciulevičius
20
Parity checking
 A parity checking refers to the use of parity bits
to check that data has been writted and readed
accurately
 The parity bit is added to every data unit
(typically byte)
 The parity bit for each unit is set so that:
 unit has either an odd number or
 unit has an even number of set bits
2009-2013
©S.Maciulevičius
21
Parity checking
0 1 0 1 0 1 1 0 k
k = b0  b 1  …  b7
or
k = 1  b 0  b1  …  b7
1
Address
bus AR
1
Data
bytes
Data bus
n
k
DR
generating
parity bits /
checking
n
n
m
Parity bits
Error
Data bus
k
Usually k=n/8
2009-2013
©S.Maciulevičius
22
Error-checking and correcting
Hamming code: a error-correcting code, can detect up to two
simultaneous bit errors, and correct single-bit errors
Code length usually is
With additional parity
k=log2n + 1;
- k=log2n + 2
1
Address
bus
1
bytes
n
k
Data
AR
Data bus
DR
Hamming
check k
n
m
ECC bits
k
Generating of
Hamming code
Error
Correcn
n
tion
k
Error
2009-2013
©S.Maciulevičius
23
Hamming code
Bit
12
11
10
9
8
7
6
5
4
3
2
1
2009-2013
Binary
1100
1011
1010
1001
1000
0111
0110
0101
0100
0011
0010
0001
Contr. Data
D8
D7
D6
D5
K8
D4
D3
D2
Hamming code detects
up to two
simultaneous bit
errors, and corrects
single-bit errors
K4
D1
K2
K1
©S.Maciulevičius
24
Generating of Hamming code
K1 = D1  D2 
D4  D5 
D7
K2 = D1 
D3  D4 
D6  D7
K4 =
D2  D3  D4 
D8
K8 =
D5  D6  D7  D8
Let data byte is 00111001, bit D1 – at right. Then:
K1 = 1  0  1  1  0 = 1
K2 = 1  0  1  1  0 = 1
K4 = 0  0  1  0 = 1
K8 = 1  1  0  0 = 0
Data byte is saved im memory with control bits:
001101001111
2009-2013
©S.Maciulevičius
25
Error correction
Let error occus, e.g., instead:
001101001111
we have
001101101111
Calculate Hamming code:
K1 = 1  0  1  1  0 = 1
K2 = 1  1  1  1  0 = 0
K4 = 0  1  1  0 = 0
K8 = 1  1  0  0 = 0
Use sum mod2:
K8 K4 K2 K1
0
0
0
2009-2013
1
0
1
1
0
1
1
1
0 – this is number of fault bit
©S.Maciulevičius
26
Error correction: redundancy
Single error
correction
Single error correction,
double error detection
Data
bits
ECC
bits
Redundancy, %
ECC bits
Redundancy, %
8
4
50,00
5
62,50
16
5
31,25
6
37,50
32
6
18,75
7
21,88
64
7
10,4
8
12,50
128
8
6,25
9
7,03
256
9
3,52
10
3,91
2009-2013
©S.Maciulevičius
27
External memory
As long-term storage in computers are
used:






hard drives
CD-ROM, CDs (optical compact discs)
DVDs
flash memory
floppy disks (outdated)
strimmers.
2009-2013
©S.Maciulevičius
28
External memory
2009-2013
©S.Maciulevičius
29
External memory
Access modes:
 direct access
 sequential access
Parameters:
 capacity
 access time
 data transfer spped
 relative price
2009-2013
©S.Maciulevičius
30
First HD
 IBM announced the IBM 350 storage unit
as a component of the RAMAC 305
computer system on September 13, 1956
 Assembled with covers, the 350 was 60
inches long, 68 inches high and 29 inches
deep
 It was configured with 50 magnetic disks
containing 50,000 sectors, each of which
held 100 alphanumeric characters, for a
capacity of 5 million characters
2009-2013
©S.Maciulevičius
31
First HD
• 50 platters
• 1 head
2009-2013
©S.Maciulevičius
32
First HD
 Disks rotated at 1,200 rpm ; tracks (20 to
the inch) were recorded at up to 100 bits per
inch, and typical head-to-disk spacing was
800 microinches
 The execution of a "seek" instruction
positioned a read-write head to the track
that contained the desired sector and
selected the sector for a later read or write
operation
 Seek time averaged about 600 milliseconds
2009-2013
©S.Maciulevičius
33
Hard disk drive
2009-2013
©S.Maciulevičius
34
Hard disk drive




Platters vary in size and hard disk drives come in
two form factors, 5.25in or 3.5in
Typically two or three or more platters are stacked
on top of each other with a common spindle
Tthe head flies a fraction of a millimetre above the
disk. On early hard disk drives this distance was
around 0.2mm. In modern-day drives this has been
reduced to 0.07mm or less
There's a read/write head for each side of each
platter, mounted on arms
2009-2013
©S.Maciulevičius
35
Hard disk drive


The disk controller controls the drive's servomotors and translates the fluctuating voltages from
the head into digital data for the CPU
More often than not, the next set of data to be read
is sequentially located on the disk. For this reason,
hard drives contain between 256KB and 8MB of
cache buffer in which to store all the information in
a sector or cylinder in case it's needed. This is very
effective in speeding up both throughput and
access times
2009-2013
©S.Maciulevičius
36
Technical specifications



Capacity: Amount of data which can be stored on
a hard drive
Transfer rate: Quantity of data which can be read
or written from the disk per unit of time. It is
expressed in bits per second (Mb/s)
Rotational speed: The speed at which the platters
turn, expressed in rotations per minute (rpm for
short). Hard drive speeds are on the order of 7200
to 15000 rpm. The faster a drive rotates, the higher
its transfer rate. On the other hand, a hard drive
which rotates quickly tends to be louder and heats
up more easily
2009-2013
©S.Maciulevičius
37
Technical specifications



Latency (also called rotational delay): The length
of time that passes between the moment when the
disk finds the track and the moment it finds the
data
Average access time: Average amount of time it
takes the read head to find the right track and
access the data. In other words, it represents the
average length of time it takes the disk to provide
data after having received the order to do so. It
must be as short as possible
Radial density: number of tracks per inch (tpi).


Linear density: number of bits per inch (bpi) on a given track.
Surface density: ratio between the linear density and radial density
(expressed in bits per square inch).
2009-2013
©S.Maciulevičius
38
Technical specifications


Cache memory: Amound of memory located on
the hard drive. Cache memory is used to store the
drive's most frequently-accessed data, in order to
improve overall performance
Interface: the connections used by the hard drive.
The main hard drive interfaces are IDE/ATA,
SATA, SCSI
2009-2013
©S.Maciulevičius
39
Information on disk
2009-2013
©S.Maciulevičius
40
Information on disk
 The data is organised in concentric circles called
"tracks"
 The tracks are separated into areas called
sectors, containing data (generally at least 512
octets per sector)
 The term cylinder refers to all data found on the
same track of different platters
 The term clusters (also called allocation units)
refers to minimum area that a file can take up on
the hard drive
2009-2013
©S.Maciulevičius
41
Hard disk
Formatted and unformatted disk
capacity
Capacity = Number_of_cylinders 
Number_of_surfaces 
Number_of_sectors/cilinder 
sector_size
Modern disk capacity is at least 500 GB,
advanced disks even reach 4 TB
2009-2013
©S.Maciulevičius
42
Hard disk
Access time depends on the following
parameters:
 cylinder seek time
 delay on the rotation
 transfer time
Information transmission time depends on:
 recording density and
 disk rotational speed
2009-2013
©S.Maciulevičius
43
Old disk - MD Maxtor 33073H3
Capacity
30 GB
ATA-5 / Ultra ATA/100
Integrated interface
Buffer size/ type
Surfaces / Heads
Platters
Arreal density
2 MB SDRAM
3
2
14.7 Gb / sq. in. max
Track density
Linear density
Bytes per sector/ Block
34 000 tpi
354 - 431 kb/colyje
512
Sectors in track
373 - 746
60 032 448
Sectors in disk
2009-2013
©S.Maciulevičius
44
Old disk - MD Maxtor 33073H3
Seek time (read op.)
Track-track
Average
Rotational speed(+ 0.1%)
Data transfer rate
To/from interface (Ultra ATA/100, DMA M5)
To/from interface (PIO 4 / Multi-word DMA
M5)
To/from medium
Start time
2009-2013
©S.Maciulevičius
1.0 ms
9.5 ms
5400 RPM
to 100 MB/s.
to 16.7 MB/s
to 46.7 MB/s
8.5 s
45
Recording methods
Traditional recording method –horizontal recording:
N
S N
S S
N S
N N
S S
N N
S
Now a new recording method is in use – vertical
(perpendicular) recording. The bits are in a
vertical arrangement instead of horizontal in order to
take up less space. By 2010, perpendicular
densities are expected to exceed 500 Gb/sq. in.
2009-2013
S
N
N
S
S
N
S
S
S
N
N
S
S
N
N
S
N
N
N
S
©S.Maciulevičius
46
Disk density
Disk Density is measured and is also called
areal density
 Now how is this density calculated? For the
most part the density we measure in Bit per
Inch (BPI) and track per inch (TPI)
 When we multiply the TPI and BPI we get
areal density
 RAMAC had an areal density of 2,000 bit/in²

2009-2013
©S.Maciulevičius
47
Disk density - 2008




In 2012 the highest areal density was around 625Gb/inch2.
HDD areal densities measuring data-storage capacities are
projected to climb to a maximum 1800Gb/inch2 per platter
by 2016, up from 744Gb/inch2 in 2011, as shown in the
figure below
This means that from 2011 to 2016, the five-year compound
annual growth rate (CAGR) for HDD areal densities will be
equivalent to 19%
For this year, HDD areal densities are estimated to reach
780Gb/inch2 per platter, and then rise to 900Gb/inch2 next
year.
2009-2013
©S.Maciulevičius
48
Disk density
2009-2013
©S.Maciulevičius
49
Disk capacity
2009-2013
©S.Maciulevičius
50
Flash memory
 Flash memory refers to a particular type of EEPROM
(Electronically Erasable Programmable Read Only
Memory). It is a memory chip that maintains stored
information without requiring a power source
 Flash memory differs from EEPROM in that
EEPROM erases its content one byte at a time. This
makes it slow to update. Flash memory can erase its
data in entire blocks, making it a preferable
technology for applications that require frequent
updating of large amounts of data
2009-2013
©S.Maciulevičius
51
Flash memory
Flash memory combines several useful
features:
 high packing density (the cell is 30% smaller than
the DRAM)
 maintaining stored information without requiring a
power supply
 erasing and recording information using electrical
signals
 low energy consumption
 high reliability and
 low price
2009-2013
©S.Maciulevičius
52
Flash memory
 Flash memory is used primarily as:



rarely rewritten (eg, BIOS) memory
compact exchangeable memory in computers (USB
keys)
compact exchangeable memory in PDAs, digital
cameras, digital audio players etc.
 E.g., Kingston


DataTraveler 200 is 32GB-128GB capacity
(DataTraveler 300 – 256GB), has 20MB/sec read,
10MB/sec write speed
DataTraveler Vault has 256-AES hardware-based
encryption, 2GB-32GB capacity
2009-2013
©S.Maciulevičius
53
Solid state memory




Solid state memory or a solid state drive (SSD) is a
device that uses no moving parts to store data
The first ferrite memory SSD devices, or auxiliary memory
units as they were called at the time, emerged during the
era of vacuum tube computers
In the 1970s and 1980s, SSDs were implemented in
semiconductor memory for early supercomputers of IBM,
Amdahl and Cray; however, the high price of the SSDs
made them quite seldom used
RAM "disks" were popular as boot media in the 1980s
when hard drives were expensive, floppy drives were slow
2009-2013
©S.Maciulevičius
54
Solid state memory




2004: Texas Memory Systems' RamSan-325 can carry
out 250,000 I/O operations a second.
Available in capacities of 128, 96, 64, and 32 gigabytes,
RamSan-325 accelerates I/O intensive applications by
delivering random data at sustained rates exceeding 1.5
Gbps
Non-volatile product has high availability architecture with
redundant and hot swappable power supplies, redundant
batteries
However, build using 512Mb of DDR RAMs, device was
quite expensiv – 16 GB device costs $36.000
2009-2013
©S.Maciulevičius
55
Solid state memory
 Now fash memory is media for building solid state
memory devices
 These devices can range up to 512GB (or even
more)
 Flash memory used as a hard drive has many
advantages over a traditional hard drive
 It is silent, much smaller than a traditional hard
drive, and highly portable with a much faster
access time
 However, the advantages of a traditional hard
drive are price and capacity
2009-2013
©S.Maciulevičius
56
SSD
 Most SSD manufacturers use non-volatile flash memory to
create more compact devices for the consumer market
 These flash memory-based SSDs, also known as flash
drives, do not require batteries. They are often packaged
in standard disk drive form factors (1.8-, 2.5-, and 3.5-inch)
 In addition, non-volatility allows flash SSDs to retain
memory even during sudden power outages, ensuring data
persistence
 Flash memory SSDs are slower than DRAM SSDs and
some designs are slower than even traditional HDDs on
large files, but flash SSDs have no moving parts and thus
seek times and other delays inherent in conventional
electro-mechanical disks are negligible
2009-2013
©S.Maciulevičius
57
SSD prices – some facts
 In March 2007 SanDisk announced it was offering its 32GB
2.5" SATA SSD to oems for $350. In July 2008 OCZ said
its fast Core series 2.5" SSDs were available with an price
of $169 for 32GB
 October 2009: Active Media Products launched its Aviator
312 line of bus powered fast USB 3.0 external SSDs with
R/W speeds upto 240MB/s and 160MB/s respectively.
Capacity options include:- 16GB ($89), 32GB ($119) and
64GB ($209)
 2013: Kingston SSDNow V300 Series SV300S37A/120G
2.5" 120GB SATA III Internal SSD - $102.99
 SanDisk Extreme SDSSDX-480G-G25 2.5" 480GB SATA
III SSD - $369.99
2009-2013
©S.Maciulevičius
58
SSD and HD
2009-2013
©S.Maciulevičius
59
Hybrid hard drive
 Certain technology meets half-way between hard
drive and solid-state drive, such as the hybrid drive,
and ReadyBoost
 A hybrid drive, sometimes called hybrid hard drive,
uses a small SSD as a cache. The SSD is often
flash memory
 ReadyBoost is a part of the Microsoft Windows
Vista operating system that uses compatible flash
memory as a drive for a disk cache
 A random disk read from the cache is generally 80
to 100 times faster than random disk read from a
traditional hard drive
2009-2013
©S.Maciulevičius
60
Hybrid hard drive
Hybrid drive
Controller
2009-2013
Cache controller
Interface
Flash memory
S.Maciulevičius
Hard disk
61
Solid State Hybrid Drives
2009-2013
©S.Maciulevičius
62
Adaptive Memory™ technology
 Adaptive Memory™ technology from Seagate
selectively tackles data that is frequently read and
time–consuming to fetch. Seagate SSHD drives
can then copy this data into the flash
 Adaptive Memory technology makes such efficient
use of the drive’s solid state memory that only 4GB
to 8GB of flash capacity is actually needed. This
reduces costs so much that it’s now practical to
employ enterprise-class SLC NAND flash memory,
the fastest and most reliable type of flash memory
on the market
2009-2013
©S.Maciulevičius
63
Intel Smart Response Technology
 Smart Response Technology (SRT) is a proprietary
caching mechanism introduced in 2011 by Intel for
their Z68 chipset (for the Sandy Bridge–series
processors), which allows a SATA solid-state drive
(SSD) to function as cache for a (conventional,
magnetic) hard disk drive
 This provides the advantage of having a hard disk
drive (or a RAID volume) for maximum storage
capacity while delivering an SSD-like overall
system performance experience
2009-2013
©S.Maciulevičius
64
Intel Smart Response Technology
Time To Run
Cold Boot Unigine
Fallout 3
Photoshop
Elements)
No SSD Cache
28 sec
40 sec
13 sec
19 sec
SSD Cache - Pass 1
23 sec
35 sec
13 sec
19 sec
SSD Cache - Pass 2
18 sec
24 sec
8 sec
19 sec
SSD Cache - Pass 3
16 sec
24 sec
7 sec
18 sec
SSD Cache - Pass 4
15 sec
24 sec
7 sec
18 sec
2009-2013
©S.Maciulevičius
65