Windows 2000 IO Performance Leonard Chung & Jim Gray 4/5/2000 Study Goals  Repeat and Extend the Riedel, et.

Download Report

Transcript Windows 2000 IO Performance Leonard Chung & Jim Gray 4/5/2000 Study Goals  Repeat and Extend the Riedel, et.

Windows 2000 IO Performance
Leonard Chung & Jim Gray
4/5/2000
1
Study Goals

Repeat and Extend the Riedel, et. al paper.
 Many things have changed:
– Software: Windows 2000 instead of NT4SP3
– Hardware: New, faster drives and standards

3 main testing scenarios:
– old-old: “old” machine with NT4SP6
– old-new: “old” machine with Win2000
– new-new: “new” machine with Win2000
4/5/2000
2
Hardware Configurations

“old” hardware:
– 333 MHz PII
– 4 x 7200 RPM UW SCSI drives
– 128 MB SDRAM

“new” hardware:
–
–
–
–
4/5/2000
2 x 733 MHz PIII
4 x 10,000 RPM Ultra160 SCSI drives
256 MB RDRAM
4 x 5400 RPM UltraATA/66 IDE drives on a
3ware card
3
Primary Test Tools
SQLIO – the primary test tool
 CacheFlush – buffered sequential
 DiskCache – PCI/host adapter
throughput
 Memspeed – memory subsystem

4/5/2000
4
Testing Methodology

Before each test:
– Drive formatted
– Test files copied in same order
– Test run

Sequential test files made to live on
outer edge of disk, giving disk’s max
performance and consistent results.
4/5/2000
5
Media Banding

Modern disks are zoned
– More bits stored on outer tracks +
constant angular velocity =
fast outer tracks
– We’ve measured inner tracks on some
drives being up to 40% slower than the
outer tracks
– A “normal” disk map…
4/5/2000
6
Media Banding
4/5/2000
7
Overall Findings

Changes in throughput performance
are incremental rather than radical
– Trendlines have the same general shape
– Most of Riedel’s model still holds
4/5/2000
8
Hardware Bandwidth (RAP)
System Bandwidth: What Riedel Saw
in megabytes per second (not to scale!)
30
9
per disk
72
140
Hard Disk | SCSI | PCI | Memory | Processor
4/5/2000
9
Hardware Bandwidth (PAP)
System Bandwidth Yesterday
in megabytes per second (not to scale!)
40
15
per disk
133
422
Hard Disk | SCSI | PCI | Memory | Processor
4/5/2000
10
Hardware Bandwidth (PAP)
System Bandwidth Yesterday
in megabytes per second (not to scale!)
40
15
per disk
The familiar
bandwidth
pyramid:
133
422
The farther
from the CPU,
the less
the bandwidth.
Hard Disk | SCSI | PCI | Memory | Processor
4/5/2000
11
Hardware Bandwidth (PAP)
System Bandwidth Today
in megabytes per second (not to scale!)
The familiar
pyramid is gone!
PCI is now the
bottleneck!
26
26
160
133
1,600
In practice,
3 disks can reach
saturation using
sequential IO
Hard Disk | SCSI | PCI | Memory | Processor
26
4/5/2000
12
Hardware Bandwidth (PAP)
System Bandwidth Today
Possible
solutions:
in megabytes per second (not to scale!)
26
26
26
160
532
1,600
A fatter, 64bit
66MHz PCI
bus
or…
Hard Disk | SCSI | PCI | Memory | Processor
4/5/2000
13
Hardware Bandwidth (PAP)
System Bandwidth Today
Possible
solutions:
in megabytes per second (not to scale!)
26
26
160
133
26
1,600
26
26
133
160
A fatter, 64bit
66MHz PCI
bus
or…
multiple PCI
busses
26
Hard Disk | SCSI | PCI | Memory | Processor
4/5/2000
14
Hardware Bandwidth (RAP)
System Bandwidth Today (reads)
Numbers we’ve seen
in megabytes per second (not to scale!)
24
each
98.5
98.5
975
Hard Disk | SCSI | PCI | Memory | Processor
4/5/2000
15
old-old:
NT4SP3 vs. NT4SP6
Unbuffered read and WCE writes no
longer show decrease in throughput
Buffered read bug is gone
Overheads are different

NT4SP3
NT4SP6
NT4SP3 Unbuffered Read Throughput
80 10
Read
Throughput (MB/s)
ms/MB)
(cpu (MB/s)
Overhead
Throughput
870
60
650
4
8
Read
Write
6
Write
(WCE)
Write
+ WCE
4
Write
40
30
20
2
Write
2
Read
10
0 0
4/5/2000
0
2
0
2
24 4
32 32 6464 128
4 8 8 8 1616 16 32
128 192
192
64
128
192
Request
Si Size
ze
(K(KB)
Bytes)
Request
Request
Size
(K-Bytes)
NT4SP6
Unbuffered
Read
NT4SP6
Buffered
Overhead
NT4SP6
Buffered
Throughput
10
9
10 850
9 7
8 640
7
5
6 30
4
5
4 320
3 2
2 110
1 0
throughput at various
request depths
Throughput
(MB/s)(MB/s)
Overhead
Throughput
(CPU ms/MB)
NT4SP3
Buffered
Throughput
NT4SP3
Buffered
Overhead
1 Fast Disk, Various Request Depths
10
write (WCE)
read
write
1 deep
3 deep
read
w rite 4 deep
w rice 8(WCE)
deep
0
2
2
2 4
4
8
16
32
64
8
32
64
4 Request
8 1616Size
32(KB)
64
Request
(KB)
Request Size
Size (KB)
128
128
128
256
256
256
16
old-new:
Windows 2000

Software: Major changes, minor
differences
– Dmio: The volume manager for Win2K
• More fixed overhead than ftdisk due to
longer code paths
• More features than ftdisk (dynamically size
volumes, etc.)
– In the end, performance is the same.
• Processors are fast enough that there are
more than enough cycles to spare.
4/5/2000
17
new-new:
Windows 2000

Hardware: The American Way
– Faster, bigger, cheaper
• Disks are now 4 times bigger and 3 times
faster.
• SCSI bus bandwidth has surpassed the PCstandard 32bit, 33MHz PCI bus bandwidth.
• Random IO is unaffected by the PCI
bottleneck.
• Additional SMP processor provided no
additional throughput gains.
4/5/2000
18
new-new:
Windows 2000 Scalability
PCI Bottleneck
Win2K Dynamic
1 disk unbuffered throughput
Win2K Dynamic
2 disk unbuffered throughput
20
read
write
write (WCE)
15
10
5
Throughput (MB/s)
Throughput (MB/s)
25
0
2
4
8
16
32
64
128
50
45
40
35
30
25
20
15
10
5
0
256
read
write
write (WCE)
2
4
8
Request Size (KB)
read
write
write (WCE)
60
50
40
30
20
10
64
128
4
8
16
32
64
Request Size (KB)
256
100
80
60
40
read
20
write (WCE)
0
0
4/5/20002
32
Win2K 3 disk Dynamic, 1 disk Basic
4 disk unbuffered throughput
Throughput (MB/s)
70
16
Request Size (KB)
Win2K Dynamic
3 disk unbuffered throughput
Throughput (MB/s)

128
256
2
4
8
16
32
Request Size (KB)
64
128
256 19
new-new:
Windows 2000 & IDE

The real IO revolution: RAID priced for the
masses!
 The good news:
– IDE disks are cheap
• We bought 5400 RPM IDE 27GB drives for $209
($7.75/GB) while our 10,000 RPM 18GB SCSI drive
cost $534 ($30/GB)
• IDE costs $3.17 per Kaps while SCSI costs $5.09 per
Kaps.
• Today, IDE is $6,500 per TB while SCSI costs $16,000
4/5/2000
20
new-new:
Windows 2000 & IDE

IDE Performance:
•However,
can provide
to
IDE IO/s vs. up
Depth
Atlas 10K SCSImultiple
IO/s vs. DepthIDE disks Fireball
180
180
60%
more
Kaps
for
the
same
price as a single
160
160
140
140
120
120
SCSI
disk.
100
100
IO/s
IO/s
– Single disk random IO performance on a
5400 RPM IDE is much slower than a
10,000 SCSI.
80
60
40
20
0
read
write
1
4/5/2000
2
4
8
Request Depth
16
32
80
60
40
20
0
read
write
1
2
4
8
Request Depth
16
32
21
new-new:
Windows 2000 & IDE

IDE Performance:
– Single disk sequential IO throughput on
a 5400 RPM IDE drive is 80% of the
more expensive 10,000 RPM SCSI drive.
Throughput (MB/s)
30
W in2K 1 disk 3w are IDE
unbuffered throughput
25
25
20
20
15
15
10
1 deep read
2 deep read
1 deep write (WCE)
5
0
2
4
8
16
32
64
128 256
Request Size (KB)
4/5/2000
30
Throughput (MB/s)
Win2K 1 disk SCSI
unbuffered throughput
1
2
1
2
10
5
deep
deep
deep
deep
read
read
w rite
w rite
0
2
4
8
16
32
64
128 256
Re que s t Size (KB)
22
new-new:
Windows 2000 & IDE

Price/Performance for IDE is hard to beat
– Performance
• For sequential and random IO, IDE is
price/performance leader
• Overhead for SCSI and 3ware/DMA IDE is the same.
– Capacity
• 69GB (~2.5 disks worth) of Quantum Fireball lct08s
costs the same as one Quantum Atlas 10K 18GB
disk.
4/5/2000
23
new-new:
Windows 2000 & IDE

The bad news about IDE
– The quality of IDE controllers varies
Revolutions are
being missed due
to slow controller
4/5/2000
24
new-new:
Windows 2000 & IDE

The bad news about IDE
Throughput (MB/s)
25
Western Digital Caviar 30GB
3ware unbuffered read throughput
20
Missing every
other revolution
15
10
1 deep
5
Missing multiple revolutions
0
2
4/5/2000
High controller
overhead is
causing the disk to
miss revolutions
at small request
sizes
4
8
16
32
Request Size (KB)
64
128
256
25
new-new:
Windows 2000 & IDE (3ware)

The bad news about IDE
– IDE RAID isn’t as mature as SCSI
• Driver bugs and incompatibilities
• Problems with multiple IDE drives
– IDE spec gives 18” as the max cable length:
getting cables to drives can be a chore
– Avoid master/slave: reliability and possibly
performance is lost
– No hot swap
4/5/2000
26
new-new:
Windows 2000 & IDE (3ware)

The bad news about IDE
– RAID isn’t as mature as SCSI
• 3ware’s card peaks out at 55MBps for reads
and 40MBps for writes; 3 disks for reads
and 2 for writes.
40
50
50
35
40
30
40
25
30
30
20
deep
111deep
deep
deep
22 deep
2
deep
deep
44 deep
4
deep
deep
88 deep
20
20
15
111000
00
0
unbuffered write
40
5
4/5/2000
45
Win2K 2 disk 3ware hardware RAID0
3 disk 3ware hardware RAID0
Win2K 4
unbuffered write
60
60
8 deep
22
2
44
4
88
1166
32
32
64
64
8
16
32
64
Request
Re que s t Size
Size (KB)
(KB)
1128
28
128
256
256
256
Throughput
Throughput
Throughput (MB/s)
(MB/s)
Throughput (MB/s)
Throughput
Throughput
(MB/s)
(MB/s)
45
60
60
Win2K 2
4 disk
are hardw
are RAID0
Win2K
disk 3w
3ware
hardware
RAID0
Win2K
3 disk
3ware
hardware
RAID0
unbuffered read
read
unbuffered
unbuffered read
50
50
35
30
40
40
25
30
30
20
1 deep
deep
121deep
deep
22deep
deep
4
deep
4 deep
deep
deep
884deep
20
20
15
10
0
11
0
5
00
0
8 deep
22
2
44
4
88
1166
32
64
8
16
32
64
Request
Re que s t Size
Size (KB)
(KB)
27 256
128
256
128
Where do we go from here?

Network IO over Gigabit
– OOB performance and slight tuning

Sqlio2: a complete rewrite of SQLIO
4/5/2000
28
And in conclusion…

NT4SP6
– Unbuffered requests at 2KB, 4KB
request sizes no longer have dip
– Buffered read request bug gone
– Buffered overhead appears to be lower

Windows 2000
– Despite dmio replacing ftdisk,
throughput remains unaffected
4/5/2000
29
And in conclusion…

new-new SCSI performance
– PCI is now the bottleneck with 3 drives able to
reach saturation

new-new IDE
– IDE shows a lot of promise: cheap storage and
good performance
– Difficulty lies with multiple disks
• IDE RAID cards not quite ready for prime time
• Physically wiring the drives
4/5/2000
30