Progress Dump & Loads

Download Report

Transcript Progress Dump & Loads

Database Dump/Load
Ben Holmes
Eaton Corp
© 2002 Eaton Corporation. All rights reserved.
About Ben

Currently with Eaton Corp FPG division as a Sr. Progress
DBA for the past 11 years

Started Programming with Progress in 1989, version 4.2

Progress Consultant/DBA for over 12 years.

Worked at QAD as Technical Service Manager.

Currently administrate over 80 production environments
(4-5 db’s each, over 320)
Sizes range from 20GB to 480GB

2
Who are You?
3

Progress Version

Largest Database

Database Operating System
Reasons NOT to Dump & Load

Because it is X months since the last time.

Because the vendor said we should.

Because we must in order to upgrade Progress.

Because we are told it will improve performance.

When’s the next 3 day weekend?

When is the best time based upon various statistics
 Scatter Factor
 Fragmentation Factor
 Application and/or Utility Performance
4
Why I would Dump/Load

Improve Performance by reducing Table Scatter Factor &
Record Fragmentation

Reclaim Space in the Database


Migrate between different Progress Versions (e.g. from V9 to
V10 with new storage area)
Migrate between different Platforms (e.g. Windows to HP/UX)

When is the best time based upon various statistics
 Scatter Factor
 Fragmentation Factor
 Application and/or Utility Performance
5
When you need to D&L?
6

Change the Database Blocksize (V8.2+)

Change the Records per Block (RPB)

Convert to Multiple Storage Areas (V9+)

Convert to OE10 Type 2 Storage Areas

Recover Data from a Corrupted Database
Physical Scatter Factor
7

How close are the Records in a Table to each
other physically

Obtained from proutil dbanalys/tabanalys

Progress recommendations are way low for most
production databases; you will spend more
weekends doing D&Ls then seeing your family
Scatter Factor

For Large, Frequently Accessed Tables:
 .9
Data overlap problem (bug)
 1.0
Green Flag - Perfect
 1.1-2.0 Probably OK
 2.1-3.0 Yellow Flag - Deteriorating
 3.1-4.0 Performance problems are
probably
 4.1+
8
being experienced
Red Flag - take action ASAP
Example
RECORD BLOCK SUMMARY FOR AREA "GLTRHIST" : 35
-------------------------------------------------------Record Size (B)- ---Fragments--- Scatter
Table
Records
Size
Min
Max Mean
Count Factor
PUB.gltr_hist
118323698
24.4G
127
241
221 198899694
1.0
PUB.absc_det
477
19.1K
41
41
41
477
1.0
PUB.absr_det
89324
10.1M
97
195
118
89324
1.0
PUB.acd_det
133684
9.2M
48
91
72
133684
1.0
PUB.anl_det
1664
99.5K
35
79
61
1664
1.0
PUB.ans_det
2224 127.3K
44
76
58
2224
1.0
PUB.an_mstr
1991 126.9K
42
82
65
1991
1.0
9
Factor
3.9
6.2
4.0
3.8
5.7
5.5
5.6
Fragmentation Factor




10
A Fragment is a Record
A Record can consist of one or more Fragments
The more Fragments per Record the more I/O is
required to read and update a Record
VSTs (_ActRecord) can tell you how much Excess
Fragment I/O is actually occurring but not by table;
also the statistics are somewhat suspicious
General Setup

Archive & Purge as much data as possible
 The D&L will take less time
 The Scattering will be eliminated by the D&L

Perform an Index Rebuild (if possible)
 Might improve performance enough to avoid a D&L but
unlikely in my experience
 Might reveal hidden DB corruption that needs to be dealt
with first (i.e. 1124 errors)
 Should make the dump process go faster
11
Dump/Load Setup

Run & Time some Heavy Duty Reports (multi-table,
many records, reproducible)

Run & Time proutil dbanalys
 Use Record Counts before the Dump and after the
Load to Validate that the Load was successful
 Might Reveal Corruption (i.e. 1124 errors)
12
Disk Preparation

Consider Disabling Disk Mirroring
 Extra Dump Space
 Availability of 2nd Disk Controller
 Potential (usually minor) Performance Gain
 But don’t forget there may be overhead of resyncing the mirrors

Dump to non-database disks if possible
 AI/BI/Temp File Disks are good choices
13
Dump/Load Options

“Classic” Dictionary Dump & Load

Bulk Loader

Binary Dump & Load

Parallel – Multi
 Many Benefits
 Difficult to Effectively Balance
14

Automating the process

Buffer COPY
Classic / 4GL Dump

Progress Dictionary/Custom 4GL Code
 Dictionary has a Simple Interface
 Usually the Slowest Method
 Dictionary Dump/Load Programs can be run noninteractively
 Custom Coding requires 4GL experience
 Dump files are subject to 2GB limit but can code
around it
15
Binary Dump

Binary Dump (V8.2 and later)
 Documented & Supported
 proutil command can only Dump/Load one table at a time (one
proutil per Table)
 Multi-Threading - parallel dump are possible
 Dump files are portable
 No 2GB limit (.b1 b.2….)
 Larger than 2GB single dump file on V9.1B and later
 Possibility exists of “carrying” Database corruption to the new
Database
16
Binary Dump Tips
17

Use the Read Only (-RO)

Benchmark

Don’t specify a large -B with -RO

Use bin.p (or bin-nt.p) on the BravePoint web site to
generate D&L scripts (UNSUPPORTED) or
dumpload.p in DBA Resource Kit

OE v10:proutil dbname -C dump table -index 0

Thread by storage area.
What Dump to Use

Do Parallel Dumps if…
 You want to go home earlier
 You have multiple CPUs
 You have multiple (non-DB) disks with free space
18

1-4 dump threads per CPU is usually safe

Once a table is finished dumping, you can start
loading into a database on the other system or
set of disks (I use the TEST DB)
What to Consider




19
Forman stats “If a table is large (10+ million
records), the Binary dump might not be the
fastest method because it is single threaded”
Consider multiple, parallel, 4GL dumps
Bulk Load the Data (proutil bulkload)
Alternative in V9.1D+; you can use proutil
dumpspecified instead of dump but major
limitations – more in a later slide
Deep Thoughts

Sometimes using a non-primary index is faster
particularly if the secondary index is ‘smaller’

Don’t forget to dump Sequences and the User
table (can’t Binary dump)

Don’t forget SQL92 Privileges and Database
_user table.
20
proutil dumpspecified
21

Can’t have multiple streams dumping a table to
the directory at the same time (table.bd)

-index does not work

Primary Index needs to be a single Field

The Index must have the same name as the
Field
Load - Dictionary

Dictionary Load/Custom 4GL Code
 Same Advantages & Disadvantages as
the Dictionary/custom Dump
 Slowest Option (except in Parallel maybe...)
22
Load - BulkLoad

Bulk Load
 Option on proutil
 Can load Dictionary or 4GL dump (.d) files
 Very Fast but not quite as fast as Binary
 Single Threaded only
 Index Rebuild is required
23
LOAD - Binary

Single or Parallel loads are possible

Start the Database Broker to:
 Observe the speed (Records Created)
 Avoid BI Recovery for every LOAD
24

Performance use the No Integrity (-i) Option

A number of problems(bugs) in the older
versions (below 8.3C)
Load - Parallel

V8 - NO, Increases the Scatter Factor; slow

V9/OE10 - One load thread per Area

Don’t Forget




25
No Integrity (-i)
APWs, BIW
-bibufs
-directio (maybe)
Big Cluster Size (16mb+)
-spin
No AI
Index Rebuild



26
Backup the Database BEFORE you start the Index Rebuild
If the Index Rebuild fails you might not be able to restart it
Disk Sort Method
 Fastest (but single threaded)
 Builds a more compact index
 A Sort file is created on disk
 Sort File on Non-DB Disks = 20% Faster
Disk Sort

Disk Sort Method
 Sort File Size Estimate: 1X-2X Data Size to be
absolutely safe
 The Sort file is subject to the 2GB limit until
V9.1D SP07 or 08
 V8.* allows Multi-Volume Sort files
 V8: Don't put TABs in the .srt file
27
Memory Sort





28
Much Slower than Disk Sort
Less Compact Index
Use V9+ idxcompact to compact the Index
 A compaction percentage can be specified
 Can be run online or offline
 Only in V9.1E and above
No Disk Space Required
Use a larger -B (but not too large)
Index Rebuild Options
29

-TB 31
Disk Sort

-TM 32
Disk Sort

db.srt
Multi-volume Sort file

-t
Disk Sort; Unix

-B
Useful for Memory Sort only

-SS
V9.1B ‘build indexes’ option (shortly)

-SG
Sort Group; SP07; default 48; use 64
Belt & Suspenders
 Compare Record Counts
 Check Logs for Known & Unknown Errors
• grep for fail, error, (1124)
 Check for Success Binary
• grep “ 0 errors” /tmp/dl/*.log | wc –l
• grep “ dumped” /tmp/dl/*.log
• grep “ loaded” /tmp/dl/*.log
• ls –l /tmp/dl/*.d
30
Buffer-Copy & Raw-Transfer



Very Fast
Eliminates The Middle Man (temp file IO
operations)
Provides fine-grained, ABL level of control
 Allows on the fly data manipulation
 Useful when merging databases



Can use with remote connections to bridge
version numbers.
In OE10.1 performance is essentially equal
Cannot use RAW-TRANSFER with -RO
Benchmarks (by others)
32

Dict Load/Idx Inactive/idxbuild: 27:07

Dict Load/Idx Active (3 threads):23:57

Bulk Load/idxbuild:
16:10

Serial Binary/idxbuild:
15:15

Parallel Binary/idxbuild:
15:29

Serial Binary with -SS:
19:56

Parallel Binary with -SS:
38:04
Eaton Benchmark
33

80 GB MFG/PRO Database

Progress 8.3E

Single thread Binary dump time:6hrs

Multi-thread Binary dump time: 4.5hrs

Single thread Binary LOAD:10hrs

Multi-thread Binary Load:4.5 hrs

Index Rebuild: 12 hrs
Eaton Benchmark

Progress 10.1C DB: 2.3GB Tab: 3mins

Data Dict Dump: 1:15 hrs

Longet table:47 min = multi-thread 47 mins

Single thread Binary dump time:00:17 hrs

Multi-thread Binary dump time:00:09hrs

ON NAS


All Var Ext Single thread Binary LOAD:9 min idx 12min

All Var Ext Multi-thread Binary Load:10min idx 12min

Fixed Ext Multi-thread Binary Load:13 mins indx 10 mins
On EMC disk
 Multi-thread load 4 mins idx 5 mins
34
Dictionary Dump Compare
dbname=phroaux
tfile=/cit/log/qad/$dbname.tabanalys.130916
lst=`ls -1 /prog_bkup_new/phr/ascii/*.d`
for i in $lst
do
xdump=0
xtabdump=0
xtable=`grep filename= $i|cut -f2 -d"=" `
xdump=`grep records= $i|cut -f2 -d"="`
xtabdump=$(grep -i "PUB.$xtable " $tfile|tr -s " " "" |cut -f2 -d" ")
xdump=$(expr $xdump \* 1)
xtabdump=$(expr $xtabdump \* 1)
[[ $xdump -eq $xtabdump ]] && echo "Matched " $xtable :$xtabdump:$xdump:
[[ ! $xdump -eq $xtabdump ]] && echo $i $xtable NO Match :${xtabdump}"-"${xdump}:
[[ ! $xdump -eq $xtabdump ]] && read a
done
35
Binary Dump Multi-Thread
DEF VAR X AS INT.
clear all.
DEF VAR xx AS INT.
update w-dbname label "DB Name"
DEF VAR xcnt as INT.
def var w-str as char.
def var w-dbname as char format "x(50)".
def var w-dumpdir as char format "x(78)".
help "Enter Full path and DB name ie:
/mfgpro/phr/db1/phrpmfg " skip
w-dumpdir label "Dump dir"
def var b as int.
def var bb as int.
DEF STREAM out1.
DEF STREAM out2.
help "Enter Full path of the Dump directory"
with frame a side-labels scrollable.
Assign X = 1
DEF STREAM out3.
xx = 0.
DEF STREAM out4.
DEF STREAM out5.
OUTPUT STREAM out1 TO "bdump1.sh".
OUTPUT STREAM out2 TO "bdump2.sh".
OUTPUT STREAM out3 TO "bdump3.sh".
OUTPUT STREAM out4 TO "bdump4.sh".
OUTPUT STREAM out5 TO "bdump5.sh".
36
/* find the total number of tables first */
for each _file no-lock
where ( _file-number > 0 and _file-number < 30000):
xcnt = xcnt + 1.
end.
Binary 2
/* create dump scripts */
b = 2.
IF X = 2 THEN PUT STREAM out2 UNFORMATTED
"$DLC/bin/proutil " + w-dbname + " -C dump "
+ _file-name + " " + w-dumpdir + w-str skip.
FOR EACH _file NO-LOCK
WHERE (_file-number > 0 and _file-number < 30000)
break by _file-name:
xx = xx + 1.
w-str = " > " + _file-name + ".log ".
if xx mod b = 0 then assign b = b + 2
w-str = w-str + " &".
bb = xx mod b.
IF xx MOD (xcnt / 5) = 0 THEN X = X + 1.
/* display xx xcnt x b bb . pause. */
IF X = 1
THEN PUT STREAM out1 UNFORMATTED "$DLC/bin/proutil "
+ w-dbname + " -C dump “
+ _file-name + " " + w-dumpdir + w-str skip.
IF X = 3
THEN PUT STREAM out3 UNFORMATTED
"$DLC/bin/proutil " + w-dbname + " -C dump "
+ _file-name + " " + w-dumpdir + w-str skip.
IF X = 4
THEN PUT STREAM out4 UNFORMATTED
"$DLC/bin/proutil " + w-dbname + " -C dump "
+ _file-name + " " + w-dumpdir + w-str skip.
IF X = 5
THEN PUT STREAM out5 UNFORMATTED
"$DLC/bin/proutil " + w-dbname + " -C dump "
+ _file-name + " " + w-dumpdir + w-str skip.
END.
OUTPUT STREAM out1 CLOSE.
OUTPUT STREAM out2 CLOSE.
OUTPUT STREAM out3 CLOSE.
OUTPUT STREAM out4 CLOSE.
OUTPUT STREAM out5 CLOSE.
37
Additional Resources

http://www.greenfieldtech.com/download
s.shtml


Buffer-Copy
Pro D&L : BravePoint
 minimize D&L down time
38
Questions?
39
Thank you for
your time
40