OpenVMS Solutions Center Lab Project - Spring 2004 : Oracle 9i RAC DT/HA in a distributed OpenVMS Environment Phase I – Failover.
Download ReportTranscript OpenVMS Solutions Center Lab Project - Spring 2004 : Oracle 9i RAC DT/HA in a distributed OpenVMS Environment Phase I – Failover.
OpenVMS Solutions Center Lab Project - Spring 2004 : Oracle 9i RAC DT/HA in a distributed OpenVMS Environment Phase I – Failover
RAC DT/HA – Goals – Phase I
First: Demonstrate that Oracle 9iRAC continues to run during simulated network failure using LAN Failover and failSAFE IP configurations.
Second: Measure the latency effect of failover when RAC instances are connected over long distance (100km).
RAC DT/HA – What is Failover?
Oracle RAC failover: The ability to resume work on an alternate instance upon instance failure Oracle TAF (Transparent Application Failover): Runtime failover which enables client applications to automatically reconnect to the database if the connection fails LAN Failover: Hardware failover from failed network interface card (NIC) to another NIC configured as part of LAN failover set failSAFE IP: Address failover to alternate interfaces
RAC DT/HA – Hardware Config
2 4-cpu GS160, with Shared Cluster System disk, a Shared Oracle install disk on Enterprise Storage Array connected via Fibre SAN A Switch DE602 AA (EIA) NIC’s, using Twisted Pair on 100m-bit LAN Extreme Summit4 Switch 5-DEGPA-SA, 1-DEGXA-SA (EWA-D) NIC’s, 1Gbit fiber on 1Gbit LAN Digital Networks DNSwitch 800 100km cable - Gbit SCS Extreme Summit 7i Switch
RAC DT/HA – Server Config
OpenVMS 7.3-2, TCPIP 5.4
Oracle Server 9.2.0.4, with Oracle patch for bug fix 3026720: Excessive CPU and BUFIO for LMD0 and SMON processes when >2cpu Running 2 RAC instances, in 2 node cluster Requires the INIT
RAC DT/HA – Client Config
9.2 SQLNet Client, on PC running Windows 2000 Benchmark/Load Generating software: • Swingbench 2.1f An ‘unofficial’, Java based, client load generating tool from Oracle, which allows a ‘load’ to be generated and the transactions/response times to be charted • Configured to connect 100 clients, load balanced between the 2 instances, and run 50,000 ‘typical’ Order Entry transactions
RAC DT/HA – Test Plan
Restore from disk backup before each test run to ensure same starting point Ensure RAC instances communicating over specified network interface Run 3 iterations of same benchmark load while collecting data • Run Benchmark load, no failures • Run Benchmark load, fail instance • Run Benchmark load, fail network connection between instances
RAC DT/HA – Data collection
T4 running on both nodes, 10sec sampling interval Saved Swingbench data results after each run Executed and ‘saved’ output of VMS commands during network failures to see status of network devices and Oracle processes
$ MC LANCP SHOW DEVICE/CHARATERISTICS LLA0 $ TCPIP SHOW INTERFACES/FULL $ PIPE SHO SYS|SEA TT: ORA_CPU
Tabular Timeline Tracking Tool – T4
Created by OpenVMS Sustaining Engineers to help diagnose OS functionality. Uses OpenVMS Monitor data, stored in Comma Separated Value file format (.csv file), which can then be used by a variety of applications (spreadsheets, TlViz, etc) Download from web. Shipped with OpenVMS 7.3-2, in SYS$ETC directory http://h71000.www7.hp.com/openvms/products/t4/index.html
Users are able to queue data collection and configure data collection frequency Helpful in establishing baseline performance footprint which can then be used in before and after comparisons of system changes T4 ‘hooks’ for Oracle and Rdb Server being created
RAC DT/HA – EIA Network
GS160 - QBB0 Oracle RAC network connection using EIA device EIA0 161.114.69.7
EVA Common System Disk Shared Oracle 9i Fiber San A Switch Database 100 M-bit Lan Extreme Summit 4 Switch EIA0 161.114.69.8
GS160 - QBB3 PC Swingbench Client 1 PC Swingbench Client 2 .
.
.
PC Swingbench Client 100
RAC DT/HA – T4 data - EIA
1,400,000 1,300,000 1,200,000 1,100,000 1,000,000 900,000 800,000 700,000 600,000 500,000 400,000 300,000 200,000 100,000
EIA0 - Baseline
16:20:00 (26-Mar-2004) 16:25:00 (26-Mar-2004) 16:30:00 (26-Mar-2004) 16:35:00 (26-Mar-2004) 16:40:00 (26-Mar-2004)
[NET.EIA0:]Bytes Recv/Sec(# 1)
Node QBB0
16:45:00 (26-Mar-2004) 1,400,000 1,300,000 1,200,000 1,100,000 1,000,000 900,000 800,000 700,000 600,000 500,000 400,000 300,000 200,000 100,000
RAC DT/HA - LAN Failover Network
GS160 - QBB0 Oracle RAC connection using LLA0 device for LAN Failover EIA0 161.114.69.7
EWA0 EWB0 EVA LLA0 10.3.3.1
Common System Disk Shared Oracle 9i Database Fiber San A Switch 100 M-bit Lan Extreme Summit 4 Switch G-Bit LAN Digital Networks DNswitch 800 LLA0 10.3.3.2
EWA0 EWB0 PC Swingbench Client 1 PC Swingbench Client 2 .
.
.
PC Swingbench Client 100 EIA0 161.114.69.8
GS160 - QBB3
RAC DT/HA – LAN Failover DCL
$ MCR LANCP SHOW DEVICE/CHAR LLA0
Before NIC ‘fails’
Device Characteristics LLA0: Value Characteristic ----- ------------- 256 Max receive buffers Yes Full duplex enable . .
. .
1000 Line speed (mbps) "EWB0" Failover device "EWA0" Failover device (active) . .
. .
0 Failover priority
After NIC ‘fails’
Device Characteristics LLA0: Value Characteristic ----- ------------- 256 Max receive buffers Yes Full duplex enable . .
. .
1000 Line speed (mbps) "EWB0" Failover device (active) "EWA0" Failover device . .
. .
0 Failover priority
RAC DT/HA-T4 LAN Failover EWA/B
1,900,000 1,800,000 1,700,000 1,600,000 1,500,000 1,400,000 1,300,000 1,200,000 1,100,000 1,000,000 900,000 800,000 700,000 600,000 500,000 400,000 300,000 200,000 100,000 0
LAN Failover - Pull Cable
EWA0 cable pulled EWB0 cable pulled 17:05:00 (7-Apr-2004) 17:10:00 (7-Apr-2004) 17:15:00 (7-Apr-2004)
[NET.EWA0:]Bytes Recv/Sec(# 1)
17:20:00 (7-Apr-2004) 17:25:00 (7-Apr-2004) 17:30:00 (7-Apr-2004)
[NET.EWB0:]Bytes Recv/Sec(# 1)
Node QBB0
1,900,000 1,800,000 1,700,000 1,600,000 1,500,000 1,400,000 1,300,000 1,200,000 1,100,000 1,000,000 900,000 800,000 700,000 600,000 500,000 400,000 300,000 200,000 100,000 0
RAC DT/HA-T4 LAN Failover LLA0
LAN Failover - Pull Cable
1,900,000 1,800,000 1,700,000 1,600,000 1,500,000 1,400,000 1,300,000 1,200,000 1,100,000 1,000,000 900,000 800,000 700,000 600,000 500,000 400,000 300,000 200,000 100,000 1,900,000 1,800,000 1,700,000 1,600,000 1,500,000 1,400,000 1,300,000 1,200,000 1,100,000 1,000,000 900,000 800,000 700,000 600,000 500,000 400,000 300,000 200,000 100,000 17:05:00 (7-Apr-2004) 17:10:00 (7-Apr-2004) 17:15:00 (7-Apr-2004) 17:20:00 (7-Apr-2004) 17:25:00 (7-Apr-2004)
[NET.LLA0:]Bytes Recv/Sec(# 1)
Node QBB0
17:30:00 (7-Apr-2004)
RAC DT/HA-T4 Overlay of EWA/LLA0
1,900,000 1,800,000 1,700,000 1,600,000 1,500,000 1,400,000 1,300,000 1,200,000 1,100,000 1,000,000 900,000 800,000 700,000 600,000 500,000 400,000 300,000 200,000 100,000 0
LAN Failover - Pull Cable
17:05:00 (7-Apr-2004) 17:10:00 (7-Apr-2004)
[NET.EWA0:]Bytes Recv/Sec(# 1)
1,900,000 1,800,000 1,700,000 1,600,000 1,500,000 1,400,000 1,300,000 1,200,000 1,100,000 1,000,000 900,000 800,000 700,000 600,000 500,000 400,000 300,000 200,000 100,000 0 17:15:00 (7-Apr-2004) 17:20:00 (7-Apr-2004)
[NET.EWB0:]Bytes Recv/Sec(# 1)
17:25:00 (7-Apr-2004) 17:30:00 (7-Apr-2004)
[NET.LLA0:]Bytes Recv/Sec(# 1)
Node QBB0
RAC DT/HA – failSAFE IP Network
GS160 - QBB0 Oracle RAC connection using EWD0/E0 devices FailSafeIP EVA EIA0 161.114.69.7
EWA0 EWB0 PC Swingbench Client 1 Common System Disk Shared Oracle 9i 10.4.4.1
Fiber San A Switch G-Bit LAN Digital Networks DNswitch 800 100 M-bit Lan Extreme Summit 4 Switch PC Swingbench Client 2 .
.
.
Database EWD0 10.4.4.2
EWE0 10.4.4.3
10.4.4.2 & 10.4.4.3 are configured for FailSafeIP EIA0 161.114.69.8
PC Swingbench Client 100 GS160 - QBB3
RAC DT/HA – failSAFE IP DCL
$ TCPIP SHOW INTERFACE/FULL
Route Tree for Protocol Family 2: default 161.114.69.1 UGS 0 7999 IE0 10.4.4/24 10.4.4.2 U 274 408185 WE3 10.4.4/24 10.4.4.3 U 274 445714 WE4 10.4.4.2 10.4.4.2 UHL 0 0 WE3 10.4.4.3 10.4.4.3 UHL 0 14 WE4 WE3: flags=c43
RAC DT/HA – failSAFE IP DCL Failed 1
$ TCPIP SHOW INTERFACE/FULL
Route Tree for Protocol Family 2: default 161.114.69.1 UGS 0 7999 IE0 10.4.4/24 10.4.4.2 U 274 408185 WE3 10.4.4/24 10.4.4.3 U 274 445714 WE4 10.4.4.2 10.4.4.2 UHL 0 0 WE3 10.4.4.3 10.4.4.3 UHL 0 14 WE4 WE3: flags=c43
RAC DT/HA – failSAFE IP DCL Failed 2
$ TCPIP SHOW INTERFACE/FULL
Route Tree for Protocol Family 2: default 161.114.69.1 UGS 0 7999 IE0 10.4.4/24 10.4.4.2 U 274 408185 WE3 10.4.4/24 10.4.4.3 U 274 445714 WE4 10.4.4.2 10.4.4.2 UHL 0 0 WE3 10.4.4.3 10.4.4.3 UHL 0 14 WE4 WE3: flags=c43
failSAFE IP Addresses: inet 10.4.4.2 netmask ffffff00 broadcast 161.114.69.63(on QBB3 WE3) *inet 10.4.4.3 netmask ffffff00 broadcast 10.4.4.255 (on QBB3 WE3)
RAC DT/HA – T4 data failSAFE IP
FailSafeIP - Pull Cable
1,600,000 1,500,000 1,400,000 1,300,000 1,200,000 1,100,000 1,000,000 900,000 800,000 700,000 600,000 500,000 400,000 300,000 200,000 100,000 0 EWD0 cable pulled EWE0 cable pulled 14:40:00 (14-Apr-2004) 14:45:00 (14-Apr-2004) 14:50:00 (14-Apr-2004) 14:55:00 (14-Apr-2004)
[NET.EWD0:]Bytes Recv/Sec(# 1)
15:00:00 (14-Apr-2004) 15:05:00 (14-Apr-2004)
[NET.EWE0:]Bytes Recv/Sec(# 1)
Node QBB3
1,600,000 1,500,000 1,400,000 1,300,000 1,200,000 1,100,000 1,000,000 900,000 800,000 700,000 600,000 500,000 400,000 300,000 200,000 100,000 0
RAC DT/HA – 100km cable Network
GS160 - QBB0 Oracle RAC connection using EWA0 device separated by 100km EIA0 161.114.69.7
EWA0 EVA PC Swingbench Client 1 Common System Disk Shared Oracle 9i Fiber San A Switch G-Bit SCS Extreme Summit 7i 100km Fiber Cable G-Bit SCS Extreme Summit 7i 100 M-bit Lan Extreme Summit 4 Switch Database EWA0 PC Swingbench Client 2 .
.
.
PC Swingbench Client 100 EIA0 161.114.69.8
GS160 - QBB3
RAC DT/HA
– T4 EWA0 w/100km cable
EWA0 with 100km Fiber cable between instances
1,700,000 1,600,000 1,500,000 1,400,000 1,300,000 1,200,000 1,100,000 1,000,000 900,000 800,000 700,000 600,000 500,000 400,000 300,000 200,000 100,000 12:25:00 (16-Apr-2004) 12:30:00 (16-Apr-2004) 12:35:00 (16-Apr-2004) 12:40:00 (16-Apr-2004) 12:45:00 (16-Apr-2004) 12:50:00 (16-Apr-2004)
[NET.EWA0:]Bytes Recv/Sec(# 1)
QBB0
12:55:00 (16-Apr-2004) 1,700,000 1,600,000 1,500,000 1,400,000 1,300,000 1,200,000 1,100,000 1,000,000 900,000 800,000 700,000 600,000 500,000 400,000 300,000 200,000 100,000
RAC DT/HA
– T4 EIA compared w/ EWA
Bytes/sec of EIA NIC over UTP and EWA NIC over 100km
Red graph says [NET.EWA0], but this is really [NET.EIA0] 1,700,000 1,600,000 1,500,000 1,400,000 1,300,000 1,200,000 1,100,000 1,000,000 900,000 800,000 700,000 600,000 500,000 400,000 300,000 200,000 100,000 0 1,700,000 1,600,000 1,500,000 1,400,000 1,300,000 1,200,000 1,100,000 1,000,000 900,000 800,000 700,000 600,000 500,000 400,000 300,000 200,000 100,000 0 16:20:00 (26-Mar-2004) 16:30:00 (26-Mar-2004)
[NET.EWA0:]Bytes Recv/Sec(# 1)
16:40:00 (26-Mar-2004)
[NET.EWA0:]Bytes Recv/Sec(# 2)
RAC DT/HA – Load Generation Data
50k Transactions, no RAC or Network Failure Network Interface Baseline (EIA 161.114.69.x) Lan Failover (EWA 10.3.3.x) FailSafe IP (EWD 10.4.4.x) 100 km Baseline (EWA 10.3.3.x) Total duration 30:08 30:02 30:02 29:52 TPS 27.8
27.9
27.9
28.0
RAC DT/HA – Load Generation Data
50k Transactions, Network failover Network Interface Baseline (EIA 161.114.69.x) Lan Failover (EWA 10.3.3.x) FailSafe IP (EWD 10.4.4.x) 100 km Baseline (EWA 10.3.3.x) Total duration N/A 30:02 30:13 N/A TPS N/A 27.9
27.7
N/A
RAC DT/HA – Load Generation Data
50k Transactions, 1 RAC instance failed Network Interface Baseline (EIA 161.114.69.x) Total Duration 33:25 TPS 25.0
50client Failover 00:37 Lan Failover (EWA 10.3.3.x) FailSafe IP (EWD 10.4.4.x) 100 km Baseline (EWA 10.3.3.x) 29:54 30:02 29:39 28.0
27.7
28.0
00:39 00:39 00:43
RAC DT/HA – Conclusions
RAC seemed to have no problems when running with network configured to use LAN Failover or failSAFE IP (on the same node).
There seems to be a definite distributing effect on network traffic when Oracle init.ora parameter CLUSTER_INTERCONNECTS is used
RAC DT/HA – Phase II and III
Phase II: Configure Oracle 9iRAC 2-node cluster using Raid-1 Shadow Sets for database and logfiles, and test recently released Host Based Mini-Merge (HBMM) functionality in a variety of configurations. Refer to: http://h71000.www7.hp.com/news/hbmm.htm
Phase III: Distribute nodes in cluster over 100km+ distance and test failover and HBMM functionality
RAC DT/HA - References
OpenVMS Technical Journal: Matt Muggeridge’s July 2003 - V2 Article: Configuring TCP/IP for High Availability http://h71000.www7.hp.com/openvms/journal/v2/ articles/tcpip.pdf
Steve Lieman’s January 2004 - V3 Article: TimeLine-Driven Collaboration with T4 & Friends: A Time-saving Approach to OpenVMS Performance http://h71000.www7.hp.com/openvms/journal/v3/ t4.pdf
RAC DT/HA – References (con’t)
TCPIP docs: http://h71000.www7.hp.com/doc/tcpip54.html
OpenVMS docs: http://h71000.www7.hp.com/doc/os732_index.ht
ml HP TCP/IP Services for OpenVMS Management: Chapter 5 Configuring and Managing FailSAFE IP o http://h71000.www7.hp.com/doc/732final/docum entation/pdf/aa-lu50m-te.pdf
RAC DT/HA – References (con’t)
HP OpenVMS System Management Utilities Reference Manual: Chapter 13, LAN Control Program (LANCP) Utility o http://h71000.www7.hp.com/doc/732FINAL/DOC UMENTATION/PDF/aa-pv5ph-tk.PDF
HP OpenVMS System Manager’s Manual, Volume 2 -Tuning, Monitoring, and Complex Systems: Chapter 10, Managing the Local Area Network (LAN)Software o http://h71000.www7.hp.com/doc/732FINAL/aa pv5nh-tk/aa-pv5nh-tk.pdf
RAC DT/HA – References (con’t)
Oracle References: Swingbench – an ‘unofficial’ load generating benchmarking tool, developed in Java, which allows a load to be generated and the transactions/response times to be charted http://www.dominicgiles.com/swingbench.php
OTN otn.oracle.com
Real 24/7: Use Oracle9i RAC and TAF to guarantee availability. http://otn.oracle.com/oramag/oracle/02 may/o32clusters.html
RAC DT/HA – References (con’t)
Oracle Metalink articles: metalink.oracle.com
.
Note:183340.1 - Frequently Asked Questions About the.
CLUSTER_INTERCONNECTS Parameter in 9i.
Note 220970.1 traffic?" “Which network is Oracle using for RAC Note: 162725.1 - OPS/RAC VMS: Using alternate TCP Interconnects on 8i OPS and 9i RAC on OpenVMS.
Note: 226880.1 – Configuration of Load Balancing and Transparent Application Failover.
OpenVMS Solutions Lab
Available to customers to test new hardware, software, applications Alpha and Integrity systems available for use To get the most benefit from the Lab, customer is expected to be prepared with exact list of hardware and software requirements, test plan and goals