Transcript Document

Upgrade to Oracle Real Application Clusters 11g Release 2
Key Success Factors
Bill Burton, Joshua Ort
Oracle RAC Assurance Team,
Real Applications Clusters Product Development
The following is intended to outline our general
product direction. It is intended for information
purposes only, and may not be incorporated into any
contract. It is not a commitment to deliver any
material, code, or functionality, and should not be
relied upon in making purchasing decisions.
The development, release, and timing of any
features or functionality described for Oracle’s
products remains at the sole discretion of Oracle.
Oracle Confidential
Agenda
• Upgrade Concepts and Considerations
• Grid Infrastructure
– Upgrade paths
– Planning , Preparation and Prerequisites
– Completing the upgrade and post upgrade validation
• RAC Database
– Upgrade paths
– Planning , Preparation and Prerequisites
– Completing the upgrade and post upgrade validation
• Utilizing New Features/Tools
– OCR and Voting on ASM
– ASM Cluster File System
– Cluster health Monitor
Upgrade Concepts
• Out of place Upgrade for Software
– New Software is installed into a new Home
– Current software is left in place and resources requiring it remain running
– Allows for shorter downtime as we can complete the install before the actual
upgrade
– Can be for Single instance database, RAC, or clusterware homes
– Goal: Same configuration, new binaries
• New clusterware features can be adopted later
Upgrade Concepts
• Rolling Upgrade (Clusterware/ASM)
– Deals with the upgrade of resources that utilize a clustered home.
– Normally done after an out of place software upgrade
– On each node in turn resources are:• Stopped
• Upgraded
• Restarted
Upgrade Considerations
• With Oracle Grid Infrastructure 11.2, ASM and Oracle
Clusterware are installed into a single home directory, which is
referred to as the “Grid Infrastructure home”.
– However, Oracle Clusterware and ASM remain separate products.
• The grid infrastructure version must be greater than or equal to
the version of the resources it manages eg ASM, RDBMS
• All 11.2 upgrades are out of place
Upgrade Considerations
• Oracle Home/Base
– ORACLE_BASE for GI should be different than the
ORACLE_BASE for Oracle Database.
– Each installation user should have it’s own Oracle Base
– The 11.2 grid infrastructure home must be owned by the
same user as the pre-11.2 clusterware home.
– rootupgrade.sh will change the ownership of the grid
infrastructure home directory and its parents to root
Upgrade Paths – Grid Infrastructure
• Pre Oracle 10g Release 1
– Did not have oracle clusterware, so install 11.2.0.1
– No ASM so no upgrade considerations.
• Oracle 10gR1
–
–
–
–
Clusterware and ASM support direct upgrade from 10.1.0.5
Earlier versions require upgrade to 10.1.0.5 first
Clusterware can be rolling upgraded
ASM cannot be rolling upgraded
Upgrade Paths – Grid Infrastructure
• Oracle 10gR2
–
–
–
–
Clusterware and ASM support direct upgrade from 10.2.0.3
Earlier versions require upgrade to 10.2.0.3 or higher first
Clusterware can be rolling upgraded
ASM cannot be rolling upgraded
• Oracle 11gR1
– Clusterware and ASM support direct upgrade from 11.1.0.6 and
11.1.0.7
– Clusterware can be rolling upgraded
– ASM can be rolling upgraded from 11.1.0.6
• Requires patch for bug 6872001 on the 11.1.0.6 ASM home
for 11.1.0.6 ASM to be rolling upgraded.
– ASM can also be rolling upgraded from 11.1.0.7, no patch
required
Planning , Preparation and Prerequisites
Installation and Upgrade Guides
• Always the first place to go …
• http://www.oracle.com/technology/documentation/database.html
– View Library ( though you can download the whole set )
• Check the Installing and Upgrading Section
•
•
•
•
•
•
•
Oracle Grid Infrastructure Installation Guide
Real Application Clusters Installation Guide for Linux and UNIX
Clusterware Administration and Deployment Guide
Real Application Clusters Administration and Deployment Guide
Storage Administrator's Guide
11g Upgrade Guide
New Features Guide
• Check Metalink for Upgrade Issues
• Check Upgrade Companion Note: 785351.1
• Check Certification Matrix
– http://www.oracle.com/technology/support/metalink/index.html
Planning , Preparation and Prerequisites
RAC Starter Kits for 11.2
• The goal of the Oracle Real Application Clusters (RAC) Starter Kit is to
provide you with the latest information on generic and platform specific
best practices for implementing an Oracle RAC cluster.
• meant to supplement the Oracle Documentation set
• See Metalink Note: 45715.1
RAC Assurance Support Team: 11gR2 RAC Starter Kit and Best Practices
Planning , Preparation and Prerequisites
• 11.2 grid infrastructure home cannot reside on a shared cluster
file system. eg. ocfs2, Veritas CFS
• NFS based shared storage is supported
• Installer will allow move from 10.2 on CFS to 11.2 on non CFS
• All cluster nodes must be up and running
• Remove any down nodes, or start them if possible.
• Unset ORACLE_HOME, ORACLE_BASE in the environment for
the installing user as the install scripts handle these.
• Avoid Installer AttachHome issues
• Set the following parameter in the SSH daemon configuration
file "/etc/ssh/sshd_config" on all cluster nodes before running
oui.
• LoginGraceTime 0
• Restart sshd
• Provision network resources for Single Client Access Name
(SCAN)
Planning , Preparation and Prerequisites
Single Client Access Name (SCAN)
• Oracle Database 11g release 2 clients connect to the database
using SCAN VIPS.
• The SCAN is associated with the entire cluster rather than an
individual node.
• Resolves to up to 3 IP Addresses in DNS or GNS
– IP Addresses returned in a round robin manner .
• SCAN listeners run under the grid infrastructure home.
• Provides load balancing and failover for client connections
• Check this white paper for more details
– http://www.oracle.com/technology/products/database/clustering/pdf/scan.pdf
Planning , Preparation and Prerequisites
Single Client Access Name (SCAN)
• Scan VIP’s - Network Requirement
– A single client access name (SCAN) configured in DNS.
[root@cluster1 oracle]# nslookup mycluster-scan1
Server:
Address:
120.20.190.70
120.20.190.70#53
Name: mycluster -scan1.mydomain.com
Address: 10.148.46. 79
Name: mycluster -scan1.mydomain.com
Address: 10.148.46. 77
Name: mycluster -scan1.mydomain.com
Address: 10.148.46. 78
Completing the upgrade
Top Level Flow
•
•
•
•
Verify the Hardware/Software Environment
Install the Software
Configure the Software
Finalize the Upgrade
Completing the upgrade
Verify the Hardware/Software Environment
• Secure Shell
– We recommend using OUI to set up ssh.
• old ssh setup not always considered valid by 11.2 OUI, due to
tighter restrictions, but OUI will correct it for you.
– OUI will validate ssh before allowing you to continue
• Watch out for stty commands or profile messages that may
cause the automatic setup of ssh to fail.
Completing the upgrade
Verify the Hardware/Software Environment
• Cluster Verification Utility
– Integrated into OUI
– Still recommended to run before an Install/upgrade.
– Now has “fixup scripts” to correct certain failures such as
kernel parameter settings
– The most recent version is available from OTN
• http://www.oracle.com/technology/products/database/clustering/cvu/cvu_download_
homepage.html
Completing the upgrade
Install the Software
• Oracle Universal Installer – runInstaller
– Should find existing Oracle clusterware and suggest upgrade
to Grid Infrastructure
– Must run installer as the previous version software owner
– If you need to collect debug tracing (request from support)
• /runInstaller -debug
• Output is written to stdout by default
• Use script command to capture the output
• Following screenshots are from a 10.2.0.4 upgrade
Completing the upgrade - Install the Software
If upgrade is not highlighted we did not detect a
previous clusterware version
Completing the upgrade
Install the Software
Completing the upgrade
Install the Software
• OUI detects existing ASM instances and responds
with this message
– “INS-40413 Existing ASM Instance Detected.”
• Tells you that if you intend to upgrade ASM, you must shut
down all ASM and therefore all database instances
– What does this mean to you ?
• Once you understand there will be a complete outage
when asmca is running, then hit YES
• ASM and database should remain up.
• rootupgrade.sh will shutdown as required (rolling )
• ASMCA will shutdown instances as required.
– 11.1 In rolling fashion
– 10g Not rolling (complete outage)
Completing the upgrade
Install the Software – Node Selection and SSH
Completing the upgrade
Install the Software – Node Selection and SSH
Completing the upgrade
Install the Software – Set up ASM Roles
Completing the upgrade
Install the Software – Cluster Verification
Completing the upgrade
Install the Software – Cluster Verification
Completing the upgrade
Install the Software – Cluster Verification
root> /tmp/CVU_11.2.0.1.0_grid/runfixup.sh
Response file being used is :/tmp/CVU_11.2.0.1.0_grid/fixup.response
Enable file being used is :/tmp/CVU_11.2.0.1.0_grid/fixup.enable
Log file location: /tmp/CVU_11.2.0.1.0_grid/orarun.log
Setting Kernel Parameters...
fs.file-max = 327679
fs.file-max = 6815744
net.ipv4.ip_local_port_range = 9000 65500
net.core.wmem_max = 262144
net.core.wmem_max = 1048576
uid=501(grid)gid=502(oinstall)groups=502(oinstall),
503(asmadmin),504(asmdba)
Completing the upgrade
Install the Software – Cluster Verification
Completing the upgrade
Configure the Software – run the root scripts
• Run GI_HOME/rootupgrade.sh on each node of the
cluster as root
– High level logging to stdout
– See log file in GI_HOME/cfgtoollogs/crsconfig/ for more detail.
– Must be run on all nodes :• First node on it’s own to successful completion
• All but the last node can then be run in parallel
• Last Node on it’s own.
– Does a rolling upgrade of Clusterware stack
– Only finalizes the upgrade as the last node completes
• Prior to that, the clusterware is effectively running on the
old version
Completing the upgrade
Configure the Software – run rootupgrade.sh
• Actions Automatically Performed on all Nodes
•
•
•
•
•
Instantiate scripts, create directories, set perms
Retrieve upgrade configuration
Old Clusterware stack stopped
Local configuration completed
New Clusterware stack started
• ohasd process replaces init.cssd,init.crsd,init.evmd, and oprocd
• Starts CSSD in exclusive Mode.
• Will only work on first node all others get…
• CRS-4402: The CSS daemon was started in exclusive mode
but found an active CSS daemon on node ratus-vm1, number
1, and is terminating. An active cluster was found during
exclusive startup, restarting to join the cluster
• Upgrades CSS voting disks.
• Starts/Restarts CSS clustered, and other Clusterware daemons.
• Finalizes local nodes upgrade
Completing the upgrade
Configure the Software – run the rootupgrade.sh
• Actions automatically performed on First Node
• In addition to performing all “each node” steps
• Push Cluster profile to all nodes
Completing the upgrade
Configure the Software – run the rootupgrade.sh
• Actions automatically performed on Last Node
• Actions performed when all other nodes already have new
software version installed and have completed
rootupgrade.sh
• Sets the new active version
• Cluster is effectively on old release until this point
• ASM resource added if needed (not present previously)
• New resources added e.g. SCAN, SCAN Listener
• Nodeapps and new resources started
Completing the upgrade
Upgrade ASM
• If you chose to upgrade ASM
• ASMCA runs silently after acknowledging completion of last
node's rootupgrade script within OUI.
Post-upgrade validation…
• Clusterware processes
–
$ ps -ef|grep -v grep |grep d.bin
oracle 9824 1 0 Jul14 ? 00:00:00 /u01/app/grid11gR2/bin/oclskd.bin
root 22161 1 0 Jul13 ? 00:00:15 /u01/app/grid11gR2/bin/ohasd.bin reboot
oracle 24161 1 0 Jul13 ? 00:00:00 /u01/app/grid11gR2/bin/mdnsd.bin
oracle 24172 1 0 Jul13 ? 00:00:00 /u01/app/grid11gR2/bin/gipcd.bin
oracle 24183 1 0 Jul13 ? 00:00:03 /u01/app/grid11gR2/bin/gpnpd.bin
oracle 24257 1 0 Jul13 ? 00:01:26 /u01/app/grid11gR2/bin/ocssd.bin
root 24309 1 0 Jul13 ? 00:00:06 /u01/app/grid11gR2/bin/octssd.bin
root 24323 1 0 Jul13 ? 00:01:03 /u01/app/grid11gR2/bin/crsd.bin reboot
root 24346 1 0 Jul13 ? 00:00:00 /u01/app/grid11gR2/bin/oclskd.bin
oracle 24374 1 0 Jul13 ? 00:00:03 /u01/app/grid11gR2/bin/evmd.bin
• Clusterware checks
–
$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
$ crsctl check has
CRS-4638: Oracle High Availability Services is online
$ crsctl query crs activeversion
Oracle Clusterware active version on the cluster is [11.2.0.1.0]
Post-upgrade validation…
• $ crsctl check cluster -all
**************************************************************
rat-rm2-ipfix006:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
rat-rm2-ipfix007:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
rat-rm2-ipfix008:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
Post-upgrade validation
• $CRS_HOME/bin/crs_stat is deprecated in 11.2, use crsctl instead
– $ crsctl status resource -t
NAME TARGET STATE SERVER
STATE_DETAILS
-----------------------------------------------ora.DATA.dg
 ASM disk group (new resource)
ONLINE ONLINE rat-rm2-ipfix006
ONLINE ONLINE rat-rm2-ipfix007
ONLINE ONLINE rat-rm2-ipfix008
ora.LISTENER.lsnr
ONLINE ONLINE rat-rm2-ipfix006
ONLINE ONLINE rat-rm2-ipfix007
ONLINE ONLINE rat-rm2-ipfix008
ora.asm
ONLINE ONLINE rat-rm2-ipfix006 Started
ONLINE ONLINE rat-rm2-ipfix007 Started
ONLINE ONLINE rat-rm2-ipfix008 Started
ora.eons
 new resource
ONLINE ONLINE rat-rm2-ipfix006
ONLINE ONLINE rat-rm2-ipfix007
ONLINE ONLINE rat-rm2-ipfix008
ora.gsd
OFFLINE OFFLINE rat-rm2-ipfix006
OFFLINE OFFLINE rat-rm2-ipfix007  ora.gsd OFFLINE is normal,
OFFLINE OFFLINE rat-rm2-ipfix008
unless running 9i RAC too
OFFLINE OFFLINE rat-rm2-ipfix008
ora.net1.network
 new resource
ONLINE ONLINE rat-rm2-ipfix006
ONLINE ONLINE rat-rm2-ipfix007
ONLINE ONLINE rat-rm2-ipfix008
That was so easy.
A Neanderthal man that
inhabits a cave could
do it !
Carefully reworded to avoid copyright issues
Diagnosing a failed Upgrade
• Root script logs
• Output from running rootupgrade.sh
• <grid home>/cfgtoollogs/crsconfig/rootcrs_<host>.log
• srvctl command logs
• Modifying nodeapps , adding scan etc
• <grid home>/cfgtoollogs/crsconfig/srvmcfg.#.log
• Daemon logs, including clusterware alert log
• <grid home>/log/<hostname>/
• ASM logs for 11.2 in grid user oracle base
• <obase>/diag/asm/+asm/+ASM<n>
Clusterware Downgrade
• Restores OCR from backup taken before upgrade
• rootcrs.pl -downgrade [-force] on all but last node
• rootcrs.pl -downgrade -lastnode -oldcrshome <CH> version 11.1.0.6.0 [-force]
• This restores OCR
• Run root.sh from previous-version Clusterware home
one node at a time
Interoperability Success Factors
11.2 GI with older Database
• If you want to create a database in the old DB home
after GI Upgrade
• Apply patch for Bug 8288940 on 10.2.0.4 or 11.1 to create a
database using dbca from that release, otherwise dbca fails to
see the 11.2 ASM instance is running.
• All nodes must be pinned ( GI upgrade does this )
• If your Database home owner is not the GI owner
• Ensure the Database Home owner can access the
listener.ora file in the GI home and can place a backup in the
GI Home network/admin directory to prevent dbca failure.
• Use 11.2 asmca to create any new disk groups you need.
• dbca will offer disk group creation but the database owner
only has asmdba privileges so no DG creation possible.
RAC Database Upgrade
Upgrade Concepts – RAC Database
• Installer controlled RAC rolling upgrade not available
– Other rolling upgrade options may be considered
– See White papers from the MAA team
• High Availability with Oracle Database 11g Release 2
http://www.oracle.com/technology/deploy/availability/pdf/twp_databaseha_11gr2.pdf
• Oracle Data Guard with Oracle Database 11g Release 2
http://www.oracle.com/technology/deploy/availability/pdf/twp_dataguard_11gr2.pdf
Upgrade Paths – RAC Database
• Oracle 9i Release 2 and Earlier
– Database supports direct upgrade to 11.2 from 9.2.0.8
– Earlier Database versions require upgrade to 9.2.0.8 first
– See the 9.2.0.8 upgrade guide for supported upgrade paths
• Oracle 10gR1
– Database supports direct upgrade from 10.1.0.5
– Earlier versions require upgrade to 10.1.0.5 first
• Oracle 10gR2
– Database supports direct upgrade from 10.2.0.2
– Earlier versions require upgrade to 10.2.0.2 or higher first
• Oracle 11gR1
– Database supports direct upgrade from 11.1.0.6 and 11.1.0.7
Planning , Preparation and Prerequisites
RAC Database
• Backup everything
• Define and test a fallback strategy
• Be aware of known issues (Metalink Note: 161818.1)
• Make decisions regarding the database compatibility
init parameter. Some changes cannot be reverted.
• Check for and apply critical patches to 11.2 home
prior to database upgrade
• Test, test and test until comfortable with the results
• Gather Baseline stats (OS, DB, ..etc)
• After upgrade, check logs
Planning , Preparation and Prerequisites
RAC Database
• Verify you meet the minimum requirements for shared
memory pool sizes
• Check for invalid SYS and SYSTEM owned objects
• Disable all jobs that are executed at the OS level or
by third parties before the database upgrade
• See upgrade guide for full list of requirements
Completing the upgrade
Install the Software
• Oracle Universal Installer – runInstaller
• Out of Place Upgrade but not Rolling
– You can install into a new Home and keep services up whilst the
Install is done
– All instances must be down at the same time to complete the
database upgrade with dbua
• Oracle Home should be within ORACLE BASE
– Installer verifies this ..
• CVU still used to check O/S prerequisites and fix up scripts
generated.
• Software copied to all selected nodes prior to dbua
running.
Completing the upgrade
Database Upgrade Assistant
• Similar to Single Instance (Non RAC)
• Runs database pre-upgrade checks
– Provides warning report with required actions
– Allows you to carry on despite warnings
• Allows you to modify
– Flash Recovery Area
– Diagnostic Destination
• Provides detailed summary report prior to upgrade
• Shuts down all database instances to perform the
upgrade
• Provides detailed results report.
Completing the upgrade– DBUA
Completing the upgrade– DBUA Warnings
Completing the upgrade– Destinations
Completing the upgrade– DBUA Summary
Completing the upgrade – DBUA Completion
ASM File
System
Binaries
OCR &
Voting Files
DB Datafiles
New Features and Tools
Utilizing New Features/Tools
OCR on ASM
• OCR
– In 11.2 OCR files can be placed on ASM Disk groups.
– Oracle Recommends moving OCR files to ASM post upgrade.
• Upgrade process cannot do this automatically
• Raw devices are only supported for existing files
– Up to 5 locations allowed
• Each file must go in a separate Disk Group
– To move OCR to ASM
• Set ASM diskgroup compatibility to 11.2.0.1.0
– asmca, asmcmd or sqlplus
• Move OCR to ASM using ocrconfig (on any node)
[root@rat-rm2-ipfix006 bin]# ./ocrconfig -add +DATA1
[root@rat-rm2-ipfix006 bin]# ./ocrconfig -delete /dev/raw/raw1
Utilizing New Features/Tools
Voting Files on ASM
• Voting Files
– In 11.2 Voting files can be placed on ASM Disk groups
– We recommend voting files should be moved to ASM
• Upgrade process cannot do this automatically
• Raw devices are only supported for existing files
– Command can be run from any node
– Handles removal of old files and creation of new ones
[root bin]$ ./crsctl replace votedisk +DATA
– Disk group redundancy affects how many voting files can reside on
an ASM diskgroup.
• External: 1 voting File
• Normal: 3 voting Files (minimum of 3 failure groups)
• High: 5 voting Files (minimum of 3 failure groups)
Utilizing New Features/Tools
ASM Cluster File System(ACFS)
• ACFS can be used for
– Shared DB homes (not grid infrastructure)
– General Purpose CFS
•
•
•
•
ASM file systems created on ASM Volumes
ASM Volumes created in ASM Disk Groups
ACFS requires disk groups with compatibility 11.2 or greater
Managed from :– asmcmd - many new commands in 11.2
– asmca
- new to 11.2
– sqlplus - connect as /sysasm
Utilizing New Features/Tools
ACFS Creation using ASMCA
New Features/Tools
Cluster Health Monitor
• Daemon that collects information about system load
• Tracks the OS resource consumption at each node, process,
and device level continuously.
• Berkeley database stores the information/statistics
• GUI interface for install on client machine(s)
• Primarily for problem diagnosis – eg. evictions
• Previously called IPD/OS (Instantaneous Problem Detector for
Clusters)
• Currently available on linux and windows
• Available for download from OTN
– http://www.oracle.com/technology/products/database/clustering/ipd_download_homepag
e.html
QUESTIONS
ANSWERS
The preceding is intended to outline our general
product direction. It is intended for information
purposes only, and may not be incorporated into any
contract. It is not a commitment to deliver any
material, code, or functionality, and should not be
relied upon in making purchasing decisions.
The development, release, and timing of any
features or functionality described for Oracle’s
products remains at the sole discretion of Oracle.