Transcript RAC Best Practices on Linux
Session id: 40136
RAC Best Practices on Linux
Kirk McGowan Technical Director – RAC Pack Server Technologies Oracle Corporation Roland Knapp Principal Member Technical Staff – RAC Pack Server Technologies Oracle Corporation
Agenda
Planning Best Practices – – – – Architecture Expectation setting Objectives and success criteria Project plan Implementation Best Practices – – – – Infrastructure considerations Installation/configuration Database creation Application considerations Operational Best Practices – – – Backup & Recovery Performance Monitoring and Tuning Production Migration
Planning
Understand the Architecture – – – – Cluster terminology Functional basics HA by eliminating node & Oracle as SPOFs Scalability by making additional processing capacity available incrementally Hardware components Private interconnect/network switch Shared storage/concurrent access/storage switch Software components OS, Cluster Manager, DBMS/RAC, Application Differences between cluster managers
RAC Hardware Architecture
Network Centralized Management Console High Speed Switch or Interconnect Clustered Database Servers Hub or Switch Fabric Mirrored Disk Subsystem Low Latency Interconnect ie. VIA or Proprietary
Users No Single Point Of Failure
Storage Area Network
RAC Software Architecture
Shared Data Model
GES&GCS Shared Memory/Global Area shared SQL log buffer GES&GCS GES&GCS Shared Memory/Global Area shared SQL log buffer
. . .
. . .
Shared Memory/Global Area shared SQL log buffer GES&GCS Shared Memory/Global Area shared SQL log buffer Shared Disk Database
RAC on Linux HW & SW Components
public network
Node1a Oracle 9
i
RAC instance 1 DB cache ORACM Unbreakable Linux
cluster interconnect cache to cache
Node2a Oracle 9
i
RAC instance 2 DB cache ORACM Unbreakable Linux
shared storage
concurrent access from every node = “scale out”
redo log instance 1 … redo log instance 2 … control files database files
N3
more nodes = higher availability
N4 Nn
Linux Cluster Hardware
Cluster interconnects – FastEthernet, Gigabit Ethernet Public networks – Ethernet, FastEthernet, Gigabit Ethernet Memory, swap & CPU Recommendations – Each server should have a minimum of 512Mb of memory, at least 1Gb swap space, and two CPUs. Fiber Channel, SCSI, or NAS storage connectivity
Unbreakable Linux Distributions
Red Hat Enterprise Linux AS and ES United Linux 1.0
– SuSE Linux Enterprise Server 8 (SuSE Linux AG) – – – Conectiva Linux Enterprise Edition (Conectiva S.A.) SCO Linux Server 4.0 (The SCO Group) Turbolinux Enterprise Server 8 (Turbolinux) Oracle will support Oracle products running with other distributions but will not support the operating system.
RAC Certification for Unbreakable Linux
Certification – – – – – Enterprise class OS distribution (e.g. RH AS, United Linux 1.0) Clusterware (Oracle Cluster Manager only) Network Attached Storage (e.g. Network Appliance filers) Most SCSI and SAN storage are compatible 32 bit and 64 bit Itanium 2 Intel based servers are certified.
For more details on software certification: http://technet.oracle.com/support/metalink/content.html
Discuss hardware configuration with your HW vendor
Linux IA64 requirements
Operating System Requirements – Red Hat Linux Advanced Server 2.1 operating system with kernel 2.4.18-e.14.ia64.rpm – – – – – glibc 2.2.4-29 Gnu gcc 2.96.0 release Linux Header Patch 2.4.18 (available from Intel) asynch libraries libaio-0.3.92-1 (Oracle9i Release Notes Release 2 (9.2.0.2.0) for Linux Intel on Itanium (64-bit) Part No. B10567-02 )
Set Expectations Appropriately
If your application will scale transparently on SMP, then it is realistic to expect it to scale well on RAC, without having to make any changes to the application code.
RAC eliminates the database instance, and the node itself, as a single point of failure, and ensures database integrity in the case of such failures
Planning: Define Objectives
Objectives need to be quantified/measurable – – – HA objectives Planned vs unplanned Technology failures vs site failures vs human errors Scalability Objectives Speedup vs scaleup Response time, throughput, other measurements Server/Consolidation Objectives Often tied to TCO Often subjective
Build your Project Plan
Partner with your vendors – Multiple stakeholders, shared success Build detailed test plans – Confirm application scalability on SMP before going to RAC optimize first for single instance Address knowledge gaps and training – – Clusters, RAC, HA, Scalability, systems management Leverage external resources as required Establish strict System and Application Change control – – – Apply changes to one system element at a time Apply changes to first to test environment Monitor impact of application changes on underlying system components Define Support mechanisms and escalation procedures
Agenda
Planning Best Practices – – – – Architecture Expectation setting Objectives and success criteria Project plan
Implementation Best Practices
– – – –
Infrastructure considerations Installation/configuration Database creation Application considerations
Operational Best Practices – – – Backup & Recovery Performance Monitoring and Tuning Production Migration
Infrastructure Considerations
Architecture/Design – – – Eliminate SPOFs (Single Points of Failure) Workload Distribution (load balancing) strategy Systems management framework for monitoring and managing to SLAs Hardware/Software – Processing nodes – sufficient CPU to accommodate failure – Scalable I/O Subsystem Use S.A.M.E.
– – Private Interconnect Gige, UDP, switched Patch levels and certification
Impementation Flowchart
Configure HW Install cluster manager 9.2.0.1
Create database Configure private interconnect Install Oracle 9.2.0.1
Install Unbreakable Linux Install 9.2.0.3 cluster manager Configure storage and install OCFS Install Oracle 9.2.0.3
Installation Flowchart for Red Hat Linux AS 2.1
Boot Use DRUID for Partition Setup Account Configuration Choose Language Select Keyboard & Mouse Choose – Advanced Server Option Select Boot Loader Configure Network Configure Timezone Select Graphic Mode Boot Floppy Creation Installation Complete / Reboot
Install tips for Red Hat Linux AS 2.1
As documented in: – “Tips and Techniques: Install and Configure Oracle9i on Red Hat Linux Advanced Server” by Deepak Patel, Oracle http://otn.oracle.com/tech/linux/pdf/installtips_final.pdf
Boot options – Always use Advanced Server install. As needed install required packages. CD 1 to 3 has all rpm packages. CD 3 and 4 has source packages. CD 5 includes docs. Memory – Based on physical memory on machine smp or enterprise kernel is installed. ( <= 4 GB smp kernel and > 4 GB enterprise kernel ) Post Installation – Add users, configure network and other administrative tasks after installation.
Install tips for United Linux 1.0
You must install the latest UnitedLinux kernel update! Oracle was certified against an update kernel, the original UL-1.0 kernel is NOT certified!
After installing United Linux 1.0, install Service Pack 2a from: ftp://suse.us.oracle.com/pub/suse/i386/unitedlinux-1.0-iso/ You will also need to have the basic developments tools installed, like make, gcc_old(2.95.3), and the binutils package. Full installation instructions: ftp://ftp.suse.com/pub/suse/i386/supplementary/commercial/Orac le/docs/920_sles8_install.pdf
Install tips for United Linux 1.0
Install the orarun.rpm package from either the SP2 CD –
– or from ftp://ftp.suse.com/pub/suse/i386/supplementary/commerci al/Oracle/sles-8/orarun.rpm
orarun.rpm
update kernel (ie. shmmax, shmmin) UDP settings (256K) Installs and configures hangcheck-timer
Prepare Linux Environment
Follow these steps on EACH node of the cluster – Set Kernel parameters in /etc/sysctl.conf
– Add hostnames to /etc/hosts file – Establish file system or location for ORACLE_HOME (writable for oracle userid) – Setup host equivalence for oracle userid (.rhosts)
Installation Flowchart for OCFS
Download the latest OCFS rpm’s from www.ocfs.org
Create partition on the primary node Install the rpm’s on all nodes Run ocfstool to format and mount your new filesystem Run ocfstool as root (configures /etc/ocfs.conf) on all nodes Run load_ocfs (insmod will load ocfs.o) on all nodes Mount the new filesystem on all nodes Edit rc.local or equivalent add load_ocfs and ‘mount –t ocfs Redhat – – – – currently ships 4 flavors of the AS 2.1 kernel, viz., UP, SMP, Enterprise and Summit (IBM x440) Oracle provides a separate OCFS module for each of the kernel flavors Minor revisions of the kernel do not need a fresh build of ocfs e.g., ocfs built for e.12 will work for e.16, e.18, etc. United Linux – – – United Linux ships 3 flavors of its kernel, for the 2.4.19-64GB SMP, the 2.4.19-4GB and the 2.4.19-4GB-SMP kernel OCFS 1.0.9 is supported on UL 1.0 Service Pack 2a or higher OCFS build is not currently upward compatible with kernel (pre SP3) must ensure OCFS build exists for each new Kernel version prior to upgrading kernel Maintains cache coherency across nodes for the filesystem metadata only – Does not synchronize the data cache buffers across nodes, lets RAC handle that OCFS journals filesystem metadata changes only – Filedata changes are journalled by RAC (log files) Overcomes some limitations of raw devices on Linux – No limit on number of files – – Allows for very large files (max 2TB) Max volume size 32G (4K block) to 8T (1M block) Oracle DB performance is comparable to raw kernel e.25 is strongly recommended for use with OCFS 1.0.9 (remove old kernel tuning parameters) Ensure OCFS rpm corresponds to kernel version – uname –r (i.e. 2.4.19-4GB) Remember to also download rpm’s for OCFS “Support Tools” and “Additional Tools” Download the dd/tar/cp rpm that supports o_direct Use rpm –Uv to install all 4 rpm’s on all nodes Use OCFS for Oracle DB files only, not Oracle binaries (OCFS 1.0.x was not designed as a general purpose filesystem). Configure private interconnect and quorum device Install 9.2.0.1 software with the RAC option Install the oracm from the 9.2.0.3 patchset Install the oracm from the 9.2.0.1 CD-ROM Kill the oracm and watchdog process Install the 9.2.0.3 patchset Configure ocmargs.ora and cmcfg.ora Load the softdog and start with ./ocmstart.sh the cluster manager on both nodes modify ocmargs.ora and cmcfg.ora (remove watchdog) Load the hangcheck-timer module with lsmod Fix empty directory bug Start with ./ocmstart.sh the cluster manager Oracle Instance Cluster Manager (including Node Monitor) User-mode Oracm maintains both, node status view and instance status view. The hangcheck-timer monitors the kernel for hangs, and resets the node if needed. Kernel-mode Hangcheck-timer To enable asynchronous I/O must re-link Oracle to use skgaioi.o Adjust UDP send / receive buffer size to 256K Larger Buffer Cache – – Create an in-memory file system on the /dev/shm (mount -t shm shmfs -o size=8g /dev/shm) To enable the extended buffer cache feature, set the init.ora paramter USE_INDIRECT_DATA_BUFFERS = true Increasing Address Space – – Default 1.7 GB of address space for its SGA. See Metalink Note: 200266.1 for details and a sample program. Create Database Use DBCA to simplify DB creation – Start gsd ( global services daemon ) on all nodes, if it is not already running. Set MAXINSTANCES, MAXLOGFILES, MAXLOGMEMBERS, MAXLOGHISTORY, MAXDATAFILES (auto with DBCA) Create tablespaces as locally Managed (auto with DBCA) Create all tablespaces with ASSM (auto with DBCA) Configure automatic UNDO management (auto with DBCA) Use SPFILE instead of multiple init.ora’s (auto with DBCA) Instances running on all nodes SQL> select * from gv$instance RAC communicating over the private Interconnect SQL> oradebug setmypid SQL> oradebug ipc SQL> oradebug tracefile_name /home/oracle/admin/RAC92_1/udump/rac92_1_ora_1343841.trc – Check trace file in the user_dump_dest: SSKGXPT 0x2ab25bc flags info for network 0 socket no 10 IP 204.152.65.33 UDP 49197 sflags SSKGXPT_UP info for network 1 socket no 0 IP 0.0.0.0 UDP 0 sflags SSKGXPT_DOWN RAC is using desired IPC protocol: Check Alert.log ... cluster interconnect IPC version:Oracle UDP/IP IPC Vendor 1 proto 2 Version 1.0 PMON started with pid=2 ... Use cluster_interconnects only if necessary SRVCTL uses information from srvconfig – Reads $ORACLE_HOME/srvm/config /srvConfig.loc information File can be a RAW Device or OCFS file Srvconfig -init gsd must be running Add ORACLE_HOME – $ srvctl add database -d db_name -o oracle_home [-m domain_name] [-s spfile] Add instances (for each instance enter the command) – $ srvctl add instance -d db_name -i sid -n node Same guidelines as single instance – – SQL Tuning Sequence Caching – – – – – Partition large objects Use different block sizes Tune instance recovery Avoid DDL Use LMT’s and ASSM as noted earlier Planning Best Practices – – – – Architecture Expectation setting Objectives and success criteria Project plan Implementation Best Practices – – – – Infrastructure considerations Installation/configuration Database creation Application considerations Operational Best Practices – – – Backup & Recovery Performance Monitoring and Tuning Production Migration Same DBA procedures as single instance, with some minor, mostly mechanical differences. Managing the Oracle environment – – – Starting/stopping cluster services (ocmstart.sh) Starting/stopping gsd Managing multiple redo log threads Startup and shutdown of the database – Use srvctl Backup and recovery Performance Monitoring and Tuning Production migration Use SRVCTL to administer your RAC database environment. – OEM and the Oracle Intelligent Agent use the configuration information that SRVCTL generates to discover and monitor nodes in your cluster. Global Services Daemon (GSD) receives requests from SRVCTL to execute administrative job tasks, such as startup or shutdown. – GSD must be started on all the nodes in your RAC environment so that the manageability features and tools operate properly. (GSDCTL) RMAN is the most efficient option for Backup & Recovery – – – – Managing the snapshot control file location. Managing the control file autobackup feature. Managing archived logs in RAC – choose proper archiving scheme. Node Affinity Awareness RMAN and Oracle Net in RAC apply – you cannot specify a net service name that uses Oracle Net features to distribute RMAN connections to more than one instance. Oracle Enterprise Manager – GUI interface to Recovery Manager Tune first for single instance 9i Use Statspack: – – – Separate 1 GB tablespace for Statspack snapshots at 10-20 min intervals during stress testing, hourly during normal operations Run on all instances, staggered Supplement with scripts/tracing – – – Monitor V$SESSION_WAIT to see which blocks are involved in wait events Trace events like 10046/8 can provide additional wait event details Monitor Alert logs and trace files, as on single instance Oracle Performance Manager RAC-specific views Supplement with System-level monitoring – – – CPU utilization never 100% I/O service times never > acceptable thresholds CPU run queues at optimal levels Obvious application deficiency on a single node can’t be solved by multiple nodes. – Single points of contention. – – Not scalable on SMP I/O bound on single instance DB Tuning on single instance DB to ensure applications scalable first – – – Identify/tune contention using v$segment_statistics to identify objects involved Concentrate on the top 5 Statspack timed events if majority of time is spent waiting Concentrate on bad SQL if CPU bound Maintain a balanced load on underlying systems (DB, OS, storage subsystem, etc. ) – Excessive load on individual components can invoke aberrant behaviour. Deciding if RAC is the performance bottleneck – – Amount of Cross Instance Traffic Type of requests Type of blocks Latency Block receive time buffer size factor bandwidth factor Adhere to strong Systems Life Cycle Disciplines – – – – – – Comprehensive test plans (functional and stress) Rehearsed production migration plan Change Control Separate environments for Dev, Test, QA/UAT, Production System AND application change control Log changes to spfile Backup and recovery procedures Security controls Support Procedures Recommended sessions – List 1 or 2 sessions that complement this session Recommended demos and/or hands-on labs – List of or two demos or labs that will let them see this product in action. See Your Business in Our Software – Visit the DEMOgrounds for a customized architectural review, see a customized demo with Solutions Factory, or receive a personalized proposal. Visit the DEMOgrounds for more information. Relevant web sites to visit for more information – List urls here. RedHat Linux – http://www.redhat.com/oracle/ Linux Center - Technical White Papers & Documentation – http://otn.oracle.com/tech/linux/tech_wp.html “Tips and Techniques: Install and Configure Oracle9i on Red Hat Linux Advanced Server ” by Deepak Patel, Oracle Corporation • http://otn.oracle.com/tech/linux/pdf/installtips_final.pdf “Tips and Techniques: Install and Configure Oracle9i on SLES8 / United Linux 1.0 • http://www.suse.com/en/business/certifications/certified_software/oracle/db /9iR2_sles8.html United Linux – http://www.unitedlinux.com SuSE – http://www.suse.com/us/business/products/server/sles/index.html SCO Group (Formerly Caldera System) - http://www.ebizenterprises.com/page1.asp?p=463 Connectiva – http://www.connectiva.com TurboLinux – http://www.turbolinux.com/ Bug 2820871 - ORA-29740 NODE EVICTION DESIGN ALGORITHM AND ABRUPT TIME CHANGE ARU: 9.2.0.3 ARU 4161735 completed for LINUX Intel Bug 2420930 - GET ORA-600 [KSXPMPRP1] DURING STARTUP IN RAC MODE WITH LARGER BUFFERS. This was mysteriously included in 9.2.0.2, but not in 9.2.0.3. Bug 2875050 was opened for this issue. ARU: 9.2.0.3 ARU 4202164completed for LINUX Intel Bug 2420930 - GET ORA-600 [KSXPMPRP1] DURING STARTUP IN RAC MODE WITH LARGER BUFFERS Bug 2922471 – Fractured Bug:2844009 - MISSING LIBCXA.SO.3 LIBRARY ISSUE IN PSR 9203. ARU: 9.2.0.3 ARU 4046387 completed for LINUX Intel Bug 2779294 – node_list does not populated into oraInventory/ContentsXML/inventory.xml. opatch install will only apply to local node. Workaround is editing inventory.xml documented in bug 2742686. Bug 2646914, 2675090, 2706220 and 2695783 ORA-600 [KCCSBCK_FIRST], [2] on linux and W2K platform after installing 9.2.0.2. Very important patch, missing from 9.2.0.3 ARU: 9.2.0.3 ARU 4110670 completed for LINUX Download Patch 2594820 from Metalink – #rpm -ivh WatchdogTimerMargin WatchdogSafetyMargin KernelModuleName=hangcheck-timer CMDiskFile from optional to mandatory – CM quorum partition of cluster participation. remove or comment out from the /etc/rc.local file: /sbin/insmod softdog nowayout=0 soft_noboot=1 soft_margin=60 ADD to rc.local, execute as root to load /sbin/insmod hangcheck-timer.o hangcheck_tick=30 hangcheck_margin=180 inclusion of the hangcheck-timer kernel module Parameter Service Value ---------------- ------ hangcheck_tick ---------------- hangcheck-timer ------- 30 seconds 180 seconds hangcheck_margin hangcheck-timer KernelModuleName oracm hangcheck-timer MissCount hangcheck_tick oracm hangcheck_margin > (> 210 cmcfg.ora example HeartBeat=15000 ClusterName=Oracle Cluster Manager, version 9i KernelModuleName=hangcheck-timer PollInterval=1000 MissCount=215 PrivateNodeNames=int-node1 int-node2 PublicNodeNames=node1 node2 ServicePort=9998 CmDiskFile=/ocfsdisk1/quorum/quorumfile HostName=int-node1 Parameters for ocmargs.ora oracm norestart 1800 Overall tools CPU Memory Disk I/O Kernel messages OS error codes OS calls sar, vmstat /proc/cpuinfo, mpstat, top /proc/meminfo, /proc/slabinfo, free iostat Network /proc/net/dev, netstat, mii-tool Kernel Version and Rel. Types of I/O Cards lspci –vv cat /proc/version Kernel Modules Loaded List all PCI devices (HW) Startup changes lsmod, cat /proc/modules lspci –v /etc/sysctl.conf, /etc/rc.local /var/log/messages, /var/log/dmesg /usr/src/linux/include/asm/errno.h /usr/sbin/strace-p Increasing Address Space Default 1.7 GB of address space for its SGA. Shutdown all instances of Oracle cd $ORACLE_HOME/lib cp -a libserver9.a libserver9.a.org – to make a backup copy cd $ORACLE_HOME/rdbms/lib genksms -s 0x15000000 >ksms.s – lower SGA base to 0x15000000 make -f ins_rdbms.mk ksms.o – compile in new SGA base address make -f ins_rdbms.mk ioracle (relink) Increasing Address Space Cont. sysctl –w kernel.shmmax=3000000000 Lower process base – Find out the pid of the process (shell) from where oracle will be started using ps (Oracle - echo $$) – changing /proc/$pid/mapped_base to 0x10000000 and restarting oracle Metalink Note: 200266.1 0xFFFFFFFF Default Reserved for kernel 0xC0000000 Variable SGA 0xFFFFFFFF After Relink Reserved for kernel 0xC0000000 Variable SGA 0x50000000 0x40000000 0x00000000 DB Buffers (SGA) Code, etc. sga_base (relink Oracle) mapped_base (/proc/ DB Buffers (SGA) Code, etc. Larger Buffer Cache does buffer cache increase with larger SGA Create an in-memory file system on the /dev/shm mount -t shm shmfs -o size=8g /dev/shm To enable the extended buffer cache feature, set the init.ora paramter USE_INDIRECT_DATA_BUFFERS = true Don’t Use dynamic cache parameters DB_CACHE_SIZE DB_#K_CACHE_SIZE Limitations apply to the extended buffer cache feature on Linux: You cannot change the size of the buffer cache while the instance is running. You cannot create or use tablespaces with non-standard block sizes. Adjust send / receive buffer size to 256K Tuning the default and maximum window sizes: /proc/sys/net/core/rmem_default - default receive window /proc/sys/net/core/rmem_max - maximum receive window /proc/sys/net/core/wmem_default - default send window /proc/sys/net/core/wmem_max - maximum send window - sysctl -w net.core.rmem_max=262144 sysctl -w net.core.wmem_max=262144 sysctl -w net.core.rmem_default=262144 sysctl -w net.core.wmem_default=262144 To enable asynchronous I/O must re-link Oracle to use skgaioi.o – cd to $ORACLE_HOME/rdbms/lib – – make -f ins_rdbms.mk async_on make -f ins_rdbms.mk ioracle – – set 'disk_asynch_io=true' (default value is true) set 'filesystemio_options=asynch‘ (RAW Only) OCFS and Unbreakable Linux
OCFS and RAC
Install Tips for OCFS
Installation Flowchart for oracm and Oracle
Hangcheck NM, and CM Flow (After V9.2.0.2)
Post Installation
Create RAC database using DBCA
Validate RAC Configuration
Configure srvconfig / srvctl
Application Deployment
Agenda
Operations
Operations: srvconfig / srvctl
Operations: Backup & Recovery
Performance Monitoring and Tuning
Performance Monitoring and Tuning
Performance Monitoring and Tuning
Production Migration
Next Steps….
Reminder – please complete the OracleWorld online session survey Thank you.
Resources
United Linux 1.0 Resources
Recommended one-off patches
Recommended one-off patches
Hangcheck-timer and Oracle Cluster Manager
Hangcheck-timer and Oracle Cluster Manager
Hangcheck-timer and Oracle Cluster Manager
Hangcheck-timer and Oracle Cluster Manager
Hangcheck-timer and Oracle Cluster Manager
Linux Monitoring and Configuration Tools
Post Installation
Post Installation
Post Installation
Lowering of mapped base
Post Installation
Post Installation
Post Installation