What is RAC?
RAC stands for Real Application cluster. It is a clustering solution from Oracle
Corporation that ensures high availability of databases by providing instance failover, media failover features.
Mention the Oracle RAC software
components:-Oracle RAC is composed of two or more database instances. They are composed of Memory structures and background processes same as the single instance database.
Oracle RAC instances use two processes GES(Global Enqueue Service), GCS(Global Cache Service) that enable cache fusion.
Oracle RAC instances are composed of following background processes: ACMS—Atomic Controlfile to Memory Service (ACMS)
GTX0-j—Global Transaction Process LMON—Global Enqueue Service Monitor LMD—Global Enqueue Service Daemon LMS—Global Cache Service Process LCK0—Instance Enqueue Process
RMSn—Oracle RAC Management Processes (RMSn) RSMN—Remote Slave Monitor
What is GRD?
GRD stands for Global Resource Directory. The GES and GCS maintains records of the statuses of each datafile and each cahed block using global resource directory.This process is referred to as cache fusion and helps in data integrity.
Give Details on Cache
Fusion:-Oracle RAC is composed of two or more instances. When a block of data is read from datafile by an instance within the cluster and another instance is in need of the same block,it is easy to get the block image from the insatnce which has the block in its SGA rather than reading from the disk. To enable inter instance communication Oracle RAC makes use of interconnects. The Global Enqueue Service(GES) monitors and Instance enqueue process manages the cahce fusion.
Give Details on
ACMS:-ACMS stands for Atomic Controlfile Memory Service.In an Oracle RAC environment ACMS is an agent that ensures a distributed SGA memory update(ie)SGA updates are globally committed on success or globally aborted in event of a failure.
Give details on GTX0-j
:-The process provides transparent support for XA global transactions in a RAC environment.The database autotunes the number of these processes based on the workload of XA global transactions.
Give details on
LMON:-This process monitors global enques and resources across the cluster and performs global enqueue recovery operations.This is called as Global Enqueue Service Monitor.
Give details on
LMD:-This process is called as global enqueue service daemon. LMD:-This process manages incoming remote resource requests within each instance.
Give details on
LMS:-This process is called as Global Cache service process.LMS:-This process maintains statuses of datafiles and each cahed block by recording information in a Global Resource
Dectory(GRD).This process also controls the flow of messages to remote instances and manages global data block access and transmits block images between the buffer caches of different instances.This processing is a part of cache fusion feature.
Give details on
LCK0:-This process is called as Instance enqueue process.LCK0:-This process manages non-cache fusion resource requests such as libry and row cache requests.
Give details on
RMSn:-This process is called as Oracle RAC management process.These pocesses perform managability tasks for Oracle RAC.Tasks include creation of resources related Oracle RAC when new instances are added to the cluster.
Give details on
RSMN:-This process is called as Remote Slave Monitor.RSMN:-This process manages background slave process creation andd communication on remote instances. This is a background slave process.This process performs tasks on behalf of a co-ordinating process running in another instance.
What components in RAC must reside in shared storage?
All datafiles, controlfiles, SPFIles, redo log files must reside on cluster-aware shred storage.
What is the significance of using cluster-aware shared storage in an Oracle RAC environment?
All instances of an Oracle RAC can access all the datafiles,control files, SPFILE's, redolog files when these files are hosted out of cluster-aware shared storage which are group of shared disks.
Give few examples for solutions that support cluster
storage:-ASM(automatic storage management),raw disk devices,network file system(NFS), OCFS2 and OCFS(Oracle Cluster Fie systems).
What is an interconnect network?
an interconnect network is a private network that connects all of the servers in a cluster. The interconnect network uses a switch/multiple switches that only the nodes in the cluster can access.
How can we configure the cluster interconnect?
Configure User Datagram Protocol(UDP) on Gigabit ethernet for cluster interconnect.On unia and linux systems we use UDP and RDS(Reliable data socket) protocols to be used by Oracle Clusterware.Windows clusters use the TCP protocol.
Can we use crossover cables with Oracle Clusterware interconnects? No, crossover cables are not supported with Oracle Clusterware intercnects.
What is the use of cluster interconnect?
Cluster interconnect is used by the Cache fusion for inter instance communication.
How do users connect to database in an Oracle RAC environment?
Users can access a RAC database using a client/server configuration or through one or more middle tiers ,with or without connection pooling.Users can use oracle services feature to connect to database.
What is the use of a service in Oracle RAC environemnt?
Applications should use the services feature to connect to the Oracle database.Services enable us to define rules and characteristics to control how users and applications connect to database instances.
What are the characteriscs controlled by Oracle services feature?
high availability characteristics.
Which enable the load balancing of applications in RAC?
Oracle Net Services enable the load balancing of application connections across all of the instances in an Oracle RAC database.
What is a virtual IP address or VIP?
A virtual IP address or VIP is an alternate IP address that the client connectins use instead of the standard public IP address. To configureVIP address, we need to reserve a spare IP address for each node, and the IP addresses must use the same subnet as the public network.
What is the use of VIP?
If a node fails, then the node's VIP address fails over to another node on which the VIP address can accept TCP connections but it cannot accept Oracle connections.
Give situations under which VIP address failover
happens:-VIP addresses failover happens when the node on which the happens:-VIP address runs fails, all interfaces for the VIP address fails, all interfaces for the VIP address are disconnected from the network.
What is the significance of VIP address failover?
When a VIP address failover happens, Clients that attempt to connect to the VIP address receive a rapid connection refused error .They don't have to wait for TCP connection timeout messages.
What are the administrative tools used for Oracle RAC environments?
Oracle RAC cluster can be administered as a single image using OEM(Enterprise Manager),SQL*PLUS,Servercontrol(SRVCTL),clusterverificationutility(cvu),DBCA,NE TCA
How do we verify that RAC instances are running?
Issue the following query from any one node connecting through SQL*PLUS. $connect sys/sys as sysdba
SQL>select * from V$ACTIVE_INSTANCES;
The query gives the instance number under INST_NUMBER column,host_:instancename under INST_NAME column.
What is FAN?
Fast application Notification as it abbreviates to FAN relates to the events related to instances,services and nodes.This is a notification mechanism that Oracle RAc uses to notify other processes about the configuration and service level information that includes service status changes such as,UP or DOWN events.Applications can respond to FAN events and take immediate action.
Where can we apply FAN UP and DOWN events?
FAN UP and FAN DOWN events can be applied to instances,services and nodes.
State the use of FAN events in case of a cluster configuration change?
During times of cluster configuration changes,Oracle RAC high availability framework publishes a FAN event immediately when a state change occurs in the cluster.So applications can receive FAN events and react immediately.This prevents applications from polling database and detecting a problem after such a state change.
Why should we have seperate homes for ASm instance?
It is a good practice to have ASM home seperate from the database
hom(ORACLE_HOME).This helps in upgrading and patching ASM and the Oracle database software independent of each other.Also,we can deinstall the Oracle database software independent of the ASM instance.
What is the advantage of using ASM?
Having ASM is the Oracle recommended storage option for RAC databases as the ASM maximizes performance by managing the storage configuration across the disks.ASM does this by distributing the database file across all of the available storage within our cluster database environment.
What is rolling upgrade?
It is a new ASM feature from Database 11g.ASM instances in Oracle database 11g
release(from 11.1) can be upgraded or patched using rolling upgrade feature. This enables us to patch or upgrade ASM nodes in a clustered environment without affecting database availability.During a rolling upgrade we can maintain a functional cluster while one or more of the nodes in the cluster are running in different software versions.
Can rolling upgrade be used to upgrade from 10g to 11g database? No,it can be used only for Oracle database 11g releases(from 11.1).
State the initialization parameters that must have same value for every instance in an Oracle RAC
database:-Some initialization parameters are critical at the database creation time and must have same values.Their value must be specified in SPFILE or PFILE for every instance.The list of parameters that must be identical on every instance are given below:
ACTIVE_INSTANCE_COUNT ARCHIVE_LAG_TARGET COMPATIBLE CLUSTER_DATABASE CLUSTER_DATABASE_INSTANCE CONTROL_FILES DB_BLOCK_SIZE DB_DOMAIN DB_FILES DB_NAME DB_RECOVERY_FILE_DEST DB_RECOVERY_FILE_DEST_SIZE DB_UNIQUE_NAME INSTANCE_TYPE (RDBMS or ASM) PARALLEL_MAX_SERVERS REMOTE_LOGIN_PASSWORD_FILE UNDO_MANAGEMENT
Can the DML_LOCKS and RESULT_CACHE_MAX_SIZE be identical on all instances?
These parameters can be identical on all instances only if these parameter values are set to zero.
What two parameters must be set at the time of starting up an ASM instance in a RAC environment?
The parameters CLUSTER_DATABASE and INSTANCE_TYPE must be set.
Mention the components of Oracle
clusterware:-Oracle clusterware is made up of components like voting disk and clusterware:-Oracle Cluster Registry(OCR).
What is a CRS resource?
Oracle clusterware is used to manage high-availability operations in a cluster.Anything that Oracle Clusterware manages is known as a CRS resource.Some examples of CRS resources are database,an instance,a service,a listener,a VIP address,an application process etc.
What is the use of OCR?
CRS resources stored in OCR(Oracle Cluster Registry).
How does a Oracle Clusterware manage CRS resources?
Oracle clusterware manages CRS resources based on the configuration information of CRS resources stored in OCR(Oracle Cluster Registry).
Name some Oracle clusterware tools and their uses? OIFCFG - allocating and deallocating network interfaces
OCRCONFIG - Command-line tool for managing Oracle Cluster Registry OCRDUMP - Identify the interconnect being used
CVU - Cluster verification utility to get status of CRS resources
What are the modes of deleting instances from ORacle Real Application cluster Databases?
We can delete instances using silent mode or interactive mode using DBCA(Database Configuration Assistant).
How do we remove ASM from a Oracle RAC environment?
We need to stop and delete the instance in the node first in interactive or silent mode.After that asm can be removed using srvctl tool as follows:
srvctl stop asm -n node_name srvctl remove asm -n node_name
We can verify if ASM has been removed by issuing the following command: srvctl config asm -n node_name
How do we verify that an instance has been removed from OCR after deleting an instance?
Issue the following srvctl command: srvctl config database -d database_name cd CRS_HOME/bin
./crs_stat
How do we verify an existing current backup of OCR?
We can verify the current backup of OCR using the following command : ocrconfig-showbackup
What are the performance views in an Oracle RAC environment?
We have v$ views that are instance specific. In addition we have GV$ views called as global views that has an INST_ID column of numeric data type.GV$ views obtain
information from individual V$ views.
What are the types of connection load-balancing?
There are two types of connection load-balancing:server-side load balancing and client-side load balancing.
What is the differnece between server-side and client-side connection load balancing? Client-side balancing happens at client side where load balancing is done using listener.In case of server-side load balancing listener uses a load-balancing advisory to redirect connections to the instance providing best service.
Give the usage of
srvctl:-srvctl start instance -d db_name -i "inst_name_list" [-o start_options]srvctl:-srvctl stop instance -d name -i "inst_name_list" [-o stop_options]srvctl stop instance -d orcl -i "orcl3,orcl4" -o immediatesrvctl start database -d name [-o start_options]srvctl stop database -d name [-o stop_options]srvctl start database -d orcl -o mount
cluster demons
Demon means background process
1) crsd - it is update the ocr file. it use for node apps work(ons,vip,gsd) 2) cssd - it is update the vd file. it is monitor the node membership
when ever HB(heart beat) fail between the nodes the cssd process reboot the node 3) evmd - update the diag information
http://www.idevelopment.info/data/Oracle/DBA_tips/Oracle10gRAC/CLUSTER_65.sht ml
Overview
Oracle Clusterware 10g, formerly known as Cluster Ready Services (CRS) is software that when
installed on servers running the same operating system, enables the servers to be bound together
to operate and function as a single server or cluster. This infrastructure simplifies the
requirement for an Oracle Real Application Clusters (RAC) database by providing cluster
software that is tightly integrated with the Oracle Database.
The Oracle Clusterware requires two critical clusterware components: a voting disk to record
node membership information and the Oracle Cluster Registry (OCR) to record cluster
configuration information:
Voting Disk
The voting disk is a shared partition that Oracle Clusterware uses to verify cluster node
membership and status. Oracle Clusterware uses the voting disk to determine which instances are
members of a cluster by way of a health check and arbitrates cluster ownership among the
instances in case of network failures. The primary function of the voting disk is to manage node
membership and prevent what is known as Split Brain Syndrome in which two or more instances
attempt to control the RAC database. This can occur in cases where there is a break in
communication between nodes through the interconnect.
The voting disk must reside on a shared disk(s) that is accessible by all of the nodes in the
cluster. For high availability, Oracle recommends that you have multiple voting disks. Oracle
Clusterware can be configured to maintain multiple voting disks (multiplexing) but you must
have an odd number of voting disks, such as three, five, and so on. Oracle Clusterware supports a
maximum of 32 voting disks. If you define a single voting disk, then you should use external
mirroring to provide redundancy.
A node must be able to access more than half of the voting disks at any time. For example, if you
have five voting disks configured, then a node must be able to access at least three of the voting
disks at any time. If a node cannot access the minimum required number of voting disks it is
evicted, or removed, from the cluster. After the cause of the failure has been corrected and access
to the voting disks has been restored, you can instruct Oracle Clusterware to recover the failed
node and restore it to the cluster.
Oracle Cluster Registry (OCR)
Maintains cluster configuration information as well as configuration information about any
cluster database within the cluster. OCR is the repository of configuration information for the
cluster that manages information about like the cluster node list and instance-to-node mapping
information. This configuration information is used by many of the processes that make up the
CRS as well as other cluster-aware applications which use this repository to share information
amoung them. Some of the main components included in the OCR are:
•
Node membership information
•
Database instance, node, and other mapping information
•ASM (if configured)
•
Application resource profiles such as VIP addresses, services, etc.
•Service characteristics
•
Information about processes that Oracle Clusterware controls
•
Information about any third-party applications controlled by CRS (10g R2
and later)
The OCR stores configuration information in a series of key-value pairs within a directory tree
structure. To view the contents of the OCR in a human-readable format, run the
ocrdumpcommand. This will dump the contents of the OCR into an ASCII text file in the current
directory named
OCRDUMPFILE.
The OCR must reside on a shared disk(s) that is accessible by all of the nodes in the cluster.
Oracle Clusterware 10g Release 2 allows you to multiplex the OCR and Oracle recommends that
you use this feature to ensure cluster high availability. Oracle Clusterware allows for a maximum
of two OCR locations; one is the primary and the second is an OCR mirror. If you define a single
OCR, then you should use external mirroring to provide redundancy. You can replace a failed
OCR online, and you can update the OCR through supported APIs such as Enterprise Manager,
the Server Control Utility (SRVCTL), or the Database Configuration Assistant (DBCA).
This article provides a detailed look at how to administer the two critical Oracle Clusterware
components — the voting disk and the Oracle Cluster Registry (OCR). The examples described
in this guide were tested with Oracle RAC 10g Release 2 (10.2.0.4) on the Linux x86 platform.
Example Configuration
The example configuration used in this article consists of a two-node RAC with a clustered
database named
racdb.idevelopment.inforunning Oracle RAC 10g Release 2 on the Linux
x86 platform. The two node names are
racnode1and
racnode2, each hosting a single Oracle
instance named
racdb1and
racdb2respectively. For a detailed guide on building the example
clustered database environment, please see:
Building an Inexpensive Oracle RAC 10
g Release 2 on Linux -
(CentOS 5.3 / iSCSI)
The example Oracle Clusterware environment is configured with a single voting disk and a
single OCR file on an OCFS2 clustered file system. Note that the voting disk is owned by the
oracle
user in the
oinstallgroup with 0644 permissions while the OCR file is owned by
rootin the
oinstallgroup with 0640 permissions:
[oracle@racnode1 ~]$ ls -l /u02/oradata/racdb total 16608
-rw-r--r-- 1 oracle oinstall 10240000 Aug 26 22:43 CSSFile
drwxr-xr-x 2 oracle oinstall 3896 Aug 26 23:45 dbs/
-rw-r--- 1 root oinstall 6836224 Sep 3 23:47 OCRFile
Check Current OCR File
[oracle@racnode1 ~]$ ocrcheck
Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4660 Available space (kbytes) : 257460 ID : 1331197
Device/File Name : /u02/oradata/racdb/OCRFile
Device/File integrity check succeeded Device/File not configured
Cluster registry integrity check succeeded
[oracle@racnode1 ~]$ crsctl query css votedisk 0. 0 /u02/oradata/racdb/CSSFile
located 1 votedisk(s).
Preparation
To prepare for the examples used in this guide, five new iSCSI volumes were created from the
SAN and will be bound to RAW devices on all nodes in the RAC cluster. These five new
volumes will be used to demonstrate how to move the current voting disk and OCR file from an
OCFS2 file system to RAW devices:
Five New iSCSI Volumes and their Local Device Name Mappings
iSCSI Target Name
Local Device Name
Disk Size
iqn.2006-01.com.openfiler:racdb.ocr1 /dev/iscsi/ocr1/part 512 MB
iqn.2006-01.com.openfiler:racdb.ocr2 /dev/iscsi/ocr2/part 512 MB
iqn.2006-01.com.openfiler:racdb.voting1 /dev/iscsi/voting1/part 32 MB iqn.2006-01.com.openfiler:racdb.voting2 /dev/iscsi/voting2/part 32 MB iqn.2006-01.com.openfiler:racdb.voting3 /dev/iscsi/voting3/part 32 MB
After creating the new iSCSI volumes from the SAN, they now need to be configured for access
and bound to RAW devices by all Oracle RAC nodes in the database cluster.
1.
From all Oracle RAC nodes in the cluster as
root, discover the five new iSCSI volumes
from the SAN which will be used to store the voting disks and OCR files.
[root@racnode1 ~]# iscsiadm -m discovery -t sendtargets -p openfiler1-san 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.asm1 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.asm2 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.crs 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.ocr1 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.ocr2 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.voting1 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.voting2 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.voting3
[root@racnode2 ~]# iscsiadm -m discovery -t sendtargets -p openfiler1-san 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.asm1 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.asm2 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.crs 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.ocr1 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.ocr2 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.voting1 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.voting2 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.voting3
[root@racnode1 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.ocr1 -p
192.168.2.195 -l
[root@racnode1 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.ocr2 -p
192.168.2.195 -l
[root@racnode1 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.voting1
-p 192.168.2.195 -l
[root@racnode1 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.voting2
-p 192.168.2.195 -l
[root@racnode1 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.voting3
-p 192.168.2.195 -l
[root@racnode2 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.ocr1 -p
192.168.2.195 -l
[root@racnode2 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.ocr2 -p
192.168.2.195 -l
[root@racnode2 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.voting1
-p 192.168.2.195 -l
[root@racnode2 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.voting2
-p 192.168.2.195 -l
[root@racnode2 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.voting3
-p 192.168.2.195 -l
3. Create a single primary partition on each of the five new iSCSI volumes that span the
entire disk. Perform this from only one of the Oracle RAC nodes in the cluster:
[root@racnode1 ~]# fdisk /dev/iscsi/ocr1/part [root@racnode1 ~]# fdisk /dev/iscsi/ocr2/part [root@racnode1 ~]# fdisk /dev/iscsi/voting1/part [root@racnode1 ~]# fdisk /dev/iscsi/voting2/part [root@racnode1 ~]# fdisk /dev/iscsi/voting3/part
4. Re-scan the SCSI bus from all Oracle RAC nodes in the cluster:
[root@racnode2 ~]# partprobe
5.
Create a shell script (
/usr/local/bin/setup_raw_devices.sh) on all Oracle RAC
nodes in the cluster to bind the five Oracle Clusterware component devices to RAW
devices as follows:
# +---+ # | FILE: /usr/local/bin/setup_raw_devices.sh | # +---+ # +---+ # | Bind OCR files to RAW device files. | # +---+ /bin/raw /dev/raw/raw1 /dev/iscsi/ocr1/part1
/bin/raw /dev/raw/raw2 /dev/iscsi/ocr2/part1 sleep 3
/bin/chown root:oinstall /dev/raw/raw1 /bin/chown root:oinstall /dev/raw/raw2 /bin/chmod 0640 /dev/raw/raw1
/bin/chmod 0640 /dev/raw/raw2
# | Bind voting disks to RAW device files. | # +---+ /bin/raw /dev/raw/raw3 /dev/iscsi/voting1/part1
/bin/raw /dev/raw/raw4 /dev/iscsi/voting2/part1 /bin/raw /dev/raw/raw5 /dev/iscsi/voting3/part1 sleep 3
/bin/chown oracle:oinstall /dev/raw/raw3 /bin/chown oracle:oinstall /dev/raw/raw4 /bin/chown oracle:oinstall /dev/raw/raw5 /bin/chmod 0644 /dev/raw/raw3
/bin/chmod 0644 /dev/raw/raw4 /bin/chmod 0644 /dev/raw/raw5
6. From all Oracle RAC nodes in the cluster, change the permissions of the new shell script
to execute:
[root@racnode1 ~]# chmod 755 /usr/local/bin/setup_raw_devices.sh [root@racnode2 ~]# chmod 755 /usr/local/bin/setup_raw_devices.sh
7. Manually execute the new shell script from all Oracle RAC nodes in the cluster to bind
the voting disks to RAW devices:
[root@racnode1 ~]# /usr/local/bin/setup_raw_devices.sh /dev/raw/raw1: bound to major 8, minor 97
/dev/raw/raw2: bound to major 8, minor 17 /dev/raw/raw3: bound to major 8, minor 1 /dev/raw/raw4: bound to major 8, minor 49 /dev/raw/raw5: bound to major 8, minor 33
[root@racnode2 ~]# /usr/local/bin/setup_raw_devices.sh /dev/raw/raw1: bound to major 8, minor 65
/dev/raw/raw2: bound to major 8, minor 49 /dev/raw/raw3: bound to major 8, minor 33 /dev/raw/raw4: bound to major 8, minor 1 /dev/raw/raw5: bound to major 8, minor 17
8. Check that the character (RAW) devices were created from all Oracle RAC nodes in the
cluster:
[root@racnode1 ~]# ls -l /dev/raw total 0
crw-r--- 1 root oinstall 162, 1 Sep 24 00:48 raw1 crw-r--- 1 root oinstall 162, 2 Sep 24 00:48 raw2 crw-r--r-- 1 oracle oinstall 162, 3 Sep 24 00:48 raw3 crw-r--r-- 1 oracle oinstall 162, 4 Sep 24 00:48 raw4 crw-r--r-- 1 oracle oinstall 162, 5 Sep 24 00:48 raw5 [root@racnode2 ~]# ls -l /dev/raw
total 0
crw-r--- 1 root oinstall 162, 1 Sep 24 00:48 raw1 crw-r--- 1 root oinstall 162, 2 Sep 24 00:48 raw2 crw-r--r-- 1 oracle oinstall 162, 3 Sep 24 00:48 raw3 crw-r--r-- 1 oracle oinstall 162, 4 Sep 24 00:48 raw4
crw-r--r-- 1 oracle oinstall 162, 5 Sep 24 00:48 raw5 [root@racnode1 ~]# raw -qa
/dev/raw/raw1: bound to major 8, minor 97 /dev/raw/raw2: bound to major 8, minor 17 /dev/raw/raw3: bound to major 8, minor 1 /dev/raw/raw4: bound to major 8, minor 49 /dev/raw/raw5: bound to major 8, minor 33 [root@racnode2 ~]# raw -qa
/dev/raw/raw1: bound to major 8, minor 65 /dev/raw/raw2: bound to major 8, minor 49 /dev/raw/raw3: bound to major 8, minor 33 /dev/raw/raw4: bound to major 8, minor 1 /dev/raw/raw5: bound to major 8, minor 17
9.
Include the new shell script in
/etc/rc.localto run on each boot from all Oracle RAC
nodes in the cluster:
[root@racnode1 ~]# echo "/usr/local/bin/setup_raw_devices.sh" >> /etc/rc.local [root@racnode2 ~]# echo "/usr/local/bin/setup_raw_devices.sh" >> /etc/rc.local 10.
Once the raw devices are created, use the
ddcommand to zero out the device and make
sure no data is written to the raw devices. Only perform this action from one of the
Oracle RAC nodes in the cluster:
[root@racnode1 ~]# dd if=/dev/zero of=/dev/raw/raw1 dd: writing to '/dev/raw/raw1': No space left on device 1048516+0 records in
1048515+0 records out
536839680 bytes (537 MB) copied, 773.145 seconds, 694 kB/s [root@racnode1 ~]# dd if=/dev/zero of=/dev/raw/raw2
dd: writing to '/dev/raw/raw2': No space left on device 1048516+0 records in
1048515+0 records out
536839680 bytes (537 MB) copied, 769.974 seconds, 697 kB/s [root@racnode1 ~]# dd if=/dev/zero of=/dev/raw/raw3
dd: writing to '/dev/raw/raw3': No space left on device 65505+0 records in
65504+0 records out
33538048 bytes (34 MB) copied, 47.9176 seconds, 700 kB/s [root@racnode1 ~]# dd if=/dev/zero of=/dev/raw/raw4 dd: writing to '/dev/raw/raw4': No space left on device 65505+0 records in
65504+0 records out
33538048 bytes (34 MB) copied, 47.9915 seconds, 699 kB/s [root@racnode1 ~]# dd if=/dev/zero of=/dev/raw/raw5 dd: writing to '/dev/raw/raw5': No space left on device 65505+0 records in
33538048 bytes (34 MB) copied, 48.2684 seconds, 695 kB/s
It is highly recommended to take a backup of the voting disk and OCR file before making
any changes! Instruction are included in this guide on how to perform backups of the
voting
disk
and
OCR file
.
CRS_home
The Oracle Clusterware binaries included in this article (i.e.
crs_stat,
ocrcheck,
crsctl,
etc.) are being executed from the Oracle Clusterware home directory which for the purpose
of this article is
/u01/app/crs. The environment variable
$ORA_CRS_HOMEis set for both the
oracleand
rootuser accounts to this directory and is also included in the
$PATH:
[root@racnode1 ~]# echo $ORA_CRS_HOME /u01/app/crs
[root@racnode1 ~]# which ocrcheck /u01/app/crs/bin/ocrcheck
Administering the OCR File
View OCR Configuration Information
Two methods exist to verify how many OCR files are configured for the cluster as well as their
location. If the cluster is up and running, use the
ocrcheckutility as either the
oracleor
rootuser account:
[oracle@racnode1 ~]$ ocrcheck
Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4660 Available space (kbytes) : 257460 ID : 1331197
Device/File Name : /u02/oradata/racdb/OCRFile <-- OCR (primary)
Device/File integrity check succeeded
Device/File not configured <-- OCR Mirror (not configured)
Cluster registry integrity check succeeded
If CRS is down, you can still determine the location and number of OCR files by viewing the file
is located in
/etc/oracle/ocr.locwhile on Sun Solaris it is located at
/var/opt/oracle/ocr.loc:
[root@racnode1 ~]# cat /etc/oracle/ocr.loc ocrconfig_loc=/u02/oradata/racdb/OCRFile
local_only=FALSE
To view the actual contents of the OCR in a human-readable format, run the
ocrdumpcommand.
This command requires the CRS stack to be running. Running the
ocrdumpcommand will dump
the contents of the OCR into an ASCII text file in the current directory named
OCRDUMPFILE:
[root@racnode1 ~]# ocrdump[root@racnode1 ~]# ls -l OCRDUMPFILE
-rw-r--r-- 1 root root 250304 Oct 2 22:46 OCRDUMPFILE
The
ocrdumputility also allows for different output options:
#
# Write OCR contents to specified file name. #
[root@racnode1 ~]# ocrdump /tmp/'hostname'_ocrdump_'date +%m%d%y:%H%M'
#
# Print OCR contents to the screen. #
[root@racnode1 ~]# ocrdump -stdout -keyname SYSTEM.css
#
# Write OCR contents out to XML format. #
[root@racnode1 ~]# ocrdump -stdout -keyname SYSTEM.css -xml > ocrdump.xml
Add an OCR File
Starting with Oracle Clusterware 10g Release 2 (10.2), users now have the ability to multiplex
(mirror) the OCR. Oracle Clusterware allows for a maximum of two OCR locations; one is the
primary and the second is an OCR mirror. To avoid simultaneous loss of multiple OCR files,
each copy of the OCR should be placed on a shared storage device that does not share any
components (controller, interconnect, and so on) with the storage devices used for the other OCR
file.
Before attempting to add a mirrored OCR, determine how many OCR files are currently
configured for the cluster as well as their location. If the cluster is up and running, use the
ocrcheck
utility as either the
oracleor
rootuser account:
[oracle@racnode1 ~]$ ocrcheckStatus of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120
Used space (kbytes) : 4660 Available space (kbytes) : 257460 ID : 1331197
Device/File Name : /u02/oradata/racdb/OCRFile <-- OCR (primary)
Device/File integrity check succeeded
Device/File not configured <-- OCR Mirror (not configured yet)
Cluster registry integrity check succeeded
If CRS is down, you can still determine the location and number of OCR files by viewing the file
ocr.loc
, whose location is somewhat platform dependent. For example, on the Linux platform it
is located in
/etc/oracle/ocr.locwhile on Sun Solaris it is located at
/var/opt/oracle/ocr.loc:
[root@racnode1 ~]# cat /etc/oracle/ocr.loc ocrconfig_loc=/u02/oradata/racdb/OCRFile
local_only=FALSE
The results above indicate I have only one OCR file and that it is located on an OCFS2 file
system. Since we are allowed a maximum of two OCR locations, I intend to create an OCR
mirror and locate it on the same OCFS2 file system in the same directory as the primary OCR.
Please note that I am doing this for the sake brevity. The OCR mirror should always be placed on
a separate device than the primary OCR file to guard against a single point of failure.
Note that the Oracle Clusterware stack should be online and running on all nodes in the cluster
while adding, replacing, or removing the OCR location and hence does not require any system
downtime.
The operations performed in this section affect the OCR for the entire cluster. However, the
ocrconfigcommand cannot modify OCR configuration information for nodes that are shut
down or for nodes on which Oracle Clusterware is not running. So, you should avoid shutting
down nodes while modifying the OCR using the
ocrconfigcommand. If for any reason, any
of the nodes in the cluster are shut down while modifying the OCR using the
ocrconfigcommand, you will need to perform a
repair
on the stopped node before it can brought online
to join the cluster. Please see the section "
Repair an OCR File on a Local Node
" for
instructions on repairing the OCR file on the affected node.
You can add an OCR mirror after an upgrade or after completing the Oracle Clusterware
installation. The Oracle Universal Installer (OUI) allows you to configure either one or two OCR
locations during the installation of Oracle Clusterware. If you already mirror the OCR, then you
do not need to add a new OCR location; Oracle Clusterware automatically manages two OCRs
when you configure normal redundancy for the OCR. As previously mentioned, Oracle RAC
environments do not support more than two OCR locations; a primary OCR and a secondary
(mirrored) OCR.
Run the following command to add or relocate an OCR mirror using either destination_file or
disk to designate the target location of the additional OCR:
ocrconfig -replace ocrmirror <destination_file> ocrconfig -replace ocrmirror <disk>
You must be logged in as the
rootuser to run the
ocrconfigcommand.
Please note that
Attempting to copy the existing OCR file to a new location and then manually
ocrconfig -replaceis the only way to add/relocate OCR files/mirrors.
adding/changing the file pointer in the
ocr.locfile is not supported and will actually fail to
work.
For example:
## Verify CRS is running on node 1. #
[root@racnode1 ~]# crsctl check crs CSS appears healthy
CRS appears healthy EVM appears healthy
#
# Verify CRS is running on node 2. #
[root@racnode2 ~]# crsctl check crs CSS appears healthy
CRS appears healthy EVM appears healthy
#
# Configure the shared OCR destination_file/disk before # attempting to create the new ocrmirror on it. This example # creates a destination_file on an OCFS2 file system.
# Failure to pre-configure the new destination_file/disk # before attempting to run ocrconfig will result in the # following error:
#
# PROT-21: Invalid parameter
#
[root@racnode1 ~]# cp /dev/null /u02/oradata/racdb/OCRFile_mirror [root@racnode1 ~]# chown root /u02/oradata/racdb/OCRFile_mirror [root@racnode1 ~]# chgrp oinstall /u02/oradata/racdb/OCRFile_mirror [root@racnode1 ~]# chmod 640 /u02/oradata/racdb/OCRFile_mirror
#
# Add new OCR mirror. #
[root@racnode1 ~]# ocrconfig -replace ocrmirror
After adding the new OCR mirror, check that it can be seen from all nodes in the cluster:
## Verify new OCR mirror from node 1. #
[root@racnode1 ~]# ocrcheck
Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4668 Available space (kbytes) : 257452 ID : 1331197
Device/File Name : /u02/oradata/racdb/OCRFile
Device/File integrity check succeeded
Device/File Name : /u02/oradata/racdb/OCRFile_mirror <-- New OCR Mirror
Device/File integrity check succeeded Cluster registry integrity check succeeded
[root@racnode1 ~]# cat /etc/oracle/ocr.loc
#Device/file getting replaced by device /u02/oradata/racdb/OCRFile_mirror ocrconfig_loc=/u02/oradata/racdb/OCRFile
ocrmirrorconfig_loc=/u02/oradata/racdb/OCRFile_mirror #
# Verify new OCR mirror from node 2. #
[root@racnode2 ~]# ocrcheck
Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4668 Available space (kbytes) : 257452 ID : 1331197
Device/File Name : /u02/oradata/racdb/OCRFile
Device/File integrity check succeeded
Device/File Name : /u02/oradata/racdb/OCRFile_mirror <-- New OCR Mirror
Device/File integrity check succeeded Cluster registry integrity check succeeded
[root@racnode2 ~]# cat /etc/oracle/ocr.loc
#Device/file getting replaced by device /u02/oradata/racdb/OCRFile_mirror ocrconfig_loc=/u02/oradata/racdb/OCRFile
ocrmirrorconfig_loc=/u02/oradata/racdb/OCRFile_mirror
As mentioned earlier, you can have at most two OCR files in the cluster; the primary OCR and a
single OCR mirror. Attempting to add an extra mirror will actually relocate the current OCR
mirror to the new location specified in the command:
[root@racnode1 ~]# cp /dev/null /u02/oradata/racdb/OCRFile_mirror2 [root@racnode1 ~]# chown root /u02/oradata/racdb/OCRFile_mirror2 [root@racnode1 ~]# chgrp oinstall /u02/oradata/racdb/OCRFile_mirror2 [root@racnode1 ~]# chmod 640 /u02/oradata/racdb/OCRFile_mirror2 [root@racnode1 ~]# ocrconfig -replace ocrmirror
/u02/oradata/racdb/OCRFile_mirror2
[root@racnode1 ~]# ocrcheck
Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4668 Available space (kbytes) : 257452 ID : 1331197
Device/File Name : /u02/oradata/racdb/OCRFile
Device/File integrity check succeeded Device/File Name : /u02/oradata/racdb/OCRFile_mirror2 <-- Mirror was Relocated!
Device/File integrity check succeeded Cluster registry integrity check succeeded
Relocate an OCR File
Just as we were able to add a new ocrmirror while the CRS stack was online, the same holds true
when relocating an OCR file or OCR mirror and therefore does not require any system
downtime.
You can relocate OCR only when the OCR is mirrored. A mirror copy of the OCR file is
required to move the OCR online. If there is no mirror copy of the OCR, first create the
mirror using the instructions in the previous section.
Attempting to relocate OCR when an OCR mirror does not exist will produce the following
error:
ocrconfig -replace ocr /u02/oradata/racdb/OCRFile PROT-16: Internal Error
If the OCR mirror is not required in the cluster after relocating the OCR, it can be safely
removed.
Run the following command as the
rootaccount to relocate the current OCR file to a new
location using either destination_file or disk to designate the new target location for the OCR:
ocrconfig -replace ocr <destination_file> ocrconfig -replace ocr <disk>
Run the following command as the
rootaccount to relocate the current OCR mirror to a new
location using either destination_file or disk to designate the new target location for the OCR
mirror:
ocrconfig -replace ocrmirror <destination_file> ocrconfig -replace ocrmirror <disk>
The following example assumes the OCR is mirrored and demonstrates how to relocate the
current OCR file (
/u02/oradata/racdb/OCRFile) from the OCFS2 file system to a new raw
device (
/dev/raw/raw1):
#
# Verify CRS is running on node 1. #
[root@racnode1 ~]# crsctl check crs CSS appears healthy
CRS appears healthy EVM appears healthy
#
# Verify CRS is running on node 2. #
[root@racnode2 ~]# crsctl check crs CSS appears healthy
CRS appears healthy EVM appears healthy
#
# Verify current OCR configuration. #
[root@racnode2 ~]# ocrcheck
Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4668 Available space (kbytes) : 257452 ID : 1331197
Device/File Name : /u02/oradata/racdb/OCRFile <-- Current OCR to Relocate
Device/File integrity check succeeded Device/File Name : /u02/oradata/racdb/OCRFile_mirror Device/File integrity check succeeded Cluster registry integrity check succeeded
#
# Verify new raw storage device exists, is configured with # the correct permissions, and can be seen from all nodes # in the cluster.
#
[root@racnode1 ~]# ls -l /dev/raw/raw1
crw-r--- 1 root oinstall 162, 1 Oct 2 19:54 /dev/raw/raw1 [root@racnode2 ~]# ls -l /dev/raw/raw1
crw-r--- 1 root oinstall 162, 1 Oct 2 19:54 /dev/raw/raw1
#
# Clear out the contents from the new raw device. #
[root@racnode1 ~]# dd if=/dev/zero of=/dev/raw/raw1
#
# Relocate primary OCR file to new raw device. Note that # there is no deletion of the old OCR file but simply a # replacement.
#
[root@racnode1 ~]# ocrconfig -replace ocr /dev/raw/raw1
After relocating the OCR file, check that the change can be seen from all nodes in the cluster:
## Verify new OCR file from node 1. #
[root@racnode1 ~]# ocrcheck
Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4668 Available space (kbytes) : 257452 ID : 1331197
Device/File Name : /dev/raw/raw1 <-- Relocated OCR
Device/File integrity check succeeded Device/File Name : /u02/oradata/racdb/OCRFile_mirror Device/File integrity check succeeded Cluster registry integrity check succeeded
[root@racnode1 ~]# cat /etc/oracle/ocr.loc
#Device/file /u02/oradata/racdb/OCRFile getting replaced by device /dev/raw/raw1
ocrconfig_loc=/dev/raw/raw1
ocrmirrorconfig_loc=/u02/oradata/racdb/OCRFile_mirror
#
# Verify new OCR file from node 2. #
[root@racnode2 ~]# ocrcheck
Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4668 Available space (kbytes) : 257452 ID : 1331197
Device/File Name : /dev/raw/raw1 <-- Relocated OCR
Device/File integrity check succeeded Device/File Name : /u02/oradata/racdb/OCRFile_mirror Device/File integrity check succeeded Cluster registry integrity check succeeded
[root@racnode2 ~]# cat /etc/oracle/ocr.loc
#Device/file /u02/oradata/racdb/OCRFile getting replaced by device /dev/raw/raw1
ocrconfig_loc=/dev/raw/raw1
After verifying the relocation was successful, remove the old OCR file at the OS level:
[root@racnode1 ~]# rm -v /u02/oradata/racdb/OCRFile removed '/u02/oradata/racdb/OCRFile'
Repair an OCR File on a Local Node
It was mentioned in the
previous section
that the
ocrconfigcommand cannot modify OCR
configuration information for nodes that are shut down or for nodes on which Oracle Clusterware
is not running. You may need to repair an OCR configuration on a particular node if your OCR
configuration changes while that node is stopped. For example, you may need to repair the OCR
on a node that was shut down while you were adding, replacing, or removing an OCR.
To repair an OCR configuration, run the following command as
rootfrom the node on which
you have stopped the Oracle Clusterware daemon:
ocrconfig –repair ocr device_name
To repair an OCR mirror configuration, run the following command as
rootfrom the node on
which you have stopped the Oracle Clusterware daemon:
ocrconfig –repair ocrmirror device_name
You cannot perform this operation on a node on which the Oracle Clusterware daemon is
running. The CRS stack must be shutdown before attempting to repair the OCR
configuration on the local node.
The
ocrconfig –repaircommand changes the OCR configuration only on the node from
which you run this command. For example, if the OCR mirror was relocated to a disk named
/dev/raw/raw2
from
racnode1while the node
racnode2was down, then use the command
ocrconfig -repair ocrmirror /dev/raw/raw2on
racnode2while the CRS stack is down
on that node to repair its OCR configuration:
## Shutdown CRS stack on node 2 and verify the CRS stack is not up. #
[root@racnode2 ~]# crsctl stop crs
Stopping resources. This could take several minutes. Successfully stopped CRS resources.
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.
[root@racnode2 ~]# ps -ef | grep d.bin | grep -v grep
#
# Relocate OCR mirror to new raw device from node 1. Note # that node 2 is down (actually CRS down on node 2) while # we relocate the OCR mirror.
#
#
# Verify relocated OCR mirror from node 1. #
[root@racnode1 ~]# ocrcheck
Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4668 Available space (kbytes) : 257452 ID : 1331197 Device/File Name : /dev/raw/raw1
Device/File integrity check succeeded Device/File Name : /dev/raw/raw2 <-- Relocated OCR Mirror
Device/File integrity check succeeded Cluster registry integrity check succeeded
[root@racnode1 ~]# cat /etc/oracle/ocr.loc
#Device/file /u02/oradata/racdb/OCRFile_mirror getting replaced by device /dev/raw/raw2
ocrconfig_loc=/dev/raw/raw1
ocrmirrorconfig_loc=/dev/raw/raw2 #
# Node 2 does not know about the OCR mirror relocation. #
[root@racnode2 ~]# cat /etc/oracle/ocr.loc
#Device/file /u02/oradata/racdb/OCRFile getting replaced by device /dev/raw/raw1 ocrconfig_loc=/dev/raw/raw1
ocrmirrorconfig_loc=/u02/oradata/racdb/OCRFile_mirror #
# While the CRS stack is down on node 2, perform a local OCR # repair operation to inform the node of the relocated OCR # mirror. The ocrconfig -repair option will only update the # OCR configuration information on node 2. If there were # other nodes shutdown during the relocation, they too will # need repaired.
#
[root@racnode2 ~]# ocrconfig -repair ocrmirror /dev/raw/raw2
#
# Verify the repair updated the OCR configuration on node 2. #
[root@racnode2 ~]# cat /etc/oracle/ocr.loc
#Device/file /u02/oradata/racdb/OCRFile_mirror getting replaced by device /dev/raw/raw2
ocrconfig_loc=/dev/raw/raw1
ocrmirrorconfig_loc=/dev/raw/raw2 #
# Bring up the CRS stack on node 2. #
[root@racnode2 ~]# crsctl start crs Attempting to start CRS stack
The CRS stack will be started shortly
#
# Verify node 2 is back online. #
[root@racnode2 ~]# crs_stat -t
Name Type Target State Host
---ora.racdb.db application ONLINE ONLINE racnode1 ora....b1.inst application ONLINE ONLINE racnode1 ora....b2.inst application ONLINE ONLINE racnode2 ora....srvc.cs application ONLINE ONLINE racnode1 ora....db1.srv application ONLINE ONLINE racnode1 ora....db2.srv application ONLINE ONLINE racnode2 ora....SM1.asm application ONLINE ONLINE racnode1 ora....E1.lsnr application ONLINE ONLINE racnode1 ora....de1.gsd application ONLINE ONLINE racnode1 ora....de1.ons application ONLINE ONLINE racnode1 ora....de1.vip application ONLINE ONLINE racnode1 ora....SM2.asm application ONLINE ONLINE racnode2 ora....E2.lsnr application ONLINE ONLINE racnode2 ora....de2.gsd application ONLINE ONLINE racnode2 ora....de2.ons application ONLINE ONLINE racnode2 ora....de2.vip application ONLINE ONLINE racnode2
Remove an OCR File
To remove an OCR, you need to have at least one OCR online. You may need to perform this to
reduce overhead or for other storage reasons, such as stopping a mirror to move it to SAN, RAID
etc. Carry out the following steps:
•
Check if at least one OCR is online
•
Verify the CRS stack is online — preferably on all nodes
•Remove the OCR or OCR mirror
•
If using a clustered file system, remove the deleted file at the OS level
Run the following command as the
rootaccount to delete the current OCR or the current OCR
mirror:
ocrconfig -replace ocr or
ocrconfig -replace ocrmirror
For example:
## Verify CRS is running on node 1. #
[root@racnode1 ~]# crsctl check crs CSS appears healthy
CRS appears healthy EVM appears healthy
#
# Verify CRS is running on node 2. #
[root@racnode2 ~]# crsctl check crs CSS appears healthy
CRS appears healthy EVM appears healthy
#
# Get the existing OCR file information by running ocrcheck # utility.
#
[root@racnode1 ~]# ocrcheck
Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4668 Available space (kbytes) : 257452 ID : 1331197 Device/File Name : /dev/raw/raw1
Device/File integrity check succeeded
Device/File Name : /dev/raw/raw2 <-- OCR Mirror to be Removed
Device/File integrity check succeeded Cluster registry integrity check succeeded
#
# Delete OCR mirror from the cluster configuration. #
[root@racnode1 ~]# ocrconfig -replace ocrmirror
After removing the new OCR mirror, check that the change is seen from all nodes in the cluster:
## Verify OCR mirror was removed from node 1. #
[root@racnode1 ~]# ocrcheck
Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4668 Available space (kbytes) : 257452 ID : 1331197 Device/File Name : /dev/raw/raw1
Device/File integrity check succeeded
Device/File not configured <-- OCR Mirror Removed
Cluster registry integrity check succeeded
#
# Verify OCR mirror was removed from node 2. #
Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4668 Available space (kbytes) : 257452 ID : 1331197 Device/File Name : /dev/raw/raw1
Device/File integrity check succeeded
Device/File not configured <-- OCR Mirror Removed
Cluster registry integrity check succeeded
Removing the OCR or OCR mirror from the cluster configuration does not remove the
physical file at the OS level when using a clustered file system.
Backup the OCR File
There are two methods for backing up the contents of the OCR and each backup method can be
used for different recovery purposes. This section discusses how to ensure the stability of the
cluster by implementing a robust backup strategy.
The first type of backup relies on automatically generated OCR file copies which are sometimes
referred to as physical backups. These physical OCR file copies are automatically generated by
the CRSD process on the
master node
and are primarily used to recover the OCR from a lost or
corrupt OCR file. Your backup strategy should include procedures to copy these automatically
generated OCR file copies to a secure location which is accessible from all nodes in the cluster in
the event the OCR needs to be restored.
The second type of backup uses manual procedures to create OCR export files; also known as
logical backups. Creating a manual OCR export file should be performed both before and after
making significant configuration changes to the cluster, such as adding or deleting nodes from
your environment, modifying Oracle Clusterware resources, or creating a database. If in the
event a configuration change is made to the OCR that causes errors, the OCR can be restored to a
previous state by performing an import of the logical backup taken before the configuration
change. Please note that an OCR logical export can also be used to restore the OCR from a lost
or corrupt OCR file.
Unlike the methods used to
copying the OCR file directly at the OS level is not a valid backup and will result in errors
backup the voting disk
, attempting to backup the OCR by
after the restore!
Because of the importance of OCR information, Oracle recommends that you make copies of the
automatically created backup files and an OCR export at least once a day. The following is a
working UNIX script that can be scheduled in CRON to backup the OCR File(s) and the Voting
Disk(s) on a regular basis:
crs_components_backup_10g.ksh
Automatic OCR Backups
The Oracle Clusterware automatically creates OCR physical backups every four hours. At any
one time, Oracle always retains the last 3 backup copies of the OCR that are 4 hours old. The
CRSD process that creates these backups also creates and retains an OCR backup for each full
day and at the end of each week. You cannot customize the backup frequencies or the number of
OCR physical backup files that Oracle retains.
The default location for generating physical backups on UNIX-based systems is
CRS_home/cdata/cluster_name
where cluster_name is the name of your cluster (i.e.
crs). Use
the
ocrconfig -showbackupcommand to view all current OCR physical backups that were
taken from the
master node
:
[oracle@racnode1 ~]$ ocrconfig -showbackup
racnode1 2009/09/29 13:05:22 /u01/app/crs/cdata/crs racnode1 2009/09/29 09:05:22 /u01/app/crs/cdata/crs racnode1 2009/09/29 05:05:22 /u01/app/crs/cdata/crs racnode1 2009/09/28 05:05:21 /u01/app/crs/cdata/crs racnode1 2009/09/22 05:05:13 /u01/app/crs/cdata/crs [oracle@racnode1 ~]$ ls -l $ORA_CRS_HOME/cdata/crs total 59276
-rw-r--r-- 1 root root 8654848 Sep 29 13:05 backup00.ocr <-- Most recent physical backup
-rw-r--r-- 1 root root 8654848 Sep 29 09:05 backup01.ocr -rw-r--r-- 1 root root 8654848 Sep 29 05:05 backup02.ocr -rw-r--r-- 1 root root 8654848 Sep 29 05:05 day_.ocr
-rw-r--r-- 1 root root 8654848 Sep 28 05:05 day.ocr <-- One day old
-rw-r--r-- 1 root root 8654848 Sep 29 05:05 week_.ocr
-rw-r--r-- 1 root root 8654848 Sep 22 05:05 week.ocr <-- One week old
You can change the location where the CRSD process writes the physical OCR copies to using:
ocrconfig -backuploc <new_dirname>
Restoring the OCR from an automatic physical backup is accomplished using the
ocrconfig -restorecommand. Note that the CRS stack needs to be shutdown on all nodes in the cluster
prior to running the restore operation:
ocrconfig -restore <backup_file_name>
You cannot restore the OCR from a physical backup using the
-importoption. The only method
The Master Node
As documented in Doc ID:
357262.1
on the
My Oracle Support
web site, the CRSD process only
creates automatic OCR physical backups on one node in the cluster, which is the OCR master
node. It does not create automatic backup copies on the other nodes; only from the OCR master
node. If the master node fails, the OCR backups will be created from the new master node. You
can determine which node in the cluster is the master node by examining the
$ORA_CRS_HOME/log/<node_name>/cssd/ocssd.log
file on any node in the cluster. In this log
file, check for reconfiguration information (reconfiguration successful) after which you will see
which node is the master and how many nodes are active in the cluster:
Node 1 - (racnode1)
[ CSSD]CLSS-3000: reconfiguration successful, incarnation 1 with 2 nodes [ CSSD]CLSS-3001: local node number 1, master node number 1
Node 2 - (racnode2)
[ CSSD]CLSS-3000: reconfiguration successful, incarnation 1 with 2 nodes [ CSSD]CLSS-3001: local node number 2, master node number 1
Another quick approach is to use either of the following methods:
Node 1 - (racnode1)# grep -i "master node" $ORA_CRS_HOME/log/racnode?/cssd/ocssd.log | tail -1 [ CSSD]CLSS-3001: local node number 1, master node number 1
Node 2 - (racnode2)
# grep -i "master node" $ORA_CRS_HOME/log/racnode?/cssd/ocssd.log | tail -1 [ CSSD]CLSS-3001: local node number 2, master node number 1
# If not found in the ocssd.log, then look through all # of the ocssd archives:
Node 1 - (racnode1)
# for x in 'ls -tr $ORA_CRS_HOME/log/racnode?/cssd/ocssd.*'
do grep -i "master node" $x; done | tail -1
[ CSSD]CLSS-3001: local node number 1, master node number 1 Node 2 - (racnode2)
# for x in 'ls -tr $ORA_CRS_HOME/log/racnode?/cssd/ocssd.*'
do grep -i "master node" $x; done | tail -1
[ CSSD]CLSS-3001: local node number 2, master node number 1
# The master node information is confirmed by the # ocrconfig -showbackup command:
# ocrconfig -showbackup
racnode1 2009/09/29 13:05:22 /u01/app/crs/cdata/crs
racnode1 2009/09/29 05:05:22 /u01/app/crs/cdata/crs
racnode1 2009/09/28 05:05:21 /u01/app/crs/cdata/crs
racnode1 2009/09/22 05:05:13 /u01/app/crs/cdata/crs
Because of the importance of OCR information, Oracle recommends that you make copies of the
automatically created backup files at least once a day from the master node to a different device
from where the primary OCR resides. You can use any backup software to copy the
automatically generated physical backup files to a stable backup location:
[root@racnode1 ~]# cp -p -v -f -R /u01/app/crs/cdata /u02/crs_backup/ocrbackup/RACNODE1 '/u01/app/crs/cdata/crs/day_.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE1/cdata/crs/day_.ocr' '/u01/app/crs/cdata/crs/backup02.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE1/cdata/crs/backup02.ocr' '/u01/app/crs/cdata/crs/backup01.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE1/cdata/crs/backup01.ocr' '/u01/app/crs/cdata/crs/week_.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE1/cdata/crs/week_.ocr' '/u01/app/crs/cdata/crs/day.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE1/cdata/crs/day.ocr' '/u01/app/crs/cdata/crs/backup00.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE1/cdata/crs/backup00.ocr' '/u01/app/crs/cdata/crs/week.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE1/cdata/crs/week.ocr' [root@racnode2 ~]# cp -p -v -f -R /u01/app/crs/cdata /u02/crs_backup/ocrbackup/RACNODE2 '/u01/app/crs/cdata/crs/day_.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE2/cdata/crs/day_.ocr' '/u01/app/crs/cdata/crs/backup02.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE2/cdata/crs/backup02.ocr' '/u01/app/crs/cdata/crs/backup01.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE2/cdata/crs/backup01.ocr' '/u01/app/crs/cdata/crs/week_.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE2/cdata/crs/week_.ocr' '/u01/app/crs/cdata/crs/day.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE2/cdata/crs/day.ocr' '/u01/app/crs/cdata/crs/backup00.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE2/cdata/crs/backup00.ocr' '/u01/app/crs/cdata/crs/week.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE2/cdata/crs/week.ocr'
It is possible and recommended that shared storage be used for the backup location(s). Keep in
mind that if the master node goes down and cannot be rebooted, it is possible to loose all OCR
physical backups if they were all on that node. The OCR backup process, however, will start on
the new master node within four hours for all new backups. It is highly recommended that you
integrate OCR backups with your normal database backup strategy. If possible, use a backup
location that is shared by all nodes in the cluster.
Performing a manual export of the OCR should be done before and after making significant
configuration changes to the cluster, such as adding or deleting nodes from your environment,
modifying Oracle Clusterware resources, or creating a database. This type of backup is often
referred to as a logical backup. If in the event a configuration change is made to the OCR that
causes errors, the OCR can be restored to its previous state by performing an import of the
logical backup taken before the configuration change. For example, if you have unresolvable
configuration problems, or if you are unable to restart your cluster database after such changes,
then you can restore your configuration by importing the saved OCR content from a valid
configuration.
Please note that an OCR logical export can also be used to restore the OCR from a lost or corrupt
OCR file.
To export the contents of the OCR to a dump file, use the following command, where
backup_file_name is the name of the OCR logical backup file you want to create:
ocrconfig -export <backup_file_name>For example:
[root@racnode1 ~]# ocrconfig -export
/u02/crs_backup/ocrbackup/RACNODE1/exports/OCRFileBackup.dmp
# A second export is not strictly required, however, there is no such thing as too many backups!
[root@racnode2 ~]# ocrconfig -export
/u02/crs_backup/ocrbackup/RACNODE2/exports/OCRFileBackup.dmp
To restore the OCR from an export/logical backup, use the
ocrconfig –importcommand. Note
that the CRS stack needs to be shutdown on all nodes in the cluster prior to running the restore
operation. In addition, the total space required for the restored OCR location (typically 280MB)
has to be pre-allocated. This is especially important when the OCR is located on a clustered file
system like OCFS2.
ocrconfig –import <export_file_name>
You cannot restore the OCR from a logical backup using the
-restoreoption. The only method
to restore the OCR from a logical export is to use the
-importoption.
You must be logged in as the root user to run the
ocrconfigcommand.
Recover the OCR File
If an application fails, then before attempting to restore the OCR, restart the application. As a
definitive verification that the OCR failed, run the
ocrcheckcommand:
[root@racnode1 ~]# ocrcheck
Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4668 Available space (kbytes) : 257452 ID : 1331197
Device/File Name : /u02/oradata/racdb/OCRFile
Device/File integrity check succeeded <-- OCR (primary) Valid
Device/File Name : /u02/oradata/racdb/OCRFile_mirror
Device/File integrity check succeeded <-- OCR (mirror) Valid
Cluster registry integrity check succeeded