• No results found

What is RAC

N/A
N/A
Protected

Academic year: 2021

Share "What is RAC"

Copied!
67
0
0

Loading.... (view fulltext now)

Full text

(1)

What is RAC?

RAC stands for Real Application cluster. It is a clustering solution from Oracle

Corporation that ensures high availability of databases by providing instance failover, media failover features.

Mention the Oracle RAC software

components:-Oracle RAC is composed of two or more database instances. They are composed of Memory structures and background processes same as the single instance database.

Oracle RAC instances use two processes GES(Global Enqueue Service), GCS(Global Cache Service) that enable cache fusion.

Oracle RAC instances are composed of following background processes: ACMS—Atomic Controlfile to Memory Service (ACMS)

GTX0-j—Global Transaction Process LMON—Global Enqueue Service Monitor LMD—Global Enqueue Service Daemon LMS—Global Cache Service Process LCK0—Instance Enqueue Process

RMSn—Oracle RAC Management Processes (RMSn) RSMN—Remote Slave Monitor

What is GRD?

GRD stands for Global Resource Directory. The GES and GCS maintains records of the statuses of each datafile and each cahed block using global resource directory.This process is referred to as cache fusion and helps in data integrity.

Give Details on Cache

Fusion:-Oracle RAC is composed of two or more instances. When a block of data is read from datafile by an instance within the cluster and another instance is in need of the same block,it is easy to get the block image from the insatnce which has the block in its SGA rather than reading from the disk. To enable inter instance communication Oracle RAC makes use of interconnects. The Global Enqueue Service(GES) monitors and Instance enqueue process manages the cahce fusion.

Give Details on

ACMS:-ACMS stands for Atomic Controlfile Memory Service.In an Oracle RAC environment ACMS is an agent that ensures a distributed SGA memory update(ie)SGA updates are globally committed on success or globally aborted in event of a failure.

(2)

Give details on GTX0-j

:-The process provides transparent support for XA global transactions in a RAC environment.The database autotunes the number of these processes based on the workload of XA global transactions.

Give details on

LMON:-This process monitors global enques and resources across the cluster and performs global enqueue recovery operations.This is called as Global Enqueue Service Monitor.

Give details on

LMD:-This process is called as global enqueue service daemon. LMD:-This process manages incoming remote resource requests within each instance.

Give details on

LMS:-This process is called as Global Cache service process.LMS:-This process maintains statuses of datafiles and each cahed block by recording information in a Global Resource

Dectory(GRD).This process also controls the flow of messages to remote instances and manages global data block access and transmits block images between the buffer caches of different instances.This processing is a part of cache fusion feature.

Give details on

LCK0:-This process is called as Instance enqueue process.LCK0:-This process manages non-cache fusion resource requests such as libry and row cache requests.

Give details on

RMSn:-This process is called as Oracle RAC management process.These pocesses perform managability tasks for Oracle RAC.Tasks include creation of resources related Oracle RAC when new instances are added to the cluster.

Give details on

RSMN:-This process is called as Remote Slave Monitor.RSMN:-This process manages background slave process creation andd communication on remote instances. This is a background slave process.This process performs tasks on behalf of a co-ordinating process running in another instance.

What components in RAC must reside in shared storage?

All datafiles, controlfiles, SPFIles, redo log files must reside on cluster-aware shred storage.

(3)

What is the significance of using cluster-aware shared storage in an Oracle RAC environment?

All instances of an Oracle RAC can access all the datafiles,control files, SPFILE's, redolog files when these files are hosted out of cluster-aware shared storage which are group of shared disks.

Give few examples for solutions that support cluster

storage:-ASM(automatic storage management),raw disk devices,network file system(NFS), OCFS2 and OCFS(Oracle Cluster Fie systems).

What is an interconnect network?

an interconnect network is a private network that connects all of the servers in a cluster. The interconnect network uses a switch/multiple switches that only the nodes in the cluster can access.

How can we configure the cluster interconnect?

Configure User Datagram Protocol(UDP) on Gigabit ethernet for cluster interconnect.On unia and linux systems we use UDP and RDS(Reliable data socket) protocols to be used by Oracle Clusterware.Windows clusters use the TCP protocol.

Can we use crossover cables with Oracle Clusterware interconnects? No, crossover cables are not supported with Oracle Clusterware intercnects.

What is the use of cluster interconnect?

Cluster interconnect is used by the Cache fusion for inter instance communication.

How do users connect to database in an Oracle RAC environment?

Users can access a RAC database using a client/server configuration or through one or more middle tiers ,with or without connection pooling.Users can use oracle services feature to connect to database.

What is the use of a service in Oracle RAC environemnt?

Applications should use the services feature to connect to the Oracle database.Services enable us to define rules and characteristics to control how users and applications connect to database instances.

What are the characteriscs controlled by Oracle services feature?

(4)

high availability characteristics.

Which enable the load balancing of applications in RAC?

Oracle Net Services enable the load balancing of application connections across all of the instances in an Oracle RAC database.

What is a virtual IP address or VIP?

A virtual IP address or VIP is an alternate IP address that the client connectins use instead of the standard public IP address. To configureVIP address, we need to reserve a spare IP address for each node, and the IP addresses must use the same subnet as the public network.

What is the use of VIP?

If a node fails, then the node's VIP address fails over to another node on which the VIP address can accept TCP connections but it cannot accept Oracle connections.

Give situations under which VIP address failover

happens:-VIP addresses failover happens when the node on which the happens:-VIP address runs fails, all interfaces for the VIP address fails, all interfaces for the VIP address are disconnected from the network.

What is the significance of VIP address failover?

When a VIP address failover happens, Clients that attempt to connect to the VIP address receive a rapid connection refused error .They don't have to wait for TCP connection timeout messages.

What are the administrative tools used for Oracle RAC environments?

Oracle RAC cluster can be administered as a single image using OEM(Enterprise Manager),SQL*PLUS,Servercontrol(SRVCTL),clusterverificationutility(cvu),DBCA,NE TCA

How do we verify that RAC instances are running?

Issue the following query from any one node connecting through SQL*PLUS. $connect sys/sys as sysdba

SQL>select * from V$ACTIVE_INSTANCES;

The query gives the instance number under INST_NUMBER column,host_:instancename under INST_NAME column.

(5)

What is FAN?

Fast application Notification as it abbreviates to FAN relates to the events related to instances,services and nodes.This is a notification mechanism that Oracle RAc uses to notify other processes about the configuration and service level information that includes service status changes such as,UP or DOWN events.Applications can respond to FAN events and take immediate action.

Where can we apply FAN UP and DOWN events?

FAN UP and FAN DOWN events can be applied to instances,services and nodes.

State the use of FAN events in case of a cluster configuration change?

During times of cluster configuration changes,Oracle RAC high availability framework publishes a FAN event immediately when a state change occurs in the cluster.So applications can receive FAN events and react immediately.This prevents applications from polling database and detecting a problem after such a state change.

Why should we have seperate homes for ASm instance?

It is a good practice to have ASM home seperate from the database

hom(ORACLE_HOME).This helps in upgrading and patching ASM and the Oracle database software independent of each other.Also,we can deinstall the Oracle database software independent of the ASM instance.

What is the advantage of using ASM?

Having ASM is the Oracle recommended storage option for RAC databases as the ASM maximizes performance by managing the storage configuration across the disks.ASM does this by distributing the database file across all of the available storage within our cluster database environment.

What is rolling upgrade?

It is a new ASM feature from Database 11g.ASM instances in Oracle database 11g

release(from 11.1) can be upgraded or patched using rolling upgrade feature. This enables us to patch or upgrade ASM nodes in a clustered environment without affecting database availability.During a rolling upgrade we can maintain a functional cluster while one or more of the nodes in the cluster are running in different software versions.

Can rolling upgrade be used to upgrade from 10g to 11g database? No,it can be used only for Oracle database 11g releases(from 11.1).

(6)

State the initialization parameters that must have same value for every instance in an Oracle RAC

database:-Some initialization parameters are critical at the database creation time and must have same values.Their value must be specified in SPFILE or PFILE for every instance.The list of parameters that must be identical on every instance are given below:

ACTIVE_INSTANCE_COUNT ARCHIVE_LAG_TARGET COMPATIBLE CLUSTER_DATABASE CLUSTER_DATABASE_INSTANCE CONTROL_FILES DB_BLOCK_SIZE DB_DOMAIN DB_FILES DB_NAME DB_RECOVERY_FILE_DEST DB_RECOVERY_FILE_DEST_SIZE DB_UNIQUE_NAME INSTANCE_TYPE (RDBMS or ASM) PARALLEL_MAX_SERVERS REMOTE_LOGIN_PASSWORD_FILE UNDO_MANAGEMENT

Can the DML_LOCKS and RESULT_CACHE_MAX_SIZE be identical on all instances?

These parameters can be identical on all instances only if these parameter values are set to zero.

What two parameters must be set at the time of starting up an ASM instance in a RAC environment?

The parameters CLUSTER_DATABASE and INSTANCE_TYPE must be set.

Mention the components of Oracle

clusterware:-Oracle clusterware is made up of components like voting disk and clusterware:-Oracle Cluster Registry(OCR).

What is a CRS resource?

Oracle clusterware is used to manage high-availability operations in a cluster.Anything that Oracle Clusterware manages is known as a CRS resource.Some examples of CRS resources are database,an instance,a service,a listener,a VIP address,an application process etc.

What is the use of OCR?

(7)

CRS resources stored in OCR(Oracle Cluster Registry).

How does a Oracle Clusterware manage CRS resources?

Oracle clusterware manages CRS resources based on the configuration information of CRS resources stored in OCR(Oracle Cluster Registry).

Name some Oracle clusterware tools and their uses? OIFCFG - allocating and deallocating network interfaces

OCRCONFIG - Command-line tool for managing Oracle Cluster Registry OCRDUMP - Identify the interconnect being used

CVU - Cluster verification utility to get status of CRS resources

What are the modes of deleting instances from ORacle Real Application cluster Databases?

We can delete instances using silent mode or interactive mode using DBCA(Database Configuration Assistant).

How do we remove ASM from a Oracle RAC environment?

We need to stop and delete the instance in the node first in interactive or silent mode.After that asm can be removed using srvctl tool as follows:

srvctl stop asm -n node_name srvctl remove asm -n node_name

We can verify if ASM has been removed by issuing the following command: srvctl config asm -n node_name

How do we verify that an instance has been removed from OCR after deleting an instance?

Issue the following srvctl command: srvctl config database -d database_name cd CRS_HOME/bin

./crs_stat

How do we verify an existing current backup of OCR?

We can verify the current backup of OCR using the following command : ocrconfig-showbackup

What are the performance views in an Oracle RAC environment?

We have v$ views that are instance specific. In addition we have GV$ views called as global views that has an INST_ID column of numeric data type.GV$ views obtain

(8)

information from individual V$ views.

What are the types of connection load-balancing?

There are two types of connection load-balancing:server-side load balancing and client-side load balancing.

What is the differnece between server-side and client-side connection load balancing? Client-side balancing happens at client side where load balancing is done using listener.In case of server-side load balancing listener uses a load-balancing advisory to redirect connections to the instance providing best service.

Give the usage of

srvctl:-srvctl start instance -d db_name -i "inst_name_list" [-o start_options]srvctl:-srvctl stop instance -d name -i "inst_name_list" [-o stop_options]srvctl stop instance -d orcl -i "orcl3,orcl4" -o immediatesrvctl start database -d name [-o start_options]srvctl stop database -d name [-o stop_options]srvctl start database -d orcl -o mount

cluster demons

Demon means background process

1) crsd - it is update the ocr file. it use for node apps work(ons,vip,gsd) 2) cssd - it is update the vd file. it is monitor the node membership

when ever HB(heart beat) fail between the nodes the cssd process reboot the node 3) evmd - update the diag information

http://www.idevelopment.info/data/Oracle/DBA_tips/Oracle10gRAC/CLUSTER_65.sht ml

Overview

Oracle Clusterware 10g, formerly known as Cluster Ready Services (CRS) is software that when

installed on servers running the same operating system, enables the servers to be bound together

to operate and function as a single server or cluster. This infrastructure simplifies the

requirement for an Oracle Real Application Clusters (RAC) database by providing cluster

software that is tightly integrated with the Oracle Database.

The Oracle Clusterware requires two critical clusterware components: a voting disk to record

node membership information and the Oracle Cluster Registry (OCR) to record cluster

configuration information:

(9)

Voting Disk

The voting disk is a shared partition that Oracle Clusterware uses to verify cluster node

membership and status. Oracle Clusterware uses the voting disk to determine which instances are

members of a cluster by way of a health check and arbitrates cluster ownership among the

instances in case of network failures. The primary function of the voting disk is to manage node

membership and prevent what is known as Split Brain Syndrome in which two or more instances

attempt to control the RAC database. This can occur in cases where there is a break in

communication between nodes through the interconnect.

The voting disk must reside on a shared disk(s) that is accessible by all of the nodes in the

cluster. For high availability, Oracle recommends that you have multiple voting disks. Oracle

Clusterware can be configured to maintain multiple voting disks (multiplexing) but you must

have an odd number of voting disks, such as three, five, and so on. Oracle Clusterware supports a

maximum of 32 voting disks. If you define a single voting disk, then you should use external

mirroring to provide redundancy.

A node must be able to access more than half of the voting disks at any time. For example, if you

have five voting disks configured, then a node must be able to access at least three of the voting

disks at any time. If a node cannot access the minimum required number of voting disks it is

evicted, or removed, from the cluster. After the cause of the failure has been corrected and access

to the voting disks has been restored, you can instruct Oracle Clusterware to recover the failed

node and restore it to the cluster.

Oracle Cluster Registry (OCR)

Maintains cluster configuration information as well as configuration information about any

cluster database within the cluster. OCR is the repository of configuration information for the

cluster that manages information about like the cluster node list and instance-to-node mapping

information. This configuration information is used by many of the processes that make up the

CRS as well as other cluster-aware applications which use this repository to share information

amoung them. Some of the main components included in the OCR are:

Node membership information

Database instance, node, and other mapping information

ASM (if configured)

Application resource profiles such as VIP addresses, services, etc.

Service characteristics

Information about processes that Oracle Clusterware controls

Information about any third-party applications controlled by CRS (10g R2

and later)

The OCR stores configuration information in a series of key-value pairs within a directory tree

structure. To view the contents of the OCR in a human-readable format, run the

ocrdump

command. This will dump the contents of the OCR into an ASCII text file in the current

directory named

OCRDUMPFILE

.

(10)

The OCR must reside on a shared disk(s) that is accessible by all of the nodes in the cluster.

Oracle Clusterware 10g Release 2 allows you to multiplex the OCR and Oracle recommends that

you use this feature to ensure cluster high availability. Oracle Clusterware allows for a maximum

of two OCR locations; one is the primary and the second is an OCR mirror. If you define a single

OCR, then you should use external mirroring to provide redundancy. You can replace a failed

OCR online, and you can update the OCR through supported APIs such as Enterprise Manager,

the Server Control Utility (SRVCTL), or the Database Configuration Assistant (DBCA).

This article provides a detailed look at how to administer the two critical Oracle Clusterware

components — the voting disk and the Oracle Cluster Registry (OCR). The examples described

in this guide were tested with Oracle RAC 10g Release 2 (10.2.0.4) on the Linux x86 platform.

Example Configuration

The example configuration used in this article consists of a two-node RAC with a clustered

database named

racdb.idevelopment.info

running Oracle RAC 10g Release 2 on the Linux

x86 platform. The two node names are

racnode1

and

racnode2

, each hosting a single Oracle

instance named

racdb1

and

racdb2

respectively. For a detailed guide on building the example

clustered database environment, please see:

Building an Inexpensive Oracle RAC 10

g Release 2 on Linux -

(CentOS 5.3 / iSCSI)

The example Oracle Clusterware environment is configured with a single voting disk and a

single OCR file on an OCFS2 clustered file system. Note that the voting disk is owned by the

oracle

user in the

oinstall

group with 0644 permissions while the OCR file is owned by

root

in the

oinstall

group with 0640 permissions:

[oracle@racnode1 ~]$ ls -l /u02/oradata/racdb total 16608

-rw-r--r-- 1 oracle oinstall 10240000 Aug 26 22:43 CSSFile

drwxr-xr-x 2 oracle oinstall 3896 Aug 26 23:45 dbs/

-rw-r--- 1 root oinstall 6836224 Sep 3 23:47 OCRFile

Check Current OCR File

[oracle@racnode1 ~]$ ocrcheck

Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4660 Available space (kbytes) : 257460 ID : 1331197

Device/File Name : /u02/oradata/racdb/OCRFile

Device/File integrity check succeeded Device/File not configured

Cluster registry integrity check succeeded

(11)

[oracle@racnode1 ~]$ crsctl query css votedisk 0. 0 /u02/oradata/racdb/CSSFile

located 1 votedisk(s).

Preparation

To prepare for the examples used in this guide, five new iSCSI volumes were created from the

SAN and will be bound to RAW devices on all nodes in the RAC cluster. These five new

volumes will be used to demonstrate how to move the current voting disk and OCR file from an

OCFS2 file system to RAW devices:

Five New iSCSI Volumes and their Local Device Name Mappings

iSCSI Target Name

Local Device Name

Disk Size

iqn.2006-01.com.openfiler:racdb.ocr1 /dev/iscsi/ocr1/part 512 MB

iqn.2006-01.com.openfiler:racdb.ocr2 /dev/iscsi/ocr2/part 512 MB

iqn.2006-01.com.openfiler:racdb.voting1 /dev/iscsi/voting1/part 32 MB iqn.2006-01.com.openfiler:racdb.voting2 /dev/iscsi/voting2/part 32 MB iqn.2006-01.com.openfiler:racdb.voting3 /dev/iscsi/voting3/part 32 MB

After creating the new iSCSI volumes from the SAN, they now need to be configured for access

and bound to RAW devices by all Oracle RAC nodes in the database cluster.

1.

From all Oracle RAC nodes in the cluster as

root

, discover the five new iSCSI volumes

from the SAN which will be used to store the voting disks and OCR files.

[root@racnode1 ~]# iscsiadm -m discovery -t sendtargets -p openfiler1-san 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.asm1 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.asm2 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.crs 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.ocr1 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.ocr2 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.voting1 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.voting2 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.voting3

[root@racnode2 ~]# iscsiadm -m discovery -t sendtargets -p openfiler1-san 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.asm1 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.asm2 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.crs 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.ocr1 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.ocr2 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.voting1 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.voting2 192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.voting3

(12)

[root@racnode1 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.ocr1 -p

192.168.2.195 -l

[root@racnode1 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.ocr2 -p

192.168.2.195 -l

[root@racnode1 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.voting1

-p 192.168.2.195 -l

[root@racnode1 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.voting2

-p 192.168.2.195 -l

[root@racnode1 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.voting3

-p 192.168.2.195 -l

[root@racnode2 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.ocr1 -p

192.168.2.195 -l

[root@racnode2 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.ocr2 -p

192.168.2.195 -l

[root@racnode2 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.voting1

-p 192.168.2.195 -l

[root@racnode2 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.voting2

-p 192.168.2.195 -l

[root@racnode2 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.voting3

-p 192.168.2.195 -l

3. Create a single primary partition on each of the five new iSCSI volumes that span the

entire disk. Perform this from only one of the Oracle RAC nodes in the cluster:

[root@racnode1 ~]# fdisk /dev/iscsi/ocr1/part [root@racnode1 ~]# fdisk /dev/iscsi/ocr2/part [root@racnode1 ~]# fdisk /dev/iscsi/voting1/part [root@racnode1 ~]# fdisk /dev/iscsi/voting2/part [root@racnode1 ~]# fdisk /dev/iscsi/voting3/part

4. Re-scan the SCSI bus from all Oracle RAC nodes in the cluster:

[root@racnode2 ~]# partprobe

5.

Create a shell script (

/usr/local/bin/setup_raw_devices.sh

) on all Oracle RAC

nodes in the cluster to bind the five Oracle Clusterware component devices to RAW

devices as follows:

# +---+ # | FILE: /usr/local/bin/setup_raw_devices.sh | # +---+ # +---+ # | Bind OCR files to RAW device files. | # +---+ /bin/raw /dev/raw/raw1 /dev/iscsi/ocr1/part1

/bin/raw /dev/raw/raw2 /dev/iscsi/ocr2/part1 sleep 3

/bin/chown root:oinstall /dev/raw/raw1 /bin/chown root:oinstall /dev/raw/raw2 /bin/chmod 0640 /dev/raw/raw1

/bin/chmod 0640 /dev/raw/raw2

(13)

# | Bind voting disks to RAW device files. | # +---+ /bin/raw /dev/raw/raw3 /dev/iscsi/voting1/part1

/bin/raw /dev/raw/raw4 /dev/iscsi/voting2/part1 /bin/raw /dev/raw/raw5 /dev/iscsi/voting3/part1 sleep 3

/bin/chown oracle:oinstall /dev/raw/raw3 /bin/chown oracle:oinstall /dev/raw/raw4 /bin/chown oracle:oinstall /dev/raw/raw5 /bin/chmod 0644 /dev/raw/raw3

/bin/chmod 0644 /dev/raw/raw4 /bin/chmod 0644 /dev/raw/raw5

6. From all Oracle RAC nodes in the cluster, change the permissions of the new shell script

to execute:

[root@racnode1 ~]# chmod 755 /usr/local/bin/setup_raw_devices.sh [root@racnode2 ~]# chmod 755 /usr/local/bin/setup_raw_devices.sh

7. Manually execute the new shell script from all Oracle RAC nodes in the cluster to bind

the voting disks to RAW devices:

[root@racnode1 ~]# /usr/local/bin/setup_raw_devices.sh /dev/raw/raw1: bound to major 8, minor 97

/dev/raw/raw2: bound to major 8, minor 17 /dev/raw/raw3: bound to major 8, minor 1 /dev/raw/raw4: bound to major 8, minor 49 /dev/raw/raw5: bound to major 8, minor 33

[root@racnode2 ~]# /usr/local/bin/setup_raw_devices.sh /dev/raw/raw1: bound to major 8, minor 65

/dev/raw/raw2: bound to major 8, minor 49 /dev/raw/raw3: bound to major 8, minor 33 /dev/raw/raw4: bound to major 8, minor 1 /dev/raw/raw5: bound to major 8, minor 17

8. Check that the character (RAW) devices were created from all Oracle RAC nodes in the

cluster:

[root@racnode1 ~]# ls -l /dev/raw total 0

crw-r--- 1 root oinstall 162, 1 Sep 24 00:48 raw1 crw-r--- 1 root oinstall 162, 2 Sep 24 00:48 raw2 crw-r--r-- 1 oracle oinstall 162, 3 Sep 24 00:48 raw3 crw-r--r-- 1 oracle oinstall 162, 4 Sep 24 00:48 raw4 crw-r--r-- 1 oracle oinstall 162, 5 Sep 24 00:48 raw5 [root@racnode2 ~]# ls -l /dev/raw

total 0

crw-r--- 1 root oinstall 162, 1 Sep 24 00:48 raw1 crw-r--- 1 root oinstall 162, 2 Sep 24 00:48 raw2 crw-r--r-- 1 oracle oinstall 162, 3 Sep 24 00:48 raw3 crw-r--r-- 1 oracle oinstall 162, 4 Sep 24 00:48 raw4

(14)

crw-r--r-- 1 oracle oinstall 162, 5 Sep 24 00:48 raw5 [root@racnode1 ~]# raw -qa

/dev/raw/raw1: bound to major 8, minor 97 /dev/raw/raw2: bound to major 8, minor 17 /dev/raw/raw3: bound to major 8, minor 1 /dev/raw/raw4: bound to major 8, minor 49 /dev/raw/raw5: bound to major 8, minor 33 [root@racnode2 ~]# raw -qa

/dev/raw/raw1: bound to major 8, minor 65 /dev/raw/raw2: bound to major 8, minor 49 /dev/raw/raw3: bound to major 8, minor 33 /dev/raw/raw4: bound to major 8, minor 1 /dev/raw/raw5: bound to major 8, minor 17

9.

Include the new shell script in

/etc/rc.local

to run on each boot from all Oracle RAC

nodes in the cluster:

[root@racnode1 ~]# echo "/usr/local/bin/setup_raw_devices.sh" >> /etc/rc.local [root@racnode2 ~]# echo "/usr/local/bin/setup_raw_devices.sh" >> /etc/rc.local 10.

Once the raw devices are created, use the

dd

command to zero out the device and make

sure no data is written to the raw devices. Only perform this action from one of the

Oracle RAC nodes in the cluster:

[root@racnode1 ~]# dd if=/dev/zero of=/dev/raw/raw1 dd: writing to '/dev/raw/raw1': No space left on device 1048516+0 records in

1048515+0 records out

536839680 bytes (537 MB) copied, 773.145 seconds, 694 kB/s [root@racnode1 ~]# dd if=/dev/zero of=/dev/raw/raw2

dd: writing to '/dev/raw/raw2': No space left on device 1048516+0 records in

1048515+0 records out

536839680 bytes (537 MB) copied, 769.974 seconds, 697 kB/s [root@racnode1 ~]# dd if=/dev/zero of=/dev/raw/raw3

dd: writing to '/dev/raw/raw3': No space left on device 65505+0 records in

65504+0 records out

33538048 bytes (34 MB) copied, 47.9176 seconds, 700 kB/s [root@racnode1 ~]# dd if=/dev/zero of=/dev/raw/raw4 dd: writing to '/dev/raw/raw4': No space left on device 65505+0 records in

65504+0 records out

33538048 bytes (34 MB) copied, 47.9915 seconds, 699 kB/s [root@racnode1 ~]# dd if=/dev/zero of=/dev/raw/raw5 dd: writing to '/dev/raw/raw5': No space left on device 65505+0 records in

(15)

33538048 bytes (34 MB) copied, 48.2684 seconds, 695 kB/s

It is highly recommended to take a backup of the voting disk and OCR file before making

any changes! Instruction are included in this guide on how to perform backups of the

voting

disk

and

OCR file

.

CRS_home

The Oracle Clusterware binaries included in this article (i.e.

crs_stat

,

ocrcheck

,

crsctl

,

etc.) are being executed from the Oracle Clusterware home directory which for the purpose

of this article is

/u01/app/crs

. The environment variable

$ORA_CRS_HOME

is set for both the

oracle

and

root

user accounts to this directory and is also included in the

$PATH

:

[root@racnode1 ~]# echo $ORA_CRS_HOME /u01/app/crs

[root@racnode1 ~]# which ocrcheck /u01/app/crs/bin/ocrcheck

Administering the OCR File

View OCR Configuration Information

Two methods exist to verify how many OCR files are configured for the cluster as well as their

location. If the cluster is up and running, use the

ocrcheck

utility as either the

oracle

or

root

user account:

[oracle@racnode1 ~]$ ocrcheck

Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4660 Available space (kbytes) : 257460 ID : 1331197

Device/File Name : /u02/oradata/racdb/OCRFile <-- OCR (primary)

Device/File integrity check succeeded

Device/File not configured <-- OCR Mirror (not configured)

Cluster registry integrity check succeeded

If CRS is down, you can still determine the location and number of OCR files by viewing the file

(16)

is located in

/etc/oracle/ocr.loc

while on Sun Solaris it is located at

/var/opt/oracle/ocr.loc

:

[root@racnode1 ~]# cat /etc/oracle/ocr.loc ocrconfig_loc=/u02/oradata/racdb/OCRFile

local_only=FALSE

To view the actual contents of the OCR in a human-readable format, run the

ocrdump

command.

This command requires the CRS stack to be running. Running the

ocrdump

command will dump

the contents of the OCR into an ASCII text file in the current directory named

OCRDUMPFILE

:

[root@racnode1 ~]# ocrdump

[root@racnode1 ~]# ls -l OCRDUMPFILE

-rw-r--r-- 1 root root 250304 Oct 2 22:46 OCRDUMPFILE

The

ocrdump

utility also allows for different output options:

#

# Write OCR contents to specified file name. #

[root@racnode1 ~]# ocrdump /tmp/'hostname'_ocrdump_'date +%m%d%y:%H%M'

#

# Print OCR contents to the screen. #

[root@racnode1 ~]# ocrdump -stdout -keyname SYSTEM.css

#

# Write OCR contents out to XML format. #

[root@racnode1 ~]# ocrdump -stdout -keyname SYSTEM.css -xml > ocrdump.xml

Add an OCR File

Starting with Oracle Clusterware 10g Release 2 (10.2), users now have the ability to multiplex

(mirror) the OCR. Oracle Clusterware allows for a maximum of two OCR locations; one is the

primary and the second is an OCR mirror. To avoid simultaneous loss of multiple OCR files,

each copy of the OCR should be placed on a shared storage device that does not share any

components (controller, interconnect, and so on) with the storage devices used for the other OCR

file.

Before attempting to add a mirrored OCR, determine how many OCR files are currently

configured for the cluster as well as their location. If the cluster is up and running, use the

ocrcheck

utility as either the

oracle

or

root

user account:

[oracle@racnode1 ~]$ ocrcheck

Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120

(17)

Used space (kbytes) : 4660 Available space (kbytes) : 257460 ID : 1331197

Device/File Name : /u02/oradata/racdb/OCRFile <-- OCR (primary)

Device/File integrity check succeeded

Device/File not configured <-- OCR Mirror (not configured yet)

Cluster registry integrity check succeeded

If CRS is down, you can still determine the location and number of OCR files by viewing the file

ocr.loc

, whose location is somewhat platform dependent. For example, on the Linux platform it

is located in

/etc/oracle/ocr.loc

while on Sun Solaris it is located at

/var/opt/oracle/ocr.loc

:

[root@racnode1 ~]# cat /etc/oracle/ocr.loc ocrconfig_loc=/u02/oradata/racdb/OCRFile

local_only=FALSE

The results above indicate I have only one OCR file and that it is located on an OCFS2 file

system. Since we are allowed a maximum of two OCR locations, I intend to create an OCR

mirror and locate it on the same OCFS2 file system in the same directory as the primary OCR.

Please note that I am doing this for the sake brevity. The OCR mirror should always be placed on

a separate device than the primary OCR file to guard against a single point of failure.

Note that the Oracle Clusterware stack should be online and running on all nodes in the cluster

while adding, replacing, or removing the OCR location and hence does not require any system

downtime.

The operations performed in this section affect the OCR for the entire cluster. However, the

ocrconfig

command cannot modify OCR configuration information for nodes that are shut

down or for nodes on which Oracle Clusterware is not running. So, you should avoid shutting

down nodes while modifying the OCR using the

ocrconfig

command. If for any reason, any

of the nodes in the cluster are shut down while modifying the OCR using the

ocrconfig

command, you will need to perform a

repair

on the stopped node before it can brought online

to join the cluster. Please see the section "

Repair an OCR File on a Local Node

" for

instructions on repairing the OCR file on the affected node.

You can add an OCR mirror after an upgrade or after completing the Oracle Clusterware

installation. The Oracle Universal Installer (OUI) allows you to configure either one or two OCR

locations during the installation of Oracle Clusterware. If you already mirror the OCR, then you

do not need to add a new OCR location; Oracle Clusterware automatically manages two OCRs

when you configure normal redundancy for the OCR. As previously mentioned, Oracle RAC

environments do not support more than two OCR locations; a primary OCR and a secondary

(mirrored) OCR.

(18)

Run the following command to add or relocate an OCR mirror using either destination_file or

disk to designate the target location of the additional OCR:

ocrconfig -replace ocrmirror <destination_file> ocrconfig -replace ocrmirror <disk>

You must be logged in as the

root

user to run the

ocrconfig

command.

Please note that

Attempting to copy the existing OCR file to a new location and then manually

ocrconfig -replace

is the only way to add/relocate OCR files/mirrors.

adding/changing the file pointer in the

ocr.loc

file is not supported and will actually fail to

work.

For example:

#

# Verify CRS is running on node 1. #

[root@racnode1 ~]# crsctl check crs CSS appears healthy

CRS appears healthy EVM appears healthy

#

# Verify CRS is running on node 2. #

[root@racnode2 ~]# crsctl check crs CSS appears healthy

CRS appears healthy EVM appears healthy

#

# Configure the shared OCR destination_file/disk before # attempting to create the new ocrmirror on it. This example # creates a destination_file on an OCFS2 file system.

# Failure to pre-configure the new destination_file/disk # before attempting to run ocrconfig will result in the # following error:

#

# PROT-21: Invalid parameter

#

[root@racnode1 ~]# cp /dev/null /u02/oradata/racdb/OCRFile_mirror [root@racnode1 ~]# chown root /u02/oradata/racdb/OCRFile_mirror [root@racnode1 ~]# chgrp oinstall /u02/oradata/racdb/OCRFile_mirror [root@racnode1 ~]# chmod 640 /u02/oradata/racdb/OCRFile_mirror

#

# Add new OCR mirror. #

[root@racnode1 ~]# ocrconfig -replace ocrmirror

(19)

After adding the new OCR mirror, check that it can be seen from all nodes in the cluster:

#

# Verify new OCR mirror from node 1. #

[root@racnode1 ~]# ocrcheck

Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4668 Available space (kbytes) : 257452 ID : 1331197

Device/File Name : /u02/oradata/racdb/OCRFile

Device/File integrity check succeeded

Device/File Name : /u02/oradata/racdb/OCRFile_mirror <-- New OCR Mirror

Device/File integrity check succeeded Cluster registry integrity check succeeded

[root@racnode1 ~]# cat /etc/oracle/ocr.loc

#Device/file getting replaced by device /u02/oradata/racdb/OCRFile_mirror ocrconfig_loc=/u02/oradata/racdb/OCRFile

ocrmirrorconfig_loc=/u02/oradata/racdb/OCRFile_mirror #

# Verify new OCR mirror from node 2. #

[root@racnode2 ~]# ocrcheck

Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4668 Available space (kbytes) : 257452 ID : 1331197

Device/File Name : /u02/oradata/racdb/OCRFile

Device/File integrity check succeeded

Device/File Name : /u02/oradata/racdb/OCRFile_mirror <-- New OCR Mirror

Device/File integrity check succeeded Cluster registry integrity check succeeded

[root@racnode2 ~]# cat /etc/oracle/ocr.loc

#Device/file getting replaced by device /u02/oradata/racdb/OCRFile_mirror ocrconfig_loc=/u02/oradata/racdb/OCRFile

ocrmirrorconfig_loc=/u02/oradata/racdb/OCRFile_mirror

As mentioned earlier, you can have at most two OCR files in the cluster; the primary OCR and a

single OCR mirror. Attempting to add an extra mirror will actually relocate the current OCR

mirror to the new location specified in the command:

(20)

[root@racnode1 ~]# cp /dev/null /u02/oradata/racdb/OCRFile_mirror2 [root@racnode1 ~]# chown root /u02/oradata/racdb/OCRFile_mirror2 [root@racnode1 ~]# chgrp oinstall /u02/oradata/racdb/OCRFile_mirror2 [root@racnode1 ~]# chmod 640 /u02/oradata/racdb/OCRFile_mirror2 [root@racnode1 ~]# ocrconfig -replace ocrmirror

/u02/oradata/racdb/OCRFile_mirror2

[root@racnode1 ~]# ocrcheck

Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4668 Available space (kbytes) : 257452 ID : 1331197

Device/File Name : /u02/oradata/racdb/OCRFile

Device/File integrity check succeeded Device/File Name : /u02/oradata/racdb/OCRFile_mirror2 <-- Mirror was Relocated!

Device/File integrity check succeeded Cluster registry integrity check succeeded

Relocate an OCR File

Just as we were able to add a new ocrmirror while the CRS stack was online, the same holds true

when relocating an OCR file or OCR mirror and therefore does not require any system

downtime.

You can relocate OCR only when the OCR is mirrored. A mirror copy of the OCR file is

required to move the OCR online. If there is no mirror copy of the OCR, first create the

mirror using the instructions in the previous section.

Attempting to relocate OCR when an OCR mirror does not exist will produce the following

error:

ocrconfig -replace ocr /u02/oradata/racdb/OCRFile PROT-16: Internal Error

If the OCR mirror is not required in the cluster after relocating the OCR, it can be safely

removed.

Run the following command as the

root

account to relocate the current OCR file to a new

location using either destination_file or disk to designate the new target location for the OCR:

ocrconfig -replace ocr <destination_file> ocrconfig -replace ocr <disk>

Run the following command as the

root

account to relocate the current OCR mirror to a new

location using either destination_file or disk to designate the new target location for the OCR

mirror:

(21)

ocrconfig -replace ocrmirror <destination_file> ocrconfig -replace ocrmirror <disk>

The following example assumes the OCR is mirrored and demonstrates how to relocate the

current OCR file (

/u02/oradata/racdb/OCRFile

) from the OCFS2 file system to a new raw

device (

/dev/raw/raw1

):

#

# Verify CRS is running on node 1. #

[root@racnode1 ~]# crsctl check crs CSS appears healthy

CRS appears healthy EVM appears healthy

#

# Verify CRS is running on node 2. #

[root@racnode2 ~]# crsctl check crs CSS appears healthy

CRS appears healthy EVM appears healthy

#

# Verify current OCR configuration. #

[root@racnode2 ~]# ocrcheck

Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4668 Available space (kbytes) : 257452 ID : 1331197

Device/File Name : /u02/oradata/racdb/OCRFile <-- Current OCR to Relocate

Device/File integrity check succeeded Device/File Name : /u02/oradata/racdb/OCRFile_mirror Device/File integrity check succeeded Cluster registry integrity check succeeded

#

# Verify new raw storage device exists, is configured with # the correct permissions, and can be seen from all nodes # in the cluster.

#

[root@racnode1 ~]# ls -l /dev/raw/raw1

crw-r--- 1 root oinstall 162, 1 Oct 2 19:54 /dev/raw/raw1 [root@racnode2 ~]# ls -l /dev/raw/raw1

crw-r--- 1 root oinstall 162, 1 Oct 2 19:54 /dev/raw/raw1

#

# Clear out the contents from the new raw device. #

(22)

[root@racnode1 ~]# dd if=/dev/zero of=/dev/raw/raw1

#

# Relocate primary OCR file to new raw device. Note that # there is no deletion of the old OCR file but simply a # replacement.

#

[root@racnode1 ~]# ocrconfig -replace ocr /dev/raw/raw1

After relocating the OCR file, check that the change can be seen from all nodes in the cluster:

#

# Verify new OCR file from node 1. #

[root@racnode1 ~]# ocrcheck

Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4668 Available space (kbytes) : 257452 ID : 1331197

Device/File Name : /dev/raw/raw1 <-- Relocated OCR

Device/File integrity check succeeded Device/File Name : /u02/oradata/racdb/OCRFile_mirror Device/File integrity check succeeded Cluster registry integrity check succeeded

[root@racnode1 ~]# cat /etc/oracle/ocr.loc

#Device/file /u02/oradata/racdb/OCRFile getting replaced by device /dev/raw/raw1

ocrconfig_loc=/dev/raw/raw1

ocrmirrorconfig_loc=/u02/oradata/racdb/OCRFile_mirror

#

# Verify new OCR file from node 2. #

[root@racnode2 ~]# ocrcheck

Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4668 Available space (kbytes) : 257452 ID : 1331197

Device/File Name : /dev/raw/raw1 <-- Relocated OCR

Device/File integrity check succeeded Device/File Name : /u02/oradata/racdb/OCRFile_mirror Device/File integrity check succeeded Cluster registry integrity check succeeded

[root@racnode2 ~]# cat /etc/oracle/ocr.loc

#Device/file /u02/oradata/racdb/OCRFile getting replaced by device /dev/raw/raw1

ocrconfig_loc=/dev/raw/raw1

(23)

After verifying the relocation was successful, remove the old OCR file at the OS level:

[root@racnode1 ~]# rm -v /u02/oradata/racdb/OCRFile removed '/u02/oradata/racdb/OCRFile'

Repair an OCR File on a Local Node

It was mentioned in the

previous section

that the

ocrconfig

command cannot modify OCR

configuration information for nodes that are shut down or for nodes on which Oracle Clusterware

is not running. You may need to repair an OCR configuration on a particular node if your OCR

configuration changes while that node is stopped. For example, you may need to repair the OCR

on a node that was shut down while you were adding, replacing, or removing an OCR.

To repair an OCR configuration, run the following command as

root

from the node on which

you have stopped the Oracle Clusterware daemon:

ocrconfig –repair ocr device_name

To repair an OCR mirror configuration, run the following command as

root

from the node on

which you have stopped the Oracle Clusterware daemon:

ocrconfig –repair ocrmirror device_name

You cannot perform this operation on a node on which the Oracle Clusterware daemon is

running. The CRS stack must be shutdown before attempting to repair the OCR

configuration on the local node.

The

ocrconfig –repair

command changes the OCR configuration only on the node from

which you run this command. For example, if the OCR mirror was relocated to a disk named

/dev/raw/raw2

from

racnode1

while the node

racnode2

was down, then use the command

ocrconfig -repair ocrmirror /dev/raw/raw2

on

racnode2

while the CRS stack is down

on that node to repair its OCR configuration:

#

# Shutdown CRS stack on node 2 and verify the CRS stack is not up. #

[root@racnode2 ~]# crsctl stop crs

Stopping resources. This could take several minutes. Successfully stopped CRS resources.

Stopping CSSD.

Shutting down CSS daemon.

Shutdown request successfully issued.

[root@racnode2 ~]# ps -ef | grep d.bin | grep -v grep

#

# Relocate OCR mirror to new raw device from node 1. Note # that node 2 is down (actually CRS down on node 2) while # we relocate the OCR mirror.

#

(24)

#

# Verify relocated OCR mirror from node 1. #

[root@racnode1 ~]# ocrcheck

Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4668 Available space (kbytes) : 257452 ID : 1331197 Device/File Name : /dev/raw/raw1

Device/File integrity check succeeded Device/File Name : /dev/raw/raw2 <-- Relocated OCR Mirror

Device/File integrity check succeeded Cluster registry integrity check succeeded

[root@racnode1 ~]# cat /etc/oracle/ocr.loc

#Device/file /u02/oradata/racdb/OCRFile_mirror getting replaced by device /dev/raw/raw2

ocrconfig_loc=/dev/raw/raw1

ocrmirrorconfig_loc=/dev/raw/raw2 #

# Node 2 does not know about the OCR mirror relocation. #

[root@racnode2 ~]# cat /etc/oracle/ocr.loc

#Device/file /u02/oradata/racdb/OCRFile getting replaced by device /dev/raw/raw1 ocrconfig_loc=/dev/raw/raw1

ocrmirrorconfig_loc=/u02/oradata/racdb/OCRFile_mirror #

# While the CRS stack is down on node 2, perform a local OCR # repair operation to inform the node of the relocated OCR # mirror. The ocrconfig -repair option will only update the # OCR configuration information on node 2. If there were # other nodes shutdown during the relocation, they too will # need repaired.

#

[root@racnode2 ~]# ocrconfig -repair ocrmirror /dev/raw/raw2

#

# Verify the repair updated the OCR configuration on node 2. #

[root@racnode2 ~]# cat /etc/oracle/ocr.loc

#Device/file /u02/oradata/racdb/OCRFile_mirror getting replaced by device /dev/raw/raw2

ocrconfig_loc=/dev/raw/raw1

ocrmirrorconfig_loc=/dev/raw/raw2 #

# Bring up the CRS stack on node 2. #

[root@racnode2 ~]# crsctl start crs Attempting to start CRS stack

(25)

The CRS stack will be started shortly

#

# Verify node 2 is back online. #

[root@racnode2 ~]# crs_stat -t

Name Type Target State Host

---ora.racdb.db application ONLINE ONLINE racnode1 ora....b1.inst application ONLINE ONLINE racnode1 ora....b2.inst application ONLINE ONLINE racnode2 ora....srvc.cs application ONLINE ONLINE racnode1 ora....db1.srv application ONLINE ONLINE racnode1 ora....db2.srv application ONLINE ONLINE racnode2 ora....SM1.asm application ONLINE ONLINE racnode1 ora....E1.lsnr application ONLINE ONLINE racnode1 ora....de1.gsd application ONLINE ONLINE racnode1 ora....de1.ons application ONLINE ONLINE racnode1 ora....de1.vip application ONLINE ONLINE racnode1 ora....SM2.asm application ONLINE ONLINE racnode2 ora....E2.lsnr application ONLINE ONLINE racnode2 ora....de2.gsd application ONLINE ONLINE racnode2 ora....de2.ons application ONLINE ONLINE racnode2 ora....de2.vip application ONLINE ONLINE racnode2

Remove an OCR File

To remove an OCR, you need to have at least one OCR online. You may need to perform this to

reduce overhead or for other storage reasons, such as stopping a mirror to move it to SAN, RAID

etc. Carry out the following steps:

Check if at least one OCR is online

Verify the CRS stack is online — preferably on all nodes

Remove the OCR or OCR mirror

If using a clustered file system, remove the deleted file at the OS level

Run the following command as the

root

account to delete the current OCR or the current OCR

mirror:

ocrconfig -replace ocr or

ocrconfig -replace ocrmirror

For example:

#

# Verify CRS is running on node 1. #

[root@racnode1 ~]# crsctl check crs CSS appears healthy

CRS appears healthy EVM appears healthy

(26)

#

# Verify CRS is running on node 2. #

[root@racnode2 ~]# crsctl check crs CSS appears healthy

CRS appears healthy EVM appears healthy

#

# Get the existing OCR file information by running ocrcheck # utility.

#

[root@racnode1 ~]# ocrcheck

Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4668 Available space (kbytes) : 257452 ID : 1331197 Device/File Name : /dev/raw/raw1

Device/File integrity check succeeded

Device/File Name : /dev/raw/raw2 <-- OCR Mirror to be Removed

Device/File integrity check succeeded Cluster registry integrity check succeeded

#

# Delete OCR mirror from the cluster configuration. #

[root@racnode1 ~]# ocrconfig -replace ocrmirror

After removing the new OCR mirror, check that the change is seen from all nodes in the cluster:

#

# Verify OCR mirror was removed from node 1. #

[root@racnode1 ~]# ocrcheck

Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4668 Available space (kbytes) : 257452 ID : 1331197 Device/File Name : /dev/raw/raw1

Device/File integrity check succeeded

Device/File not configured <-- OCR Mirror Removed

Cluster registry integrity check succeeded

#

# Verify OCR mirror was removed from node 2. #

(27)

Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4668 Available space (kbytes) : 257452 ID : 1331197 Device/File Name : /dev/raw/raw1

Device/File integrity check succeeded

Device/File not configured <-- OCR Mirror Removed

Cluster registry integrity check succeeded

Removing the OCR or OCR mirror from the cluster configuration does not remove the

physical file at the OS level when using a clustered file system.

Backup the OCR File

There are two methods for backing up the contents of the OCR and each backup method can be

used for different recovery purposes. This section discusses how to ensure the stability of the

cluster by implementing a robust backup strategy.

The first type of backup relies on automatically generated OCR file copies which are sometimes

referred to as physical backups. These physical OCR file copies are automatically generated by

the CRSD process on the

master node

and are primarily used to recover the OCR from a lost or

corrupt OCR file. Your backup strategy should include procedures to copy these automatically

generated OCR file copies to a secure location which is accessible from all nodes in the cluster in

the event the OCR needs to be restored.

The second type of backup uses manual procedures to create OCR export files; also known as

logical backups. Creating a manual OCR export file should be performed both before and after

making significant configuration changes to the cluster, such as adding or deleting nodes from

your environment, modifying Oracle Clusterware resources, or creating a database. If in the

event a configuration change is made to the OCR that causes errors, the OCR can be restored to a

previous state by performing an import of the logical backup taken before the configuration

change. Please note that an OCR logical export can also be used to restore the OCR from a lost

or corrupt OCR file.

Unlike the methods used to

copying the OCR file directly at the OS level is not a valid backup and will result in errors

backup the voting disk

, attempting to backup the OCR by

after the restore!

Because of the importance of OCR information, Oracle recommends that you make copies of the

automatically created backup files and an OCR export at least once a day. The following is a

working UNIX script that can be scheduled in CRON to backup the OCR File(s) and the Voting

Disk(s) on a regular basis:

(28)

crs_components_backup_10g.ksh

Automatic OCR Backups

The Oracle Clusterware automatically creates OCR physical backups every four hours. At any

one time, Oracle always retains the last 3 backup copies of the OCR that are 4 hours old. The

CRSD process that creates these backups also creates and retains an OCR backup for each full

day and at the end of each week. You cannot customize the backup frequencies or the number of

OCR physical backup files that Oracle retains.

The default location for generating physical backups on UNIX-based systems is

CRS_home/cdata/cluster_name

where cluster_name is the name of your cluster (i.e.

crs

). Use

the

ocrconfig -showbackup

command to view all current OCR physical backups that were

taken from the

master node

:

[oracle@racnode1 ~]$ ocrconfig -showbackup

racnode1 2009/09/29 13:05:22 /u01/app/crs/cdata/crs racnode1 2009/09/29 09:05:22 /u01/app/crs/cdata/crs racnode1 2009/09/29 05:05:22 /u01/app/crs/cdata/crs racnode1 2009/09/28 05:05:21 /u01/app/crs/cdata/crs racnode1 2009/09/22 05:05:13 /u01/app/crs/cdata/crs [oracle@racnode1 ~]$ ls -l $ORA_CRS_HOME/cdata/crs total 59276

-rw-r--r-- 1 root root 8654848 Sep 29 13:05 backup00.ocr <-- Most recent physical backup

-rw-r--r-- 1 root root 8654848 Sep 29 09:05 backup01.ocr -rw-r--r-- 1 root root 8654848 Sep 29 05:05 backup02.ocr -rw-r--r-- 1 root root 8654848 Sep 29 05:05 day_.ocr

-rw-r--r-- 1 root root 8654848 Sep 28 05:05 day.ocr <-- One day old

-rw-r--r-- 1 root root 8654848 Sep 29 05:05 week_.ocr

-rw-r--r-- 1 root root 8654848 Sep 22 05:05 week.ocr <-- One week old

You can change the location where the CRSD process writes the physical OCR copies to using:

ocrconfig -backuploc <new_dirname>

Restoring the OCR from an automatic physical backup is accomplished using the

ocrconfig -restore

command. Note that the CRS stack needs to be shutdown on all nodes in the cluster

prior to running the restore operation:

ocrconfig -restore <backup_file_name>

You cannot restore the OCR from a physical backup using the

-import

option. The only method

(29)

The Master Node

As documented in Doc ID:

357262.1

on the

My Oracle Support

web site, the CRSD process only

creates automatic OCR physical backups on one node in the cluster, which is the OCR master

node. It does not create automatic backup copies on the other nodes; only from the OCR master

node. If the master node fails, the OCR backups will be created from the new master node. You

can determine which node in the cluster is the master node by examining the

$ORA_CRS_HOME/log/<node_name>/cssd/ocssd.log

file on any node in the cluster. In this log

file, check for reconfiguration information (reconfiguration successful) after which you will see

which node is the master and how many nodes are active in the cluster:

Node 1 - (racnode1)

[ CSSD]CLSS-3000: reconfiguration successful, incarnation 1 with 2 nodes [ CSSD]CLSS-3001: local node number 1, master node number 1

Node 2 - (racnode2)

[ CSSD]CLSS-3000: reconfiguration successful, incarnation 1 with 2 nodes [ CSSD]CLSS-3001: local node number 2, master node number 1

Another quick approach is to use either of the following methods:

Node 1 - (racnode1)

# grep -i "master node" $ORA_CRS_HOME/log/racnode?/cssd/ocssd.log | tail -1 [ CSSD]CLSS-3001: local node number 1, master node number 1

Node 2 - (racnode2)

# grep -i "master node" $ORA_CRS_HOME/log/racnode?/cssd/ocssd.log | tail -1 [ CSSD]CLSS-3001: local node number 2, master node number 1

# If not found in the ocssd.log, then look through all # of the ocssd archives:

Node 1 - (racnode1)

# for x in 'ls -tr $ORA_CRS_HOME/log/racnode?/cssd/ocssd.*'

do grep -i "master node" $x; done | tail -1

[ CSSD]CLSS-3001: local node number 1, master node number 1 Node 2 - (racnode2)

# for x in 'ls -tr $ORA_CRS_HOME/log/racnode?/cssd/ocssd.*'

do grep -i "master node" $x; done | tail -1

[ CSSD]CLSS-3001: local node number 2, master node number 1

# The master node information is confirmed by the # ocrconfig -showbackup command:

# ocrconfig -showbackup

racnode1 2009/09/29 13:05:22 /u01/app/crs/cdata/crs

(30)

racnode1 2009/09/29 05:05:22 /u01/app/crs/cdata/crs

racnode1 2009/09/28 05:05:21 /u01/app/crs/cdata/crs

racnode1 2009/09/22 05:05:13 /u01/app/crs/cdata/crs

Because of the importance of OCR information, Oracle recommends that you make copies of the

automatically created backup files at least once a day from the master node to a different device

from where the primary OCR resides. You can use any backup software to copy the

automatically generated physical backup files to a stable backup location:

[root@racnode1 ~]# cp -p -v -f -R /u01/app/crs/cdata /u02/crs_backup/ocrbackup/RACNODE1 '/u01/app/crs/cdata/crs/day_.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE1/cdata/crs/day_.ocr' '/u01/app/crs/cdata/crs/backup02.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE1/cdata/crs/backup02.ocr' '/u01/app/crs/cdata/crs/backup01.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE1/cdata/crs/backup01.ocr' '/u01/app/crs/cdata/crs/week_.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE1/cdata/crs/week_.ocr' '/u01/app/crs/cdata/crs/day.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE1/cdata/crs/day.ocr' '/u01/app/crs/cdata/crs/backup00.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE1/cdata/crs/backup00.ocr' '/u01/app/crs/cdata/crs/week.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE1/cdata/crs/week.ocr' [root@racnode2 ~]# cp -p -v -f -R /u01/app/crs/cdata /u02/crs_backup/ocrbackup/RACNODE2 '/u01/app/crs/cdata/crs/day_.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE2/cdata/crs/day_.ocr' '/u01/app/crs/cdata/crs/backup02.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE2/cdata/crs/backup02.ocr' '/u01/app/crs/cdata/crs/backup01.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE2/cdata/crs/backup01.ocr' '/u01/app/crs/cdata/crs/week_.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE2/cdata/crs/week_.ocr' '/u01/app/crs/cdata/crs/day.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE2/cdata/crs/day.ocr' '/u01/app/crs/cdata/crs/backup00.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE2/cdata/crs/backup00.ocr' '/u01/app/crs/cdata/crs/week.ocr' -> '/u02/crs_backup/ocrbackup/RACNODE2/cdata/crs/week.ocr'

It is possible and recommended that shared storage be used for the backup location(s). Keep in

mind that if the master node goes down and cannot be rebooted, it is possible to loose all OCR

physical backups if they were all on that node. The OCR backup process, however, will start on

the new master node within four hours for all new backups. It is highly recommended that you

integrate OCR backups with your normal database backup strategy. If possible, use a backup

location that is shared by all nodes in the cluster.

(31)

Performing a manual export of the OCR should be done before and after making significant

configuration changes to the cluster, such as adding or deleting nodes from your environment,

modifying Oracle Clusterware resources, or creating a database. This type of backup is often

referred to as a logical backup. If in the event a configuration change is made to the OCR that

causes errors, the OCR can be restored to its previous state by performing an import of the

logical backup taken before the configuration change. For example, if you have unresolvable

configuration problems, or if you are unable to restart your cluster database after such changes,

then you can restore your configuration by importing the saved OCR content from a valid

configuration.

Please note that an OCR logical export can also be used to restore the OCR from a lost or corrupt

OCR file.

To export the contents of the OCR to a dump file, use the following command, where

backup_file_name is the name of the OCR logical backup file you want to create:

ocrconfig -export <backup_file_name>

For example:

[root@racnode1 ~]# ocrconfig -export

/u02/crs_backup/ocrbackup/RACNODE1/exports/OCRFileBackup.dmp

# A second export is not strictly required, however, there is no such thing as too many backups!

[root@racnode2 ~]# ocrconfig -export

/u02/crs_backup/ocrbackup/RACNODE2/exports/OCRFileBackup.dmp

To restore the OCR from an export/logical backup, use the

ocrconfig –import

command. Note

that the CRS stack needs to be shutdown on all nodes in the cluster prior to running the restore

operation. In addition, the total space required for the restored OCR location (typically 280MB)

has to be pre-allocated. This is especially important when the OCR is located on a clustered file

system like OCFS2.

ocrconfig –import <export_file_name>

You cannot restore the OCR from a logical backup using the

-restore

option. The only method

to restore the OCR from a logical export is to use the

-import

option.

You must be logged in as the root user to run the

ocrconfig

command.

Recover the OCR File

If an application fails, then before attempting to restore the OCR, restart the application. As a

definitive verification that the OCR failed, run the

ocrcheck

command:

(32)

[root@racnode1 ~]# ocrcheck

Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 4668 Available space (kbytes) : 257452 ID : 1331197

Device/File Name : /u02/oradata/racdb/OCRFile

Device/File integrity check succeeded <-- OCR (primary) Valid

Device/File Name : /u02/oradata/racdb/OCRFile_mirror

Device/File integrity check succeeded <-- OCR (mirror) Valid

Cluster registry integrity check succeeded

The example above indicates that both the primary OCR and OCR mirror checks were successful

and that no problems exist with the OCR configuration.

If the

ocrcheck

command does not display the message '

Device/File integrity check succeeded

' for at least one copy of the OCR, then both the primary OCR and the OCR mirror

have failed. In this case, the CRS stack must be brought down on all nodes in the cluster to

restore the OCR from a previous

physical backup copy

or an

OCR export

.

If there is at least one copy of the OCR available (either the primary OCR or the OCR mirror),

you can use that valid copy to restore the contents of the other copy of the OCR. The restore in

this case can be accomplished using the

ocrconfig -replace

command and does not require

the applications or CRS stack to be down.

This section describes a number of possible OCR recovery scenarios using the OCR

configuration described by the output of the ocrcheck command above. Both the primary OCR

and the OCR mirror are located on an OCFS2 file system in the same directory. The recovery

scenarios demonstrated in this section will make use of both types of OCR backups —

automatically generated OCR file copies

and

manually created OCR export files

.

Although it should go without saying, DO NOT perform these recovery scenarios on a

critical system like production!

Recover OCR from Valid OCR Mirror

This section demonstrates how to restore the OCR when only one of the OCR copies is missing

or corrupt. The restore process will use the good OCR copy (whether its the primary OCR or the

OCR mirror) to restore the missing/corrupt copy. Remember that if there is at least one copy of

the OCR available, you can use that valid copy to restore the contents of the other copy of the

OCR. The best part about this type of recovery is that it doesn't require any downtime! Oracle

Clusterware and the applications can remain online during the recovery process.

References

Related documents

The lift to drag ratio increases as the angle of attack increased on both wings, for rear wing the lift to drag ratio is reduced when compared to that of front wing due to

Finally, we assessed whether setting, grade, simulation, difficulty reported, learning, or amount of thought needed to complete the simulation were correlated with the degree to

We evaluate the scheme under real network conditions using collected keyboard timings for ssh, and our evaluation demonstrates that the steganographic covert channel provides

Quality: We measure quality (Q in our formal model) by observing the average number of citations received by a scientist for all the papers he or she published in a given

Different from existing source term estimation methods, based on the sensor characteristic of chemical sensors, the zero readings of sensors are exploited in our algorithm where

In ad- dition, most protective mechanisms shaped by selection against cancer are costly and come with trade- offs (Hochberg, Thomas, Assenat, &amp; Hibner, 2013); thus,

Eight morpho-anatomical properties of two-year-old needles of Pinus heldreichii (Bosnian pine) from the Scardo-Pindic mountain massif in Serbia (Kosovo, Mt. Ošljak)

In each of the columns including a control for work experience, overeducation implies a lower wage penalty as clearly the number of years spent in the labor market