Automatic workload management for services

If your application scales transparently on SMP machines, then it is realistic to expect it to scale well on RAC, without having to make any changes to the application code.

RAC eliminates the database instance, and the node itself, as a single point of failure, and ensures database integrity in the case of such failures.

The following are some scalability examples: • Allow more simultaneous batch processes

• Allow larger degrees of parallelism and more parallel executions to occur

• Allow large increases in the number of connected users in online transaction processing (OLTP) systems

With Oracle RAC, you can build a cluster that fits your needs, whether the cluster is made up of servers where each server is a two-CPU commodity server or clusters where the servers have 32 or 64 CPUs in each server. The Oracle parallel execution feature allows a single SQL statement to be divided up into multiple processes, where each process completes a subset of work. In an Oracle RAC environment, you can define the parallel processes to run only on the instance where the user is connected or to run across multiple instances in the cluster.

Clusters and Scalability

SMP model RAC model

Cache CPU CPU Cache CPU CPU Memory Cache coherency SGA BGP BGP SGA BGP BGP Shared storage Cache fusion BGP (background process)

Successful implementation of cluster databases requires optimal scalability on four levels: • Hardware scalability: Interconnectivity is the key to hardware scalability, which greatly

depends on high bandwidth and low latency.

• Operating system scalability: Methods of synchronization in the operating system can

determine the scalability of the system. In some cases, potential scalability of the hardware is lost because of the operating system’s inability to handle multiple resource requests simultaneously.

• Database management system scalability: A key factor in parallel architectures is

whether the parallelism is affected internally or by external processes. The answer to this question affects the synchronization mechanism.

• Application scalability: Applications must be specifically designed to be scalable. A

bottleneck occurs in systems in which every session is updating the same data most of the time. Note that this is not RAC-specific and is true on single-instance systems, too. It is important to remember that if any of the preceding areas are not scalable (no matter how scalable the other areas are), then parallel cluster processing may not be successful. A typical cause for the lack of scalability is one common shared resource that must be accessed often. This causes the otherwise parallel operations to serialize on this bottleneck. High latency in the synchronization increases the cost of synchronization, thereby counteracting the benefits of parallelization. This is a general limitation and not a RAC-specific limitation.

Levels of Scalability

• Hardware: Disk input/output (I/O)

• Internode communication: High bandwidth and low latency

• Operating system: Number of CPUs

• Database management system: Synchronization

• Application: Design

Scaleup is the ability to sustain the same performance levels (response time) when both workload and resources increase proportionally:

Scaleup = (volume parallel) / (volume original)

For example, if 30 users consume close to 100 percent of the CPU during normal processing, then adding more users would cause the system to slow down due to contention for limited CPU cycles. However, by adding CPUs, you can support extra users without degrading performance.

Speedup is the effect of applying an increasing number of resources to a fixed amount of work to achieve a proportional reduction in execution times:

Speedup = (time original) / (time parallel)

Speedup results in resource availability for other tasks. For example, if queries usually take ten minutes to process, and running in parallel reduces the time to five minutes, then

additional queries can run without introducing the contention that might occur if they were to run concurrently.

Scaleup and Speedup

Original system

100% of task

Cluster system scaleup

Up to 200% of task Up to 300% of task Time Hardware Time Time

Cluster system speedup

Time/2 Hardware Hardware Hardware Hardware Time Hardware 100% of task

The type of workload determines whether scaleup or speedup capabilities can be achieved using parallel processing.

Online transaction processing (OLTP) and Internet application environments are

characterized by short transactions that cannot be further broken down and, therefore, no speedup can be achieved. However, by deploying greater amounts of resources, a larger volume of transactions can be supported without compromising the response.

Decision support systems (DSS) and parallel query options can attain speedup, as well as scaleup, because they essentially support large tasks without conflicting demands on

resources. The parallel query capability within the Oracle database can also be leveraged to decrease overall processing time of long-running queries and to increase the number of such queries that can be run concurrently.

In an environment with a mixed workload of DSS, OLTP, and reporting applications, scaleup can be achieved by running different programs on different hardware. Speedup is possible in a batch environment, but may involve rewriting programs to use the parallel processing capabilities.

Speedup/Scaleup and Workloads

Workload Speedup Scaleup

OLTP and Internet No Yes

DSS with parallel query Yes Yes

Batch (mixed) Possible Yes

To make sure that a system delivers the I/O demand that is required, all system components on the I/O path need to be orchestrated to work together.

The weakest link determines the I/O throughput.

On the left, you see a high-level picture of a system. This is a system with four nodes, two Host Bus Adapters (HBAs) per node, two Fibre Channel switches, which are attached to four disk arrays each. The components on the I/O path are the HBAs, cables, switches, and disk arrays. Performance depends on the number and speed of the HBAs, switch speed, controller quantity, and speed of disks. If any one of these components is undersized, the system

throughput is determined by this component. Assuming you have a 2 Gb HBA, the nodes can read about 8  200 MB/s = 1.6 GB/s. However, assuming that each disk array has one controller, all eight arrays can also do 8  200 MB/s = 1.6 GB/s. Therefore, each of the Fibre Channel switches also needs to deliver at least 2 Gb/s per port, to a total of 800 MB/s

throughput. The two switches will then deliver the needed 1.6 GB/s.

Note: When sizing a system, also take the system limits into consideration. For instance, the

number of bus slots per node is limited and may need to be shared between HBAs and network cards. In some cases, dual port cards exist if the number of slots is exhausted. The number of HBAs per node determines the maximal number of Fibre Channel switches. And the total number of ports on a switch limits the number of HBAs and disk controllers.

I/O Throughput Balanced: Example

FC-switch Disk array 1 Disk array 2 Disk array 3 Disk array 4 Disk array 5 Disk array 6 Disk

array 7 Disk_{array 8}

Each machine has 2 CPUs: 2  200 MB/s  4 = 1600 MB/s

Each machine has 2 HBAs: 8  200 MB/s = 1600 MB/s

Each switch needs to support 800 MB/s to guarantee a total system throughput of1600 MB/s.

Each disk array has one 2 Gb controller: 8  200 MB/s =

1600 MB/s

In single-instance environments, locking coordinates access to a common resource such as a row in a table. Locking prevents two processes from changing the same resource (or row) at the same time.

In RAC environments, internode synchronization is critical because it maintains proper

coordination between processes on different nodes, preventing them from changing the same resource at the same time. Internode synchronization guarantees that each instance sees the most recent version of a block in its buffer cache.

The slide shows you what would happen in the absence of cache coordination. RAC prevents this problem. Resource coordination is performed for the buffer cache, library cache, row cache, and results cache and will be explored in later lessons.

Necessity of Global Resources

1008 SGA1 SGA2 1008 SGA1 SGA2 1008 1008 SGA1 SGA2 1008 SGA1 SGA2 1009 1008 ₁₀₀₉ Lost updates!

1

2

4

3

RAC-specific memory is mostly allocated in the shared pool at SGA creation time. Because blocks may be cached across instances, you must also account for bigger buffer caches. Therefore, when migrating your Oracle Database from single instance to RAC, keeping the workload requirements per instance the same as with the single-instance case, about 10% more buffer cache and 15% more shared pool are needed to run on RAC. These values are heuristics, based on RAC sizing experience. However, these values are mostly upper bounds. If you use the recommended automatic memory management feature as a starting point, then you can reflect these values in your SGA_TARGET initialization parameter.

However, consider that memory requirements per instance are reduced when the same user population is distributed over multiple nodes.

Actual resource usage can be monitored by querying the CURRENT_UTILIZATION and MAX_UTILIZATIONcolumns for the Global Cache Services (GCS) and Global Enqueue Services (GES) entries in the V$RESOURCE_LIMIT view of each instance. You can monitor the exact RAC memory resource usage of the shared pool by querying V$SGASTAT as shown in the slide.

Additional Memory Requirement for RAC

In document Oracle 12c RAC Administration D81250GC10_sg (Page 52-59)