Complex data partitioning with triggers and views

Chapter 4. HDR: Advanced use scenarios

4.4 Data partitioning with multiple HDR

4.4.2 Complex data partitioning with triggers and views

Partitioning data based on a table between multiple servers may not yield the best performance for some applications. It is possible that an application may be accessing data from different tables that may be partitioned across multiple servers. This could lead to quite a lot of data movement between the servers and thus to poor performance. A transaction spanning servers will internally follow the two-phase commit protocol.

Simple table partitioning does not maintain locality of data. For example, one IDS server instance will contain all data in a table. It is quite likely that the data in the table belongs to different geographical regions. If different servers are located in different geographies, it may be best to hold the active data for a particular geography on the server in that region. This leads to the notion of partitioning data based on location.

Even if the data is partitioned based on location, the application should be allowed to access all of the data from anywhere. For performance reasons, it

Customer Item Inst A (pri) Inst B (pri) Item customer

may be useful to contain application data specific to a region on a server in that region. But it is quite likely that an application from a different region may want to access data from any server. For example, there could be call centers for different geographies and each of them might have their own server. Then, an application from a local call center would connect to the local server and access data. During off-peak hours, only one call center might be open and all calls will be routed to that region. That means an application will be accessing all data, not data specific to a particular region. It would be good if a single application worked all the time without switching from server to server.

To achieve data transparency, but still partition the data based on locality, a complex mechanism can be used. It is assumed here that multiple IDS Instances are used for scalability as well as high availability. This can be achieved by designing the schema using views, triggers and stored procedures, which is functionality that is available in IDS. This mechanism of partitioning can be best explained using an example. So, here we extend the example from the previous section.

Assume that the Customer table will be partitioned and table data will be physically located on the server where the data logically belongs. In this case the Customer table will be partitioned based on the

cust_no

column in the table. Any customer record whose

cust_id

is less than 10000 will be stored in one server and the rest of the customer records are stored in another server. For simplicity we partition the data into two servers, creating two base tables c1 and c2 (c1 in Inst A and c2 in Inst B). As explained in the previous section, synonyms will be created on both servers to make sure that both tables are visible in each server. Figure 4-12 shows the physical location of the two tables and their synonyms on the other servers.

Figure 4-12 Creating two base tables c1 and c2 and their synonyms

Table c1 holds customer record with

cust_id

less than 10000 and c2 holds customer record with

cust_id

>= 10000. This requirement has not yet been

Create table c1

Create synonym c2 Create table c2Create synonym c1

Inst A (pri) Inst B (pri) Inst A (sec) Inst B (sec)

C2 C1

enforced in the schema. So, now to view all customer records, create a union all view with the following commands:

create view customer as

select * from c1 union all select * from c2;

This will create a view using all of the customer table that can be accessed. Any application can use the Customer table without knowing the location of the data, which makes data transparent to the application. Figure 4-13 shows the logical and physical relationship between various database entities.

Figure 4-13 Creating partitioned data

So far, accessing data is not an issue, but what about updating? Insert, update, and delete will not work on view, so how will an application insert, update, and delete rows from the Customer view? To achieve this we use a stored procedure and a trigger. Here, the actual work is performed inside of the stored procedure. There will be trigger and stored procedure for each operation.

This is illustrated in Example 4-1, which shows a stored procedure and trigger that performs an insert into the customer table. The stored procedure is very simple, and ensures that the customer record is inserted into the correct base table. The stored procedure and trigger must be created on views on each server. This technique can be extended to perform update and delete operations as well.

Create View customer

Customer1 UNION ALL customer2 Create View customerCustomer1 UNION ALL customer2

Inst A (pri) Inst B (pri) Inst A (sec) Inst B (sec) C2 C1 C1 C2 Customer Customer _Customer Customer

Example 4-1 Stored procedure and trigger for handling inserts

Any application can treat the customer view as the customer table for all work. The application does not have to deal with location-based data and need not worry about actual location of the data. The IDS server will route the query properly and access the data in an optimized way to avoid data movement. The beauty of this solution is that partitioning can be extended to any number of servers. It can start with single server and as the data volume grows, the number of servers can be expanded. When adding servers, the underlying schema remains transparent to the applications.

In document Informix Dynamic Server V10: (Page 64-67)