Materialized View Creation
and Transformation of
Schemas in Highly Available
Database Systems
Thesis for the degree philosophiae doctor
Trondheim, October 2007
Norwegian University of Science and Technology
Faculty of Information Technology,
Mathematics and Electrical Engineering
NTNU
Norwegian University of Science and Technology Thesis for the degree philosophiae doctor
Faculty of Information Technology, Mathematics and Electrical Engineering Department of Computer and Information Science
© Jørgen Løland
ISBN 978-82-471-4381-0 (printed version) ISBN 978-82-471-4395-7 (electronic version) ISSN 1503-8181
Doctoral theses at NTNU, 2007:199 Printed by NTNU-trykk
This thesis is submitted to the Norwegian University of Science and Technol-ogy in partial fulfillment of the degree PhD. The work has been carried out at the Database System Group, Department of Computer and Information Science (IDI). The study was funded by the Faculty of Information Tech-nology, Mathematics and Electrical Engineering through the “forskerskolen” program.
Acknowledgements
First, I would like to thank my advisor Professor Svein-Olaf Hvasshovd for his guidance and ideas, and for providing valuable comments to drafts of the thesis and papers. I would also like to thank my co-advisors Dr. Ing. Øystein Torbjørnsen and Professor Svein Erik Bratsberg for constructive feedback and interesting discussions regarding the research.
During the years I have been working on this thesis, I have received help from many people. In particular, I would like to thank Heine Kolltveit and Jeanine Lilleng for many interesting discussions. In addition, Professor Kjetil Nørv˚ag has been a seemingly infinite source of information when it comes to academic publishing. I would also like to thank the members of the Database System Group in general for providing a good environment for PhD students. I sincerely thank Rune Havnung Bakken, Jon Olav Hauglid and Associate Professor Roger Midtstraum for proofreading and commenting drafts of the thesis. Your feedback have been invaluable.
I would also like to thank my parents and sister for their inspiration and encouragements. Finally, I express my deepest thanks to my wife Ingvild for her constant love and support.
Relational database systems are used in thousands of applications every day, including online web shops, electronic medical records and for mobile tele-phone tracking. Many of these applications have high availability require-ments, allowing the database system to be offline for only a few minutes each year.
In existing DBMSs, user transactions get blocked during creation of mate-rialized views (MVs) and non-trivial schema transformations. Blocking user transactions is not an option in database systems requiring high availability. A non-blocking method to perform these operations is therefore needed.
Our research has focused on how the MV creation and schema transfor-mation operations can be performed in database systems with high avail-ability requirements. We have examined existing solutions to MV creation and schema transformations, and identified requirements. Most important among these requirements were that the method should not have blocking effects, and should degrade performance of concurrent transactions to the smallest possible extent.
The main contribution of this thesis is a method for creation of derived tables (DTs) using relational operators. Furthermore, we show how these DTs can be used to create MVs and to perform schema transformations. The method is non-blocking, and may be executed as a low priority background process to minimize performance degradation.
The MV creation and schema transformation methods have been imple-mented in a prototype DBMS. By performing thorough empirical validation experiments on this prototype, we show that the method works correctly. Furthermore, through extensive performance experiments, we show that the method incurs little response time and throughput degradation under mod-erate workloads. Thus, the method provides a way to create MVs and to transform the database schema that can be used in highly available database systems.
I
Background and Context
3
1 Introduction 5
1.1 Motivation . . . 5
1.1.1 The Derived Table Creation Problem . . . 7
1.2 Research Questions . . . 10
1.3 Research Methodology . . . 11
1.4 Organization of this thesis . . . 12
2 Derived Table Creation Basics 14 2.1 Database Systems - An Introduction . . . 14
2.2 Concurrency Control . . . 16
2.3 Recovery . . . 17
2.4 Record Identification Policy . . . 21
3 A Survey of Technologies Related to Non-Blocking Derived Table Creation 23 3.1 Ronstr¨om’s Schema Transformations . . . 23
3.1.1 Simple Schema Changes . . . 25
3.1.2 Complex Schema Changes . . . 25
3.1.3 Cost Analysis of Ronstr¨om’s Method . . . 36
3.2 Fuzzy Table Copying . . . 40
3.3 Materialized View Maintenance . . . 41
3.3.1 Snapshots . . . 41
3.3.2 Materialized Views . . . 42
3.4 Schema Transformations and DT creation in Existing DBMSs 44 3.5 Summary . . . 45
II
Derived Table Creation
47
x CONTENTS
4.1 Overview of the Framework . . . 49
4.2 Step 1: Preparation . . . 51
4.3 Step 2: Initial Population . . . 53
4.4 Step 3: Log Propagation . . . 53
4.5 Step 4: Synchronization . . . 54
4.6 Considerations for Schema Transformations . . . 56
4.6.1 A lock forwarding improvement for schema transfor-mations . . . 59
4.7 Summary . . . 59
5 Common DT Creation Problems 60 5.1 Missing Record and State Identification . . . 60
5.2 Missing Record Pre-States . . . 61
5.3 Lock Forwarding During Transformations . . . 63
5.4 Inconsistent Source Records . . . 66
5.4.1 Repairing Inconsistencies . . . 68
5.5 Summary . . . 69
6 DT Creation using Relational Operators 70 6.1 Difference and Intersection . . . 71
6.1.1 Preparation . . . 71
6.1.2 Initial Population . . . 73
6.1.3 Log Propagation . . . 73
6.1.4 Synchronization . . . 75
6.2 Horizontal Merge with Duplicate Inclusion . . . 77
6.2.1 Preparation . . . 79
6.2.2 Initial Population . . . 79
6.2.3 Log Propagation . . . 79
6.2.4 Synchronization . . . 80
6.3 Horizontal Merge with Duplicate Removal . . . 81
6.3.1 Preparation Step . . . 83
6.3.2 Initial Population Step . . . 83
6.3.3 Log Propagation Step . . . 83
6.3.4 Synchronization Step . . . 85
6.4 Horizontal Split Transformation . . . 86
6.4.1 Preparation . . . 86 6.4.2 Initial Population . . . 87 6.4.3 Log propagation . . . 87 6.4.4 Synchronization . . . 89 6.5 Vertical Merge . . . 89 6.5.1 Preparation . . . 91
6.5.2 Initial Population . . . 91
6.5.3 Log Propagation . . . 91
6.5.4 Synchronization . . . 93
6.6 Vertical Split over a Candidate Key . . . 95
6.6.1 Preparation . . . 96
6.6.2 Initial Population . . . 96
6.6.3 Log Propagation . . . 96
6.6.4 Synchronization . . . 97
6.7 Vertical Split over a Functional Dependency . . . 97
6.7.1 Preparation . . . 99
6.7.2 Initial Population . . . 100
6.7.3 Log Propagation . . . 101
6.7.4 Synchronization . . . 102
6.7.5 How to Handle Inconsistent Data - An Extension to Vertical Split . . . 103
6.8 Summary . . . 105
III
Implementation and Evaluation
109
7 Implementation Alternatives 111 7.1 Alternative 1 - Simulation . . . 1127.2 Alternative 2 - Open Source DBMS . . . 112
7.3 Alternative 3 - Prototype . . . 115
7.4 Implementation Alternative Discussion . . . 117
8 Design of the Non-blocking DBMS 120 8.1 The Non-blocking DBMS Server . . . 123
8.1.1 Database Communication Module . . . 123
8.1.2 SQL Parser Module . . . 123
8.1.3 Relational Manager Module . . . 124
8.1.4 Scheduler Module . . . 127
8.1.5 Recovery Manager Module . . . 128
8.1.6 Data Manager Module . . . 129
8.1.7 Effects of the Simplifications . . . 130
8.2 Client and Administrator Programs . . . 132
8.3 Summary . . . 132
9 Prototype Testing 134 9.1 Test Environment . . . 134 9.2 Empirical Validation of the Non-Blocking DT Creation Methods139
xii CONTENTS
9.3 Performance Testing . . . 142
9.3.1 Log Propagation - Difference and Intersection . . . 147
9.3.2 Log Propagation - Vertical Merge . . . 154
9.3.3 Low Performance Degradation or Short Execution Time?156 9.3.4 Other Steps of DT Creation . . . 157
9.3.5 Performance Experiment Summary . . . 159
9.4 Discussion . . . 160
10 Discussion 162 10.1 Contributions . . . 162
10.1.1 A General DT Creation Framework . . . 163
10.1.2 DT Creation for Many Relational Operators . . . 163
10.1.3 Support for both Schema Transformations and Mate-rialized Views . . . 164
10.1.4 Solutions to Common DT Creation Problems . . . 165
10.1.5 Implemented and Empirically Validated . . . 165
10.1.6 Low Degree of Performance Degradation . . . 165
10.1.7 Based on Existing DBMS Functionality . . . 168
10.1.8 Other Considerations - Total Amount of Data . . . 169
10.2 Answering the Research Question . . . 169
10.2.1 Summary . . . 171
11 Conclusion and Future Work 172 11.1 Research Contributions . . . 172
11.2 Future Work . . . 173
11.3 Publications . . . 175
IV
Appendix
177
A Non-blocking Database: SQL Syntax 179
B Performance Graphs 181
Glossary 191
2.1 Database System . . . 15
2.2 Compensation Log Records provide valid State Identifiers . . . 18
3.1 Ronstr¨oms Horizontal Merge Method . . . 27
3.2 Ronstr¨oms Horizontal Split Method . . . 28
3.3 Examples of Vertical Merge Schema Change . . . 30
3.4 Chain of Triggers in Ronstr¨oms Vertical Merge Method . . . . 32
3.5 Ronstr¨oms Vertical Split Method . . . 33
3.6 Ronstr¨oms Vertical Split Transformation . . . 34
3.7 Ronstr¨oms Vertical Split Method and Inconsistent Data . . . . 35
3.8 Example MV Consistency Problem . . . 42
4.1 The four steps of DT creation. . . 50
5.1 Solving the Record and State Identification Problems . . . 61
5.2 Solving the Missing Record Pre-State Problem . . . 62
5.3 Example Simple Lock Forwarding (SLF) . . . 64
5.4 Lock Compatibility Matrix . . . 65
5.5 Example Many-to-One Lock Forwarding (M1LF) . . . 66
5.6 Example Many-to-Many Lock Forwarding (MMLF) . . . 67
5.7 Inconsistent Source Records . . . 67
6.1 Difference and Intersection DT Creation . . . 72
6.2 Horizontal Merge DT Creation . . . 77
6.3 Horizontal Merge - Duplicate Inclusion . . . 78
6.4 Horizontal Merge - Duplicate Inclusion with type attribute . . 80
6.5 Horizontal Merge - Duplicate Removal. . . 82
6.6 Horizontal Split DT Creation . . . 86
6.7 Example vertical merge DT creation. . . 90
6.8 Synchronization of a Vertical Merge Schema Transformation . 94 6.9 Vertical split over a Candidate Key. . . 95
xiv LIST OF FIGURES
7.1 Possible Modular Design of Prototype. . . 116 8.1 Modular Design Overview of the Non-blocking DBMS. . . 121 8.2 UML Class Diagram of the Non-blocking Database System. . . 122 8.3 Sequence Diagram - Relational Manager Processing a Query . 126 8.4 Organization of the log. . . 128 8.5 Organization of data records in a table. . . 129 8.6 Screen shot of the Client program in action. . . 133 9.1 Response time and throughput for difference and intersection . 145 9.2 Response time distribution for difference and intersection. . . . 148 9.3 Response time - difference log propagation . . . 151 9.4 Throughput - difference log propagation . . . 153 9.5 Response time and throughput - vertical merge DT creation . 154 9.6 Comparison of vertical merge and difference/intersection . . . 155 9.7 Time vs Degradation . . . 157 9.8 Response Time Summary . . . 160 11.1 Example of Schema Transformation performed in two steps. . 175 11.2 Example interface for dynamic priorities for DT creation. . . . 175 B.1 Response time and throughput - horizontal merge DT creation 182 B.2 Response time - horizontal split DT creation . . . 183 B.3 Throughput - horizontal split DT creation . . . 184 B.4 Response time - vertical merge DT creation . . . 185 B.5 Response time - vertical merge DT creation, varying table size 186 B.6 Throughput - vertical merge DT creation . . . 187 B.7 Response time - vertical split DT creation . . . 188 B.8 Throughput - vertical split DT creation . . . 189
3.1 The three dimensions of Ronstr¨om’s schema transformations. . 25
3.2 Legend for Tables 3.3 and 3.4. . . 36
3.3 Cost Incurred by Ronstr¨om’s Vertical Merge Schema Trans-formation Method . . . 37
3.4 Added Cost by Ronstr¨om’s Vertical Split Schema Transforma-tion Method . . . 38
5.1 DT Creation Problems and Solutions . . . 69
6.1 DT Creation Operators . . . 71
6.2 Problems and solutions for DT Creation methods . . . 107
7.1 Evaluation - Open Source DBMSs . . . 115
7.2 Evaluation of implementation alternatives. . . 118
9.1 Hardware and Software Environment for experiments. . . 135
9.2 Transaction Mix 1 . . . 137
9.3 Transaction Mix 2 . . . 137
9.4 Transaction Mix 3 . . . 138
9.5 Table Sizes used in the experiments . . . 138
9.6 Response Time Distribution Summary . . . 149
9.7 Response Time Initial Population and Log Propagation . . . . 158
Introduction
The topic of this thesis is schema transformations and materialized view cre-ation in relcre-ational database systems with high availability requirements. The main focus is on how creation of derived tables can be used to perform both operations while incurring minimal performance degradation for concurrent transactions.
In this chapter, the motivation for the topic is presented, and the research questions and methodology are discussed.
1.1
Motivation
Relational database systems have had tremendous success since Ted Codd introduced the relation concept in the famous paper “A relational model of data for large shared data banks” in 1970 (Codd, 1970). Today, this type of database system is so dominant that “database system” is close to synonymous with “relational database system”.
Relational database systems1, or simply database systems, are used in
virtually all kinds of applications, spanning from simple personal database systems to huge and complex business database systems. Personal database systems, including CD or book archives and contact information for friends and family, typically contain few tuples (order of hundreds). The conse-quences of unavailability2 for such database systems are, in general, not crit-ical; it would still be possible to play music even if the CD archive was
1Throughout this thesis, the term database is used to denote a collection of related
data. ADatabase Management System (DBMS) is a program used to manage these data,
while database system denotes a collection of data managed by a DBMS (Elmasri and
Navathe, 2004).
2In this thesis, database systems are consideredavailable when they can be fully
6 1.1. MOTIVATION
unavailable. Because people normally consider sporadic downtime of such systems acceptable, these systems have low requirements for availability.
At the other end of the scale, database systems used in business appli-cations may be very large, often in the order of millions or even billions of tuples. Business database systems are involved in everything from stock exchanges, web shops, banking and airline ticket booking to patient histo-ries at hospitals3, Enterprise Resource Planning (ERP) and Home Location
Registries (HLR) used to keep track of mobile telephones in a network. While database system availability is not critical to all business applica-tions, it certainly is to many. Database systems are, e.g., required for the exchange of stocks at NASDAQ and for customers to shop at Amazon.com. Even more critical; the HLR database system is required for any mobile phone to work in a network. These systems should not be unavailable for long periods of time.
Database Operations
Users interact with database systems by performing transactions, which may consist of one or more database operations (Elmasri and Navathe, 2004). Examples of basic database operations include inserting a new patient into a hospital database and querying a patient’s medical record for possible dan-gerous allergies before a surgery.
In modern database systems, e.g. Microsoft SQL Server 2005 (Microsoft TechNet, 2006) and IBM DB2 version 9 (IBM Information Center, 2006), the basic operations are designed to achieve high degrees of concurrency (Garcia-Molina et al., 2002). While the operations performed by one user may block other users from accessing the very same data items at the same time, other data items are still accessible.
A database operation is said to be blocking if it keeps other transactions from executing their update (and possibly read) operations, effectively mak-ing the involved data unavailable. Short term blockmak-ing of small parts of the database may not be problematic. However, blocking a huge amount of data for a long period of time seriously reduces availability. This is obviously unwanted in highly available database systems.
In the following section, two database operations,database schema trans-formations andcreation of materialized views, are described. Neither of these can be performed without blocking the involved tables for a long period of time in existing DBMSs (Løland, 2003).
3Patient history databases may, e.g., describe previous treatment, allergies, x-ray
1.1.1
The Derived Table Creation Problem
”Due to planned technical maintenance and upgrades, the online bank will be unavailable from Saturday 8 p.m. to Sunday 10 a.m. Our contact center will not be able to help with account infor-mation as internal systems are affected as well.
We are sorry for the inconvenience this may cause for our cus-tomers.”
Norwegian Online Bank, October 2006 Schema Transformations
Database schemas4 are typically designed to model the relevant parts and aspects of the world at design time. The schema may be excellent for the intended usage at the time it is design, but many applications change over time. New kinds of products appear, departments which had one head of department suddenly has a board, or new laws that affect the company are introduced by the government. These are examples of changes that may require a transformation of the database schema.
In addition to changing needs as a source for transformations, designers may also have been unsuccessful in designing a good schema in the first place. After being used for some time, it may turn out that a schema does not work as well as it was intended to. Often, the reason for this is that the design is a compromise between many factors, some of which include readability of the E/R diagram, removal of anomalies and optimization of runtime efficiency. It may very well turn out that the schema is too inefficient or that the designers just forgot or misinterpreted something.
In a study of seven applications, Marche (Marche, 1993) reports of sig-nificant changes to relational database schemas over time. Six of the studied schemas had more than 50% of their attributes changed. The evolution con-tinued after the development period had ended. A similar study of a health management system came to the same conclusion (Sjøberg, 1993).
As should be clear, a database schema may sometimes have to be changed after the database has been populated with data. In this thesis, we refer to such changes as “schema transformations”. An important shortcoming of all but the least complex schema transformations is that they must be performed in a blocking way in todays DBMSs (Lorentz and Gregoire, 2003b; Microsoft TechNet, 2006; IBM Information Center, 2006). This will be elaborated on in Section 3.4.
8 1.1. MOTIVATION
Materialized Views
A database view is a table derived from other tables, and is defined by a database query called the view query (Elmasri and Navathe, 2004). Views may be either virtual or materialized. Virtual views do not physically store any records, but can still be queried like normal tables. This is done by using the view queries to rewrite the user queries (Elmasri and Navathe, 2004).
Depending on the complexity of the view query, querying a virtual view may be much more costly than querying a normal table. To remedy this, most modern DBMSs support Materialized Views (MVs)5 (Løland, 2003). Unlike a virtual view, the result of the view query is stored physically in an MV (Elmasri and Navathe, 2004).
MVs have many uses in addition to speeding up queries (Alur et al., 2002). They can be used to store historical information, e.g. sales reports for each quarter of a year. They are also frequently used in Data Warehouses. Because of the great performance advantages of MVs and their widespread use, much research has been conducted on how to keep MVs consistent with the source tables (Løland and Hvasshovd, 2006c). However, in current DBMSs, the MVs still have to be created in a way that blocks all updates to the source tables while the MV is created (Lorentz and Gregoire, 2003b; IBM Information Center, 2006; Microsoft TechNet, 2006).
Using Derived Tables for Schema Transformations and Material-ized View Creation
The blocking MV creation and schema transformation methods described in the previous sections may take minutes or more for tables with large amounts of data. If either of these operations is required, the database administrator is forced to choose between unavailability while performing the operation, or to not perform it at all. Both choices may, however, be unacceptable. This is especially the case when the database system has high availability requirements.
A derived table (DT) is, as the name suggests, a database table containing records derived from other tables6 (Elmasri and Navathe, 2004). A table
“Sales Report” that stores a one-year history of all sales by all employees is an example of a DT. Hence, a materialized view is obviously one type of DT. A less intuitive application of DTs is to redirect operations from source
5Materialized Views are called Indexed Views by Microsoft (Microsoft TechNet, 2006)
and Materialized Query Table by IBM (IBM Information Center, 2007)
6Throughout this thesis, the tables that records are derived from will be calledsource
tables to derived tables and thereby perform a schema transformation. A method to create DT is therefore likely to be usable for both operations.
Both schema transformations and Materialized Views are defined by a query (Microsoft TechNet, 2006; Lorentz and Gregoire, 2003a; Løland, 2003), and we therefore focus on DT creation using relational operators7, also called
relational algebra operations. Relational operators can be categorized in two groups: non-aggregate and aggregate operators (Elmasri and Navathe, 2004). The non-aggregate operators are cartesian product, various joins, projection, union, selection, difference, intersection and division. Aggregate operators are mathematical functions that apply to collections of records. Both non-aggregate and aggregate operators can be used to define schema transformations and MVs. However, aggregate operators are typically not used without non-aggregate operators (Alur et al., 2002), and we therefore consider non-aggregate operators the best starting point for DT creation.
The main topic of this thesis is to develop a method that solves the unavailability problem of creating derived tables, using common relational operators. Due to time constraints, we will focus on six operators: full outer equijoin (one-to-many and many-to-many relationships), projection, union, selection, difference and intersection8. Full outer equijoin is chosen because
these can later be reduced to any inner/left/right join simply by removing records from the result, and because equality is the most commonly used comparison type in joins (Elmasri and Navathe, 2004). Furthermore, in terms of derived table creation, cartesian product is actually a simpler full outer join in which no attribute comparison is performed. The final non-aggregate operator, division, can be expressed in terms of the other operators, and is therefore considered less important.
The suggested method must solve any problem associated with utilizing the DTs as materialized views and in schema transformations. To gain insight in the field, the work must include a thorough study of existing solutions to the described and closely related problems. Existing DBMS functionality should be used to the greatest possible extent to ease the integration of the method into existing DBMSs. Since the goal is to develop a method that incurs little performance degradation to concurrent transactions, the performance implications of the method needs to be tested.
7Relational operators are the building blocks used in queries (Elmasri and Navathe,
2004).
8Due to naming conventions in the literature (Ronstr¨om, 1998), we use the names
ver-tical merge and split, horizontal merge and split, difference and intersection, respectively, when these relational operators are used in DT creation.
10 1.2. RESEARCH QUESTIONS
1.2
Research Questions
Based on the discussion in the previous section, the main research question of the thesis is:
How can we create derived tables and use these for schema transformation and materialized view creation purposes while incurring minimal performance degradation to transactions operating concurrently on the involved source ta-bles.
We realize that this is a research question with many aspects. To be able to answer it, the research question is therefore refined into four key challenges: Q1: Current situation
What is the current status of related research designed to address the main research question or part of it?
Q2: System Requirements
What DBMS functionality is required for non-blocking DT creation to work?
Q3: Approach and solutions
How can derived tables be created with minimal performance degra-dation, and be used for schema transformation and MV creation pur-poses?
• How can we create derived tables using the chosen six relational operators.
• What is required for the DTs to be used a) as materialized views? b) for schema transformations?
• To what extent can the solution be based on standard DBMS func-tionality and thereby be easily integradable in existing DBMSs? Q4: Performance
Is the performance of the solution satisfactory?
• How much does the proposed solution degrade performance for user transactions operating concurrently?
• With the inevitable performance degradation in mind; under which circumstances is the proposed solution better than a) other solu-tions? b) performing the schema transformation or MV creation in the traditional, blocking way?
1.3
Research Methodology
Denning et al. divides computer science research into three paradigms: the-ory, abstraction and design (Denning et al., 1989).
Theory is rooted in mathematics and aims at developing validated theories. The paradigm consists of four steps:
1. characterize objects of study
2. hypothesize possible relationships among them, i.e., form theorem 3. determine whether the relationships are true, i.e., proof
4. interpret results
Abstraction is rooted in experimental scientific method. In this method, a phenomenon is investigated by collecting and analyzing experimental results. It consists of four steps:
1. form a hypothesis
2. construct a model and make a prediction 3. design an experiment and collect data 4. analyze results
Design is rooted in engineering, and aims at constructing a system that solves a problem.
1. state requirements 2. state specifications
3. design and implement the system 4. test the system
The research presented in this thesis fits naturally into the Design paradigm. The research aims at solving theproblem that creation of derived tables is a blocking operation.
For our suggested solution to be useful, the method must fit into common DBMS design. Hence, the first step in solving the research question is to understand commonly used DBMS technology that is somehow related to the research question. This will enable us to state requirements.
The next step is to state specifications for a method that can be used to create derived tables in a non-blocking way. The method should be designed
12 1.4. ORGANIZATION OF THIS THESIS
to fit into existing DBMSs to the greatest possible extent, and to degrade performance as little as possible.
To verify validity, and to test the actual performance degradation, a DBMS and the suggested method is then designed and implemented. The implementation is then subjected to thorough performancetesting.
1.4
Organization of this thesis
The thesis is divided into three parts with different focus. The focus in Part I is on the background for the research. This includes the research question, required functionality and a survey of related work. Part II presents our solution to the derived table creation problem, and shows how the DTs can be used to transform the schema and to create materialized views. In Part III, we discuss the results of experiments on a prototype DBMS. This part also includes a discussion of the research contributions, and suggests further work.
Part I - Background and Context contains an introduction to derived table creation. The focus in this part is on research from the literature and existing systems that is relevant to our solution of the research question and suggestions for further work.
Chapter 1 contains this introduction. The chapter states the motiva-tion for the research, and the research methodology that is used in the work.
Chapter 2 introduces the DBMS fundamentals required to perform non-blocking derived table creations.
Chapter 3 is a survey of existing solutions to the non-blocking DT creation problem and related problems.
Part II - DT Creation Framework presents our solution for non-blocking creation of derived tables, and how these derived tables can be used for schema transformations and materialized view creation.
Chapter 4 introduces our framework for non-blocking derived table creation.
Chapter 5 identifies problems that are encountered when derived ta-bles are created as described in Chapter 4. The chapter also shows how these problems can be solved.
Chapter 6 describes in detail how the DT creation framework pre-sented in Chapter 4 can be used for non-blocking creation of de-rived tables using the six relational operators that have been cho-sen. The chapter also describes what needs to be done to use these DTs for schema transformations or as materialized views.
Part III - Implementation and Testing presents the design of our pro-totype DBMS. The propro-totype is capable of performing non-blocking DT creation as described in Part II. Results from performance testing of this prototype are also presented.
Chapter 7 evaluates three alternatives for implementation of the DT creation method.
Chapter 8 describes the design of a prototype DBMS capable of per-forming the DT creation method developed in Part II.
Chapter 9 discusses experiment types the results of performing the required experiments on the prototype.
Chapter 10 contains a discussion of the results of the research. Chapter 11 presents an overall conclusion and the contributions of
Chapter 2
Derived Table Creation Basics
This chapter describes basic Database Management System concepts that are used by or are otherwise relevant to our non-blocking DT creation method. A thorough description of DBMSs is out of the scope of this thesis. For further details, the reader is referred to one of the many text books on the subject, e.g. “Database Systems The Complete Book” (Garcia-Molina et al., 2002) or “Fundamentals of Database Systems” (Elmasri and Navathe, 2004).
2.1
Database Systems - An Introduction
A database (DB) is a collection of data items1, each having a value (Bernstein
et al., 1987). All access to the database goes through the Database Manage-ment System (DBMS). As illustrated in Figure 2.1, a database managed by a DBMS is called a database system (Elmasri and Navathe, 2004).
Database access is performed by executing special transaction programs, which have a defined start and end, set bystart and either commit orabort operation requests. A commit ensures that all operations of the transaction are executed and safely stored, while an abort call removes all effects of the transaction (Elmasri and Navathe, 2004).
In its most common use, transactions have four properties, known as the “ACID” properties (Gray, 1981; Haerder and Reuter, 1983):
Atomicity - The transaction must execute successfully, or must appear not to have executed at all. This is also referred to as the “all or nothing” property of transactions. Thus, the DBMS must be able to undo all operations performed by a transaction that is aborted.
1The data items are calledtuples or rows in the relational data model. Internally in
database systems, they are calledrecords(Garcia-Molina et al., 2002). To avoid confusion,
Database
Database System
Database Management System
User
Figure 2.1: Conceptual Model of a Database System.
Consistency - A transaction must always transform the database from one consistent state2 to another.
Isolation - It should appear to each transaction that other transactions either appeared before or after it, but not both (Gray and Reuter, 1993).
Durability - The results of a transaction are permanent once the transac-tion has committed.
Broadly speaking, the ACID properties are enforced mainly by concur-rency control, which is used to achieve isolation, and recovery which is used for atomicity and durability. Consistency means that transactions must pre-serve constraints. The following two sections give a brief introduction to common concurrency control and recovery mechanisms.
2A consistent state is a state where database constraints are not broken (Garcia-Molina
16 2.2. CONCURRENCY CONTROL
2.2
Concurrency Control
In a database system where only one transaction is active at any time, con-currency control is not needed. In this scenario, the operations from each transaction are executed in serial (i.e. sequential) order, and two transactions can never interfere with each other. The isolation property is therefore im-plicitly guaranteed. However, this scenario is seldom used since the database system is normally only able to use small parts of the available resources at any time (Bernstein et al., 1987).
When concurrent transactions are allowed, the operations of the various transactions must be executedas if the execution was serial (Garcia-Molina et al., 2002). A sequence, or history, of operations that gives the same result as serial execution is called serializable. It is the responsibility of the “Scheduler”, or “Concurrency Controller”, to enforce serializable histories, which in turn is a guarantee for isolation (Bernstein et al., 1987).
Schedulers
Schedulers can be either optimistic or pessimistic. With theoptimistic strat-egy, transactions perform operations right away without first checking for conflicts. When the transaction requests a commit, however, its history is checked. If the transaction has been involved in any non-serializable opera-tions, the transaction is forced to abort. Timestamp ordering, serialization graph testing and locking can all be used for optimistic scheduling (Bernstein et al., 1987).
The most common form of scheduling is pessimistic, however. With this strategy, transactions are not allowed to perform operations that will form non-serializable histories in the first place. Thus, the scheduler has to check every operation to see if it conflicts with any operation executed by another currently active transaction. When a conflict is found, the scheduler may decide to either delay or reject the operation (Bernstein et al., 1987).
The pessimistic Two Phase Locking (2PL) strategy has become the de facto scheduling standard in commercial DBMSs. It is, e.g., used in Oracle Database 10g (Cyran and Lane, 2003). In 2PL, a lock must be set on a data item before a transaction is allowed to operate on it. Two lock types, shared and exclusive, are typically used. The idea is that multiple transac-tions should be allowed to concurrently read the same record, while only one transaction should at any time be allowed to write to a record. Thus, read operations are allowed if the transaction has a shared or exclusive lock on the record, while write operations are allowed only if the transaction has an exclusive lock on it.
As the name indicates, 2PL works in two phases: locks are acquired during the first phase, and released during the second phase. This implies that a transaction is not allowed to set new locks once it has started releasing locks. Unless the transaction pre-declares all operations it will execute, the scheduler does not know if a transaction is done operating on a particular object, or if it will need more locks in the future. Locks are therefore typically not released until the transaction terminates. This is known as Strict 2PL (Garcia-Molina et al., 2002). The derived table creation method described in this document assumes that 2PL is used, although it may be tailored to suit other scheduling strategies as well.
Two-phase commit is a commonly used protocol for commit handling in distributed DBMSs used to ensure that the transaction either commits on all nodes or aborts on all nodes. The protocol works in two phases: in the prepare phase, the transaction coordinator asks all transaction participants if they are ready to commit. If they all agree to commit, the coordinator completes the transaction by sending a commit message to all participant (Gray, 1978). This is called the commit phase.
2.3
Recovery
In a database system, failure may occur on three levels. Transaction failure happens when as transaction either chooses to, or is forced to, abort. System failure happens when the contents of volatile storage is lost or corrupted. A power failure is a typical reason for such failures. Media failure happens when the contents in non-volatile storage is either lost or corrupted, e.g. because of a disk crash. In what follows, “memory” and “disk” will be used instead of volatile and non-volatile storage, respectively.
Physical Logging
Recovery managers are constructed to correct the three types of failure. The idea behind almost all recovery managers is that information for how to re-cover the database to the correct state must be stored safely at any time. This information is typically stored in a log, which can either be physical, logical or physiological (Haerder and Reuter, 1983; Bernstein et al., 1987). Physical logging, orvalue logging, writes the before and after value of a changed object to the log (Gray, 1978). The physical unit that is logged is typically a disk block or a record. Assuming that records are smaller than disk blocks, the former produces drastically higher log volumes than the latter (Haerder and Reuter, 1983). Since the log records contain before and after values of the
18 2.3. RECOVERY Block:27 LSN:10 R1=3 R2=6 Block:27 LSN:12 R1=10 R2=15 Block:27 LSN:?? R1=3 R2=15 Disk Block History 11: T1 - R1=10 12: T2 - R2=15 13: T2 commit 14: T1 abort
Figure 2.2: Two records in the same disk block are updated by different trans-actions, T1 and T2. After T1 aborts, there is no valid state identifier for the
block.
changed object, logged operations are idempotent which means that redoing the operation multiple times yields the same result as redoing it once.
Logical Logging
Logical logging, or operation logging, logs the operation that is performed instead of the before and after value (Haerder and Reuter, 1983). This strat-egy produces much smaller log volumes than the physical methods (Bernstein et al., 1987). A Log Sequence Number (LSN) is assigned to each log record, and data items are tagged with the LSN of the latest operation that has changed it. This is done to ensure that changes are applied only once to each record since logically logged operations are not idempotent in general. LSNs may be assigned to block (block state identifier, BSI) or record (record state identifier, RSI) level (Bernstein et al., 1987). The former requires slightly less disk space whereas the latter is better suited in replicated database systems based on log redo since this allows for different physical organization at the different nodes (Bratsberg et al., 1997a).
Two common methods to increase the degree of concurrency are fine-granularity locking and semantically rich locks. Fine-granularity locks are normal locks that are set on small data items, i.e. records (Weikum, 1986; Mohan et al., 1992). Semantically rich locks allow multiple transactions to lock the same data item provided that the operations are commutative (Korth, 1983). Operations that commute may be performed in any order, e.g., “increase” and “decrease”. When these methods are combined with logical logging, Compensating Log Records (Crus, 1984) are required (Mohan et al., 1992). The reason for this is that there is no correct LSN that can be used as BSI after certain undo operations, as illustrated in Figure 2.2; after the abort of transaction 1, the LSN of the block cannot be set to what it was before the update because that would not reflect the change performed by
transaction 2. Neither can the LSN of the abort log record be used since this invalidates the one-to-one correspondence between updates and log records (Gray and Reuter, 1993). Thus, Compensating Log Records (CLR) (Crus, 1984) are written to the log when an undo operation is performed due to any of the failure scenarios presented in Section 2.3. The CLR describes the undo action that takes place (Gray, 1978). It also keeps the LSN of the log record that was undone. E.g., if the insert of recordX is undone, a CLR describing the deletion of X is written to the log. LSNs are assigned to CLRs, thus the state identifier of a record or disk block will increase even when undo operations are performed.
Logical logging is considered better than physical logging because of the reduced log volume and because the state identifiers reduces recovery work. However, it has one major flaw: the logged operations are not action consis-tent3 since they are not atomic. One insert may, e.g., require that large parts of a B-tree is restructured. This can be solved by using a two-level recovery scheme where the low-level system provides action consistent operations to the high-level logical logging scheme (Gray and Reuter, 1993). Shadowing is an example low-level scheme. With this method, blocks are copied before a change is applied. The method is complex and requires locks to be set on blocks since this is the granularity of the copies (Gray and Reuter, 1993). Physiological Logging
Physiological logging (Mohan et al., 1992), also called physical-to-a-page logical-within-a-page, is a compromise between physical and logical logging. It uses logical logging to describe operations on the physical objects; blocks. In the shadowing strategy, non-atomic operations are executed by mini-transactions. The mini-transactions consist of atomic operations, each of which are physiologically logged. Thus, the log records are small, while the problems of logical logging are avoided (Gray and Reuter, 1993).
(No-)Steal and (No-)Force Cache Managers
The log may provide information to undo or redo an operation. Which kind of information is required for recovery to work depends heavily on the strat-egy of another DBMS component, the cache manager, which is responsible for copying data items between memory and disk. Two parameters are of particular relevance to the recovery manager. The first determines whether
3This means that one logical operation may involve multiple physical operations.
Hence, a database system may crash when only parts of a logically logged operation has been performed (Gray and Reuter, 1993).
20 2.3. RECOVERY
or not updates from uncommitted transactions may be written to disk. If uncommitted updates are allowed on disk, the cache manager uses a steal strategy; otherwise ano-steal strategy is used (Gray, 1978). Since memory is almost always a limited resource, stealing gives the cache manager valuable freedom in choosing which data items should be moved out. The problem is that if a failure occurs, data items on disk may have been changed by trans-actions that have not committed. The atomicity property requires that the effects of these transactions should be removed. Hence, if stealing is allowed, undo information of uncommitted writes must be forced to the log before the updated records are written to disk.
The second parameter determines if data items updated by a transac-tion must be forced to disk or not (i.e. no-force) before the transaction is committed (Gray, 1978). If force is used, the disk must be accessed during critical parts of transaction execution. This may lead to inefficient cache management. If no-force is used, redo information of all committed changes must be written to the log, and the log must then be forced to disk. This is known as Force Log at Commit (Mohan et al., 1992).
It is common to use a steal/no-force cache manager since this provides the maximum freedom and highest performance. This is also the case for a well-known recovery strategy, ARIES (Mohan et al., 1992), which is briefly described in the next section.
The ARIES Recovery Strategy
ARIES (Mohan et al., 1992) (Algorithm for Recovery and Isolation Exploit-ing Semantics) is a recovery method for the steal/no-force cache manager strategy described above. ARIES uses 2PL for concurrency. It is in common use in many commercial DBMS, e.g. SQL Server 2005 (Microsoft TechNet, 2006) and IBM DB2 version 9 (IBM Information Center, 2006). The princi-ples are also used in the derived table creation method, which is the topic of this thesis.
ARIES uses the Write-Ahead Logging (WAL) protocol (Gray, 1978), which requires that a log record describing a change to a data item is written to disk before the change itself is written. One sequential log, containing both undo and redo log records, is used. A unique, ascending Log Sequence Number (LSN) is assigned to each record in this log. The LSN is also used to tag blocks so that a disk block is known to reflect a logged change if and only if the LSN of the disk block is equal to or greater than that of a log record. Log records are initially added to the volatile part of the log file, and are forced to disk either when a commit request is processed or when the cache manager writes changed data items to disk.
The ARIES protocol can be used with both logical and physiological logging. It supports fine-granularity locks and semantically rich locks (Mohan et al., 1992).
2.4
Record Identification Policy
The DBMS needs a way to uniquely identify records so that transactional operations and recovery work is applied to the correct record. Each record is therefore assigned a Record Identifier (RID). The mapping from RID to the physical record is called the access path. There are four identification techniques (Gray and Reuter, 1993): Relative Byte Address, Tuple Identifier, Database Key, and Primary Key.
Physical Identification Policies
Relative Byte Addresses (RBA) consist of a block address and an offset, i.e. the byte number within that block. RBA is fast since it points directly to the correct physical address. Physical location is not very stable, however. E.g., an update may increase a records size, which may change the offset or block. An address that is as unstable as this is not well suited as a RID (Gray and Reuter, 1993).
Tuple Identifiers (TID) consists of a block address and a logical ID within the block. Each block has an index used to map the ID to the correct offset. Hence, a record may be relocated within a block without changing the RID. A pointer to the new address is used if a record is relocated to another block. When a pointer is followed, the access path to the record becomes more costly, however. Hence, relocated records should eventually receive a new TID reflecting the actual location. This reorganization must be executed online, i.e. in parallel to normal processing, and represents an overhead. This seems to be the most common record identification technique; it is used by, e.g., IBM DB2 v9 (IBM Information Center, 2006), SQL Server 2005 (Microsoft TechNet, 2006) and Oracle 10g (Cyran and Lane, 2003).
Logical Identification Policies
Database Keys are unique, ascending integers assigned to records by the DBMS. A translation table maps database keys to the physical location of the records. The database key works as an index in the array-like translation table, and therefore requires only one block access. This mapping ensures that a record can be relocated to any block without having to change its RID. The extra lookup incurs an access path overhead, however.
22 2.4. RECORD IDENTIFICATION POLICY
Since all records in a relational database system are required to have a uniqueprimary key4, the primary keys may serve as RIDs as well. Addressing
is indirect, but in contrast to the previous method, primary keys can not be used as an index in a translation table since primary keys do not have monotonically increasing values. Thus, a B-tree is used to map the primary key to the physical location of the record. The access path is approximately as costly as database keys, but has a number of advantages. These include that access to records is often done through primary keys, so the primary key mapping to record must be done either way. The uniqueness of primary keys must also be guaranteed by the DBMS. This is efficient to do when primary key is used as RID (Gray and Reuter, 1993). This technique is used, e.g, in Oracle 10g if the table is index-organized (Cyran and Lane, 2003).
When creating derived tables, the physical location of records residing in DTs are not the same as in the source tables. The blocks are obviously different. The location within the blocks may also be different since a rela-tional operator is applied. Hence, the DT creation method described in this thesis assumes that a logical record identification scheme is used, i.e. either Database Keys or Primary Keys. Using physical identification policies, i.e. RBA or TID, is also possible, but requires an additional address mapping table.
A Survey of Technologies
Related to Non-Blocking
Derived Table Creation
This chapter describes the state of the art in non-blocking creation of derived tables (DT). The aim of the survey is to evaluate the functionality and cost of existing methods used for this purpose. Some of the ideas presented here will later be used in our non-blocking DT creation method. This will be explicitly commented on.
Three related areas of research are discussed. First, a schema transfor-mation method that can be used for some of the relational operators is de-scribed. To the author’s knowledge, this is the only research on non-blocking transformations in relational database systems in the literature. Next, we describe fuzzy copying, which is a method for non-blocking creation of DTs, but without the ability to apply relational operators. Third, maintenance techniques for Materialized Views (MV) are discussed. The motivation for this is that an MV is a type of DT, and some of the research in MV mainte-nance is therefore applicable to our suggested DT creation method. Finally, methods for schema transformations and DT creation available in existing DBMSs are described.
3.1
Ronstr¨
om’s Schema Transformations
Ronstr¨om (Ronstr¨om, 2000) presents a non-blocking method that uses both a reorganizer and triggers within users’ transactions to perform schema trans-formations, called schema changes by Ronstr¨om (Ronstr¨om, 1998). It is argued that there are three dimensions to schema transformations. These
24 3.1. RONSTR ¨OM’S SCHEMA TRANSFORMATIONS
are soft vs. hard schema changes, simple vs. complex schema changes and simple vs. complex conversion functions (Ronstr¨om, 1998). A summary can be found in Table 3.1.
The soft vs. hard schema change dimension determines whether or not new transactions are allowed to use the new schema before all transactions on the old schema have terminated. Thus, with soft schema changes, trans-actions that were started before the new schema was created continue pro-cessing on the old schema while new transactions start using the transformed one (Ronstr¨om, 1998). With hard schema changes, new transactions are not allowed to start processing on the affected parts of the transformed schema until all transactions on the old schema have terminated.
Soft schema changes are desirable since with this strategy, new transac-tions are not blocked while the old transactransac-tions finish processing. In some cases, soft schema changes can not be used, however. This happens when-ever the new schema does not contain enough information to trigger updates back to the old schema, i.e. when a mapping function from the transformed attributes to the old attributes does not exist (Ronstr¨om, 1998).
The second dimension to schema changes divides transformations into simple andcomplex schema changes. Simple schema changes are short lived, and typically involve changes to the schema description only (Ronstr¨om, 1998). Complex schema changes involve many records and take considerable time. With this method, complex schema changes should not be executed as one single blocking transaction due to their long execution time (Ron-str¨om, 2000). Instead, complex schema changes are organized using SAGAs1
(Garcia-Molina and Salem, 1987).
The third dimension to schema changes is that ofsimple vs. complex con-version functions. In simple conversions, all the information needed to apply an operation in the transformed schema is found in the operated-on record in the old schema. Complex conversions, on the other hand, need information from other records before the operation can be applied. This information may be found in other tables (Ronstr¨om, 2000). Complex conversions can only be performed by complex schema changes (Ronstr¨om, 2000).
The following sections describe how transformations are performed in Ronstr¨om’s method. The description divides transformations into simple and complex changes, i.e. along the second dimension. Even though not all of the complex schema changes actually create DTs, all changes that can be performed by the method are presented for readability. A thorough cost analysis of schema transformation operators is presented in Section 3.1.3.
1SAGA is a method to organize long running transactions (Garcia-Molina and Salem,
Schema Change
Soft New transactions are allowed to start accessing the new tables while the old transactions are accessing the old tables.
Hard Transactions that try to access the new tables are blocked until all transactions accessing the old tables have completed. Simple Short lived, typically only changes schema description,
exe-cuted as one transaction.
Complex Long lived, involves many records, executed using triggers and SAGA transactions.
Conversion Function
Simple A record can be added to the new schema by reading only the operated-on record in the old schema.
Complex Adding a record to the new schema may require information from multiple records in the old schema. Always executed as a complex schema change.
Table 3.1: The three dimensions of Ronstr¨om’s schema transformations.
3.1.1
Simple Schema Changes
Simple schema changes only change the schema description of the database. The changes are organized in a way similar to the two-phase commit protocol, described in Section 2.2. First, the system coordinator sends the new schema description to all nodes. If the transformation is hard, each node will wait for ongoing transactions to finish processing. The nodes then lock the involved parts of the schema, update the schema description and log the change before acknowledging the change request. When all nodes have acknowledged the change, the coordinator sends commit to all nodes, including a new schema version number. New transactions will from now on use the transformed schema.
Examples of simple schema changes include adding and dropping a table, adding an attribute with a default value, and dropping an attribute or index. None of these operations involve creation of derived tables.
3.1.2
Complex Schema Changes
Schema changes involving many records are considered complex in Ron-str¨om’s method. This includes adding attributes, with values derived from other attributes, to a table. It also includes adding a secondary index. Ad-ditionally, both horizontal and vertical merge and split of tables can be per-formed (Ronstr¨om, 1998). In terms of relational operators, horizontal merge
26 3.1. RONSTR ¨OM’S SCHEMA TRANSFORMATIONS
corresponds to the union operator, while vertical merge corresponds to the left outer join operator. The split methods are inverses of the merge methods. All complex schema changes go through three phases. The schema is first changed by adding the necessary tables, attributes, triggers, indices and constraints (Ronstr¨om, 2000). Second, the involved tables are operated on by reading and performing necessary operations one record at a time. The required operations depend on the transformation being performed. Involved tables are left unlocked for the entire transformation, whereas the records are locked temporarily. To ensure that the transformation does not lock records for long periods of time, only one record is operated on per transaction. All these transactions, each operating on one record, are organized by using SAGAs. While transactions read records in the source tables and perform the operations necessary for the transformation, triggers ensure that insert, delete and update operations in the old schema are performed in the new schema as well (Ronstr¨om, 1998).
The third phase is started once the SAGA organized transactions have completed. If the schema change is soft, new user transactions start using the new schema immediately while active transactions are allowed to fin-ish execution on the old schema. Since both schemas are in use, triggers have to forward operations both from the old to the new schema and vice versa. If the schema change is hard, transactions are not allowed to use the new schema until all transactions on the old schema have completed. When all transactions that were using the old schema have terminated, obsolete triggers, tables, attributes, indices and constraints are removed (Ronstr¨om, 2000).
In what follows, all complex schema changes that can be performed by Ronstr¨om’s method are described in detail. Ordered by increasing com-plexity, these are horizontal merge and split, and vertical merge and split transformations.
Horizontal Merge Schema Change
In Ronstr¨om’s schema transformation framework, horizontal merge corre-sponds to the UNION relational operator without duplicate removal. The transformation is performed by creating a new table in which records from both source tables are inserted. Hence, this is a derived table.
As illustrated in Figure 3.1, records from both source tables may have identical primary keys. This is a problem because two records with the same primary key are not allowed to coexist in the new table. This may be solved by using a non-derived primary key, e.g. an additional attribute with an auto-incremented number, in the new table. Alternatively, the primary key of the
Foreign key
Record ("new table")
FK Album Root Down Come away... Kind of... Root Down The look... Artist Smith, Jimmy Jones, Norah Davis, Miles Smith, Jimmy Krall, Diana FromTbl Vin Vin Vin CD CD CD ("original table 2") Album Root Down The look... Artist Smith, Jimmy Krall, Diana FK
Vinyl ("original table 1")
FK Album Root Down Come away... Kind of... Artist Smith, Jimmy Jones, Norah Davis, Miles
Figure 3.1: Horizontal Merge. Two tables, “Vinyl” and “CD”, are merged into one new table, “Record”. The primary key of both tables is <artist,album>. The new table includes an attribute “FromTable” so that identical primary key values from the two source tables may coexist.
new table may be a combination of the primary key from the old schema and an attribute identifying which table the record belonged to (Ronstr¨om, 2000).
The method starts by creating the new table. Foreign keys are then added to both the old tables and to the new table. Since duplicates are not removed, there is a one-to-one relationship between records in the old and the new schema. Thus, triggers in both schemas will only have to operate on one record in the other schema.
Update and delete operations in one of the source tables trigger the same operation on the record referred to in the new table. For soft transformation, updates and deletes in the new table have similar triggers. Insert operations in a source table simply trigger an equal insert into the new table. Inserts into the new table should trigger inserts into one of the old tables as well. If the new table contains an attribute that identifies which old table it should belong to, this is straightforward. If this is not the case, e.g. because a non-derived key is used in the new table, the transformation cannot be performed softly.
In the second step, records from the old tables are read and inserted into the new table one record at a time. When all records have been copied, new transactions are given access to the new schema. The old tables are deleted once all old transactions have finished processing.
28 3.1. RONSTR ¨OM’S SCHEMA TRANSFORMATIONS
Foreign key
Employee ("original table")
FK Salary $40’ $32’ $42’ $35’ S.Name Valiante Olsen Oaks Pine F.Name Hanna Erik Markus Peter
LowSalary ("new table2")
Salary $32’ $35’ S.Name Olsen Pine F.Name Erik Peter FK
HighSalary ("new table1")
Salary $40’ $42’ S.Name Valiante Oaks F.Name Hanna Markus FK
Figure 3.2: Horizontal Split. One table, “Employee”, is split into two tables based on salary.
Horizontal Split Schema Change
Horizontal split is the inverse of horizontal merge; it splits one table into two or more tables by copying records to the new tables depending on a condition. An example transformation is that of splitting “Employee” into “High Salary Employee” and “Low Salary Employee” based on conditions like “salary >= $40.000” and “salary < $40.000”. This transformation is illustrated in Figure 3.2. Only horizontal split transformations where all source table records match the condition of one and only one new table is described by Ronstr¨om (Ronstr¨om, 2000). Because of this, records in the old table refer to exactly one record in the new schema, thus simplifying the transformation.
The new tables are first added to the schema, thus this method creates derived tables. Foreign keys, initially set to null, are then added both to the old and new tables. Once a record has been copied to the new schema, the foreign keys in both schemas are updated to point to the record in the other schema.
The transformation can easily be made soft by adding triggers to both the old and the new tables. Because of the one-to-one relationship between records in the old and new schema, these triggers are straightforward: deletes and updates perform the same operation on the record referred to by the foreign key. Insert operations into the old table trigger an insert into the new table that has a matching condition. Ronstr¨om does not discuss how to handle an insert into a new table if the old table already contains a record with the same primary key. In all other cases, inserts into the new table
simply result in an insert into the old table as well.
When the triggers have been added, the transformation is executed as described in the general method by copying the data in the old table one record at a time.
Vertical Merge Schema Changes
The vertical merge schema change uses the left outer join relational opera-tor. Since records without a join match in the left table of the join are not included, this transformation is not lossless. The method requires that the tables have the same primary key, or that one table has a foreign key to the other table (Ronstr¨om, 2000). This is illustrated in Figures 3.3(a) and 3.3(b), respectively. These requirements imply that the method cannot perform a join of many-to-many relations, nor a full outer join.
Since the join is performed by adding attributes to one of the existing tables, a DT is not created. Hence, the method cannot be used for other purposes than schema transformations.
The transformation starts by adding the attributes belonging to the joined record to the left table of the join. This table is called the left table in the old schema and themerged table in the new schema. The left table is repre-sented by “Person” in both Figures 3.3(a) and 3.3(b). A foreign key to the right table of the join, called theoriginating table, is also added if it does not already exist. The originating table is represented by “Salary” and “PAd-dress” in Figures 3.3(a) and 3.3(b), respectively. In addition, an attribute indicating whether the record has already been transformed is added. During the transformation, transactions that operate on the old schema do not see the attributes added to the left table.
Triggers are then added to the originating table. Update and delete op-erations trigger update opop-erations of all records referring to it in the merged table. The trigger on deletes also removes the foreign key of referring records. Insert operations trigger updates of all records in the merged table match-ing the join attributes. All these triggers also set the has-been-transformed attribute to true (Ronstr¨om, 2000).
Since old transactions are free to operate on the left table, a number of triggers must be added there as well. The trigger of insert operations reads the matching record in the originating table so that the value of the added attributes can be updated accordingly. Update operations that change the foreign key, trigger a read of the new join match and update the added attributes to keep it consistent. In addition, all modifying operations2 have
30 3.1. RONSTR ¨OM’S SCHEMA TRANSFORMATIONS Person Firstname Surname Address Zip Code Salary "Left table"
Old Schema New Schema
"Originating table" "Merged table" Person Firstname Surname Address Zip Code Salary Firstname Surname Salary Salary
(a) Vertical merge with same primary key. “Firstname, Surname” is the pri-mary key in all tables. The merged table is created by adding the salary attribute to “Person”. Person Firstname Surname Address Zip Code Person Firstname Surname Address Zip Code City PAddress Zip Code City City
Old Schema New Schema
"Left table"
"Originating table"
"Merged table"
(b) Vertical Merge of functional dependency. Person.ZipCode is a foreign key to PAddress.ZipCode. The merged table is created by adding city to “Person”.
to update the foreign key reference in the original table.
Once all necessary attributes and triggers are in place, the records in the left table are processed one at a time. A transaction reads a record and uses the foreign key to find the record referred to in the originating table. The attribute values of that record are then written to the added attributes in the merged table, and the has-been-transformed attribute is set (Ronstr¨om, 2000).
When all records in the merged table have been processed, it contains a left outer join of the two tables. A hard schema change is simply achieved by letting old transactions complete on the old schema. Triggers, attributes and foreign keys no longer in use are then dropped, before new transactions are allowed to use the new schema (Ronstr¨om, 2000).
Performing a soft vertical merge transformation implies adding triggers to the merged table as well. These triggers make sure that write operations executed by transactions operating on the new schema are also visible for transactions using the old schema. Thus, updates on the added attributes of records in the merged table must trigger updates to the referred record in the originating table.
Insert operations into the merged table trigger a scan of the originating table to see if it already contains a matching record. If so, only the foreign key of the inserted record is updated. Otherwise, a record is also inserted into the originating table.
A problem arises if a record is inserted into the merged table and the trigger scanning the originating table finds that an inconsistent record already exists. The same case is encountered if the added attributes of a merged table record are updated while multiple records refer to the same originating record. Since two records with the same primary key cannot exist in the originating table at the same time, a simple insert of a new record is not possible. Furthermore, it would not be correct to just update the originating record since the other records referring to it would disagree on its attribute values. This problem is not addressed in the method, but there are at least two possible solutions: the first possibility is to abort the transaction trying to insert or update the record in the merged table. This would be a serious restriction to operations in the merged table. The second is to update the record in the originating table, which in turn triggers updates on all records in the merged table that refers to it. This is illustrated in Example 3.1.1: Example 3.1.1 (Triggered Updates During Soft Vertical Merge)
Consider the vertical merge illustrated in Figure 3.4. During a soft schema change, an attribute added to the merged table, “City”, is updated. This triggers an update to the originating table, “Postal Address”, again triggering