• No results found

Structure-Modification Operations

data item that has been updated by an overlapping but already committed transaction T′. Alomari et al. [2] describe how the PostgreSQL database

implements snapshot isolation by having each transaction take exclusive locks on any modified data items; and by aborting any transaction T that tries to modify a data item for which the latest version(k, [v1, v2), w) has

been modified after transaction T begun, so that v1> snap(T).

We do not presuppose that snapshot isolation be used in connection with the database structures and algorithms presented in this disserta- tion. However, we do assume that such a multiversion concurrency-control algorithm is used that allows read-only transactions to operate without blocking updating transactions and vice versa. We will use the snapshot isolation algorithm in our examples.

2.8

Structure-Modification Operations

When a database page becomes filled with data items, it needs to be split into two separate pages using a page-split operation. Similarly, when enough data items in a page have been marked as deleted, the page needs to be merged with a sibling page using a page-merge operation to main- tain an appropriate minimum number of live data items in each page. These operations are called structure-modification operations, or SMOs. In snapshot database theory, the algorithms of the ARIES family use nested top actions [64, 66] when executing structure-modification opera- tions. When a structure-modification operation begins, the log sequence number (LSN) of the last action performed prior to the SMO is recorded. All the individual operations required by the SMO are logged using nested top actions. When the SMO finishes, the Undo-Next-LSN [64] of the last nested top action is set to the LSN value recorded at the beginning of the SMO, so that the action chain determined by the Undo-Next-LSN values skips the operations performed by the SMO, if the SMO has finished. The effect of this technique is that it breaks the standard backward chaining of log records, so that partial and total rollbacks skip the nested top ac- tions. The SMOs are not undone even if the transaction that triggered the SMO aborts and is rolled back.

There are, however, some challenges with nested top actions. They often require tree latches or special structure-modification bits to be set on the database pages so that concurrent transactions can notice that a structure-modification operation is ongoing. They may also be undone if a system crash occurs before all the required nested top actions have finished. Jaluta et al. have proposed a more straightforward method for

performing structure-modification operations for database index struc- tures [39–41]. In their approach, each structure-modification operation targets a single page of the index structure, requiring modifications to be made to at most three pages at two adjacent levels (for the B+-tree

index): one parent page, and two sibling child pages. This also means that at most three different pages need to be latched at a time (again, for the B+-tree; different index structures may require different number

of pages to be latched at a time). If an SMO at a lower level causes an SMO in an upper level, the SMO at the upper level is applied before the SMO at the lower level.

Each structure-modification operation is logged using a single redo- only log record. In contrast to nested top actions, interrupted SMOs are never undone in this approach, and the index structure remains consistent after each SMO, so that each SMO transforms a structurally consistent and balanced index tree into another structurally consistent and balanced index tree. The log records must contain sufficient information of all the entries moved or copied between pages, so that the effect of the operation can be redone on any single page involved in the operation, as is required by the ARIES algorithm [66]. This approach does not require any special structure-modification bits or tree latches for applying the modification. We will use this approach to apply structure-modification operations to the index structures discussed in this dissertation.

CHAPTER 3

Multiversion Index Structures

We have now discussed the theory behind temporal databases, concentrat- ing mostly on multiversion databases. In this chapter, we describe some of the index structures used in multiversion databases. We begin the chap- ter by defining a few general properties of multiversion index structures in Section 3.1. In Section 3.2, we will demonstrate that a single-version in- dex is not an efficient structure for indexing multiversion data. To be able to properly determine the efficiency of multiversion index structures, Sec- tion 3.3 defines what we mean by an optimal multiversion index structure, and Section 3.4 lists common design ideas used in efficient multiversion indexes. In Section 3.5, we describe some of the early multiversion index structures. For a comprehensive presentation and comparison of differ- ent multiversion access methods, the reader is referred to Salzberg and Tsotras [78], and ¨Ozsoyoˇglu and Snodgrass [94]. The rest of this chapter is dedicated to different kinds of structures that have been used to index multiversion data or are otherwise related: spatial indexes (Section 3.6), hashing structures (Section 3.7), version-control systems (Section 3.8), and other structures (Section 3.9).

3.1

Properties of Multiversion Indexes

To begin, let us define what we mean by a multiversion index structure: Definition 3.1. A multiversion index structure is a transaction-time index that is partially persistent and enables efficient x/−/point queries on the data items, where x is either point , range, or ∗. The index is a collection of nodes that forms a tree or a directed acyclic graph (DAG). The nodes of the graph are fixed-size database pages. The graph contains one or possibly many root pages, which serve as starting points for search operations. Pages that have child pages are called index pages or parent

pages, and pages that do not have child pages are called leaf pages. Each page contains entries that represent either data items (called data entries, see Definition 2.7) or routers to child pages (called index entries). Page capacity B tells how many entries fit into the page. The capacity is dictated by the entry format and the page size, but for the simplicity of the theoretical discussion, we assume that the page capacity B is the same for all index and leaf pages. The data entries stored in the index may contain either the actual data stored with the key (the row in the relation), in the case of a primary or sparse index ; or a pointer to a separate storage location, in the case of a secondary or dense index. ◻

The multiversion index structure defines the way the data items are stored and accessed. Similar to the B+-tree, most often the index pages of

a multiversion index contain only index entries, and the leaf pages contain only data entries. Searches in a multiversion index follow the same logic as searches in a single-version index structure: each node has a number of child nodes, and each child page covers a more restricted area of the search space. The search spaces of sibling pages usually do not overlap, but there are exceptions. In a multiversion index, the search space is the key-version space. Each page thus covers a region in key-version space. If the multiversion index contains a single root page, then that root page covers the entire key-version space. A child page’s search-space region overlaps with the parent page’s region, and often the child page’s region is a subset of the parent page’s region. A key k that is part of version v (alive at version v) is located at the leaf page whose key-version region covers the key-version coordinate (k, v).

In multiversion index structures, the most important property to op- timize is the number of pages that an action needs to read or write to perform an action, because I/O operations on disk storage are still the most significant bottleneck in most database applications [44]. A good index structure requires a minimal number of page accesses for its actions. If a search operation on a database index requires access to m pages to locate key k, then m is normally logarithmic in the number of data items indexed by the structure, if the index is a tree structure. For analyzing the performance of index structures, we define the cost of an action:

Definition 3.2. The cost of an action or an operation is the number of index-structure pages the action needs to access (read and/or write). In the case of a sparse index, this includes all the pages the actions need to access. In the case of a dense index, this includes only the pages of the index structure itself, and not the data pages that may need to be