• No results found

PLP on Future Hardware and Conclusions

Unlike conventional systems, which either embrace fully shared-everything or shared-nothing philosophies, physiological partitioning takes the best features of both to produce a hybrid system that operates nearly latch- and lock-free, while still retaining the convenience of a common underlying storage pool and log. We achieve this result with a new multi-rooted B+Tree structure and careful assignment of threads to data.

As multicore hardware trends evolve, PLP becomes increasingly attractive for several reasons. Conventional OLTP is ill-suited to modern and upcoming hardware since;

• the code of an OLTP system is full of unbounded critical sections [96,99],

• the access patterns are so unpredictable [174] that even the most advanced prefetchers fail to detect data access patterns for a transaction [175],

• the majority of the accesses are shared read-write; hence, they under-perform on caches with non-uniform access latency [20,69,70].

As we have seen, PLP, combined with previous advances in logging, succeeds in all three problems. The majority of the unbounded critical sections are completely eliminated, ac- cess patterns are regularized by the thread assignments, and threads no longer share data to communicate, eliminating the shared R/W problem. This regularity is going to become in- creasingly important as hardware continues to make more and more demands of the software. Unfortunately, OLTP will only be able to utilize these new architectures effectively if it can eliminate the majority of accesses that are shared among multiple processors. In short, by eliminating a large class of unbounded/unscalable communication, PLP leaves OLTP engines much better-poised to take advantage of the upcoming hardware, whatever form it may take.

4

Dynamic Load Balancing for PLP

Partitioning is an increasingly popular solution for scaling up the performance of database management systems even within a single (multicore or multisocket) machine. However, it is not a panacea since there are many challenges associated with it. This chapter focuses on one of the most troublesome challenges for partitioning-based transaction processing systems, which is their behavior in skewed and dynamically changing workloads. Such workloads are the norm rather than the exception and highly problematic for statically partitioned systems.

We demonstrate the non-optimal performance of single-node partitioning-based transaction processing systems and analyze the costs and challenges toward robust and efficient dynamic load balancing mechanisms for such systems. This analysis highlights that physiologically- partitioned (PLP) shared-everything online transaction processing systems offer a good infras- tructure for lightweight repartitioning. Based on this observation, we propose a dynamic load balancing mechanism (called DLB) specialized for the PLP design. Evaluation on different multicore machines shows that the overhead of DLB is low in normal operation (in the worst case at most 8%), while it enhances the system with robust behavior achieving very low response times in both detecting and handling load imbalances.1

4.1 Introduction

Database management systems need to provide enough execution parallelism to exploit mod- ern multicore and multisocket hardware. Unfortunately, exhibiting high execution parallelism is not easy, even for transaction processing workloads, which are characterized by highcon- currencyat the request level. In particular, conventional transaction processing results in complicated and unpredictable access patterns [145]. In order for the system to maintain the consistency of the data shared by the parallel processes, it needs to employ synchronization points, which formcritical sectionsthat serialize transaction execution. Critical sections not only hurt single-thread performance, especially in transaction processing workloads [76], but they also quickly become scalability bottlenecks [96].

The common solution for improved scalability is to either remove critical sections completely or reduce the contention on them. A very popular technique to achieve that ispartitioning. The database is broken into multiple partitions and the data that belong to one partition are operated on by just one worker thread. As a result, the number of threads that share some part of the data is reduced along with the contention on the critical sections that protect that data. If only a single thread accesses each partition, then the need for critical sections is eliminated [107,145,179].

If configured correctly, a partitioned database system (shared-nothing [45,179] or shared- everything [145],Chapter 3) can perform better than corresponding non-partitioned systems. Achieving high performance, however, is not a simple task when running realistic, dynamically changing workloads. Depending on the access patterns, the load of each partition might be different. Skewed access patterns can lead toload imbalanceand reduce or eliminate any benefits due to partitioning. Therefore, system designers need to be careful in order to benefit from and not to be hindered by partitioning.

There are two orthogonal ways to attack the problem of skewed access in partitioning-based transaction processing systems:

proactivelyby configuring the system with an appropriate initial partitioning scheme and • reactivelyby using a dynamic balancing mechanism.

Starting with the appropriate partitioning configuration is key. If the workload characteristics are known a priori, previously proposed techniques [39,164] can be used to create effective ini- tial configurations. If the workload characteristics are not known, then simpler approaches like round-robin, hash-based, and range-based partitioning [45] would work. As time progresses, however, skewed access patterns gradually lead to load imbalance during execution. The initial configuration eventually becomes useless no matter how carefully it is chosen. Thus, it is far more important and challenging todynamicallybalance the load through repartitioning based on the observed, and ever changing, access patterns. A robust dynamic load balancing mechanism should eliminate any bad choices that might be made during initial assignments. In this chapter, we focus on dynamic load balancing and online repartitioning in the context of partitioned database management systems within a single node. After a thorough compari- son of different partitioning mechanisms in terms of their repartitioning costs, we design a lightweight yet effective dynamic load balancing and repartitioning mechanism, calledDLB, for physiologically-partitioned (PLP) OLTP systems. To collect information about the current access patterns and load in a workload, DLB uses the existing request queues of the partitions and employs a new data structure, called anaging two-level histogram. These structures help in observing recent load and data access patterns across and within partitions. DLB also exploits themulti-rooted B+Tree (MRBTree)index structure that is at the core of PLP (Chapter 3) for efficient reorganization of partitions.