• No results found

In-memory database systems, NVDIMMs and data durability

N/A
N/A
Protected

Academic year: 2021

Share "In-memory database systems, NVDIMMs and data durability"

Copied!
6
0
0

Loading.... (view fulltext now)

Full text

(1)

In-memory database systems, NVDIMMs and

data durability

Steve Graves - July 23, 2014

Application example: IMDS and Industrial Controller

In an industrial control system, integration of an IMDS within a controller supports a ‘flattened’ control system architecture in which data is stored and processed, and some control decisions occur, at the level of individual controllers; in the opposing (and traditional) hierarchical system architecture, data stored at the controller level is typically limited to control variables.

Database management system (DBMS) software is increasingly common in electronics, spurred by growing data management demands within technology ranging from communications equipment to avionics gear and industrial controllers, and facilitated by these devices’ increasing on-board CPU, RAM and storage resources. The size of on-device databases varies, ranging from a few gigabytes of data to support a telecom billing/credit system’s rating and balance management application, to 10+ GB for an IP router’s control plane database, and more than 100 GB for a telecom call routing database.

And DBMSs – once associated almost entirely with business, desktop and Web-based applications – have evolved to meet the needs of today’s electronics. Designers often turn to in-memory database systems (IMDSs), which store records in main memory, eliminating sources of latency such as caching and file management that are hard-wired into DBMSs that store data persistently on hard disk or flash (these sources of latency are

shown in Figure 1, below). As a result, IMDSs perform orders of magnitude faster than traditional “on-disk” DBMSs; their simpler design minimizes demand for CPU cycles, permitting the use of less powerful – and less costly – processors.

(2)

Figure 1. Sources of latency in a traditional (on-disk) database system. Volatility, however, is sometimes a concern. In the event of power loss or system failure, main memory’s contents are gone. Some applications can tolerate this risk. For example, a RAM-based electronic programming guide stored in a set-top box will be lost if power fails, but can be re-built quickly with information from the cable head-end or satellite transponder. However, other

electronics require a higher level of database durability and recoverability. For example, some medical devices require a record of vital signs over time, to support clinical decisions – this data can’t just vanish in the event of power failure. Network routers and switches store configuration data persistently, usually in flash. Keeping this configuration data in memory would make sense, to facilitate faster rebooting – but the data would need to be recoverable. Also challenged by DRAM’s volatity are scanners that “read” fingerprints or faces, and match these with biometric data in an on-device IMDS, in order to grant or deny access to secured facilities. If the access control system goes down, it must recover quickly. Solutions to IMDS Volatility

Solutions to IMDS Volatility

Solutions have emerged to address this volatility. Non-volatile memory in the form of battery-backed RAM enables data held on a DRAM chip to survive a system power loss, but has not caught on widely, due to restrictive temperature requirements, risk of leakage, finite battery shelf life and other drawbacks.

The IMDS software itself can provide mechanisms for data durability. For example, with a

transaction logging feature, the database system creates a record of transactions (groups of changes to the database that must complete or fail as units) in a log file, which can be used to restore the database after failure. But logging itself requires writes to persistent storage, and therefore carries a performance penalty.

Another IMDS feature to mitigate volatility is database replication, in which one or more standby in-memory databases on independent nodes are kept synchronized with the “master” or main database. If the master node goes down, one of these replicas takes over its role. Synchronization can take place quickly, although some latency is imposed by the processing that manages synchronization (and failover, if it occurs) and by communication between the nodes. The performance cost grows as the number of replicas or the physical distance between nodes increases. Different replication

(3)

strategies can be used to manage latency. Synchronous or “2-safe” replication requires a database transaction to complete on replica nodes concurrently with completion on the master, while

asynchronous or “1-safe” replication allows transactions to commit on the main database before they’re finalized on replicas. The asynchronous approach offers shorter resource holding time and hence faster performance, but with weaker consistency and durability.

NVDIMMS: Non-Volatile RAM, Minus the Battery

The emergence of non-volatile dual line memory modules, or NVDIMMs, adds a new tool for in-memory database durability. NVDIMMs take the form of standard in-memory sticks that plug into existing DIMM sockets, simplifying integration into off-the-shelf platforms. Typically they combine standard DRAM with NAND flash and an ultracapacitor power source. In normal operation, this technology provides the capabilities of high speed DRAM. In the event of power loss, the

ultracapacitor provides a burst of electricity that is used to write main memory contents to the NAND flash chip, where it can be held virtually indefinitely. Upon recovery, the NVDIMM restores data from NAND flash to DRAM.

For in-memory databases, NVDIMMs’ promise is similar to that of battery-backed RAM, but without the battery and its shortcomings. McObject had previously added “hooks” enabling its eXtremeDB IMDS to work with battery-backed RAM, and was eager to try the IMDS using NVDIMMs as main memory storage. Several vendors now offer NVDIMMs. We tested eXtremeDB using the product from AgigA Tech because of our familiarity with its parent company, Cypress Semiconductor, and we limited our testing to their NVDIMMs (not testing, for example, the NVDIMMs from Viking

Technology and Smart Modular Technologies) due largely to our limited time and resources. Therefore the tests described in this article amount not a product shootout so much as a proof-o--concept that an IMDS can operate with an NVDIMM as storage, achieve performance comparable to using conventional DRAM, and leverage the NVDIMMs’ recovery capability to restore an in-memory database that has been “lost” due to system failure. Performance advantage

The tests addressed another question that often comes up when considering use of an IMDS in an application that requires both low latency and data recoverability, namely, to what extent will an IMDS with transaction logging retain its performance advantage over a disk-based DBMS? For these latter tests involving persistent storage (of the IMDS’s transaction log, and the entire database in the case of the on-disk DBMS) the storage “device” consisted of a RAM-disk configured using the AGIGARAM NVDIMM. The reasons for using a RAM-disk instead of a conventional hard disk drive or solid state drive are described below.

The AgigA Tech NVDIMMs used in the tests are designed for use with Intel’s Romley and Grantley platforms (taking in Sandy Bridge, Ivy Bridge, Haswell and Broadwell processor architectures). McObject used the 4GB AGIGARAM® DDR3-1600 NVDIMM in an Intel Oak Creek Canyon reference motherboard with Intel Pentium Dual Core CPU 1407 @ 2.8 GHz processor and 8 GB Kingston conventional DDR3-1333 DRAM, running Debian Linux 2.6.32.5.

The test application performed five database operations, with each loop constituting a database transaction and containing at least two instances of the operation (see Figure 2). The benchmark application recorded the number of loops accomplished per millisecond for each of the two database types (on-disk DBMS and IMDS with transaction logging, or “IMDS+TL”) and both types of memory (NVDIMM and conventional DRAM). The test application used eXtremeDB’s native C/C++

(4)

Figure 2. Test application operations

Test application code enabling database recovery leveraged an eXtremeDB capability originally added to enable its use with battery-backed RAM as storage. This feature enables a process to re-connect to an NVRAM-hosted eXtremeDB database after a system reboots, initiate any needed cleanup, and resume normal operation. An application’s recovery algorithm assumes that the

memory block of the database memory device assigned as MCO_MEMORY_ASSIGN_DATABASE can be re-used after an application crash or power failure by re-opening it with the additional flag MCO_DB_OPEN_EXISTING.

Benchmark Results

Recovery from failure was tested by rebooting the test system mid-execution. When the system came back up, the test application re-started automatically, accessed the eXtremeDB database in its pre-failure state (upon recovery, the NVDIMM had loaded it from its flash into its DRAM), checked for database consistency and resumed operation, accessing the database from the same NVDIMM memory space that was used prior to the system restart.

In the tests comparing the speed of a “pure” IMDS (no transaction logging) with NVDIMM as main memory storage, to the same database configuration using conventional DRAM, any gap between the two storage types was negligible. The difference in performance on all the database operations tested – inserts, updates, deletes, index searches and table traversals – was within the margin of error for the measurement technique used. One might attribute this equivalence to the entire

database being loaded into CPU cache and data access occurring from there rather than from DRAM or NVDIMM. However, at approximately 12MB, test database size greatly exceeded the 5MB CPU cache size, and the test application relied on random keys to look up random pages from the database. Effect of transaction logging

The remaining tests focused on the effect of transaction logging on IMDS performance. IMDS vendors offer transaction logging to mitigate the volatility of “pure” in-memory data storage. However, transaction logging requires persistent storage (for the log) which could impact IMDS performance. For this reason, IMDS vendors are often asked whether their products, when deployed with transaction logging, still outperform on-disk DBMSs.

The test sought to answer this question. The “hard disk” used for persistent storage was actually a RAM disk (a memory-based analog of disk storage) using the NVDIMM as memory. This was done partly to further test AgigA Tech’s product (i.e. to confirm if it would work to create a RAM-disk and have a database system interact with it), but also to shed light on the reason why an IMDS with transaction logging outperforms an on-disk DBMS.

In-memory database systems differ from on-disk DBMSs in important ways beyond the storage devices they utilize (hard disk or solid state drive for on-disk DBMSs vs. DRAM for IMDSs). An IMDS

(5)

eliminates cache management, file I/O and other sources of overhead inherent in traditional DBMS architecture. Eliminating the hard disk – replacing it with a RAM disk – eliminates overhead

stemming from physical operation of the storage device, in order to highlight the latency effect of the IMDS’s streamlined design vs. the on-disk DBMS’s more complex processing.

The test showed that for insert, update and delete operations, the IMDS with transaction logging significantly outperformed the traditional on-disk DBMS (again, with both of these using a RAM-disk for their “persistent” storage). Figure 3 shows results in loops/ms for each of the configurations, as well as the performance multiple exhibited by the IMDS+TL over the on-disk DBMS. For example, in the test of database deletes, the IMDS+TL was 12.77 times faster than the on-disk DBMS. Figure 3 also shows the performance impact of turning off transaction logging and having eXtremeDB perform the operations as a “pure” IMDS using the NVDIMM as main memory storage.

Figure 3. Results

Database index searches and table traversals showed little to no performance change when moving from on-disk DBMS to IMDS+TL. This result was expected, because such database “reads” do not change database contents, and are typically much less costly, in performance terms, than insert, update and delete operations.

Discussion

NVDIMMs match the speed of conventional DRAM when used as IMDS storage, while delivering full in-memory database durability. Why, then, would anyone consider using an IMDS with latency-inducing transaction logging? There are several reasons, including cost, since GB for GB, NVDIMMs cost more than DRAM; a desire to use platforms other than Intel’s Romley and Grantley; and

required database size (AgigA Tech’s NVDIMMs support up to a 128GB total memory size). As shown in the numbers presented above, adding transaction logging to achieve data durability slows IMDS performance, but the IMDS+TL combination still outperforms a traditional on-disk DBMS for insert, update and delete operations.

Another question for prospective users is whether their chosen in-memory database system supports the use of NVDIMMs as main memory storage. As mentioned above, McObject’s eXtremeDB IMDS includes features – added early in the product’s development to support its interaction with battery-backed RAM – that enabled database recovery to occur seamlessly with NVDIMMs. Using an IMDS

(6)

without such features may entail more complexity, with significant development and testing required before reaching a workable solution.

It should also be noted that the database durability discussed in this article – that is, the assurance that the database and all committed transactions can be recovered in the event of system failure – differs from high availability, or the ability to operate without downtime. While both techniques aim to enable databases to withstand failure, high availability is usually achieved via replication, as described above, with failover time measured in milliseconds. In contrast, durability – achieved in IMDSs with transaction logging or use of NVDIMMs as main memory storage – carries no such guarantee of eliminating downtime. Database recovery using either NVDIMMs or transaction logging is usually automated, but the most likely usage scenario for either is following an

unexpected system shutdown, which implies a cold re-start (e.g. re-booting), a minutes-long process. Developers should understand the distinction between database high availability and durability when considering techniques to tame volatility.

Figure

Figure 1. Sources of latency in a traditional (on-disk) database system.
Figure 3. Results

References

Related documents

Wildlife Viewing Preferences of Visitors to Sri Lanka’s National Parks: Implications for Visitor Management and Sustainable Tourism

Third, to assess the possible impact of investment and trade liberalisation between certain countries upon FDI going to excluded countries we estimate gravity equations using data

Correspondence and orders for medicines written in English will facilitate  execution of orders.. MRP

Vocational Rehabilitation programs rarely consider collaborating with community economic development activities, and rural community economic development practitioners rarely think

 Dozvoljavanje lokalnim vlastima izdavanje lokalnih valuta i poticanje lokalnih zajednica na formiranje shema lokalne razmjene ( eng. LETS - Local exchange trading schemes)

As part of a culture of continuous improvement, practice owners, senior dentists or the dental service executive should work to implement systems and processes to improve the

Enrichment Programme enables selected students aged 10 and 11 years old from ten primary schools in rural areas to travel by bus for a day to the University Centre for

• Current link schema for MediaWiki install:.