• No results found

HHB+tree Index for Functional Enhancement of NAND Flash Memory-Based Database

N/A
N/A
Protected

Academic year: 2021

Share "HHB+tree Index for Functional Enhancement of NAND Flash Memory-Based Database"

Copied!
6
0
0

Loading.... (view fulltext now)

Full text

(1)

HHB+tree Index for Functional Enhancement of NAND Flash

Memory-Based Database

Huijeong Ju

1

and Sungje Cho

2

1,2

Department of Education

Dongbang Culture Graduate University

60, Seongbuk-ro 28-gil, Seongbuk-gu, Seoul, 136-823, South Korea

[email protected], [email protected]

Abstract

Unlike disk-based database, memory-based database mostly uses memories with small capacity and NAND flash memories for rapid data processing in low power battery environments. NAND flash memories are strong against physical shocks and have an advantage of rapid access speed, but it also has a shortcoming of functional deterioration due to a slow processing speed because of its structural impossibility of in-place-update. In order to solve such functional deterioration problem of NAND flash memories, the hybrid hash index technique, which minimizes split operations of overflow bucket using expanded hash index, is suggested. The technique immediately allocates overflow bucket as soon as split operation occurs, to solve the functional deterioration problem by reducing occurrence of additional operations. However, such technique also has a shortcoming of functional deterioration as it processes search or editing operations on additional overflow bucket in an order as it processes sequential files. Therefore, this study proposes a hybrid hash B+tree index technique of which the function has been enhanced by saving information of overflow bucket at B+tree index. Lastly this paper proves that the proposed technique is more favorable than the previous technique through experiments and analysis.

Keywords: B+tree index, Hybrid Hash Index, NAND Flash Memory, Overflow Bucket

1. Introduction

Flash memory, a type of nonvolatile memories, is small in size and consumes small amount of electricity, making it is favorable for applications in portable electronics. Also it can save a large amount of data, considering its size, and allows rapid access to data [1, 2]. Of all types of flash memory, NAND flash memory allows direct input and output process without loading data on the main memory and it is also strong against external shocks [2, 3]. Yet, NAND flash memory has a shortcoming of slow writing speed compared to its reading speed, because it saves data on initialized blocks only and if there is no initialized block, it takes extra time to initialize [4]. Moreover, whenever editing operation occurs, it erases the selected block before writing over, instead of processing in-place-update as a hard disk does, taking more time to edit data [5].

In order to compensate such characteristics of NAND flash memories, FTL(Flash Translation Layer), which allows the file system from disk-based memories to be applied on NAND flash memory was studied. FTL evenly distributes the parts of NAND flash memory for use and it also provides solution to a problem with address-remapping for editing operations and different processing times for reading, writing and editing operations [5]. Yet, if editing operations occur in series or occur frequently for a certain period of time, the overall function of NAND flash memory is deteriorated as it processes in-place-edition.

(2)

In order to solve the stated problems, the hybrid hash index technique is proposed. This technique reduces the number of NAND flash memory operation by allocating overflow bucket to delay split operations using hybrid hash index instead of processing them immediately. Consequently, the functional deterioration problem has been solved with reduced number of additional reading and writing operations.

This paper proposes Hybrid Hash B+tree index, which separately manages the B+tree-based overflow sections to enhance the processing speed of search, insertion, edition and deletion operation from the hybrid hash index technique that was previously studied.

The paper is organized as follows. Section 2 describes related studies, and Section 3 explains the Hybrid Hash B+tree index introduced in this said paper. Section 4 compares and contrasts the introduced technique and the preexisting model through a simulation. Finally, Section 5 concludes the paper.

2. Relevant Studies

2.1. Circular Hashing Model [7]

Circular Hashing model is designed by modifying the previous hash index techniques to match the characteristics of NAND flash memory. The model is functionally enhanced by minimizing the use rate of overflow chain and utilizing the internal space of bucket list. To perform insertion operation while the selected bucket is full, Circular Hashing does not use additional overflow bucket, but searches another bucket to complete the operation. The process is repeated maximum of O (logN) (N: Total number of buckets) times and like other hashing techniques, it sets the threshold number of saved keys to split buckets. Moreover, in the splitting process, all overflow chains are eliminated before re-entering on the bucket list. A bucket is split as a whole, so that the existing buckets are not modified. Such process reduces the occurrence of overflow buckets, also reducing the number of null pages. Furthermore, the modular N function is used as the hash function to effectively use the structure of Circular Hashing. Circular Hashing has solved the functional deterioration problems with operations by minimizing the occurrence of overflow buckets [7]. However, it still has the problem with increased number of deletion operations due to the frequent page nullification as it frequently uses the overflow chain in the process of bucket splitting.

2.2. Flash Memory-based B+tree for Effective Rage Search [8]

NAND flash memory has a problem of excessively performing in-place-edition because it is unable to perform range search using terminal node by recording information on a separate section when B+tree is applied. Yet, the flash memory-based B+tree technique uses p-node to solve this problem. P-node saves a parent node and leaf nodes thereof in a same block, along with information of any relevant edition operations. Also when uploading the nodes with the same type as a p-node on a memory for the first time, it reads the whole p-node block, keeping the B+tree consistency even without a separate process of editing the parent node. Such process prevents spread of node updates in the bottom-up direction and helps effective use of memory spaces. Moreover, with leaf nodes saved in a space in series, a range search using leaf node link became possible and the internal fragmentation is eliminated by taking leaf nodes as byte units [8]. Yet, such technique still has a shortcoming of searching through all nodes if the parent node is not cached, and also, lack of updated nodes or overflow of leaf nodes tend to cause P-node block initialization, resulting in a functional deterioration.

2.3. NAND Flash Memory-based Hybrid Hash Index [6]

(3)

records when performing insertion, deletion and edition operation, by reducing the writing and cancellation operations. Hybrid hash index reduces number of additional split operations when an overflow occurs by performing combination operation or split operation after the combination process. Also when performing deletion operation, it does not directly delete the selected record from the index, but inserts a deletion key in a bucket to perform combination operation or split operation after combining before deletion. The functional deterioration of system due to in-place-updates is ameliorated through such process by delaying direct deletion operations. If search operation occurs, it searches the bucket address using a hash value calculated from a hash function and it starts searching from the bucket where the latest record is inserted, shortening the searching speed [6]. However, NAND flash memory-based Hybrid hash index searches records of buckets, allocated in overflow section, in an order, taking longer search time.

3. HHB+tree Index with B+tree

3.1. Fundamental Idea

When performing transaction in NAND flash memory, it performs operation on records in units of blocks. Here, to add data to a block, existing data need to be deleted, since NAND flash memory is unable to overwrite data. Also to delete a certain record in a block, the whole block need to be erased. Like so, it has a disadvantage of taking a long time to perform elimination operation because of the impossibility of in-place-update [6, 12-14]. [6], one of the techniques proposed to solve the problems, suggests a hybrid hash index technique using overflow. When an overflow occurs, this technique allocates overflow bucket to each bucket, instead of immediately performing split operation, reducing the cost of additional operation [6]. This technique performs operations in a form of sequential files by allocating overflow to each bucket, reducing the number of additional elimination operations, but it takes many time to search the location of saved data when performing additional operations after reading and writing data. Thus, in order to shorten the operation time in processing transactions of overflow in a form of sequential files using B+tree, this study proposes Hybrid Hash B+tree index (HHB+tree index).

The following [Diagram 1] presents the overview of Hybrid Hash B+tree index.

(4)

The structure of HHB+tree index consists of Hybrid hash index and B+tree, which manages the overflow. Hybrid hash index performs search, insertion and deletion operations through expanded hashing, whereas overflow bucket performs operations using B+tree index.

If insertion operation occurs, it uses hybrid hash index to search a bucket available for insertion before performing the operation. If there is no available bucket for insertion, ad bucket from overflow section is allocated. In such case, B+tree, coded in the overflow section is allocated, is used to search the bucket.

The structure of B+tree in the technique proposed on this paper only saves only the key value in the internal node and the key value and address of selected bucket in the leaf node, when managing the buckets in overflow section. Thus, this technique allows efficient use of memory space, and enables direct and range processing with leaf nodes connected in series[9- 11].

3.2. Insertion Operation

When performing insertion operation, it searches a bucket with a priority for saving using the hash function. If there is no bucket available, it uses B+tree to allocate an overflow bucket. The procedure of performing insertion operation is as follows.

Firstly, an available bucket is searched using a hash value found from the hash function.

Secondly, if a selected bucket for insertion is full, it accesses B+tree, which manages the overflow section to allocate an overflow bucket.

Thirdly, an empty bucket is searched to insert data using B+tree. If there is an empty space in the selected bucket, the key value is inserted, completing the insertion operation.

Fourth, if there is no empty space for inserting the key value, a block is additionally allocated to save the key value in an empty space of the final bucket.

3.3. Deletion Operation

The procedure of performing deletion of record, which saves the key value in HHB+tree index structure, is as follows.

Firstly, in order to avoid direct elimination operation on a selected bucket, the deletion information is recorded in a separated bucket.

Secondly, once division operation or commit operation is processed, the deletion record, saved in another bucket, performs deletion operation.

Such process delays the deletion operation of NAND flash memory as much as possible, enhancing the function of the system.

3.4. Search Operation

The procedure of search operation on HHB+tree index is as follows.

Firstly, it searches the record, on which the key value is saved, by acquiring a hash value from the hash function.

Secondly, if the record is not found using the key value from the hash function, access B+tree, which manages the overflow section to search.

Thirdly, the desired key value exists in a recently allocated bucket with a high possibility, so start searching from the mostly recently allocated bucket.

(5)

4. Performance Evaluation

In this chapter, the performances of previously suggested hybrid hash index and HHB+tree index, proposed in this paper, were evaluated. The previously suggested hybrid hash index sequentially searches overflow buffers, whereas the HHB+tree index, suggested in this paper, uses B+tree to search overflow buffer.

4.1. Measuring the Test Performance

In order to analyze the performances of hybrid hash index and HHB+tree index, the two indexes were arranged on top of a simulator which counts the operations of flash memory. The simulator used for this test is SLC Small NAND flash memory of which each block consists of 32 pages to save 32 key values on record. Moreover, to measure the generation time, a record with 10,000 key values is inserted to count the number of reading, writing and elimination operations. To measure the generation time of the final index, the speed of reading, writing and elimination speeds of the flash memory were set to be 25μs, 200μs and 1.5ms, respectively.

4.2. Comparition Analysis and Evaluation on Performance

The generation time of indexes were measured and compared by counting the number of search, insertion and elimination operations, which occur when generating hybrid hash index or HHB+tree index on flash memory. For this, 10,000 random key values were inserted, depending on the updates and deletion ratio, to count search, insertion and elimination operations, and the final generation time was calculated.

Figure 2. Comparison of Number of Operations Depending on The Updates and Deletion Ratio

The hybrid hash index performs split operation whenever an overflow occurs, so it requires more reading, writing and elimination operations than HHB+tree index. Also, the higher the updates and deletion ratio of input trace is, the more in-place-updates occur. However, hybrid hash index is unable to avoid erase-before write process whenever in-place-updates occur, so additional writing and elimination operations are required.

On Hybrid hash index, split operations of all overflow bucket occur at once, resulting that the number of writing and elimination operations on HHB+tree index to be lower than that on hybrid hash index.

(6)

5. Conclusion

This paper proposes HHB+tree index, which is designed to improve the previous technique for NAND flash memory-based memory system. The B+tree index is designed in the HHB+tree index for management of the overflow bucket section, to reduce time taken for writing and elimination operations, due to the frequent edition and deletion of records during overflow, and split operations time. Moreover, B+tree is used to enable range search using the terminal node links and to allow efficient use of spaces. A further study on the techniques to enhance the functions of NAND flash memory by actualizing the index structure, proposed in this paper, in a more effective and stable way is to be carried out in the future.

References

[1] M. J. Moon, H. C. Roh and S. H. Park, “Database-based Flash Memory File System for Mobile Devices”, Korea Computer Congress, vol. 36, no. 1, (2009), pp. 49-52.

[2] E. D. Hwang and J. H. Cha, “A Recovery Mechanism applying the Shadow-Paging technique to Flash Memory based LFS, Korea Computer Congress”, vol. 31, no. 2, (2004), pp. 199-201.

[3] K. Atsuo, N. Shingo and M. Hiroshi, “A Flash-Memory Based File System”, Proceedings of the 1995 USENIX Technical Conference.

[4] D. Brian and L. Markus, “Designing with Flash Memory”, Annabooks, (1993).

[5] Y. H. Bae, “Design of A High Performance Flash Memory-based Solid State Disk”, Journal of Computing Science and Engineering, vol. 25, no. 6, (2007), pp. 18-28.

[6] M. H. Yoo, B. K. Kim and D. H. Lee, “Hybrid Hash Index for NAND Flash Memory-based Storage Systems”, Database, vol. 2, no. 39, (2012), pp. 120-128.

[7] D. Y. Han and K. S. Kim, “A Circular Hashing Index for Flash Memory Storage”, Korea Computer Congress, vol. 39, no. 1, (2012), pp. 180-182.

[8] S. C. Lim and C. S. Park, “A Flash Memory B+Tree for Efficient Range Searches”, JKCA, vol. 13, no. 9,

(2013), pp.28-38.

[9] R. Bayer and C. MCCreight, “Organization and Maintenance of Large Ordered Indexes”, Acta Information, vol. 1, (1972), pp. 173-189.

[10] D. Comer, “The Ubiquitous B-Trees”, ACMComputing Surveys, vol. 11, no. 2, (1979), pp. 121-137. [11] D. Knuth, “The Art of Computer Programming”, Addison-Wesley Publishing Co.Inc., (1973).

[12] C. H. Wu, T. W. Kuo and L. P. Chang, “An efficient B-tree Layer Implementation for Flash Memory Storage Systems”, ACM Transations on Embedded Computing Sytems, vol. 6, no. 19, (2007), pp. 1-20. [13] J. H. Nam and D. J. Park, “Design and Implementation of the B-Tree on Flash Memory”, Korea

Information Science Society, vol. 34, no. 2, (2007), pp. 109-118.

[14] H. J. Ju and S. J. Cho, “A Hybrid B+tree Hash Index for Efficiency Improvement in a NAND Flash Memory”, Advanced Science and Technology Letters, vol. 108, (2015).

Authors

Huijeong Ju, Department Education of Dongbang Cultural Graduate University, Seoul, Korea. Her present area of research is Database, Mobile, SQLite.

Sungje Cho, Professor in Education, Department Education of Dongbang Cultural Graduate University, Seoul, Korea. His present area of research is Cultural Contents, Multimedia Education Method, Information Security.

References

Related documents

Thus, the PPEF drove a positive storm phase at middle latitudes ionosphere on the dayside and such atmospheric disturbances related to the storm time electric

The problem of overmighty magnates, crown negligence, and rampant extortion in the Lordship 

L: .IJIKLM KVMI TWLT KIQNUIJT ONTNeIJQ LKI TLSL\MI 9J NJO9YI RK9Y LMM Q9VKOIQ XNTWNJ .IJIKLM KVMI TWLT KIQNUIJT ONTNeIJQ LKI TLSL\MI 9J NJO9YI RK9Y LMM Q9VKOIQ XNTWNJ LJU

We propose Substream Trading, a new P2P streaming design which not only enables differentiated video quality commensurate with a peer’s upload contribution but can also

According to the findings of the research; (a) the highest number of articles were published in 2017, (b) qualitative research methods were the most common research methods,

From the start, the Ministry has actively consulted and involved DHBs, PHOs and professional groups through all stages of Care Plus - the initial consultation and piloting,

Melbourne Tanzania Mozambique Botswana Zimbabwe Namibia South Africa Maputo Cape Town Walvis Bay Luanda Dar-es-Saalam Madagascar Tomasina Durban Port Elizabeth

More specifically, for a multi- level NAND flash memory channel under mild assumptions, we first prove that such a channel is indecomposable and it features asymptotic