• No results found

Architecture Based Materialized View Evolution: A Review

N/A
N/A
Protected

Academic year: 2021

Share "Architecture Based Materialized View Evolution: A Review"

Copied!
7
0
0

Loading.... (view fulltext now)

Full text

(1)

Procedia Computer Science 48 ( 2015 ) 256 – 262

1877-0509 © 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Peer-review under responsibility of scientific committee of International Conference on Computer, Communication and Convergence (ICCC 2015) doi: 10.1016/j.procs.2015.04.179

ScienceDirect

International Conference on Intelligent Computing, Communication & Convergence

(ICCC-2014)

Conference Organized by Interscience Institute of Management and Technology,

Bhubaneswar, Odisha, India

Architecture Based Materialized View Evolution: A Review

Anjana Gosain

a

, Sangeeta Sabharwal

b

, Rolly Gupta

c*

aProfessor, USICT,Guru Gobind Singh Indraprastha University, Delhi, India bProfessor, NSIT, Delhi University, Delhi, India

cResearch Scholar, NSIT, Delhi University, Delhi, India

Abstract

Data Warehouse evolution is a critical problem in present scenario due to perpetual transactions and change in their structure arising out of continual evolving users' requirements. Handling properly all type of changes is a crucial process as it forms the core component of the modern DSS. Therefore DW has to be updated periodically according to different type of evolution of information sources. The problem of evolving an appropriate set of views is subjected to as the materialized view evolution problem. Many different materialized view evolution methods have been proposed in the literature to address this issue. This paper provides a survey of materialized view evolution methods. The paper aims at studying the materialized view evolution in relational databases and data warehouses as well as in a distributed setting. It defines an evolutionary approach for highlighting the materialized view evolution problem by identifying the three main dimensions that are the basis in the classification of materialized view evolution methods namely; (i) Framework, (ii) Architecture and (iii) Model/Design Model. This study reviews architecture based materialized view evolution methods, by identifying respective potentials and limits.

Keywords: Architecture ; View Maintenanc;, Materialized view evolution

© 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Peer-review under responsibility of scientific committee of International Conference on Computer, Communication and Convergence (ICCC 2015)

(2)

* Corresponding author. Tel.: 0-98-99-148-597

E-mail address:[email protected]

1. Introduction

Materialized views act as a data cache that gather information from distributed databases and support faster and reliable availability of already computed intermediate result sets (i.e. responses to queries). Evolution in data warehouse may be generated by change in schema, changes in software and the change in data warehouse requirements. Materialized view evolution approach focuses on choosing materialized views in the design process of data warehouses or maintaining a materialized view in response to data changes or to data sources changes and sometimes to monitor the DW quality under schema evolution. Whenever the underlying base relation is modified the corresponding materialized view also evolves in reaction to those changes so that it can present quality data at the view level.

The materialized view evolution issue has been investigated in several contexts: query optimization, warehouse design, data placement in a distributed setting, web databases, etc. Many diverse solutions to the materialized view evolution problem have been proposed and analyzed through surveys [Dhote et. al. 2009, Halevy 2001, Labrinidis et. al 2009]. However, none of the above mentioned surveys provides a classification of materialized view evolution approaches in order to identify their advantages and disadvantages. Our survey fills this gap.

The goal of the materialized view evolution is to simplify the design, implementation, maintenance and management of data warehousing approaches. Therefore, we classified the materialized view evolution into following dimensions -- Framework, Architecture and Model/Design Model. Based on the methods involved in evolution of materialized views in a data warehousing dimensions, they can be categorized further. So, the taxonomy used for further classification is – View Evolution, Basic View Maintenance, Incremental VM, Self Maintainable Maintenance, Not self Maintainable Maintenance, View selection, View Synchronization, View Adaptation. We present a comparative study of the various research works explored in context of architecture based dimensions and methods. The rest of the paper is organized as follows: Section 2 presents a comparative study of the various research works explored. Section 3 presents the reviews and result. Finally section 4 contains the conclusion and discusses open issues.

2. Comparative Study

We have analyzed architecture based materialized view evolution methods on several parameters and presented their comparative results in the table below:

TABLE1: COMPARISON OF ARCHITECTURE BASED MVE METHODS

S. No Authors Tech nique s /Cate gory Adap ted Issues Addresse d/ Changes Handled Architec ture support/ perspect ive Metho d’s Activiti es/ Goals Address ed attribute s Appli cable frame work stage Advanta ges Disadva ntages Types of Queri es/ Opera tion Tool Supp ort/ Impl emen tatio n 1. Sumi Helal, et al.

IVM manual and automatic hoarding 3-Tier Architectur e flexible synchroni zation accessibility , availability, and consistency VM Ubiquitous data access fierce competitio n addressing RM Coda-based 2. Janet L. Wiener, et.al. VM autonomous sources + WHIP prototype Distribute d scalability Modularity, VM Modular and scalable Issues of crash RM C++ and C

(3)

constantly update CORBA objects scalability,c onsistency, VM recovery 3. Ching-Ming

Chao, et.al. VM Change detection and warehouse maintence OO Data Model/ Architectur e Increment al and deferred maintenan ce Response time and storage space efficiency VM Storage and maintenanc e of warehouse Problem of querying web Warehouse OOM Java/A SDK 4. Hartmut Liefke, et. al. VM hierarchical semistructur ed data and relational data WHAX Framework Optimizat ion and restructuri ng operations Data efficiency VM Multi-linearity, generalizes several techniques --- Hierarch ical & RM VDL 5. Aristides Triantafillakis, et.al. IVM Refreshmen t in federation of DW Multi-agent architecture Triggers Hyper-view approach Referential integrity, Information Quality, Adopting system VM Complete refreshment process Seek Data cleansing, merging and customizat ion RM SQL Server 2000 6. Miranda Chan, et.al. IVM Aggregation and summary information Web based architecture MIN/MA X process Availability , Better performanc e VM No recomputati on Refresh-Join operation can be handled RM Experi mental Analysi s 7. I. Stanoi, et.al. VM Cache

consistency Architectur e for mobile views Multiversi oning Consistency VM Multiversio ning approach No DW mobiltity/ source mobility Hierarch ial Model ---

8. Gary Yeung lVM Continuous updation of DW views as transaction IIVM Architectur e Multi-agent approach, Fuzzy agent schedulin g system Availability , Data consistency, scalabilitym aintainabilit y VM Availability of data connection pooling issues, System message transmissi on disorders. RM and OO Model Java, SQL Experi mental Analysi s 9. Cecile Favre,

et.al. VE optimization strategies Workload Updating Architectur e

Algorithm

ic Coherent D/VM Pro-active method Tool is needed RM C/S Archite cture, ORAC LE, PHP 10. J. A. Nasir, et.al. D/VM Support dynamics Virtualizati on based architecture synthetic warehous e scalability D/VM scalable architecture Implement ation not done RM Exampl e based study 11. Joseph M. Firestone VE DSS/DW Architectura l evolution Evolution Architectur e Defined architectu re Data quality, Information quality D/VM analyzed architecture s & tools Only Theoretica l RM --- 12. Abdessamad Mouzoune VM E-maintenance CogAff architecture Intelligent Agents Better Performanc e VM E-maintenanc e conceptual framework Tool developme nt RM Case study 13. M. Levent

Koc, et.al. IVM Incrementally maintain classificatio n Novel hybrid architecture Determini stic Algorithm s

Scalability IVM non-incremental approaches employ other techniques RM Postgre SQL8.4 14. Tho Manh Nguyen, et.al. D Automate & control feedback loops in minimal latency Zero latency DWA Grid based sense & response services scalability,a vailability D/VM Tackles resource limitation BIA processing Scalability & transaction recovery issues RM Experi mental Approa ch 15. Diva de S. e Silva, et.al. D Use of heterogeneo us DBMS HEROS CCE process High quality D/VM heterogeneo us architecture DBMS Implement ation can be done RM Case study

16. Iain Bate, et.al D Distributed

type design Architecture for DIS Architectural based Reduce complexity D/VM Development of distributed types of system Need to study the technical aspects also OO

Model DAME Case study 17. Vayu D/VM Easy maintenance EDW Architectur Heterogen eous

Efficiency D/VM real time application Require Implement OO Model Case study

(4)

& extension

of system e technologies into ODS ation 18. Catalin, et. al. D Storing &

processing of large data sets Distributed parallel architecture Distribute d & Parallel technolog ies Efficiency D Proposed an architecture Require Implement ation OO Model --- 19. José Samos, et.al. D Evolutionar y aspects Integrated DB Architectur e Extended architectu re --- D Database architecture Require Implement ation RM --- 20. Hugh J. Watson, et.al. D Understand factors of DWA selection DW Architectur e Survey based Efficient D Studied topics of Data warehousin g field More study required --- ---

21 Erik Veerman D/VM Creation of SMP based DW Fast track DW Architectur e Reference configurat ion, BI Efficient design D pre-configured architecture Require further study OO Model Exampl e-based study 22. Rajdeep Chowdhury, et.al. D Implement hybrid based architecture Hybrid

DWA Data Model --- D Proposed hybrid architecture Require implement ation --- Case study 23. Joseph M. Firestone D Collaboratio n of AKBMS EAKMS architecture AKM approach --- D Evolve DW into AKMS components More work to be done --- --- 24. Hamid R. Nemati, et.al. D Integrationo f KM, DSS, AI & DW Knowledge warehouse architecture Integratio n technique Improve DM D Extension of DW model Further work required --- ---

IVM: Incremental View Maintenance, VM: View Maintenance, VE: View Evolution, D:Design, RM: Relational Model, OOM: Object Oriented Model.

3. Review and Results

Following sub-sections provide a review of materialized view evolution in a data warehousing architecture based dimension using following aspects: techniques, tool supported, attribute addressed, type of queries, applicable framework, issues addressed, method supported, process support along with advantages and disadvantages.

3.1. Technique

In this aspect, we try to simplify the design, implementation, maintenance and management of data warehousing approaches. Summarizing materialized view evolution i.e. which methods involved in evolution of materialized views in a data warehousing architecture dimension, is most focused viz. View Evolution, Basic View Maintenance, Incremental VM, Self Maintainable Maintenance, Not self Maintainable Maintenance, View selection, View Synchronization, View Adaptation. The study found 2 authors discussing view evolution[9,11], 6 authors discussing basic VM [2,3], 5 authors discussing IVM [1], 12 authors discussing design [23], but none on view synchronization, self Maintainable Maintenance, Not self Maintainable Maintenance, View selection, and View Adaptation, giving a total of 24 papers under the stated methods in a data warehousing architecture dimension (Fig. 1).

3.2 Tool supported

In order to be effective and useful, methods involved in evolution of materialized views in a data warehousing architecture dimension must be implemented or analyzed effectively. Theoretical analysis or experimental analysis of methods involved in evolution of materialized views can also be done. The former means that the methods measure what it purports to measure. Experimental analysis involves carrying out controlled experiments or case studies etc. to gather empirical data about the methods and then using statistical techniques to gain confidence in the gathered data. Authors in [2] provided the implementation of materialized view evolution methods using front-end or back-end languages. Theoretical analysis has been carried out for the materialized view evolution methods in [10] by the respective authors. While, experimental analysis has been carried out in [6] by the

(5)

respective authors. Although the methods in [7] have not been analyzed nor implemented at all by the authors. As can be inferred, theoretical analysis have been used more than any other analysis technique for methods in a data warehousing architecture dimension (Fig.2).

Fig.1. Classification of Technique Supported Fig.2. Tool Supported Classification

3.3. Attributes Addressed

The materialized view evolution methods considered in this study focus on various external quality attributes like accessibility[1], scalability[2],consistency[7], effectiveness[3], etc . But most of papers lacked implementation results for validating their claims. There are also some materialized view evolution methods which have not been associated with any external quality attribute.

3.4. Types of queries

Based on the types of queries, the relational model [1] have been most frequently used by the authors for addressing materialized view evolution methods, while others used Object oriented [3] or hierarchical model.There are some materialized view evolution methods which have not been associated with any particular model by the authors. (Fig.3)

Fig.3. Classification of types of queries

3.5. Applicable framework

The materialized view evolution methods considered in this study also focus on applicable framework stage. The study found 11 authors discussing view maintenance as applicable framework stage, while 12 authors discussing design as applicable framework stage. This concludes that VM is the most applicable framework stage apart from

(6)

others.

3.6. Issues addressed

In general, there are two type of algorithms for imposing constraints i.e. - view maintenance algorithms and view update algorithms. Some authors have addressed issues using immediate and deferred data synchronization algorithms [1, 10, 14], but many authors have mentioned other techniques [3] for imposing constraints in order to handle the required changes in the distributed database environment.

3.7. Method Supported

Immediate and deferred data synchronization algorithms can be further classified on the basis of method’s activities as optimization [1], Data integration [2] or others [3].

3.8. Architecture suppored

Different authors have proposed different types of architecture for materialized view evolution. Based on the various perspectives on the role of architecture, they are differently named as WHIP[2], WHAX, IIVM and so on. These architectures are all useful in different situations, however they quantify for different perspective or architecture support.

3.9. Advantages & Disadvantages

Mostly all the authors are handling maintenance anomalies caused by sources data updates, preserving changes and have addressed the designing of algorithm for the materialized view evolution problem in order to reduce recomputation. They have provided a practical implication based architecture for materialized view evolution [14]. But some of the authors have not designed the algorithm nor have provided the validation / implementation studies [15] in regard to materialized view evolution.

4. Conclusion & Future Scope

This study provides a critical survey of different approaches in which the materialized view evolution has been studied in relational databases and data warehouses as well as in a distributed setting. We have defined formally the materialized view evolution problem and identified the main materialized view evolution dimensions along with materialized view evolution methods have been classified. Based on the classification, we have discussed architecture based materialized view evolution methods.

Analysis of state of the art of materialized view evolution has shown that there is very few work on materialized view evolution in distributed databases and data warehouses[Bauer et. al 2003, Chaves et. al 2009, Yang et. al 2005] and no effective solution for peer to peer systems. Indeed, [Gribble et. al 2001] seems to be the only paper which deals with the view evolution problem in peer to peer environment. In fact, it is provided a full definition of the problem but without providing any algorithm or detail on how to select an effective set of views to materialize and place them at appropriate peers. Thus, one of challenging directions of future work aims at addressing the materialized view evolution problem more efficiently in a distributed setting and semantic web databases.

References

1. Sumi Helal, Joachim Hammer, Jinsuo Zhang, and Abhinav Khushraj, “A Three-tier Architecture for Ubiquitous Data Access”. AICCSA '01 Proceedings of the ACS/IEEE International Conference on Computer Systems and Applications , Page 177.

(7)

2. Janet L. Wiener, Himanshu Gupta, Wilburt J. Labio, Yue Zhuge, Hector Garcia-Molina, Jennifer Widom, “A System Prototype for Warehouse View Maintenance”, In Proceedings of the ACM Workshop on Materialized Views: Techniques and Applications, Montreal, Canada, June 7, 1996.

3. Ching-Ming Chao, Po-Zung Chen, Shih-Yang Yang, ” Change Detection and Maintenance of an XML Web Warehouse” Tamkang Journal of Science and Engineering, Vol. 8, No 4, pp. 299-312 (2005).

4. Hartmut Liefke, Susan B. Davidson,” View Maintenance for Hierarchical Semistructured Data”, Published in Lecture Notes in Computer Science, Volume 1874, Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery 2000 (DaWaK 2000), pages 114-123, 2000

5. Triantafillakis, A., Kanellis, P & Martakos, D (2002), “Data Warehouse Clustering on the Web”. In the Proceedings of the 13th International Workshop on Database and Expert Systems Applications, Aix-en-Provence, France, pp.800-804.

6. Miranda Chan, Hong Va Leong, Antonio Si, “Incremental Update to Aggregated Information for Data Warehouses over Internet” DOLAP 2000 ACM.

7. I. Stanoi, D. Agrawal, A. El Abbadi, S. H. Phatak, B. R. Badrinath, “Data Warehousing Alternatives for Mobile Environments”, ACM 1999. 8. Gary Yeung, “Multi-Agent Framework for Immediate Incremental View Maintenance in Data Warehousing” , IEEE Transaction on Systems, Man, and Cybernetics Society, March 2005.

9. C´ecile Favre, Fadila Bentayeb, and Omar Boussaid, “Evolution of Data Warehouses’ Optimization: A Workload Perspective” DaWaK 2007, LNCS 4654, pp. 13–22, 2007.

10. J. A. Nasir, M. Khurram Shahzad, “Architecture for Virtualization in Data Warehouse” Innovations and Advanced Techniques in Computer and Information Sciences and Engineering, 243–248. 2007 Springer.

11. Joseph M. Firestone,” Architectural Evolution in DataWarehousing and Distributed Knowledge Management Architecture” White Paper No. Eleven July 1, 1998.

12. Abdessamad Mouzoune, “Towards an intelligence based conceptual framework for e-maintenance” (2012).

13. M. Levent Koc, Christopher R´e, “Incrementally Maintaining Classification using an RDBMS”, Proceedings of the VLDB Endowment, Vol. 4, No. 5, August 29th September 3rd 2011, Seattle, Washington.

14. Tho Manh Nguyen, A Min Tjoa, “Zero-Latency Data Warehousing (ZLDWH): the State-of-the-art and experimental implementation approaches”, International Conference on Research, Innovation and Vision for the Future, Feb. 12-16, 2006, Pages 167 – 176.

15. Diva de S. e Silva, Sean W. M. Siqueira, Elvira Mª A. Uchôa, Mª Helena L. B. Braz, Rubens N. Melo,” An Architecture for Data Warehouse Systems Using a Heterogeneous Database Management System” , published in book on Heterogeneous Information Exchange and Organizational Hubs, 2002.

16. Iain Bate, Malihe Tabatabaie, “Architecture of Distributed information system (Using DAME case study)”, Thesis, 2007. 17. Vayu, “Architecture of Enterprise Data Warehouse” Report 2005-2008.

18. C. Boja, A. Pocovnicu, “Distributed Parallel Architecture for Storing and Processing Large Datasets”, Recent Researches in Engineering Education and Software Engineering, ISBN: 978-1-61804-070-1.

19. José Samos, Fèlix Saltor, Jaume Sistac, and Agustí Bardés, “Database Architecture for Data Warehousing: An Evolutionary Approach”, In proceedings of 9th Int. Conf. on Database and Expert Systems Applications (DEXA'98).

20. Hugh J. Watson, Thilini Ariyachandra “ Data Warehouse Architectures: Factors in the Selection Decision and the Success of the Architectures” report, July 2005.

21. Erik Veerman, “An Introduction to Fast Track Data Warehouse Architectures” SQL Server 2008.

22. Rajdeep Chowdhury, Bikramjit Pa,” Proposed Hybrid Data Warehouse Architecture Based on Data Model”, International Journal of Computer Science & CommunicationVol. 1, No. 2, July-December 2010, pp. 211-213.

23. Joseph M. Firestone, “Knowledge Base Management Systems and The Knowledge Warehouse: A "Strawman" KM ANSI/ISO Standards Committee Meeting January 29, 1999.

24. Hamid R. Nemati, David M. Steiger, Lakshmi S. Iyer , Richard T. Herschel, “Knowledge warehouse: an architectural integration of knowledge management, decision support, artificial intelligence and data warehousing”, Decision Support Systems 33 (2002) 143– 161.

References

Related documents

The algorithm will only match applicants employers preferred for NESP positions to NESP positions and applicants employers preferred for NETP positions to NETP positions. This way

To combine the estimated prevalence rates of autism in adults with learning disabilities who live in private households and who live in communal care establishments with the

Various mechanisms for rewriting graphs have been pro- posed, ranging from the general-purpose graph transformation approaches by pushout construc- tion [ CMR + 97 , EHK + 97 , EK06

Many published studies report that social circumstances during childhood are associated with later health during adulthood or older ages (among the most distinguished

Before the meeting all participants have been asked to feed back their interest and their research ideas to the organizers. Sensor- Nets, Google Maps). In the very first session

The soil and general conditions at each test site were assessed in the field and described in accordance with the information presented in the Field Manual for Describing Soils

Finally, because the decline in the degree of financial integration in 2008 explains a small percentage of the fall in the value of total capitalization (1.4%), if this retrenchment

We identified alpha- 1-acidglycoprotein and C1 inhibitor as up regulated and transthyretin, retinol binding protein and apolipoprotein A-I as down regulated proteins in