• No results found

Applying MDA and universal data models for data warehouse modeling

N/A
N/A
Protected

Academic year: 2021

Share "Applying MDA and universal data models for data warehouse modeling"

Copied!
6
0
0

Loading.... (view fulltext now)

Full text

(1)

Applying MDA and universal data models for data warehouse

modeling

MARIS KLIMAVICIUS

Department of Applied Computer Science

Riga Technical University

Meza iela 1/3-506, LV-1048, Riga

LATVIA

maris.klimavicius@gmail.com

ULDIS SUKOVSKIS

Department of Applied Computer Science

Riga Technical University

Meza iela 1/3-506, LV-1048, Riga

LATVIA

uldis.sukovskis@cs.rtu.lv

Abstract: - Business process monitoring provides an invaluable means of an enterprise to adapt to changing conditions. Data warehouse stores the process data which is foundation for business process monitoring applications. Development of such applications by using traditional methods is challenging because of the complexity of integrating business processes and existing information systems. Different modeling approaches have been proposed to overcome every design pitfall of the development of data warehouse systems. On the other hand, model driven architecture is an approach to develop applications from domain-specific models to platform-sensitive models – that bridges the gap between business processes and information technology. Model driven architecture is a standard framework for software development that addresses the complete life cycle of designing, deploying, integrating, and managing applications by using models in software development. Authors propose to use model driven approach for data warehouse development. Also the concept of universal data models was introduced in order to ease data warehouse development by providing standard data objects. This paper introduces the overall concept of applying model driven architecture and universal data models to development process of data warehouse.

Key-Words: - Data warehouse, Business process modeling, Model driven architecture, Universal data models

1 Introduction

During the last ten years, the interest to analyze data has increased significantly, because of the competitive advantages that information can provide for decision-making process. Data warehouse systems represent a single source of information for analyzing the development and results of an enterprise organization in a changing environment. The data in the data warehouse describes events and a status of business processes, products and services, goals and organizational units. Nowadays, a key to survival in the business world is ability to analyze, plan and react to changing business conditions as fast as possible. However the ability to change is bound to many constraints, such as staff knowledge,

business supporting systems, etc. Business

operations depend on enterprise information systems, it mean that changes in business processes requires changes in supporting information systems. A change in operational information systems requires changes in data warehouse, which is a central repository of atomic and summarized data from different operational systems.

Data warehouses integrate data from multiple heterogeneous information sources and transform

them into a multidimensional representation for decision support applications.

Research in the field of data warehousing and online analytical processing has produced important technologies for the design, management, and use of information systems for decision support. Much of the interest and success in this area can be attributed to the need for software and tools to improve data management and analysis. However, despite the continued success and maturing of the field, much research remains to be done across many different areas of data warehousing

In particular, data warehousing applications require improved and standardized conceptual modeling techniques as well as novel approaches to dealing with data quality issues. Considering that data that needs to be stored in the warehouse is getting more and more complex in both structure and semantics while the analysis must keep up with the demands of new applications. Therefore, there is still a lot of effort to put into developing advanced methods and

standards for data warehouse development

framework. Proposed approach is based on the idea that requirements for data warehouse can be elicited from business process models [12].

(2)

2 Related work

Different approaches for the conceptual and logical design of data warehouse systems have been proposed in the last few years. In this section, authors present a brief discussion about some of the important approaches.

While the standardization of metadata is discussed in numerous domains resulting in a different metadata standards, the specific requirements of data warehousing solutions are usually addressed insufficiently [1].

In [2], the authors present the multidimensional model, a logical model for OLAP (On Line Analytical Processing) systems, and show how it can be used in the design of multidimensional databases. The authors also propose a general design method, aimed at building a multidimensional schema starting from an operational database described by an ER schema. Although the design steps are described in a logic and coherent way, the data warehouse design is only based on the operational data sources, what we consider insufficient because end users’ requirements and business processes are very important in the data warehouse design.

In [14] authors present an approach to business metadata that is based on the relationship between the data warehouse data and the structure and behavior of enterprise. They use models to derive business metadata, which forms an additional level of abstraction on top of the data-oriented data warehouse structure. Authors of this work also

establish relationship between organization’s

processes and related data though they use business processes and MDA to accomplish it.

There are also several works which address model driven architecture as the solution for data warehouse implementation.

One of the fist works which has been developed for aligning the design of data warehouse with the general MDA paradigm, the model driven data warehouse [3]. This approach is based on the Common warehouse metamodel [4], which is a platform-independent metamodel definition for

interchanging data warehouse specifications

between different platforms and tools. However, Common warehouse metamodels are too generic to represent all peculiarities of multidimensional modeling in a conceptual model and too complex to be handled by both business users and designers [5]. In [6], authors describe how to align the whole data warehouse development process to MDA. They define multidimensional model driven architecture,

an approach for applying the MDA framework to one of the stages of the data warehouse development: multidimensional modeling. They also describe how to build the different MDA artifacts by using extensions of the UML. In this approach transformations between models are clearly and

formally established by using the

Query/View/Transformation approach. However, the authors are not addressing requirements gathering stage. Requirements are specified in CIM stage which is performed manually.

3 Data warehouse development

Most techniques that are used by organizations to build a data warehousing system use either a top-down and bottom-up development approach. In the top-down approach [8], an enterprise data warehouse is built in an iterative manner, business area by business area, and underlying dependent data marts are created as required from the enterprise data warehouse content. In the bottom-up approach [9], independent data marts are created with the view to integrating them into an enterprise data warehouse at some time in the future. There are still a lot of discussions about the similarities and differences among these architectures, but despite these differences there are two main steps in data warehouse development, which are very closely connected – requirements gathering and information modeling (design). Figure 1 shows typical data warehouse architecture, which might be addressed to any approach. Basically data warehouse has 5 layers. These layers are possible to define as follows: source layer – operational information

systems, integration layer – extraction,

transformation and loading of data into data warehouse, Data warehouse layer – central data storage, Data mart layer – customized data according to needs of users, Application layer – applications for end users to analyze data.

Fig.1. Data warehouse architecture

In the paper authors address the development of central data warehouse component – data warehouse layer.

(3)

4

Concept

of

Model

Driven

Architecture

The idea of Model Driven Architecture was introduced by OMG (Object Management Group) as

an approach to system specification and

interoperability and is inspired by the use of several formal models. The key concepts of the MDA architecture are the default view points on a system specified by the MDA: computation independent, platform independent, platform specific and a code.

CIM PIM PSM PSM CODE CODE ... ... T T T T T

Fig.2. Model driven architecture

In MDA, platform-independent models (PIM) are initially expressed in a platform-independent modelling language. The platform-independent model is subsequently translated to a platform-specific model (PSM) by mapping the PIM to some implementation language or platform using formal rules.

CIM (computation-independent model)

A CIM is also often referred to as a business model because it uses a vocabulary that is familiar to the subject matter experts. It presents exactly what the system is expected to do, but hides all information

technology related specifications to remain

independent of how that system will be

implemented.

PIM (platform-independent model)

A PIM has a sufficient degree of independence so as to enable its mapping to one or more platforms. This is commonly achieved by defining a set of services in a way that abstracts out technical details. That means that PIM does not contain any information specific to the platform or the technology that is used to realize it.

PSM (platform-specific model)

A PSM combines the specifications in the PIM with the details required to stipulate how a system uses a

particular type of platform. If the PSM does not include all of the details necessary to produce an implementation of that platform it is considered abstract.

5 Concept of universal data models

The concept of universal data models was introduced by Len Silverston [7] as an approach to system modeling and is inspired by the use of proven components. A universal data model is a template data model that can be used as a building block to start development of the logical data model or data warehouse data model.

Effective methods for incorporating the universal data models can be summarized as follows:

• Develop the enterprise data model by customizing and adding to the universal data models using the business terms that are commonly known in the enterprise and adding appropriate information requirements.

• Build the appropriate logical data models for each project according to the business requirements for that specific application.

• Create the necessary physical database designs based on the logical data model and the technical requirements.

• Customize the database design to the appropriate target DBMS (database management system). One of the key information issues today is how to develop integrated systems that facilitate consistent information for use by the enterprise. When projects develop their database designs independent of an overall model, the same information items are often implemented in separate tables and sometimes with

different meanings, leading to redundant,

inconsistent data and non-integrated systems.

The universal data models can be used to start an enterprise data model effort, providing the enterprise with a "road map" of their information and showing how information relates to other information. This approach can lead to much more data consistency, data quality, and ultimately to better information to be used to improve the operations of the enterprise. Universal data models can also serve as the basis for a data warehouse design and implementation. Eventually, if universal data models are suitable for enterprise application development, then these data structures are also valid for data warehouse development.

Example of universal data model of invoices and invoice items are shown on figure 3.

(4)

INVOICE ITEM

# INVOICE ITEM SEQ ID * TAXABLE FLAG - QUANTITY - AMOUNT - ITEM DESCRIPTION

SALES INVOICE ITEM PURCHASE INVOICE ITEM adjusted by the adjustment for sold with sold for

INVOICE ITEM TYPE # INVOICE ITEM TYPE ID * DESCRIPTION PRODUCT FEATURE # PRODUCT FEATURE ID * DESCRIPTION PRODUCT # PRODUCT ID * NAME - INTRODUCTION DATE - SALES DISCONTINUATION DATE - SUPPORT DISCONTINUATION DATE - COMMENT INVENTORY ITEM # INVENTORY ITEM ID SERIALIZED INVENTORY ITEM * SERIAL NUM NON SERIALIZED INVENTORY ITEM - QUANTITY ON HAND INVOICE # INVOICE ID * INVOICE DATE - MESSAGE - DESCRIPTION described by the change for the description for billed via billed via billed via the change for the change for part of composed of

Fig.3. Universal data model of invoice item

Invoices, like shipments and orders, may have many items showing the detailed information about the goods or services that are charged to parties. The items on an invoice may be for products, features, work efforts, time entries, or adjustments such as sales tax, shipping and handling charges, fees, and so on.

6 Alignment of MDA and universal

data models to data warehouse

development framework

The authors have previously described MDA approach and universal data models concept. The purpose of this section is to combine these

approaches to data warehouse development

framework. MDA presents computational

independent, platform independent, and platform specific viewpoints. Following these considerations authors present an MDA oriented data warehouse development framework. PI M PI M C O D E C IM

Fig.4. MDA approach of data warehouse development

Following MDA viewpoints can be represented

according to data warehouse development

framework:

 CIM defines the requirements for the data warehouse. It is a viewpoint of the data warehouse from business process perspective. Business processes has a crucial role in data warehouse development. Business processes and universal data models bridge the gap between those that are experts about the domain and process, and those that are experts of the design and construction of the data warehouse.

 PIM defines the data warehouse from a conceptual viewpoint. The major aim at this level is to represent the main data warehouse architecture - logical data warehouse data structures with appropriate attributes without taking into account any specific technology.  PSM defines the data warehouse design from a

certain platform view. For example, a data warehouse can be implemented according to different platforms, such as Common warehouse metamodel (CWM) standard, SQL statements for some particular warehouse platform.

 CODE defines implementation code. 6.1 CIM implementation

As a basis for CIM model serves business process model. Below is the business process diagram illustrating the Seller-initiated Invoice transaction process. This is not the only method by which the process may occur, however, it represents a primary process. Intermediaries, including routing hubs and/or networks, may be involved if necessary.

Fig.5. Invoice transaction business process

6.2 PIM implementation

A Platform independent model is a view of a system from the platform independent viewpoint [4]. This means that the model describes the system hiding the details necessary for a particular platform. From the perspective of data warehouse development this view is logical data warehouse data model. Platform

(5)

independent model consists of integrated view of business process model and appropriate universal data model. Reconciliation Process Invoice Create invoice # INVOICE ID # INVOICE ITEM SEQ ID * INVOICE DATE * CUSTOMER ID * BILL TO ADDRESS ID * ORGANIZATION ID * ORG ADDRESS ID * PRODUCT ID * QUANTITY * AMOUNT * EXTENDED AMOUNT - PRODUCT COST * LOAD DATE INVOICE # CUSTOMER ID # SNAPSHOT DATE * CUSTOMER NAME - AGE - MARITAL STATUS * CREDIT RATING CUSTOMER # SUPPLIER ID # ADDRESS ID * SUPPLIER NAME - POSTAL CODE * LOAD DATE SUPPLIER

Fig. 6. Example of logical data structure of data warehouse

6.3 PSM implementation

A Platform specific model is a view of a system from the platform specific viewpoint. This model represents platform independent model with perspective of how that model will be implemented by chosen platform.

Platform Independent model might be implemented in different ways, for example as XML description of data warehouse data structures.

7 Conclusion

In the paper authors have introduced MDA oriented framework for data warehouse development. This framework addresses the design of the data warehouse system by aligning every development stage of the data warehouse with the different MDA viewpoints. Authors introduced universal data models use in MDA oriented framework. Use of universal data models speeds up and facilitates development of data warehouse system. This approach is useful when process oriented data warehouse is developed. Authors consider that advantages of the approach are seen in the combination of model driven and universal data model’s approach to data warehouse development framework. Both MDA and universal data models are designed to accelerate software development. Authors plan to evolve this approach to include transformation between different viewpoints of

MDA. The aim is to develop fully automated transformation process.

8 Acknowledgments

This work has been partly supported by the European Social Fund within the National Programme „Support for the carrying out doctoral study programm’s and post-doctoral researches” project „Support for the development of doctoral studies at Riga Technical University”.

References:

[1] Staudt, M., Vaduva, A., & Vetterli, T., Metadata Management and Data Warehousing (No.

Technischer Report 99.04 Institut für

Informatik). Zürich: Universität Zürich, 1999 [2] Cabibbo L., Torlone R. A Logical Approach to

Multidimensional Databases. In: Proc. Of the 6th Intl. Conf. on Extending Database Technology (EDBT’98). Volume 1377 of LNCS, pp. 183-197. Valencia, Spain. 1998.

[3] Poole J. Model Driven Data Warehouse (MDDW).

www.cwmforum.org/POOLEIntegrate2003.pdf, 2003.

[4] OMG Common Warehouse Metamodel (CWM) Specification 1.0.1. http://www.omg.org/cgi-bin/doc?formal/03-03-02, 2002.

[5] Medina E., Trujillo J. A Standard for Representing Multidimensional Properties: The Common Warehouse Metamodel (CWM). In

proceedings of the 6th East-European

Conference on Advances in Databases and Information Systems (ADBIS’02), volume 2435 of Lecture Notes in Computer Science, pages

232-247, Bratislava, Slovakia. September,

Springer-Verlag, 2002.

[6] J.Mazón, J.Trujillo, An MDA approach for the development of data warehouses, An MDA

approach for the development of data

warehouses, 1st issue, Vol. 45, Elsevier Science Publishers, 2008.

[7] L.Silverston, The Data Model Resource Book Revised Edition Volume 1, Wiley, 2001

[8] W.H.Inmon, Building the Data Warehouse, 4th

edition, Wiley, 2005

[9] R.Kimball, L.Reeves, M.Ross, W.Thornthwaite - The Data Warehouse Lifecycle Toolkit, John Wiley & Sons (1998)

[10] S.Kent, Model Driven Engineering, Lecture Notes in Computer Science, Vol. 2335, Springer, 2002.

(6)

[11] Marco, D., & Jennings, M., Universal Meta Data Models. New York et al.: Wiley Publishing., 2004.

[12] M.Klimavicius, U.Sukovskis, Business process driven data warehouse development, Scien-tific Proceedings of Riga Technical University, Computer Science Series, Applied computer Systems, 6th issue, Vol. 22, Riga, Latvia, RTU, 2005.

[13] M. Klimavicius, Towards Development of Solution for Business Process-Oriented Data Analysis, Proceedings of World Academy of Science, Engineering and Technology, Volume 27, Cairo, Egypt, 2008

[14] V.Stefanov and B.List, Business Metadata for the Data Warehouse - Weaving Enterprise Goals and Multidimensional Models, International Workshop on Models for Enterprise Computing at the 10th International Enterprise Distributed Object Computing Conference, Hong Kong, China, 2006

References

Related documents

for injection intravenous use 40 mg Slovak Republic Nycomed GmbH Byk-Gulden-Str.2 78467 Constance Germany Controloc 20 mg gastrorezistentné tablety 20 mg Gastro-resistant

The purpose of this study was to investigate the risk factors associated with late preterm births in Sichuan Province, China, and to perform a systematic review of the literature

soil physical, chemical and biological conditions (Jat et al., 2009a; Gathala et al., 2011b); (3) enhancement, in the long term C sequestration and build-up in soil organic matter

Glass, Swing-Out Side, Fixed Rear Cargo Door 179 Interior Upgrade Package (May substitute with Vinyl Floor Covering) 18B Power Group (includes Remote Keyless Entry) 90F Radio,

While many people whom we assist receive income support payments, many also work: in fact, around a third of people living below the poverty line report ‘salary from paid work’ to

Types of financing needed: Of total financing needs for surveyed enterprises, $16.3 million is required for long-term loans, $5.4 million for equity financing, and $2.5 million

We take advantage of an unprecedented change in policy in a number of Swiss occupational pension plans: The 20 percent reduction in the rate at which retirement capital is

The fourth paper then uses the social media maturity dataset, computes maturity scores using different quantitative methods prescribed in maturity models