• No results found

THE SUCCESSION OF DATA WAREHOUSE USING OBJECT ORIENTED APPROACH

N/A
N/A
Protected

Academic year: 2020

Share "THE SUCCESSION OF DATA WAREHOUSE USING OBJECT ORIENTED APPROACH"

Copied!
6
0
0

Loading.... (view fulltext now)

Full text

(1)

THE SUCCESSION OF

DATA WAREHOUSE

USING OBJECT ORIENTED

APPROACH

1.DR. (MRs PUSHPA SURI)

Reader, Deptt. Of Computer Science. & Applications, Kurukshetra University, Kurukshetra

Haryana, India

2.MRs. MEENAKSHI SHARMA

Sr. Lecturer Deptt of Computer Science & Engineering in H.C.T.M, Kaithal

Haryana, India

Abstract:

This paper describes techniques for designing both front & back end of a data warehouse in such a way that companies can continue to progress their warehouse and query tools as their business changes, customer demands changes, instead of continuously having to restructure and rewrite their existing tools. Object –Oriented approach are driving the current development of ODBMS that will handle complex objects, inheritance and other features that enable direct implementation technology that is being extended to combine data management capabilities with application of logical rules to provide more refined information to management. Object-oriented approach is used in different areas i.e Software Engg, Database Management System and Latest Approach is used in Data warehouse. Object-Oriented envelopment involves more than learning a program, it includes capability for development. A purely Object –Oriented tool like Java, Smalltalk , or the object –oriented use of C++ requires that problem domains be conceptualized in terms of the paradigm inherent in object technology. Most research in Object-oriented programming (OOP) including object –oriented database Management systems (OODBMS), is concerned with supporting users which are not served well by more conventional technology. Our research has been from a different point of view – our primary motivating factor is to show how existing applications can be enhanced using object –oriented Technology.

Keywords: Data warehouse, Object-Oriented, Data Model, Object relational, Database

1. Preamble

(2)

2.Interface in Object Oriented System

The user interface is the visual representation of the data in the warehouse. No matter how well you structure your data warehouse, if the user does not have an easy-to-use interface, structured to make changes quickly and cost-effectively, the warehouse will die. An object is the basic building of block of object – oriented programming. In OOP the concept of class is followed and objects are the variables of type class. The benefit of this technique is objects can be added and removed without affecting pieces of the application. The benefit in the design of OLAP tools is that parts of tools can be changed or removed without making major changes to the entire application because all interaction occurs through the model.

2.1 Front End Tools

A popular conceptual model that influences the front-end tools, database design, and the query engines for OLAP is the multidimensional view of data in the warehouse. In a multidimensional data model, there is a set of numeric measures that are the objects of analysis. Examples of such measures are sales, budget, revenue, inventory, ROI (return on investment). Each of the numeric measures depends on a set of dimensions, which provide the context for the measure. For example, the dimensions associated with a sale amount can be the city, product name, and the date when the sale was made. The dimensions together are assumed to uniquely determine the measure. Thus, the multidimensional data views a measure as a value in the multidimensional space of dimensions. Each dimension is described by a set of attributes. For example, the Product dimension may consist of four attributes: the category and the industry of the product, year of its introduction, and the average profit margin. For example, the soda Surge belongs to the category beverage and the food industry, was introduced in 1996, and may have an average profit margin of 80%. The attributes of a dimension may be related via a hierarchy of relationships.

3. Cargo Space in Data warehouse

A data warehouse is a mechanism for data storage and data retrieval. Data can be stored and retrieved with multidimensional structure –hypercube or relational, a star schema structure or several other data storage techniques. In the interest of space, I am going to leave out the discussion of cleansing, transformation, replication and meta-data, however these are also important issues that need to be addressed and implemented in your data warehouse to ensure success.

3.1 Back End Tools and Utilities

Data warehousing systems use a variety of data extraction and cleaning tools, and load and refresh utilities for populating warehouses. Data extraction from “foreign” sources is usually implemented via gateways and standard interfaces (such as Information Builders EDA/SQL, ODBC, Oracle Open Connect, Sybase Enterprise Connect, Informix Enterprise Gateway).

Data Cleaning

Since a data warehouse is used for decision making, it is important that the data in the warehouse be correct. However, since large volumes of data from multiple sources are involved, there is a high probability of errors and anomalies in the data.. Therefore, tools that help to detect data anomalies and correct them can have a high payoff. Some examples where data cleaning becomes necessary are: inconsistent field lengths, inconsistent descriptions, inconsistent value assignments, missing entries and violation of integrity constraints. Not surprisingly, optional fields in data entry forms are significant sources of inconsistent data.

Load

(3)

builds up a new database. While it is in progress, the current database can still support queries; when the load transaction commits, the current database is replaced with the new one. Using periodic checkpoints ensures that if a failure occurs during the load, the process can restart from the last checkpoint.

Refresh

Refreshing a warehouse consists in propagating updates on source data to correspondingly update the base data and derived data stored in the warehouse. There are two sets of issues to consider: when to refresh, and how to refresh. Usually, the warehouse is refreshed periodically (e.g., daily or weekly). Only if some OLAP queries need current data (e.g., up to the minute stock quotes), is it necessary to propagate every update. The refresh policy is set by the warehouse administrator, depending on user needs and traffic, and may be different for different sources. Refresh techniques may also depend on the characteristics of the source and the capabilities of the database servers. Extracting an entire source file or database is usually too expensive, but may be the only choice for legacy data sources. Most contemporary database systems provide replication servers that support incremental techniques for propagating updates from a primary database to one or more replicas. Such replication servers can be used to incrementally refresh a warehouse when the sources change. There are two basic replication techniques: data shipping and transaction shipping.

4. Utilization of Object Oriented Approach in Different Areas

4.1 Object Oriented Software

Object-oriented software is all about objects. An object is a "black box" which receives and sends messages. A black box actually contains code (sequences of computer instructions) and data (information which the instructions operate on). Traditionally, code and data have been kept apart. For example, in the C language, units of code are called functions, while units of data are called structures. Functions and structures are not formally connected in C. A C function can operate on more than one type of structure, and more than one function can operate on the same structure. Not so for object-oriented software! In o-o (object-oriented) programming, code and data are merged into a single indivisible thing -- an object. This has some big advantages, as you'll see in a moment. But first, here is why SDC developed the "black box" metaphor for an object. A primary rule of object-oriented programming is this: as the user of an object, you should never need to peek inside the box! [10].

4.2 Object Oriented DBMS

An object-oriented database management system (OODBMS), sometimes shortened to ODBMS for object database management system), is a database management system (DBMS) that supports the modelling and creation of data as objects. This includes some kind of support for classes of objects and the inheritance of class properties and methods by subclasses and their objects. There is currently no widely agreed-upon standard for what constitutes an OODBMS, and OODBMS products are considered to be still in their infancy. In the meantime, the object-relational database management system (ORDBMS), the idea that object-oriented database concepts can be superimposed on relational databases, is more commonly encountered in available products. An object-oriented database interface standard is being developed by an industry group, the Object Data Management Group (ODMG). The Object Management Group (OMG) has already standardized an object-oriented data brokering interface between systems in a network.

Figure 1: Object Relational Database

4.3 Objects-Relational Database

(4)

an object environment to users of their system. In general, we can regard O-RDBMS architecture as shown in figure 1. Like any other systems, O-R interfaces obtain requirement data and deliver the corresponding object data from the O-RDBMS to the applications. These interface components ensure a transparent access from application outside the system to data storage in its databases. The Object-Relational engine is object-based environment and bridges the object environment and relational database. It not only manages the native SQL data types (such as integer, number, date, char) but also object data types, which are user-defined or system predefined object types. Like any classes in an object oriented programming language, these object types include ‘attributes’ holding the data and ‘methods’ manipulating their behaviors. Consequently, the object-relational query language trends to support user-defined functions and operators.[11]

For example, given the object-relational schema and a typical object-relational query [Ston97]: Create EMP-OR (name=C12, age=int, salary=int,

dept=C12,location=point, picture=image); Select name

Form EMP-OR

Where beard (picture) > 0.7 and Age > 60 and

Location in circle (“10,10”, 5);

Comparing with traditional relation, the two new additional fields that hold data in two new data types are “geographic point” and “image”. In the query, “beard(picture)” and “in” are user-defined operators.

5. O-R Data warehouse

In this section we propose O-R DW architecture, given in figure 2, based on logical architecture proposed in [WuBu97]. The differences of these architectures are “the object-orientation” approach and the new metadata layer. With the object-oriented approach, most layers of this architecture -but the “Data Store” layer- consist of many objects of various object types, which perform underlying functions of each component. In this architecture, the data flow is similar to other data warehouse architectures [KRRT98], [Fire97a], [Orr96], [WuBu97] where data is collected from diverse operational database systems, summarized, aggregated and integrated in a data warehouse, and used as read-only data to supports complex analysis. This architecture consists of the components described as follows:

Figure 2: The ORDW Architecture

1. Application interface layer

(5)

of the data warehouse, the objects of this layer will provide functions to manage user services, control the updating, maintaining processes of the data warehouse. That means, new user services can be added in this layer to support new user requirements if needed.

2. Data Acquisition:

The Data Acquisition component can be considered as a tool that constructs the data engine of the data warehouse. The data acquisition objects will extract, transform and transfer data from different legacy operational data stores (ODS) to the data warehouse O-R database.

The functions of this component are divided into suitable sub-function levels that are performed by pattern object types, e.g., this component has various classes, such as: Extracting Service, Transforming- Service, Loading Service, etc.

3. Data Warehouse Management:

As a component of the data management layer, this component directly accesses data of the data warehouse from the O-R database. It provides services, which bridge the application interface layer and the O-R database. In this component, different methods can be applied to access data stored in the O-R database. Furthermore, the database access methods can be updated or added to improve the performance of the data warehouse. The division of the data management layer into two individual components allows us to clearly distinguish between read-only data processing in data warehouse and data input processing. The functions of this component are mainly to read available data, and to create new materialized views based on this data.

4. Metadata

With regard to metadata in an object-oriented way, we define the behaviors for metadata objects depended on its roles. For instance, metadata can itself count its accessed frequency, make statistics of query usages, and so on. That means that many questions about the warehouse operations can be easily answered by directly querying metadata, e.g., how many reports were created in a day? How often is one kind of data used?

5. Data store

The data stored in O-R DW differs primarily from DW in relational environment and object-oriented data warehouse. Depending on the requirements and data types, O-R DW designers can decide to model it as a “cube”, like MOLAP (Multidimensional OLAP), or as object hierarchy, like O3LAP (Object-Oriented OLAP). For instance, in O-R DW, simple data can be modeled in multidimensional structures looking like what have done in relational database systems [KRRT98], [WuBu97], [Fire97b]. Otherwise, complex data, user-define data can be modeled in object hierarchical structures as suggested for OODBMS [BeMa93]. Furthermore, the objects of any layers, particularly metadata objects, can be modeled in the O-R Data Base.

6. Conclusion

(6)

References

[1] Bertino E.(1991): Method precomputation in object–oriented databases. Proceedings of ACM– SIGOIS and IEEE–TC–OA

International Conference on Organizational Computing Systems.

[2] Bertino E., Catania B., Chiesa L.(1998): Definition and Analysis of Index Organizations for Object-Oriented Database Systems.

Information Systems, Vol.23, No.2, pp. 65-108.

[3] Dobrovnik M., Eder J. (1994): Adding view support to ODMG-93. Proceedings of the International Workshop on Advances in

Databases and Information Systems.

[4] Dobrovnik M., Eder J. (1996): Logical data independence and modularity through views in OODBMS. Proceedings of the Engineering

Systems Design and Analysis Conference, Vol. 2, pp. 13-20.

[5] Dobrovnik M., Eder J. (1998): Partial Replication of Object–Oriented Databases. Proceedings of the Second East-European

Conference on Advances in Databases and Information Systems – ADBIS'98. Poland, LNCS No. 1475, pp. 260-271.

[6] Eder J., Frank H., Liebhart W. (1994): Optimization of Object–Oriented Queries by Inverse Methods. Proceedings of East/West

Database Workshop, Austria.

[7] E. Bertino et al(1992): Object Oriented Query Languages: The Notion and Issues, IEEE TKDE, vol4, No.3, June.

[8] Gupta A., Mumick I.S. (1999): Materialized Views: Techniques, Implementations, and Applications. The MIT Press.

[9] GRAY W. HANSEN.JAMES V.HANSEN, “Data Base Management System”.

[10] Terry Montlick (1999): “Object Oriented Software”, Copyright by software Design Consultants, LLC.

Figure

Figure 1: Object Relational Database
Figure 2: The ORDW Architecture

References

Related documents

typedef int NSInteger;.. Observe the @property Objective-C 2.0 syntax. This syntax declares that the mem- ber variable will have two instance methods, one to set the value,

Many users, however, observe that popular systems such as BitTorrent (employing tit-for-tat as incentive mechanism), are often ineffective at fulfilling a set of key

Such a concept might help to arrange European cultural identities (national and regional ones) not being detrimental to a common European identity, although it remains

Banda Aceh: Balai Pustaka Pelastarian Sejarah dan Nilai Tradisi Aceh. Investigating

Pri analizi postopkov urejanja nepremičnin v javnih evidencah predstavljamo postopke evidentiranja nepremičnin v zemljiškem katastru (urejanje in evidentiranje meje, spreminjanje

The study identified 40 EA benefits that are grouped into five categories (operational, managerial, strategic, IT infrastructure and organizational) and

Enter the IP source address for Cisco IOS Telephony Services :10.90.0.1 Enter the Skinny Port for Cisco IOS Telephony Services : [2000]:2000 How many IP phones do you want to

Any companion brighter than 0.3% the brightness(V-band) of the primary would have been detected... The red points are from the SOPHIE spectrometer at Obs. Lower left) Phase