By Narasimhaiah Gorla
In order for an organization to achieve competitive advantage, voluminous data needs to be managed, analyzed, and fed into the decision-making process. Data warehouses provide decision support to organizations with the help of analytical databases and online analytical processing (OLAP) tools. Incorporating OLAP tools into decision models as part of decision support systems improves decision making [10]. Decision makers can: access analyti- cal databases through an OLAP interface and are able to analyze corporate data on various dimensions; view corporate changes over a period of time, to obtain a macro view of the business operations as well as perform a microanalysis in a specific sub-function; perform various what-if analyses; and drill-down and discover the pattern of sales of certain products in a given period of time or find how the sales performance of an individual salesperson affects the company’s revenues.
These time-order/aggregation/disaggre- gation features provide decision makers valuable insights into the customer/busi- ness behavior, which are of fundamental importance for better decision making.
OLAP tools have benefited organizations in different ways. For example, Lockheed Martin has used OLAP tools in aircraft design and manufacturing data and cut their analyst labor costs up to 20% [4]. As a result of using data warehouse technol- ogy, First American Corporation has trans- formed itself and improved its financial performance from losses to profits [3].
Data warehouse technology in conjunction with OLAP has been useful in improving decision making in the community health care realm, as shown in [1].
However, despite potential benefits of data warehousing and OLAP tools, such projects were difficult to use and failed to realize benefits [9]. Corporations that invest in data warehouses often do not provide tools to end users that they can use easily, resulting in users not utilizing the tools, millions of dollars worth of unused software, and unrealized return on investment [8]. The most important determinants of new technology accep- tance are perceived ease of use (PEU) and perceived usefulness (PU) [6]. PU is defined as the degree to which a person believes that using a particular system
F EATURES TO
C ONSIDER IN A D ATA
W AREHOUSING S YSTEM
Evaluating
and assessing
the important
distinctions
between data
processing
capability
and data
currency.
would enhance his or her job performance. PEU is defined as the degree to which a person believes that using a particular system would be free of effort. In order to derive benefits from OLAP technology, it is important to assess whether the OLAP tools, as an integral part of data warehousing, help or hinder the usage by the end user. Thus, this article is intended: To find the effect of OLAP features on perceived easy of use (PEU) and the perceived usefulness (PU) of OLAP; to provide suggestions for appropriate contexts for use of ROLAP and MOLAP systems; and to pro- vide guidelines for better design of data warehouses with OLAP technology.
Data Warehouse and OLAP
Codd et al. first coined the term OLAP in 1993 as “the dynamic synthesis, analysis, and consolidation of large vol- umes of multidimensional data.” OLAP technology can organize data in multidimen- sional tables called data cubes and provides access to the data warehouse through an interactive GUI (see Figure 1).
Some of the common capabilities of OLAP include:
multidimensionality, aggregation, drill-down and roll-up (view detailed and aggregated data), and slic- ing and dicing.
The most common types of OLAP technology are Multidimensional OLAP
(MOLAP) and Relational OLAP (ROLAP). The differences between the two concern data process- ing capability and data currency [9]. In MOLAP, the data is cleaned, aggre- gated in multiple dimen- sions, and uploaded into a data cube periodically.
The data is stored in mul- tidimensional arrays [2], thus the database has precompiled organization and data arrays that can be accessed directly and
faster. In ROLAP, data is aggregated and stored along with relational databases. ROLAP relies on indices to be built on tables for data access. Users generate queries using SQL on the fly, offering more flexibility in query generation and data currency.
Research Methodology
Measures of PU and PEU. Data was collected for each feature of OLAP and ease of use/usefulness of OLAP system as perceived by users. Perceived Use- fulness is measured based on the potential OLAP benefits [6]: improves decision making, provides accurate analysis, provides all required information, improves working efficiency, and increases user pro- ductivity. Perceived Ease of Use is measured based on whether users feel that learning OLAP was easy, the system was user-friendly, OLAP was easy to use, and it was easy to get information.
Measures of OLAP Features. Seven types of tools or features are normally offered in an OLAP system. Based on previous literature, each feature and its components are described here.
Visualization allows users to create summary tables and charts interactively. This is measured using the presence of multidimensional tables and graphics.
Summarization refers to
“degree of aggregation” of infor- mation. We measure this fea- ture using number of hierarchies allowed, level of detail, and the capability to swap between summarized and detailed levels.
Navigation refers to its capa- bility to drill-down or drill-up between levels of detail. This is measured by shareability (num- ber of concurrent users allowed), data navigatability (availability of drill-down, slic- ing-dicing, and drag-drop facil- ities), and ability to extract detailed and real-time data.
Query Function: Query engines extract data from multidimensional data- bases and generate outputs in 3D graphics. This is measured using preconstructed query capability,
ROLAP DATA
ROLAP DATA OLAP Server User Interface
Data
Warehouse Operational Database
Figure 1. Data warehouse and OLAP.
Visualization
Navigation
Query
Performance Sophisticated
Analysis
-.48
+60 +.39 +.38
+79 Summarization +.39
Dimensionality
Only significant relationships shown Ease of
Use
Usefulness
Figure 2. MOLAP model.
simple query building with click-select feature, query building with query languages, and concur- rent run of queries.
Sophisticated Analysis: This feature is measured by six most common types of analyses used in decision support: statistical profiling (for example, list cus- tomers with highest combined sales); moving aver- ages; cross dimension comparison (compare product sales by region over a period of time); queries with self-defined formula; exception condition; and what- if analysis.
Dimensionality is mea- sured using the number of allowable dimensions, capabil- ity to redefine dimension, and time for data refresh after redefinition.
Performance includes response times for four basic functions:
standard report generation, customized report generation, graphic/chart generation, and data navigation.
Data Collection. To examine the effect of OLAP features on perceived ease of use (PEU) and perceived use-
fulness (PU), a questionnaire-based survey was con- ducted in Hong Kong. The questionnaire considered user demographics, measures of PEU and PU, and features of OLAP in place. Users were queried about their positions, departments, and OLAP systems they used. Questions regarding PEU and PU, for example, “OLAP system increases my productivity,” used a five-point Likert scale ranging from 1 (strongly agree) to 5 (strongly disagree).
Questions regarding OLAP features inquired about their satisfaction, for example, “Flexibility to swap between summarized and detailed data” (1=strongly unsatisfied to 5=very satisfied). Alternatively, the respondents for each feature may choose “not used”
and “not applicable,” as appropriate. The question- naire was sent to two groups of people—ROLAP and MOLAP users. Seventy-eight questionnaires were sent to four companies with two of each using ROLAP or MOLAP systems. Approximately 58 questionnaires were returned providing a 74%
return rate. Pearson correlation analysis was used to examine the relationship between OLAP features and PEU and PU.
Four companies were selected for the survey, two of which used MOLAP software and the other two used ROLAP-related software. The companies using MOLAP used either Cognos Software’s PowerPlay
or Oracle Express from Oracle Corporation. Power- Play software stores the analytical data in multidi- mensional data sets called PowerCubes that are stored either on clients or on servers and are updated periodically by running a batch job. The PowerPlay analytical engine is aided by Impromptu reporting system and Visualizer visualization technology. Ora- cle Express also stores data in multidimensional
“physical cubes” and allows users to “slice and dice”
the data cubes. The companies using ROLAP used either Business Objects or had modules that were custom- developed internally using SQLBase RDBMS. When employing Business Objects software, a user submits a request for information through a semantic layer, which is converted to an SQL statement submitted to the database engine that accesses relational database and returns the result that is transformed into a cube for the user.
Results and Discussion The significant relationships between OLAP features and PEU and PU for MOLAP and ROLAP systems are shown in Figures 2 and 3, respectively. All features (except Query function) of ROLAP are perceived as useful. On the contrary, only two features (Visual- ization and Summarization) of MOLAP are per- ceived to be useful. Furthermore, in a ROLAP system, PEU is significantly related to PU; thus, when the ROLAP features are perceived as easy to use and user-friendly, it positively impacted the use- fulness of ROLAP.
The visualization feature has a positive effect on ease of use with ROLAP software, while it has a neg- ative effect on ease of use with MOLAP. Visualiza- tion features are less prevalent in ROLAP, so any improvements in visualization with help of graphical user interfaces and help menus aided ease of use in ROLAP. On the other hand, MOLAP systems usu- ally have adequate visualization effects. Cognos’
MOLAP system, PowerPlay, presents data to users in a variety of modes, such as cross-tabs, pie charts, and graphs using Visualizer technology. In addition, users can change various visualization effects, such as, colors, formats, fonts, and so forth. It is possible that excessive presence of visualization effects in MOLAP could confuse users, resulting in a negative
Visualization
Query
Performance Sophisticated
Analysis
+.61
+.48 +.62
+.66 +.74
+.60
+.60
+.79 Summarization +.75
Dimensionality
Only significant relationships shown Ease of
Use
Usefulness Navigation
Figure 3. ROLAP model.
relationship with PEU. The visualization features of ROLAP and MOLAP have positive significant effects on the usefulness of the OLAP tools.
The summarization feature has a positive signifi- cant relationship with both PEU and PU in ROLAP and MOLAP. This implies that with increasing number of permissible detail-levels and flexibility in swapping between levels, the use of OLAP will improve.
The data navigation feature has a significant posi- tive effect on PEU in MOLAP. Since there are only limited levels available for drill-down and slice-dice, MOLAP allowed users to navigate easily. This limita- tion of MOLAP resulted in a nonsignificant relation- ship with PU. The situation is the reverse for ROLAP.
The Mercury of Business Objects, a ROLAP system, lets the users define their own dimensions, lets them perform queries at various levels of detail, and offers various reporting facilities. Since these flexible naviga- tion facilities (real-time data access, detail data extrac- tion, or drill-down) are possible for ROLAP, this feature has a positive effect on PU.
The Query function showed a significant posi- tive relationship only with PEU for MOLAP. Since all reports have been predesigned in MOLAP, users need only to click and select the report.
Impromptu, a companion of Cognos’ PowerPlay, is easy-to-use software, but the data cube has to be built by either a database administrator or a data- base analyst. This predefined data cube may not meet the query needs of a user, in which case, the user needs to wait for the database specialist to modify the data cube. Although MOLAP is easy to use, users did not find it useful because of its lack of flexibility.
Sophisticated analysis has a significant positive effect on PU in ROLAP and not on PU in MOLAP.
This is because ROLAP provided users with more useful functions: ad hoc queries down to detail data, customized reports, and what-if analysis. The Set Analyzer of Business Objects allows users to build complex queries from large databases as index tables, thereby enabling users to build sophisticated and flexible queries that also run quickly. Set Ana- lyzer allows users to maintain a hierarchy of evolv- ing queries, giving them the capability to perform sophisticated analyses.
Dimensionality for ROLAP systems has a signif- icant effect on PU. Since ROLAP systems operate on transactional data, users could get current data in their required dimensions. In MOLAP systems, pre-aggregation has limited the flexibility of chang- ing the definition of dimensions, resulting in users not perceiving it as useful. Oracle Express allowed
users to create relationships among the existing dimensions and to define the top-level dimension.
However, the users did not perceive these facilities as relevant or useful.
The positive correlation between Performance and PU signifies the importance of system perfor- mance in ROLAP. Since it takes time to execute the SQL queries for manipulating voluminous data, users perceived performance to be critical. With ROLAP systems (for example, Business Objects), large amounts of data are queried by the clients against large data sets—this further results in increase in network traffic, leading to high response times of queries.
Choice Between MOLAP and ROLAP This study evaluated OLAP tools for ease of using the system and for usefulness. Following are some guidelines in choosing between MOLAP or ROLAP:
• Choose MOLAP for non-sophisticated computer users and ROLAP for the sophisticated users.
Our study found more features of MOLAP have positive effects on ease of use, compared to those of ROLAP.
• Users who use only preset reports and have no need to monitor the daily transaction data could deploy a MOLAP system. On the other hand, users that need to analyze the market information regularly would require a ROLAP system; it is suitable for the retailing industry or manufactur- ers with a variety of products and a large volume of data.
• If the information needs of users are relatively consistent over a period of time, MOLAP is pre- ferred. If the requirements change frequently, ROLAP should be adopted because of its flexible query capability.
• Since MOLAP uses a multidimensional data cube that is generated periodically, the data is not cur- rent. Hence, MOLAP should be used where data is relatively nonvolatile. Customers can use MOLAP for inquiring about the products, their descriptions, and prices. For a volatile data envi- ronment, for example, as in sales transaction data, they would need more current data than is possi- ble through a ROLAP system.
• In the initial stages of adoption of OLAP tech- nology in organizations, MOLAP systems are rec- ommended because of their ease of use. After considerable experience, a ROLAP system is pre- ferred because of its flexibility and ability to han- dle complex queries.
Effective OLAP for Data Warehouses Based on the OLAP users’ perception, our findings indicate MOLAP tools make the data warehouse sys- tem easy to use but not useful; ROLAP tools make the data warehouse useful but not easy to use. Sug- gestions for improving the design of data warehouses with OLAP include:
Do proper planning : Because the system designs for MOLAP and ROLAP systems are quite different, IT professionals should be aware of this in require- ment planning. User requirements for MOLAP sys- tems should be clearly defined in advance so that pre-aggregated formats can be set appropriately.
Make ROLAP user-friendly: The flexibility of ROLAP system should be complemented with easy- to-use features. Software vendors should design ROLAP tools using better GUI and drag-drop tech- nologies, so that the software is more user-friendly.
Align IT strategy with business: OLAP tools should be designed considering alignment of IT strategy with business strategy [7]. First American corpora- tion implemented a data warehouse that is aligned with its business strategy and improved financial per- formance [3]. By determining information needs based on the proper alignment, OLAP tools can be made more useful for organizations and individuals.
This is especially true in case of MOLAP tools, since only a few features are related to PU.
Physical data warehouse design: Better physical data warehouse design is needed in order to improve the performance of ROLAP tools. Data warehouses may be designed integrating the ROLAP relational structure and the MOLAP multidimensional cube—one way to implement this is by using a dense-region-based data cube [2]. Performance of data warehouses can also be improved by using phys- ical design techniques, such as partitioning and access method selection [12] and parallel query pro- cessing techniques [5].
Personalize : OLAP tools should be personaliz- able. Personalization is an evolutionary concept in designing personal end-user tools [11]. This may be done by unbundling the features of OLAP and pro- viding the software interface to the user that will allow access to a set of OLAP tools selected depend- ing on the skill level and the information needs of the specific user. This will improve both ease of use and usefulness of the system.
Integrate ROLAP and MOLAP : Data warehouses should include both ROLAP and OLAP in an inte- grated fashion, since information needs generally comprise both batch output and online inquiries.
Batch outputs could be done with MOLAP, while online ad hoc needs can be met with ROLAP tools.
Integrate OLAP with decision models: In order to make data warehouses and the associated OLAP tools more useful for decision support, analyses need to be made of the decisions to be supported, the decision processes involved, and the relevant deci- sion models. Using decision-making processes and decision models, appropriate queries can be designed and incorporated into OLAP tools, thereby benefit- ing decision makers.
Improve data currency: Since a drawback of MOLAP is not having current data in its database, these data warehouses should be updated as fre- quently as possible, which will ensure the outputs from the data warehouse are more current. However, updating the data warehouses is time consuming and costly. So, an optimal updating frequency should be computed and used in practice.
Use data mining to improve OLAP : Data mining can extract rules based on historical data. By using these rules, the materialized views for OLAP can be designed. Since these rules are extracted from previ- ous transaction profile, the predesigned queries or materialized views in MOLAP tend to be more use- ful. Furthermore, by using data-mining rules, indexes can be selected intelligently for ROLAP.
References
1. Berndt, D.J., Hevner, A.R., and Studnicki, J. The Catch data ware- house: Support for community health care decision-making. Decision Support Systems 35, 3 (June 2003), 367–384.
2. Cheung, D.W., et al. Towards the building of a dense-region-based OLAP system. Data and Knowledge Engineering 36, (2001), 1–27.
3. Cooper, B.L., et al. Data warehousing supports corporate strategy at First American Corporation. MIS Quarterly 24, 4 (Dec. 2000), 547–567.
4. Cope, J. New tools help Lockheed Martin prepare for takeoff. Com- puterworld (Mar. 17, 2000).
5. Datta, A., VanderMeer, D., and Ramamritham, K. Parallel star join + DataIndexes: Efficient query processing in data warehouses and OLAP.
IEEE Trans. On Knowledge and Data Engineering 14, 6 (Nov./Dec.
2002), 1299–1316.
6. Davis, D.G. Perceived usefulness, perceived ease of use, and user accep- tance of information technology. MIS Quarterly, (Sept. 1989), 319–339.
7. Gardner, S.R. Building the data warehouse. Commun. ACM 41, 9 (Sept. 1998), 52–60.
8. Glassey, K. Seducing the end user. Commun. ACM 41, 9 (Sept. 1998), 62–69.
9. Hasan, H. and Hyland, P. Using OLAP and mltidimensional data for decision making. IT Pro, (Sept./Oct. 2001), 44–50.
10. Koutsoukis, N., Mitra, G., and Lucas, C. Adapting on-line analytical processing for decision modelling: The interaction of information and decision technologies. Decision Support Systems 26, (1999), 1–30.
11. Riecken, D. Personal end-user tools. Commun. ACM 43, 8 (Aug.
2000), 89–91.
12. Song, S. and Gorla, N. A transaction-based genetic algorithm approach to vertical fragmentation in relational databases. The Computer Journal, 43, 1 (2000), 81–93.
Narasimhaiah Gorla ([email protected]) is an associate professor of IS at Wayne State University in Detroit, MI.
© 2003 ACM 0002-0782/03/1100 $5.00
c