• No results found

IMPACT OF A DATA WAREHOUSE MODEL FOR IMPROVED DECISION-MAKING PROCESS IN HEALTHCARE

N/A
N/A
Protected

Academic year: 2021

Share "IMPACT OF A DATA WAREHOUSE MODEL FOR IMPROVED DECISION-MAKING PROCESS IN HEALTHCARE"

Copied!
120
0
0

Loading.... (view fulltext now)

Full text

(1)

IMPACT OF A DATA WAREHOUSE MODEL

FOR IMPROVED DECISION

-

MAKING

PROCESS IN HEALTHCARE

Pubudika Kumari Mawilmada

BBus (IT Management), MIT

Submitted in fulfilment of the requirements for the degree of Master of Information Technology (Research)

Computer Science Discipline Faculty of Science and Technology Queensland University of Technology

(2)
(3)

i

Keywords

Cardiology, Clinical Decision Support Systems, Data marts, Data warehouse, Decision-making, Information systems, Healthcare, Star schema, Snow flakes schema.

(4)

Abstract

The health system is one sector dealing with a deluge of complex data. Many healthcare organisations struggle to utilise these volumes of health data effectively and efficiently. Also, there are many healthcare organisations, which still have stand-alone systems, not integrated for management of information and decision-making. This shows, there is a need for an effective system to capture, collate and distribute this health data. Therefore, implementing the data warehouse concept in healthcare is potentially one of the solutions to integrate health data. Data warehousing has been used to support business intelligence and decision-making in many other sectors such as the engineering, defence and retail sectors.

The research problem that is going to be addressed is, “how can data warehousing assist the decision-making process in healthcare”. To address this problem the researcher has narrowed an investigation focusing on a cardiac surgery unit. This research used the cardiac surgery unit at the Prince Charles Hospital (TPCH) as the case study. The cardiac surgery unit at TPCH uses a stand-alone database of patient clinical data, which supports clinical audit, service management and research functions. However, much of the time, the interaction between the cardiac surgery unit information system with other units is minimal. There is a limited and basic two-way interaction with other clinical and administrative databases at TPCH which support decision-making processes. The aims of this research are to investigate what decision-making issues are faced by the healthcare professionals with the current information systems and how decision-making might be improved within this healthcare setting by implementing an aligned data warehouse model or models. As a part of the research the researcher will propose and develop a suitable data warehouse prototype based on the cardiac surgery unit needs and integrating the Intensive Care Unit database, Clinical Costing unit database (Transition II) and Quality and Safety unit database [electronic discharge summary (e-DS)]. The goal is to improve the current decision-making processes. The main objectives of this research are to improve access to integrated clinical and financial data, providing potentially better information for decision-making for both improved

(5)

iii management and patient care and also, providing greater efficiency in supporting current similar processes.

The methodology used to conduct this research consisted of five stages. The first stage reviewed the literature to define the background knowledge about data warehousing, identify different data warehouse models, factors leading to model selection and application of the data warehouse concept in the healthcare environment. In the second stage of the methodology, a survey was conducted to gather information on the current data repositories, current decision-making process, current decision-making issues and data warehouse prototype development requirements. The main survey methods used were questionnaire and unstructured interviews. A total of ten questionnaires were distributed to stakeholders in the cardiac surgical decision-making processes. The questionnaire consisted of twelve questions producing data for four categories of inquiry namely: current data repositories, decision-making process, current issues, data storage and analysis needs. An 80% response rate was achieved (8 out of 10). Although 30% (3 of 10) did not wish to participate further 70% (7 of 10) contributed to subsequent unstructured interviews used to clarify and extend survey results. These were analysed thematically and a number of decision-making knowledge gaps ascertained. The survey and literature review data were then integrated to select a model. Thirdly, the model prototype was developed and fourthly the integrated data was analysed and information products created. Finally, the information products were reviewed by the hospital staff and feedback obtained to evaluate the warehouse prototype utility.

According to the survey conducted in this research it is apparent that end users (clinicians, unit manager, data managers from cardiac surgery, ICU, quality and safety and clinical costing units) have limited access to data repositories other than their own database. For instance, most of the time clinicians or unit managers have to contact data custodians to extract and collate the information from other data repositories. Also, then they have to manually integrate data prior to analysis and reporting. This leads to limitations in the interaction between ICU, cardiac surgery (CARPIA), quality and safety (e-DS), and clinical costing units databases.

All these issues create inefficiencies in the decision-making process. After analysis of further data from the questionnaire, the user requirements were summarised for the data warehouse prototype development. Using analysed results

(6)

from the questionnaire and by referring to the literature, the results indicate a centralised data warehouse model for the cardiac surgery unit at this stage. A centralised data warehouse model addresses current needs and can also be upgraded to an enterprise wide warehouse model or federated data warehouse model as discussed in the many consulted publications. The data warehouse prototype was able to be developed using SAS enterprise data integration studio 4.2 and the data was analysed using SAS enterprise edition 4.3. In the final stage, the data warehouse prototype was evaluated by collecting feedback from the end users. This was achieved by using output created from the data warehouse prototype as examples of the data desired and possible in a data warehouse environment. According to the feedback collected from the end users, implementation of a data warehouse was seen to be a useful tool to inform management options, provide a more complete representation of factors related to a decision scenario and potentially reduce information product development time.

However, there are many constraints exist in this research. For example the technical issues such as data incompatibilities, integration of the cardiac surgery database and e-DS database servers and also, Queensland Health information restrictions (Queensland Health information related policies, patient data confidentiality and ethics requirements), limited availability of support from IT technical staff and time restrictions. These factors have influenced the process for the warehouse model development, necessitating an incremental approach. This highlights the presence of many practical barriers to data warehousing and integration at the clinical service level. Limitations included the use of a small convenience sample of survey respondents, and a single site case report study design.

As mentioned previously, the proposed data warehouse is a prototype and was developed using only four database repositories. Despite this constraint, the research demonstrates that by implementing a data warehouse at the service level, decision-making is supported and data quality issues related to access and availability can be reduced, providing many benefits. Output reports produced from the data warehouse prototype demonstrated usefulness for the improvement of decision-making in the management of clinical services, and quality and safety monitoring for better clinical care. However, in the future, the centralised model selected can be upgraded to an enterprise wide architecture by integrating with additional hospital units’ databases.

(7)

v

Table of Contents

Keywords ...i

Abstract ... ii

Table of Contents ... v

List of Figures ... vii

List of Tables... viii

List of Abbreviations ... ix

Statement of Original Authorship ... x

Acknowledgments ... xi Dedication ... xii CHAPTER 1: INTRODUCTION ... 1 1.1 Research background ... 1 1.2 Problem ... 2 1.3 Research questions ... 3

1.4 Significance, Scope and Definitions ... 4

1.5 Thesis outline ... 5

CHAPTER 2: LITERATURE REVIEW ... 7

2.1 Review methodology ... 7

2.1.1 Literature search sources ... 7

2.1.2 Information search strategies ... 9

2.2 Background theory ... 9

2.2.1 The data warehouse concept ... 9

2.2.2 Main components of the data warehouse ... 10

2.2.3 Data warehouse modelling ... 12

2.2.4 Data warehouse methodologies ... 14

2.2.5 Data warehouse lifecycle ... 16

2.2.6 Operational systems vs data warehouses... 17

2.2.7 Data marts ... 18

2.3 Different types of data warehouse models ... 19

2.3.1 Centralised data warehouse ... 19

2.3.2 Independent data marts ... 19

2.3.3 Federated architecture ... 19

2.3.4 Hub and spoke architecture ... 20

2.3.5 Data mart bus architecture ... 20

2.4 Data warehouse architecture/model selection factors ... 21

2.5 Health information management ... 24

2.5.1 Healthcare decision-making ... 24

2.5.2 Healthcare information systems and decision-making ... 25

2.6 Data warehousing and healthcare ... 29

2.6.1 Data warehouse implementation examples ... 30

2.6.2 Data waehouse implementation challenges ... 34

(8)

CHAPTER 3: RESEARCH DESIGN ... 39

3.1 Methodology and Research Design... 39

3.1.1 Methodology ... 39

3.1.2 Research Design ... 42

3.2 Participants ... 43

3.3 Instruments ... 43

3.4 Procedure and Timeline ... 44

3.5 Analysis ... 45

3.6 Ethics and Limitations ... 45

3.7 Interlectual Property Rights ... 46

3.8 Health and safety ... 46

CHAPTER 4: RESULTS ANALYSIS ... 47

4.1 Current decision-making process ... 47

4.2 Decision-making issues ... 48

4.3 Application development requirements analysis ... 51

CHAPTER 5: DATA WAREHOUSE PROTOTYPE DEVELOPMENT ... 55

5.1 Business intelligence tools ... 55

5.1.1. SAS/Warehouse Administrator 4.3 ... 56

5.1.2 SAS data integration studio ... 57

5.2 Data analysis tools ... 58

5.2.2 SAS enterprise guide ... 58

5.3 Cardiac surgery data warehouse prototype selection and development ... 58

5.3.1 Model selection Rationale ... 58

5.3.2 Development process ... 63

5.4 Data analysis using the data warehouse prototype ... 68

5.5 Data warehouse prototype evaluation ... 74

CHAPTER 6: DISCUSSION ... 77

6.1 Limitations of the study ... 81

CHAPTER 7: CONCLUSION ... 85

7.1 Recommendations and future directions ... 86

BIBLIOGRAPHY ... 87

APPENDICES ...93

Appendix A: Questionnaire ... 93

(9)

vii

List of Figures

Figure 1: Components of the data warehouse ... 11

Figure 2: Multidimensional data ... 12

Figure 3: Star schema data model ... 13

Figure 4: Snowflakes schema data model ... 14

Figure5: Data warehouse system life cycle... 16

Figure 6: Data warehouse architectural types ... 20

Figure 7: Different types of data warehouse architectures ... 21

Figure 8: Results of the survey ... 22

Figure 9: The distribution of the architectures ... 22

Figure 10: Research model for data warehouse architecture selection ... 23

Figure 11: An integrated model for data warehouse architecture selection ... 23

Figure 12: Decision-making levels within an organisation ... 25

Figure 13: Timelining Health Information Systems Evaluation ... 27

Figure 14: Advantages and disadvantages of data integration architectures ... 28

Figure 15: Current use of BI\CI by healthcare organisations ... 30

Figure 16: Top 3 barriers to the use of business/clinical intelligence applications ... 35

Figure 17: Top 3 IT challenges to implementing/deploying business intelligence applications ... 36

Figure 18: Current support from the IS’s for decision making ... 48

Figure 19: Decision-making issues with current IS’s ... 50

Figure 20: Data quality issues in current decision-making process ... 51

Figure 21: Security and privacy concerns for DW prototype development ... 53

Figure 22: VHA corporate data warehouse visual architecture ... 59

Figure 23: Medical federated data warehouse model ... 60

Figure 24: CDW architecture for traditional Chinese medicine ... 61

Figure 25: Proposed data warehouse model for the TPCH Cardiac surgery unit ... 63

Figure 26: Risk score star schema ... 66

Figure 27: Cost star schema ... 67

Figure 28: Cardiac Surgery unit data warehouse model ... 68

Figure 29: Comparison of risk scores –group by PREDMORT ... 69

Figure 30: Interaction of risk scores ... 69

Figure 31: The actual expenditure per episode of care according to the certain clinical group ... 70

Figure 32: Cost of reoperation for bleeding as an example of post operational complications ... 71

(10)

List of Tables

Table 1: Literature search sources ... 8

Table 2: Comparison of data warehouse with OLTP systems ... 17

Table 3: Differences between data mart and data warehouse ... 18

Table 4: Combined reasons for data warehouse failure ... 34

Table 5: Methodology stages ... 39

Table 6: Decisions/ Problems would like to address by end users ... 52

(11)

ix

List of Abbreviations

BI Business Intelligence

CDSS Clinical Decision Support Systems CI Clinical Intelligence

CIO Cheif Information Officer

CMS Center for Medicare and Medicaid services DM Data Marts

DW Data Warehouse

e-DS Electronic Discharge Summary FED Federated Data warehouse

HBCIS Hospital Based Corporate Information System ICU Intensive Care Unit

IDM Independent Data Marts IS Information Systems IT Information Technology

ITI Information Technology Infrastructure OIPT Organizational Information Processing

Theories

OLAP Online Analytical Processing OLTP Online Transaction Processing RCT Randomised Controlled Trials TPCH The Prince Charles Hospital VHA Veteran’s Health Administration

(12)

Statement of Original Authorship

The work contained in this thesis has not been previously submitted to meet requirements for an award at this or any other higher education institution. To the best of my knowledge and belief, the thesis contains no material previously published or written by another person except where due reference is made.

Signature: _________________________

(13)

xi

Acknowledgments

I would like to thank my principal supervisor Dr. Tony Sahama for guidance and support given to me in conduct this research. Also, I would like to thank my associate supervisor Craig Huxley for his guidance and advice. I would like to acknowledge my associate supervisor Susan Smith for advice and support given throughout this project. I appreciated her assistance and guidance provided to me and being patient in answering my questions. To the Prince Charles Hospital cardiac surgery unit data managers Gai Harris, Lesley Drake, Kay Watson Clinical costing manager Allan Rowe, Senior costing officer Diana Lal, Applications infrastructure manager Brad Day and ICU health information manager Lynette Munck; I would like to thank you for support in many ways for my research project.

(14)

Dedication

This thesis is dedicated to my parents U.B. Mawilmada and N.K. Mawilmada

(15)

Chapter 1:

Introduction

This chapter outlines the background (section 1.1) and the research problem to be addressed by the research project (section 1.2) and research questions (section1.3). Section 1.4 describes the significance and scope of this research. Finally, section 1.5 includes an outline of the remaining chapters of the thesis.

1.1 RESEARCH BACKGROUND

“Healthcare is an information intensive business generating huge volumes of data from hospitals, primary care surgeries, clinics and laboratories” (Grimson, Grimson, & Hasselbring, 2000, p. 49). According to Sahama and Croll (2007), data acquisition and distribution of information create a challenging situation for people engaged in the medical sector. Information Technology (IT) today plays a major role in healthcare through the introduction of systems such as electronic health records and telemedicine for example. Integration of stand-alone systems would benefit health organisations. However, there are many healthcare organisations which still have stand-alone Information Systems (IS) (de Mul, Alons, van der Velde, Konings, Bakker, & Hazelzet, 2010). Integrating stand–alone systems will become a more complex task as stored data is increasingly used for decision-making in clinical care, quality assurance, research and management (de Mul et al., 2010). Jani, Davis and Fox (2007) stated that, although there are recent advances in database developments their impact is limited because there are limited opportunities to link these databases. Although many clinical ISs have been designed or are available, most benefit the area of hands on care for individual patients in transactional systems rather than supporting the analyses of data (de Mul et al., 2010; Sanders & Protti, 2008). As stated by Albert, Walter, Arnrich, Hassanein, Rosendahl, Bauer and Ennker (2004, p. 312), “Clinicians are encouraged to improve their methods of investigation and analysis of outcomes, which still tend to be underdeveloped in comparison to methods available in industry”.

(16)

1.2 PROBLEM

The problem that is going to be addressed is, “how does data warehousing assist the decision-making process in healthcare”. To address this problem narrowed the scope of the research to an investigation focusing on a cardiac surgical unit. Arigon (2007) describes that, data used in cardiac surgery consists of alphanumeric data, images and signals. These may come from a number of data repositories. The analysis environment of such data must include processing methods in order to compute or extract the knowledge embedded in the raw data (Arigon et al., 2007). A

data warehouse is a potential solution which may provide a better environment for the analysis purposes of these data.

All clinical care units are accountable for providing quality of care. There have been many models of quality measures developed (de Mul et al., 2010). However, sometimes these require complex queries to analyse data and it is a time consuming process. Moreover, as stated by Albert et al (2004), often predictive models cannot consider all patients characteristics, and do not include non patient related factors. Therefore, there is a need for a system to analyse cardiac data from different perspectives. However, most of the time cardiac information systems such as those for cardiac surgery have minimal interaction with other units. By combining the cardiac surgery unit data repository with clinical units such as the Intensive Care unit (ICU), anaesthesia and financial units, clinicians could gain more benefits. The implementation of a data warehouse concept is one potential solution to efficiently facilitate easy analysis of data (de Mul et al., 2010).

Finally, although most clinicians believe that the use of the data warehouse concept in cardiac surgery unit can lead to efficient decision-making, high quality of patient care and safer processes, only a small proportion of this technology has been adopted (de Mul et al., 2010).

This research has used the cardiac surgery unit at the Prince Charles Hospital (TPCH) as a case study. The cardiac surgery program at the Prince Charles Hospital uses a stand-alone database of patient clinical data, which supports Clinical Audit, Service Management and Research functions. There is a limited two way interaction with other clinical and administrative databases at TPCH to support these decision-making processes. This research aims to propose a suitable data warehouse model for the cardiac surgery unit at TPCH, in order to improve the decision-making process.

(17)

3 The main databases employed to develop a data warehouse prototype are the cardiac surgery register database (CARPIA), the ICU database, a quality and safety unit database and the enterprise clinical costing unit database. The cardiac surgery register database stores cardiac surgical patients’ demographics data, patients history, preoperative data, procedural (surgical) data, post-operative outcomes data, test results, diagnosis, risk scores and so on. The data for this database is derived from several sources; however most data are collected and entered manually into the system by trained clinical data managers. Some basic patients’ information is derived from the Hospital Based Corporate Information System (HBCIS) which is the enterprise hospital patient administration system. Also, the pathology system and main theatre information system provide information for the CARPIA database. The Quality and Safety unit database of interest is known as the electronic Discharge Summary database (e-DS). This database contains hospital wide discharge summaries of all patients; It is a small transactional database deriving information from HBCIS and clinician entry.

The Clinical Costing unit already employs the State level enterprise data warehouse known as Transition II. The main data sources for this database are HBCIS, and the other management feeder systems such as Emergency Department Information System (EDIS), Operating Room Management Information System (ORMIS), Enterprise pathology results information system (Auslab) and Trendcare system (patient-nurse dependency). The Transition II database manages data in three levels: the financial level, departmental level and the patient level, although little actual clinical data are captured. The ICU database contains data of patients admitted to the ICU. Manually entered data are the main source of information for this database and include patient clinical data such as morbidity scores, risk scores procedural data and physiological measurements.

1.3 RESEARCH QUESTIONS

One of the main aims of this research is to develop background knowledge of data warehousing and its application to healthcare. Data warehousing plays a major role in businesses today in contributing to improved decision-making. As in other businesses, the data warehouse concept is also becoming popular in the healthcare industry as making appropriate well informed decisions is the basis of effective

(18)

healthcare, which will lead to improvements in the quality of service and reduce the costs in healthcare. However, there are still many healthcare organisations which have disparate information systems that are not integrated and do not support improved decision-making processes. Therefore, it is important to identify those issues with the current information systems relating to the impediment of better decision-making and to the potential. Hence, the first question asked would be:

“What decision-making issues exist or are faced by healthcare professionals with the

current information systems?”

There are different alternatives of data warehouse architecture available which support various decision-making structures and purposes. Therefore, it is important to consider selection of a suitable data warehouse model, which will facilitate quality decisions in the Cardiac Surgical context. This will be the key to the next question:

“How might decision-making be improved within healthcare services by implementing a more aligned data warehousing model or models?”

This research will, develop a suitable data warehouse model for the Cardiac surgery unit at The Prince Charles Hospital, in order to improve decision-making processes.

1.4 SIGNIFICANCE, SCOPE AND DEFINITIONS

This research presents four different outcomes. As discussed above, a data warehouse prototype will be developed for the Cardiac surgery unit at the Prince Charles Hospital. This will:

• improve access to administrative, financial and clinical information.

• potentially improve decision-making for the management of the clinical services.

• potentially improve quality and safety monitoring to assist healthcare accountability and better clinical care .

(19)

5

1.5 THESIS OUTLINE

Chapter 1 provides details about the research background, research problem, its purpose and outcomes of the research. Four outcomes are highlighted as part of the completion of this project

Chapter 2 presents the review of literature on data warehousing, including different data warehouse architectural types, how data warehouse is different from operational systems and data marts, data warehouse modelling, and data warehouse model selection factors. Furthermore, this chapter provides details on healthcare information management, healthcare decision-making issues and application of the data warehouse concept in healthcare with some examples.

Chapter 3 describes the research design of this research project. It covers research methodology, research design, participants and instruments used in the research. The research methodology consists of five stages and each stage is explained in detail.

Chapter 4 presents the analysis of the survey findings. It covers the current decision-making process, issues related to the current decision-making process and also identifies the user requirements for data warehouse prototype development.

Chapter 5 presents the cardiac surgery data warehouse prototype development. Firstly, it briefly describes the business intelligence tools used to develop the data warehouse prototype and the benefits of those tools. The next section, explains the data warehouse development steps using the SAS data integration studio 4.2 software.

Chapter 6 provides a discussion of survey results analysis. This chapter contains a full discussion and evaluation of the results with reference to the literature and the limitations.

Chapter 7 concludes the thesis by providing information on the research process, its benefits to TPCH cardiac surgical unit, constraints and limitations faced during the project and recommendations and future directions.

(20)
(21)

Chapter 2:

Literature Review

This chapter reviews the literature on the following topics: The first section gives a brief introduction of review methodology (2.1). This covers the literature search sources (2.1.1) and information search strategies (2.1.2). The second section (2.2) discusses the background theory of data warehousing in general and gives detailed information about data warehouse components, data warehouse modelling, data marts and how data warehouses differ from operational systems. The third section (2.3) discusses different types of data warehouse models and selecting factors and the issues related to data warehouse selection. The next section (2.4) identified the data warehouse model selection factors. Fifth section (2.5) reviews the literature on health information management. This will covers decision-making and issues related to healthcare and healthcare information systems. The following section (2.6) discusses the data warehouse concept in healthcare and some of the real examples of data warehouse implementation and its benefits. The section 2.7, studies the implications from the literature and develops a framework for the research.

2.1 REVIEW METHODOLOGY

2.1.1 LITERATURE SEARCH SOURCES

Many information sources were used to search the literature widely. The primary literature search sources used were publisher databases. The publisher databases provide information from many formal sources such as journal articles, research papers, and conference papers. They also provide a major source of traditional academic information. Most of the information sources from the Queensland University of Technology (QUT) library are stored as books, magazines and e-books. Moreover, the general web search engines such as Google Scholar, Google, Scirus and Inforpeople provide important e-books, peer reviewed articles as well as non-peer reviewed industry and ‘grey’ literature that is related to the research field. The following table shows the information sources that were used.

(22)

Search material

Source type Main information source

Journal articles and conference papers Databases ScienceDirect Web of Science ACM portal SpringerLink ProQuest IEEE Xplore CiteSeerX EBSCO host Elsevier JAMA

Books Libraries Queensland University of Technology Online providers Google books

Web sites

Case studies Australian

Digital Thesis

Web search engines www.google.com

http://scholar.google.com. www.scirus.com www.infopeople.org http://au.search.yahoo.com http://www.bing.com http://au.altavista.com http://www.webwombat.com. http://www.dwinfocenter.org/getstart.html

Web groups Web search engines http://www.technologyreview.com/blog/ http://blog.kalido.com/ http://tdwi.org/ http://www.information-management.com/ http://www.sas.com/ http://www.bi-dw.info/ http://www.dwaa.org.au/layout-8.html

(23)

2.1.2 INFORMATION SEARCH STRATEGIES

Many strategies were used to search widely for information related to the research topic and research questions. The search terms “data warehouse”, “data warehousing”, “data integration” were used to find the basic articles about the data warehouses. These searches returned a number of articles. And the next step to combine the initial terms with other terms such as “healthcare”, “decision-making”, “models” etc. to narrow down the search. Search strategies included the use of boolean operators, use of proximity operators such as 1W/nn (ScienceDirect, ProQuest) and Near operator (N) in EBSCO Host etc. which helped to narrow down the search results. Abstracts were reviewed and if certain criteria (e.g. related to the research questions) were identified in the abstract then the full paper was included in the literature review. Citation indexes were used to search for related publications. Also, citation indexes helped to identify the latest research trends and helped to obtain the broadest approach to addressing the research topic. Moreover, the citation indexes were useful in gathering information about authors, journal articles and specialised areas of publications.

2.2 BACKGROUND THEORY

2.2.1 THE DATA WAREHOUSE CONCEPT

Data warehousing technology aims to structure the data in a appropriate way to access the data, and use it in an efficient and effective manner (Dias, Tait, Menolli, & Pacheco, 2008). As stated by Kerkri, Quantin, Allaert, Cottin, Charve, Jouanot and Yétongnon (2001), the data warehouse is responsible for the consistency of information. The integration of tools such as query tools, reporting tools and analysis tools provide opportunity to handle the coherence of information. The aim of data warehousing is to organise the gathering of a wide range of data and store it in a single repository (Kerkri et al., 2001). Currently, data warehousing plays a major role in the business community at large. It is also relevant to healthcare as mentioned in del Hoyo-Barbolla and Lees (2002, p. 43), “in a competitive climate, if healthcare organisations are to keep their customers, knowing and managing information about them is essential and organisations realized that it is crucial to access viable and timely data.” Furthermore, integrating data from the different

1

(24)

sources and converting them into valuable information is a way to obtain competitive advantage (del Hoyo-Barbolla & Lees, 2002).

Data warehousing is “a collection of decision support technologies aimed at enabling the knowledge worker (executive, manager, analyst) to make better and faster decisions” (Chaudhuri & Dayal, 1997, p. 1). According to Inmon (2005, p. 29) data warehouse is a “subject-oriented, integrated, time-variant and non-volatile collection of data in support of management decisions”. March and Hevner (2007) argued that the three components of intelligence namely understanding, adaptability and profiting from experience are important considerations when designing the data warehouse. Also, these authors mentioned that the data warehouse should allow managers to gather information such as identifying and understanding different situations and the reasons for their occurrence. Further, they have argued that the, data warehouse should “enable a manager to locate and apply the relevant organizational knowledge and to predict and measure the impact of decision over time” (March & Hevener, 2007, p.1035). However, as mentioned by March and Hevner (2007), these arguments forms the challenges that need to be considered when implementing a data warehouse.

2.2.2 MAIN COMPONENTS OF THE DATA WAREHOUSE

According to Kimball and Ross (2002), a few components can be identified to form the data warehouse environment (Figure 1). Each component of the data warehouse provides a specific function. The main components are,

• Operational source system • Data Staging Area

• Data Presentation Area • Data Access Tools

Operational Source Systems

The Operational source system is mainly concerned about processing performance and availability. Generally, the source system maintains a small amount of historical data. The queries designed against source systems are narrow. On the other hand, one-record-at-a-time queries which operate as part of the normal transaction flow and act according to the demands on the operational system (Kimball & Ross, 2002).

(25)

Figure 1: Components of the data warehouse (Kimball & Ross, 2002, p. 7)

Data Staging Area

The data staging area is the place that keeps the data as temporary storage (Kimball & Ross, 2002). Also, this area is known as the Extract Transformation Load (ETL) because it is conducting the data extraction, transformation and loading. In other words, the data staging area can be referred to as everything between the operational source systems and the data presentation area (Kimball & Ross, 2002). The first process of transferring data to the data warehouse is extraction. During this process it is important to read and understand the source data and copy them to the staging area of the data warehouse for further management. After extracting the data to the staging area many alterations such as cleansing the data (correcting misspellings, resolving domain conflicts, dealing with missing elements, or parsing into standard formats), combining data from multiple sources, deduplicating data, and assigning warehouse keys take place (Kimball & Ross, 2002). Then the load the data to the presentation area of the data warehouse (Kimball & Ross, 2002).

Data Presentation

The data presentation area is the place where data is organized, stored, and made available to the users. In addition, the data presentation area is the place where business communities see data and gain access using data access tools. As stated by Kimball and Ross (2002), this area can be referred as series of integrated data marts. A each of this data mart presents the data from a single business process (Kimball & Ross, 2002).

(26)

Data access Tools

The data access tools element is the final element of the data warehouse. This element provides many capabilities for the business users to control the presentation area for analytic decision-making. Generally, the data access tool can act as a simple query tool or can be complex as a data mining application (Kimball & Ross, 2002). 2.2.3 DATA WAREHOUSE MODELLING

Generally Data warehouse modelling is used to,

• Identify the data warehouse, data mart, and decision support system data and information requirements

• Represent the data warehouse view

• Design the data warehouse schema according to the information requirements. (Borysowich, 2007)

In the data warehouse, after the business queries and subject area have been identified the information stored in the data warehouse/data mart is designed (Borysowich, 2007). Designing the data warehouse/data mart structure is different from designing the operational systems. According to Mohania, Samtani, Roddick and Kambayashi (1999), operational systems consist of simple pre-defined queries. On the other hand, in data warehousing environments queries join with more tables and more computation time and informality (Mohania et al., 1999). This leads to an emergence of a new view of data modeling design. As a result of this, the multi-dimensional or data cube has become the suitable data model for the data warehousing environment. As stated by Chaudhri and Dayal (1997), a multidimensional view of the data is important when designing front end tools, database design and query engines for online analytical processing (OLAP).

(27)

As stated by Ramakrishnan and Gehrke (as cited in Tan, 2006, p.876) “ Online analytical processing (OLAP) is a term that describes a technology that uses a multidimensional view of aggregate data to provide quick access to strategic information for the purposes of advanced analysis”. Generally, OLAP supports queries and data analysis by collecting, managing and processing multidimensional data (Tan, 2006). In multidimensional data modeling, data is stored as facts and dimensions. Facts can be numerical or factual data and can represent the activity which is specific to the business. On the other hand, “a dimension represents a single perspective of the data” (Mohania et al., 1999, p. 44) and attributes of the dimension characterises each dimension. For instance a customer dimension can consist of the name of the customer, address, and the city. Figure 2 shows the multidimensional data view. Two modeling techniques named star schema or snowflakes schema are used to represent multidimensional data.

Star schema

The star schema modelling consists of a central table (fact table) and other tables which directly link to it. These tables are known as dimension tables. According to Chaudhuri and Dayal (1997), star schema is used in most data warehouses to represent the multidimensional data model.

Figure 3: Star schema data model (ExecutionMih, 2010, p. 2)

In general, the fact table contains the keys and measurements. For example when referring to the Figure 3 sales fact table, it can be seen to contain keys such as time_key, Item_key, branch_key and location _key and measures such as units_sold,

(28)

dollars_sold and avg_sales. In addition, the dimension tables are related to the sales fact table by time, branch item and location fields. Each of these dimension tables contains the attributes related to each dimension (ExecutionMih, 2010).

Snowflakes schema

The snowflakes schema is a more complex data warehouse model than the star schema. Like the star schema the snowflakes schema also consists of fact tables and dimension tables. However, the snowflakes schema dimension tables are normalised and linked to another dimension table (Chaudhuri & Dayal, 1997).

Figure 4: Snowflakes schema data model (ExecutionMih, 2010, p. 2)

2.2.4 DATA WAREHOUSE METHODOLOGIES

There are main two basic methodological approaches for data warehouse design. These are the top- down approach and the bottom-up approach (Golfarelli & Rizzi, 2009). In the top-down approach, user requirements are to analyse, plan and design it, and implement it as a whole. But, this approach has many problems such as high costs, difficulty of the analysing and collecting of all sources, difficulty of collecting all specific needs of all the organisational departments and more development time. In the bottom-up approach the data warehouse is built and then several data marts will be created. This method takes a partial picture of the whole application, therefore, there is a risk involved with this method (Golfarelli & Rizzi, 2009). The bottom-up approach is the accepted method of most users. Moreover, List

(29)

et al. (2002), have identified three data warehouse methodologies such as Data-Driven Methodologies, Goal Data-Driven Methodologies, User Data-Driven Methodologies.

Data – Driven methodologies

As stated by List, Bruckner, Machaczek and Schiefer (2002), “Bill Inmon, the founder of data warehousing argues that data warehouse environments are data driven, in comparison to classical systems, which have a requirement driven development life cycle”. Also, as mentioned by Inmon (as cited in List et al, 2002), user requirements are need to consider finally on the decision support system life cycle.

Goal driven methodologies

List et al (2002), discussed about the Semantic Object Model (SOM) process modelling technique that presented by Böhnlein and Ulbrich-vom Ende. In the first stage of the technique, identifies the company goals and services. Then the SOM schema applying to analysed the business processes. This helps to track the company’s customers and their business transactions, and then at the next stage these transactions are transformed into the existing dependencies called information systems. The final step, identifies the measures and dimensions (according to transactions and dependencies) (List et al., 2002).

User driven methodologies

According to Westerman (as cited in List et al 2002), the user driven methodology is a Wal-mart approach. This approach mainly focuses on implementing a business strategy. “The methodology assumes that the company goal is the same for everyone and the entire company will therefore be pursuing the same direction” (List et al., 2002, p. 205). The first prototype is developed according to the business needs. Firstly, business people set goals and then identify and prioritise the business questions that support the business goals. Then the most important questions are classified with the data elements.

Moreover, there are many development methodologies are introduced by different authors and organisations. As stated by Golfarelli and Rizzi (2009) (as cited

(30)

in Kimball et al, 1998), business dimensional life cycle used to design, develop and implement data warehouse systems. The rapid warehousing methodology is another approach to managing the data warehousing projects. This approach was introduced by the SAS institute, who is leader in the statistical analysis industry. The rapid warehousing methodology consists of seven phases: Assessment, requirements, design, construction and final test, deployment, maintenance and administration and review (Golfarelli & Rizzi, 2009).

2.2.5 DATA WAREHOUSE LIFECYCLE

The data warehouse life cycle plays a major role when developing a data warehouse. The following figure shows the basic phases of the data warehouse life cycle. This life cycle takes the bottom up approach (Figure 5). The main phases of this life cycle are setting goals and planning, designing infrastructures and designing and developing data marts (Golfarelli & Rizzi, 2009). The first phase involves feasibility study. In this phase many activities take place such as setting system goals and estimating the costs for building the data warehouse. The next phase, analyses and compares the architecture solutions for the data warehouse design (Golfarelli & Rizzi, 2009). Moreover, the designer must consider the available tools and technologies for design the plan. The final step involves designing and developing the data marts. In this phase, new data marts are created and added to the data warehouse system (Golfarelli & Rizzi, 2009).

Figure5: Data warehouse system life cycle (Golfarelli & Rizzi, 2009, p. 46)

Setting goals and planning

Designing infrastructures

Designing and developing data marts

(31)

2.2.6 OPERATIONAL SYSTEMS VS DATA WAREHOUSES

There are the many differences between operational systems and the data warehouse. The primary difference between operational systems and data warehousing systems is that operational systems are designed to support transaction processing (OLTP) and data warehousing systems are designed to support online analytical processing (OLAP). The users of the operational systems deal with one record at a time. Also, they perform the same operational task repetitively. On the other hand, a data warehouse is capable of handling with volumes of data at a time and helps to make decisions in a timely and consistent manner with accurate and up to date information (Kimball, 2002).

The follow table shows the differences between the on line transaction processing system (OLTP) and a data warehouse.

(32)

According to Inmon (2005), there are many challenges that exist in the use of current information systems. These include a lack in data credibility, issues with productivity and inability to transform data into information. The lack in credibility occurs due to many reasons such as time discrepancy, algorithmic differences, level of data extraction, problems with external data and no common source of data from the beginning (Inmon, 2005). This leads to many incompatibilities in the reports generated by the different departments of an organisation. On the other hand, productivity becomes a major issue when an organisation needs to analyse the same data across all its departments (Inmon, 2005). This is because, many programs must be written and there are many technological barriers to overcome (Inmon, 2005). 2.2.7 DATA MARTS

A data mart and a data warehouse have different architectural structures. On some occasions there is a need to perform a standardized data analysis and organising data to identify simple usage patterns. As a result of this, data warehousing is arranged in to small units called data marts (Bonifati, Cattaneo, Ceri, Fuggetta, & Paraboschi, 2001). As mentioned by Inmon (1999), “a data mart is a collection of subject areas organised for decision support based on the needs of a given department”. Therefore, each department has its own way of understanding how the data mart should look. Each data mart is designed according to the department’s needs (Inmon, 1999). The following table shows the structure and the differences between the data marts and the data warehouse.

Data Mart Data Warehouse

Departmental Corporate

High level of granularity Low level of granularity Star join structure Star join/Snowflake structure Modest amount of historical data Robust amount of historical data

Technology optimal for access and analysis Technology optimal for holding, and managing massive volumes of data

Each department has a different structure Structure suits corporate understanding of data

(33)

2.3 DIFFERENT TYPES OF DATA WAREHOUSE MODELS

Different types of data warehouse models can be identified. Ponniah (2010), describes basic data warehouse architectural types available (Figure 6). She has introduced five different data warehouse architectural designs. These are, centralised data warehouse architecture, independent data marts (IDM), federated architecture (FED), hub and spoke and data marts bus architecture. Also, as mentioned by Ariyachandra and Watson (2010), these are reference architectural types which provides guidance when creating a new design.

2.3.1 CENTRALISED DATA WAREHOUSE

The centralised data warehouse models consider enterprise level information requirements. The warehouse contains atomic level data which is maintained in the third normal form and sometimes, summarised data will be stored. There are no separate data marts developed in this architecture (Ponniah, 2010).

2.3.2 INDEPENDENT DATA MARTS

The independent data marts are developed to meet the needs of individual the organisational units (Ariyachandra & Watson, 2005). However, these data marts do not provide a ‘single version of the truth’. As stated by Marco (2000), several features can be identified in the independent data marts architecture. These features include:

- The each data mart is started directly from the operational systems.

- In general, data marts are built independently from one another by autonomous teams (Independent teams will typically deploy tools, software, hardware, and processes).

Also, inconsistent data definitions, use of different dimensions and measures of IDM prevent analysing the data across the data marts (Ariyachandra & Watson, 2005). Moreover, Marco (2000) identified problems such as redundant data, redundant processing, scalability and non integration of this architecture.

2.3.3 FEDERATED ARCHITECTURE (FED)

As stated by Ariyachandra and Watson (2010, p. 13), “this architecture leaves existing decision support structures (e.g., operational systems) in place”. The data in the warehouse integrates logically or physically using different methods such as share

(34)

keys, global meta data, distributed queries etc.. According to Jindal and Acharya (as cited in Ariyachandra & Watson, 2010), this architecture is more suitable for the firms that have pre-existing, complex decision support systems.

2.3.4 HUB AND SPOKE ARCHITECTURE

This architecture is similar to centralised architecture. It contains atomic (detail) level data which are normalised into third normal form. There are independent data marts attached to this centralised data warehouse. The independent data marts acquire data from the centralised data warehouse. The centralised data warehouse act as a hub and the independent data marts act as spokes. The independent data marts develop for different purposes of the organisation (Ponniah, 2010).

2.3.5 DATA MART BUS ARCHITECTURE

The data mart bus architecture is designed according to the business requirements of the organisation (Ponniah, 2010). At the beginning, data mart architecture is designed with dimensions and measurements and later on, measurement data marts are added to it. The data marts consist of atomic and summarised data and are organised in star schemas (Ponniah, 2010).

(35)

Figure 7: Different types of data warehouse architectures (Sen & Sinha, 2005, p. 80)

Moreover, Sen and Sinha (2005) discussed about some other different types of data warehouse architecture (Figure 7). Some of these data warehouse architectural types are extended versions of the above mentioned architectural types. For example enterprise warehouse with operational data store, hub and spoke data mart architecture.

2.4 DATA WAREHOUSE ARCHITECTURE/MODEL SELECTION

FACTORS

According to the survey done by Forrester as cited in Agosta, 2005 among 213 practitioners at the Data Warehousing Institute in the San Diego Conference in August 2004, most respondents selected the “Hub and Spoke” data warehouse architecture as the most suitable architecture (see figure 8).

Agosta (2005) stated that, “the survey did not ask about data modelling philosophy, and this survey is perfectly consistent with practitioners implementing dimensional models in different architectures - centralised, hub-and-spoke, as well as "conformed" designs”. However, Agosta (2005) argued that there is no right or wrong data warehousing architecture itself, because most of the architectures (models) are successful with alternative architectures.

(36)

Figure 8: Results of the survey (Agosta, 2005)

Another survey conducted by Ariyachandra and Watson (2005) among 454 participants, on data warehouse architecture selection among companies, showed that 39% selected the hub and spoke architecture and only a small percentage selected the federated architecture (Figure 9).

Figure 9: The distribution of the architectures (Ariyachandra & Watson, 2005, p. 24)

According to Ariyachandra and Watson (2010, p. 1), “data warehouse selection decision is a subset of IT infrastructure (ITI) design”. However, little research has been conducted in ITI design and most findings are depicted from case studies or

(37)

recommendations which are developed from observation or indications. As stated by Ariyachandra and Watson (2010), most of the research does not address the factors that influence the data warehouse design. Ariyachandra and Watson (2010), have introduced a research model for data warehouse architecture selection. Figure 10 shows the research model they have introduced.

Figure 10: Research model for data warehouse architecture selection (Ariyachandra & Watson, 2010, p. 4)

Their research on this model and further analysis shows that there is a combination of several factors affecting the selection of data warehouse architecture. They have introduced an overall model for data warehouse selection. The model has been created according to the selection factors that were chosen as most important. As stated by Ariyachandra and Watson (2010), based on organizational information processing theories (OIPT) information processing needs to occur as a combination of interdependence and task routineness. Also, both sponsorship level and information processing needs manipulate creation of the strategic view of the warehouse selection (Ariyachandra & Watson, 2010). Moreover, resource constraints, the perceived ability of IT staff and urgency (facilitating conditions) also influence the warehouse architecture selection (Ariyachandra & Watson, 2010) (Figure 11).

(38)

2.5 HEALTH INFORMATION MANAGEMENT

As mentioned by Johns (2002), information management is defined in several ways by different authors. Synott and Gruber state (as cited in Johns, 2002), the information management function provides control and management over information resources. Also, Scheyman states (as cited in Johns, 2002, p.4) information management “refers to information characteristics such as information ownership, content, quality and appropriateness”. The information management tasks that are performed traditionally in healthcare organisations are highly quantitative and departmentally focused (Johns, 2002). The role of the health information manager includes responsibility for managing health information in the given context. The traditional activities of the health information manager include to planning, developing and implementing systems designed to carry out tasks such as control, monitor, store, retrieve data on a departmental basis (Johns, 2002). Today, the tasks of the information manager are changing alongside the increasing information complexity in healthcare and they act as an information broker of information services such as information engineering, retrieval and analysis.

2.5.1 HEALTHCARE DECISION-MAKING

The following figure (Figure 12) shows the decision-making levels of an organisation. The top level of decision-making involves strategic decision-making (Johns, 2002). At this level managers make decisions about the overall goals of the organisation. For instance, types of decisions made on this level include which services need to be provided (such as acute, ambulatory or long term care). and at which geographical location to operate (such as local, state, national) (Johns, 2002). The second level concerns tactical decision-making. The decisions made on this level relate to the tactical units of the organisation such as patient care services and marketing (Johns, 2002). The third level concerns the day to day decisions of the organisation such as hiring employees, ordering supplies and medications, processing bills (Johns, 2002).

(39)

Figure 12: Decision-making levels within an organisation (Johns, 2002, p. 36)

2.5.2 HEALTHCARE INFORMATION SYSTEMS AND DECISION-MAKING

The importance of information technology to healthcare services can be seen differently from the perspectives of patients, professionals and government and funding agencies. A patient expects easy access to personal information, knowledge to provide self care, timely access to their healthcare professionals, privacy and up to date care. On the other hand, professionals’ expectations of Information Technology (IT) are different from those of the patients. As professionals they expect focused information, support for effective use of IT, decision support tools and new education and training. From government’s or funders’ perspectives accountability, efficiency, sustainability and scalability are expected through the implementation of IT to healthcare services (B. Barraclough, personal communication, March 31, 2009). Therefore, the important issue to consider is to try to achieve these needs through integrating IT with healthcare services. As stated by Lenz and Reichert (2007), to offer IT support effectively it is vital to understand healthcare processes characteristics.

According to Johns (2002), healthcare information systems were paper based for more than a century. The first use of computers in healthcare was reported to be between in 1960s and early 1970s. Evolution of healthcare information systems is shown in Figure 13. There are many information system applications used in healthcare today. As stated by Johns (2002), most of these applications are clinically

(40)

oriented systems such as patient monitoring systems, nursing information systems, laboratory information systems and so on. Also, there are applications which are supportive for the operational activities or managerial activities of a healthcare institution such as accounting information systems, human resource management information systems and materials management. On the other hand, some of the information systems are external to the organisation. As stated by Johns (2002), the information manager of an institution understand the components of information systems, how the system affects the organisation and others outside the organisation.

In the late 1980s, hospitals had started to implement many systems to support strategic decision-making, managerial decision-making and quality improvement (Johns, 2002). According to Grimson et al. (2000), previously, healthcare organisations consisted of individual units which were operated independently from one another and the need for information sharing was seen as less of a priority than it is today. However, the inability of sharing information across systems and organisations creates major barriers in progress on shared care as well as cost containment (Grimson et al., 2000). Moreover, as mentioned by Johns (2002), although the transactional databases contain a wealth of information it is impossible to extract information for high level decision-making. Also, absence of integrated healthcare leads to risks of medical treatment errors, lack of coordination, multiple examinations and increased therapy costs (Stolba & Schanner, 2007). Furthermore, according to Kerr, Norris and Stockdale (2007, p. 1017), “in the healthcare sector lack of data quality has far-reaching effects. Planning and delivery of services rely on data from different sources such as clinical, administrative and management sources”. Therefore, if the quality of the data is higher it helps to retrieve better information (Kerr et al., 2007).

(41)

Figure 13: Timelining Health Information Systems Evaluation (Johns, 2002, p. 61)

As mentioned by Landrum, Peachey, Huscroft and Hall (2008), there are many technological advances in use or under development to improve decision-making in healthcare industry such as Decision Support Systems (DSS). These systems help the hospital operate efficiently by reducing medical or prescription errors, organizing staff and patients by reducing the patients waiting time and facilitating effective diagnosis of the patients symptoms (Landrum et al., 2008). Some of the common DSS in healthcare are marketing systems, cost accounting systems and case-mix systems (Johns, 2002). These systems consist of tools that help the manipulation of data and “what if” analysis scenarios for strategic decision-making (Johns, 2002). According to Arigon et al. (2007), Clinical Decision Support Systems (CDSS) were introduced to assist decision-making in healthcare. However, the scope of this is limited when compared to clinical data warehousing (Arigon et al., 2007).

(42)

Also, as stated by Rajan and Ramaswamy (2010), because health data are derived from different environments there is a significant probability of errors and uncertainty. Moreover, many factors such as poor data quality, inconsistent representation and complicated domain knowledge etc., causes clinical decision-making to be a labour intensive and error prone task (Zhou, Chen, Liu, Zhang, Wang, Li, Guo, Zhang, Gao, & Yan, 2010). Therefore, effectively integrating health data from different sources is becoming recognised as a crucial factor (Shams & Farishta, 2001).

There are a number of technologies available to integrate data. These include data warehouses, database federations, database federation with mediated schemas and peer data management systems (Louie, Mork, Martin-Sanchez, Halevy, & Tarczy-Hornoch, 2007). As mentioned earlier, data warehouses integrate data from different sources to a single repository. In a database federation, integration of disparate sources is effected by using software programs that interface with the source (Louie et al., 2007). The database federations with mediated schemas

address problems faced by database federations when integrating data sources from different sources. They use mediated schemas which act as middleware in a database federation. In peer data management system (PDMS) “each data sources provides semantic mapping to either one or a small set of other data sources or peers (Louie et al., 2007, p. 8).” Each of these data integration technologies has advantages and disadvantages as shown in Figure 14. By using data and knowledge formalisms such as relational schemas, semi-structured data and ontologies, data are integrated in the above mentioned data integration architectures (Louie et al., 2007).

(43)

However, data governance is also an important factor to consider when implementing a data integration project. “Data governance refers to the overall management of the availability, usability, integrity, and security of the data employed in an enterprise”(Federal Student Aid, 2007). This will improve data consistency in decision making, improve data security, decrease the regulatory fines and assign accountability of data quality (Delgado, 2011). Although there are standards and security and compliance frameworks available for the healthcare industry, healthcare organisations should implement privacy programs to their data governance programs (Delgado, 2011). To implement the effective privacy program basic elements such as formal policy governance structure, written policies, funding and procedures to handle complaints need to be addressed(Delgado, 2011).

2.6 DATA WAREHOUSING AND HEALTHCARE

“In recent years, medical professionals are witnessing an explosive growth in data collected by various organisations and institutions” (Kerkri et al., 2001). Hence, there should be effective systems to manage healthcare data. As mentioned before, OLPT systems are not designed to provide support for the ad-hoc queries. The reason is, although transaction systems are rich in information it is very difficult to obtain the appropriately linked and analysed information for higher decision-making levels such as managers, executives. One solution that many organisations turn to is implementing a data warehouse concept (Scheese, 1998). According to Wah and Sim (2009, p. 530), “data warehousing is becoming an indispensible component in data mining process and business intelligence”. Increasing quantities of healthcare data is not the only problem, healthcare expenditure is another problem. Healthcare expenditure is increasing and is a burden for both individuals as well as governments (Yan & Jianli, 2005). For instance, annually U.S. allocates a trillion dollars for healthcare expenditure (Berndt, Fisher, Hevner, & Studnicki, 2001). Therefore, there is a need for a strategy to reduce healthcare expenditure and to improve quality of care.

In the context of the hospital systems, healthcare data comes from disparate sources such as hospital administration systems, clinical databases and financial systems and appears in many forms such as spread sheets, published books and other data formats (Berndt et al., 2001). The data warehouse provides an opportunity to

(44)

integrate these separated systems and provide help for efficient decision-making. According to the survey have done by Health Industry Insights company in U.S.A. (Figure 15) among 36 participants from healthcare provider chief information officers (CIOs) it was shown that roughly 40% selected that their current use of business and clinical intelligence is limited to deployment of data marts or data cubes (Holland, 2009). Also, 35% indicated limited use of business intelligence (BI)/clinical intelligence (CI) tools that are incorporated into their packaged software applications (e.g. electronic medical records (EMR), financial applications).

Figure 15: Current use of BI/CI by healthcare organisations (Holland, 2009, p.9)

2.6.1 DATA WAREHOUSE IMPLEMENTATION EXAMPLES

According to Winter (2007), data warehousing concepts in the healthcare environment have been implemented successfully in the private sector as well as in some government agencies in the USA. He has mentioned real examples of success stories of implementing data warehousing in the healthcare sector such as hospitals, and among commercial healthcare providers. As stated by Winter (2007), most of these healthcare organisations gain more benefits by implementing data warehousing. Some of these examples are outlined below.

The Midwestern Health Insurance Company in USA uses their data warehouse to identify and encourage optimal practices. The company found that the mortality

(45)

rate in cardiac surgery was lower for some healthcare providers. Subsequently, the significant finding was that mortality rate for bypass surgery for this insurer’s members declined by 75%, from 4% to 1%. Another example involves, commercial pharmacy savings of forty million dollars achieved in one sixth-month period with their data warehouse based program (Winter, 2007).

Veteran’s Health Administration (VHA) in USA is another institution that gains benefits from their data warehouse. The aims of their data warehouse use are to improve the quality, efficiency and safety of its medical care; measure the effectiveness of the care it offers; and to facilitate medical research. The VHA have saved millions of dollars on an annual basis through better decision-making (Winter, 2007). Also, the New South Wales department of Health (NSW Health), in Australia is another example for data warehousing success stories. NSW Health is responsible for many services such as a State-wide ambulance service, mental health services, drug and alcohol services and a network of community health centres etc. (Sybase, 2010). The new improvement to the data warehouse with Sybase provides an opportunity to enhance their benefits in several ways. Some of these benefits are:

• Reducing data loads by 76 percent

• Achieving a data compression rate of over 70 percent

• Simplifying administration and reduce overhead costs

• Delivers queries 85 percent faster (Sybase, 2010, p. 1)

However, many findings show that certain factors are important for the success of data warehousing. Winter (2007) introduces eleven critical factors that should be addressed for successful data warehousing in healthcare services. These factors include:

• The Enterprise approach

• Support for complex data structure • Support for complex queries • Large data volumes

• Concurrent and timely use • Flexibility

(46)

• Support and education • High availability • Privacy and security • Data quality and standards • High performance

Facilitating the enterprise approach to data warehousing provides the greatest benefits to health services. Health data flows from multiple different areas to the data warehouse. These data can flow from both internal as well as external sources. Therefore, the integration of all these data for relevant decision-making is essential (Winter, 2007). Concurrently, end users of the data warehouse need different views of the data. For example a doctor needs a complete picture of a patient’s history of tests, physical examinations, symptoms etc for making a clinical decision. Alternatively, an insurer requires a complete picture of a hospital when providing their services or its price structure. Likewise every user (physicians, payers, regulators, and researchers) needs the same data filtered in different views.

Healthcare systems are dealing with large volumes of data and this is growing rapidly day by day (Winter, 2007). Therefore, increasing the volume of data is a challenge for data warehousing in healthcare services. The important thing to facilitate is management of these high volumes of data in an efficient and effective manner. When implementing a data warehouse, quality of data plays a major role. According to Leitheiser (2001, p. 1), “healthcare organisations data is central to both effective healthcare and to financial survival”. Therefore, data quality must be high to provide reliable and dependable information for decision support. According to Winter (2007), flexibility is another critical factor when implementing a data warehouse. In other words, the data warehouse should be able to adapt to changes which can occur due to variation in regulations, technology advances and fluctuations in consumer expectations. The changes which occur may be simple or complex. For example, new data types continue to grow with increasing use of images, text and audio and must be accommodated.

The privacy and security of health related informationalso plays a major role when implementing a data warehouse (Winter, 2007). As a data warehouse consists

Figure

Figure 1: Components of the data warehouse (Kimball & Ross, 2002, p. 7)
Figure 3: Star schema data model  (ExecutionMih, 2010, p. 2)
Figure 4: Snowflakes schema data model (ExecutionMih, 2010, p. 2)
Figure 6: Data warehouse architectural types (Ponniah, 2010, p. 33)
+7

References

Related documents

Top panel: samples from a MC simulation (gray lines), mean computed over these samples (solid blue line) and zero-order PCE coefficient from the SG (dashed red line) and ST

We, the Heads of State and Government and High Representatives of Member States, have decided to provide opportunity for and call on all Member States to contribute in the fashioning

Advanced Papers Foundation Certificate Trust Creation: Law and Practice Company Law and Practice Trust Administration and Accounts Trustee Investment and Financial

Look to Jesus, Love the Brethren, Preach the Word, and Shine the Light of Truth to this

In total in the four big cities 86,750 square metres of small office space were taken-up last year, which means an increase of 24% compared to 2013. In the other regions a total

Hot water cylinder Heat exchanger in radiator heating system RAVK RAVK Technical data Type RAVK k v (m 3 /h) at a P-band °C

As one ofEurope’s leading contract logistics provider, FIEGE established holistic and efficient supply chain management systems proven by customers and institutions –

The media server can support publish-subscribe model for A/V clients, which means a client can subscribe to the audio and video streams via general signaling procedure and the