1. Background and Introduction; Setting the Scene
1.8. DCM data-management framework; a data warehouse approach
In order to maximise the potential of data being generated through DCM by using it for potential secondary uses, a sustainable and consistent data management solution is required. In a previous study, I proposed a data management framework using a data warehouse approach for managing the secondary use of DCM data (Khalid 2010). A detailed and critical review of this study is provided in Chapter 3. A data warehouse is a type of information system which provides an electronic repository that stores and links data taken from various sources and enables its retrieval for secondary use (Stolba 2007). The detail about how a data warehouse works is also presented in Chapter 3. My 2010 study was the first, and is currently the
25
only, technical solution proposed for managing DCM data for secondary purposes. It adopts a two
26
-step conceptual approach (Figure 1). A rationale for why a data warehouse and two-step approach was proposed is discussed in Chapter 3.
The first step involves storing DCM data taken from national and international mappers within a web-based data repository. This data repository is called the DCM international database; its purpose is to enable national and international mappers to input their DCM data into an online database system, which will store their own collected mapping data over time and also generate basic reports based on completion of analysis (Khalid 2009).
Figure 1: A two-step process for managing DCM data for secondary use
The second step involves processing DCM data into a data warehouse for the purpose of long-term storage and for reusing the data for secondary purposes, for example: complex analysis and reporting on the DCM data;
secondary research; benchmarking the quality of care; and data-mining for identifying trends and patterns of good dementia care (Khalid 2010). While my previous study initiated the important and novel work for proposing a solution for managing DCM data for secondary uses, it only went so far as assessing and proposing a technical architecture for the data management needs of DCM data. The DCM data warehouse still needs to be designed, developed and implemented through what is known as a development life-cycle (Thakur and Gosain 2011).
27
Within the development life cycle, designing a data warehouse is a fundamental and important step towards its successful use and acceptance by users (Browning and Mundy 2001; List et al. 2002; Schaefer et al. 2011).
The design process of a data warehouse involves producing a data model, which is a structural representation of data. A data model is designed based on specified requirements. The requirements refer to information obtained from various sources such as existing systems, documents or potential users of the system, which illustrate ‘what the system can do’ and ‘how it can be done’. Understanding requirements within the design process is referred to as requirement analysis, which is a process of obtaining, synthesising and analysing the requirements into an explanation that can support the design and development of a workable and acceptable system (Abai et al. 2013). A detailed view of what constitutes requirements and how these are gathered for designing and developing a data warehouse is presented in Chapter 4.
In order to demonstrate data management across a two-step framework, within my previous study (Khalid 2010), an attempt to design a data model for a DCM data warehouse was made. It was a technical effort where the existing system of Excel programme1 and the design of the DCM international database (Khalid et al. 2010) were analysed to obtain the requirements for designing a future DCM data warehouse. This technically focussed approach of gathering requirements for designing a data warehouse is called a data-driven approach, and is one of the two main approaches to designing a data warehouse. These two approaches are critiqued in Chapter 4. Based on these
1 Excel programme is provided by the BDG to support mappers’ basic analysis of some of the DCM data such as BCC and ME.
28
requirements, a basic data warehouse design (e.g. a data model) was proposed, which was validated by using simulated DCM data. The main aim was to assess the DCM data management framework and validate the data flow from the DCM international database to the DCM data warehouse and its uses for data-mining (Khalid 2010). This study therefore made the first successful attempt at showing the technical workability of a DCM data warehousing approach. However, the study was limited in that it did not involve potential users in the design process for example through gathering their requirements for designing the DCM data warehouse.
As will be explored in further detail in Chapter 5, data warehouses are information systems that are represented as a combination of people, technology and organisations (Iivari and Hirschheim 1996). These three aspects influence the data warehouse design and the requirement analysis process immensely. Users play an important role in identifying the requirements that inform the type and structure of the data (data model) that goes into the warehouse, the processing of the data, which provides valuable information, and the retrieval of that information for specific purposes (Lindgaard et al. 2006). Therefore, a growing body of literature (Kappleman 1994; Raab 1998; Teixeira et al. 2012) suggests and emphasises the involvement of users for identifying the system requirements, which could be designed and developed further. This approach is called a user-driven approach for designing a data warehouse. A detailed rationale of using a user-driven approach for a DCM data warehouse is provided in Chapter 4.
Further, as will be discussed in detail in Chapter 4, the literature suggests using data-driven and user-driven approaches together to provide a broader
29
and more detailed picture of requirements for a data warehouse (Golfarelli 2010; List et al. 2002). While my previous study (Khalid 2010) used a data-driven approach for designing data models for a DCM data warehouse, it technically operative but user-accepted as well (Schaefer et al. 2011).