• No results found

The TQdM Methodology

Methodologies for Data Quality Measurement and Improvement

7.3 Comparative Analysis of General-purpose Methodologies

7.3.3 The TQdM Methodology

The TQdM methodology (see [68]) was initially designed for data warehouse projects, but its broad scope and its level of detail characterize it as a general-purpose DQ methodology. In a data warehouse project, one of the most crit-ical phases concerns the activity of off-line consolidation of operational data sources into a unique, integrated database, used in all types of aggregations to be performed. In the consolidation phase, errors and heterogeneities present in sources have to be discovered and solved, or suffer of data warehouse cor-ruption and failure.

The orientation of TQdM toward data warehouses results in a prevalent data-driven character of the methodology. The general strategy of TQdM is synthesized in Figure 7.11. The areas in which TQdM is original and more comprehensive when compared to other methodologies are cost-benefit anal-ysis and managerial perspective. We have discussed the cost-benefit analanal-ysis classification model of TQdM in Chapter 4. TQdM provides extensive guide-lines for evaluating costs of loss of quality, costs of the process of data im-provement, and benefits and savings resulting from data quality improvement.

We notice here that another methodology, specifically focused on costs and

7.3 Comparative Analysis of General-purpose Methodologies 175

AdmSubscribedLocEvents AddressIndex

Address Office Municiplaityl B Address Office Municipality A

<<processing>>

Receiving TransferredAreas

<raw Data>

<<processing>>NotifyAreasTransfer

<<processing>>CityCouncilBNotification

<<processing>>IndexNotification

<raw Data>

<<processing>> Receiving TransferredAreas

<raw Data>

<<data Storage>>

<<processing>>NotifyAreasTransferred

<<processing>>

Receiving TransferredAreas

<<data Storage>> <<data Storage>>

<raw Data>

AdmSubscribedLocEvents AddressIndex

Address Office Municiplaityl B Address Office Municipality A

<<processing>>

Receiving TransferredAreas

<raw Data>

<<processing>>NotifyAreasTransfer

<<processing>>CityCouncilBNotification

<<processing>>IndexNotification

<raw Data>

<<processing>> Receiving TransferredAreas

<raw Data>

<<data Storage>>

<<processing>>NotifyAreasTransferred

<<processing>>

Receiving TransferredAreas

<<data Storage>> <<data Storage>>

<raw Data>

Fig. 7.10. An example of a quality improvement model in IP-UML

savings, is described in [123], while [16] describes an integer linear program-ming formulation of a quality improvement process that optimizes costs. We focus now on the managerial issues of TQdM.

Management of Improvement Solutions

The main aspect discussed in TQdM concerns the managerial perspective, i.e., the strategy that has to be followed in an organization in order to make effective technical choices. The choices are in terms of DQ activities to be performed, databases and flows to be considered, and techniques adopted. So, in the final stage of TQdM, the focus is moved from technical to managerial aspects. The extent of the steps, shown in Figure 7.11, provides evidence of the attention devoted to this issue. Some steps are also present in preced-ing phases, which we do not comment on. Specific tasks of the managerial perspective concern:

1. Assessment of organization readiness in pursuing DQ processes.

2. Survey of customer satisfaction, in order to discover problems at the source, i.e., directly from service users.

3. Initial focus on a pilot project, in order to experiment with and tune the approach and avoid the risk of failure in the initial phase, which is typical of large-scale projects performed in one single phase. This principle is inspired by the well-known motto “think big, start small, scale fast.”

1. Assessment Data analysis

Identify information groups and stakeholders Assess consumer satisfaction

DQ requirements analysis Measurement

Identify data validation sources Extract random samples of data Measure and intepret data quality Non quality evaluation

Identify business performance measures Calculate non quality costs

Benefit evaluation

Calculate information value 2. Improvement

Design solution improvement On data

Analyse data defect types Standardize data Correct and complete data

Match, transform and consolidate data On processes

Check effectiveness of improvement

3. Management of improvement solutions – organizational perspective Assess the organization’s readiness

Create a vision for information quality improvement

Conduct a customer satisfaction survey of the information stakeholders Select a small and payoff area to conduct a pilot project

Define the business problem to be solved Define the information value chain Perform a baseline assessment Analyze customer complaints Quantify costs due to quality problems Define information stewardship

Analyze the systematic barriers to DQ and recommend changes

Establish a regular mechanism of communication and education with senior managers Fig. 7.11. TQdM description

4. Definition of information stewardship, i.e., the organizational units and their managers who, with respect to the laws (in public administrations) and rules (in private organizations) that govern business processes, have specific authority on data production and exchange.

5. Following the results of the readiness assessment, analysis of the main barriers in the organization to the DQ management perspective in terms of resistance to change processes, control establishment, information shar-ing, and quality certification. In principle, every manager thinks that her or his data is of very high quality, and he or she is reluctant to accept con-trols, respect standards and methods, and share information with other managers. This step concerns a well-known habit of managers to consider data as a type of power.

6. Establishmnet of a specific relationship with senior managers, in order to get their consensus and active participation in the process.

Before concluding this section on TQdM, we mention a second set of major managerial principles inspired by [50].

7.3 Comparative Analysis of General-purpose Methodologies 177

• Principle 1. Since data are never what they are supposed to be, check and recheck schema constraints and business rules every time fresh data arrive.

Immediately identify and send discrepancies to responsible parties.

• Principle 2. Maintain a good and strict relationship with the data owners and data creators, to keep up with changes and to ensure a quick response to problems.

• Principle 3. Involve senior management willing to intervene in the case of uncooperative partners.

• Principle 4. Data entry, as well as other data processes, should be fully automated in such a way that data be entered only once. Furthermore, data should only be entered and processed as per schema and business specifications.

• Principle 5. Perform continuous and end-to-end audits to immediately identify discrepancies; the audits should be a routine part of data pro-cessing.

• Principle 6. Maintain an updated and accurate view of the schema and business rules; use proper software and tools to enable this.

• Principle 7. Appoint a data steward who owns the entire process and is accountable for the quality of data.

• Principle 8. Publish the data where it can be seen and used by as many users as possible, so that discrepancies are more likely to be reported.