• No results found

Data Ownership and Enterprise Data Management: Implementing a Data Management Strategy (Part 3)

N/A
N/A
Protected

Academic year: 2021

Share "Data Ownership and Enterprise Data Management: Implementing a Data Management Strategy (Part 3)"

Copied!
11
0
0

Loading.... (view fulltext now)

Full text

(1)

Leader in Data Quality and Data Integration

www.dataflux.com 877–846–FLUX

International +44 (0) 1753 272 020

A DataFlux White Paper

Prepared by:

Mike Ferguson

Data Ownership and Enterprise Data Management:

Implementing a Data Management Strategy (Part 3)

(2)

Introduction

So far in this three-part series on data ownership, I have discussed what data ownership is, why it is important, what the key requirements of enterprise data management (EDM) are and how companies can address the data management problem by standardizing on a suite of technologies, which I referred to as an EDM suite.

In this, the third and final paper in this short series, I want to look at what needs to be done from a strategy perspective to be able to establish personnel and procedures for enterprise data management, and what needs to be done in order to leverage the technologies available in an EDM suite to get maximum return on investment.

Enterprise Data Management Strategies

In the first paper of this series, we outlined three key requirements for enterprise data management. These requirements are:

• Establish a common suite of technologies for end-to-end data management • Dedicate IT personnel to enterprise data management

• Establish policies for data governance

Having looked at the first of these already in the second paper, we now turn our attention to organizational structure and data governance – concepts that are fundamental to any data management strategy.

Organizational Structures for Enterprise Data Management

One of the key appointments any company can make to help get their data under control is the position of a Chief Data Architect. This is often a position overlooked in IT and sometimes not well understood by business. If it does exist, this person must have a business mandate to cause change so that data can be brought under control. Fundamentally, the job of a Data Architect is to understand how data is used in business on an enterprise-wide basis and to formally define the data used. This individual is also responsible for setting policies and procedures for the use of that data, for maintaining data quality, and for ensuring a common consistent

understanding of what data means. Ideally, a Data Architect should have extensive experience in the vertical industry that he or she works so that they can clearly discuss data in the context of its business use. Data Architects must also have expertise in data management skills such as:

• Implementing data standards and establishing policies for developers and business users, including defining standard enterprise-wide data vocabularies • In-depth understanding of the relational model and navigating XML schemas • Data modeling and modeling techniques such as normalization and star

schema multi-dimensional modeling, as well as some fluency in the use of data modeling tools

• Logical and physical database design

• Data profiling and defining rules for data content cleanup

Companies

need a

strategy in

place in order

to get data

under control

and manage it

on an

enterprisewide

basis.

A Chief Data

Architect should

have extensive

experience in

the vertical

industry that he

or she works.

(3)

• Understanding of the requirements that regulations and legislation impose on data for the purposes of compliance

Ideally, data architects should have an enterprise-wide remit in the sense that they need to operate across all lines of business when managing data. This is especially important in setting strategy and patterns (best practices) around specific data management processes such as:

• Master data management

• Data profiling and data monitoring • Data migration and consolidation

• Data replication and change data capture • Data synchronisation

• Data federation

• Data warehousing and data aggregation • Data security

• Taxonomy design

Many companies are starting to create centralized IT expertise in business integration by creating Integration Competency Centers so that IT professionals responsible for different types of integration are able to coordinate their work. The data architect is at the center of data management, data quality and data integration and should be a key member of any integration competency center initiative. Figure 1 shows five levels of business integration. Data and metadata integration (and management) underpin and are a key piece of any business integration initiative.

The data

architect needs

an

enterprisewide

business

mandate.

It helps if

enterprise data

management IT

professionals

can work with

otherbusiness

integration

professionals in

an integration

competency

center.

(4)

Data and Metadata Integration Application Integration Business process integration

People integration User interface

Integration

Organization Structure – EDM As Part of An Integration

Competency Center To Coordinate Integration

Strategic Objectives

e.g. “Reduce operational costs”

Co-ordinate integration to achieve an objective

Co-ordination requires an Integration Competency Centre Business process management software Enterprise portal software EAI and SOA integration platforms ETL , EII, DQ, master data management & content management

Integration is happening at all these levels

Collaboration tools

Figure 1 - Five levels of business integration

Figure 2 shows how such an enterprise data management team operates. The first thing to notice is the consolidation of IT professionals responsible for data

management and data integration in operational, BI and unstructured content management systems into a single team. This enterprise data management team includes the users of the EDM suite of technologies, and this team has a responsibility to set standards and support the other IT professionals working in specific lines of business throughout the enterprise. People in this team are EDM technology experts.

EDM in An Integration Competency Centre

- A Federated Organisational Structure Is Worth Considering

Corporate Integration Competency

Centre

Enterprise Data architects

Data naming and definition standards Enterprise Data Model

Data Security

Data integration development and management policies

Master data management Taxonomy design

Integration with other technologies sponsor Data Governance Steering Committee

Content management, community taxonomy maintenance, data modelling using common data definitions, data integration templates

Executive

Business units

Dedicated EDM team in the ICC

Merge operational, BI & content management IT data integration teams

Figure 2 - How an Enterprise Data Management team should operate

Companies may

benefit from

merging

operational,

business

intelligence and

content

management

data integration

IT professionals

into one team.

(5)

As an example, if IT professionals in different lines of business wish to create data models then these models would be constructed from standard definitions and entities made available to them by the EDM team. The same applies if there is any

development of data. It is also important that such a team is backed by an executive sponsor who participates with a data governance steering committee. At the risk of being criticized for suggesting yet another steering committee, I would at least argue that in today’s climate of much tighter regulations, CFOs are the likely sponsors as they take compliance very seriously. There also appears to be no shortage of

executives lining up to participate in such a steering committee. Having a Compliance Manager on such a committee is also important.

Enterprise Metadata Management and Integration

Another key element of an enterprise data management strategy is data governance. This is about defining data standards including common data names and data

definitions (common metadata), common policies, patterns and processes for enterprise data quality and data integration development, the construction of an enterprise data model, and defining policies and processes for master data management (MDM).

Data standardization requires that a shared business vocabulary (SBV) is established. Setting up a shared business vocabulary involves identifying and defining data used in the enterprise, creating a common set of enterprise-wide standard data definitions for that data and then mapping (cross referencing) disparate data definitions to these common definitions. More specifically, the SBV involves incrementally defining a set of enterprise-wide common data names, common data definitions, common data

integrity rules, common reference data (e.g., code set values), common mappings and common transformations for all master data, transactional data, dimensional data and metrics. The SBV forms the base of an enterprise-wide data standard. It is

fundamental to the success of commonly understood data, master data management and data integration.

These common data definitions can be used in:

• Data models to get consistency across multiple models • Data integration tools (ETL and EII)

• Application integration technologies (message brokers and ESBs) • Business views of reporting tools

• XML mark-up tags

Rendering data (e.g., in XML form) using standard XML tags based on SBV data names means that data can be presented using data names that are commonly understood by users. In addition, if data is made available for consumption by applications in the same way, then the data is managed in a consistent unambiguous fashion as it travels throughout the enterprise and as it is prepared for presentation. To explain why this is important, consider Figure 3. This shows a common problem that often arises when trying to integrate Performance Management products with multiple lines of business intelligence systems to calculate key enterprise level performance metrics.

A shared

business

vocabulary of

common data

names, data

definitions and

data integrity

rules needs to

be established.

(6)

Why Do We Need An SBV? - How Do You Drill Down From

CPM Products When Metrics Definitions Are Inconsistent?

Custom built data mart App’n metadata BI tool metadata DBMS metadata Packaged analytic app data mart DBMS Packaged

analytic app’n BI Tool

Custom built data mart Revenue? metrics definition Total Sales? metrics definition Turnover? metrics definition Total Revenue

KPI metric definition CPM product Common definitions are critical to BI integration

Figure 3 - SBVs better integrate Performance Management products with multiple lines of Business Intelligence systems

If each underlying BI system has been built independently, then what happens when you have three different metrics in three different BI systems called Revenue, Total Sales and Turnover, and you want to create a Key Performance Indicator called Total Revenue? Do you think business users understand the difference between these metrics? Worse still, do you think an IT developer knows the difference? The problem here is obvious. It’s ambiguous and prone to misunderstanding. This misunderstanding can lead to erroneous reporting and opens up the door for potentially incorrect

interpretation of data and incorrect decision making.

A best practice would be to prevent this and to establish a shared business vocabulary across all BI systems and business views. This is done in some companies but not necessarily in all. Nevertheless, data standards are about preventing different development teams inadvertently introducing ambiguity. Even if common definitions are practiced across BI systems, it is very likely that the same could not be said when you head into the world of operational systems. Most operational systems have their own application-specific data names and data definitions (application-specific data vocabularies) for data that they maintain. Therefore, when you consider Figure 4, it is not difficult to see the problem caused when integrating applications directly into enterprise portals, for example, to integrate and simplify user interfaces. One

important question to ask when looking at Figure 4 (below): What is the problem with this architecture?

It is difficult to

integrate BI

systems at the

enterprise level

when they have

inconsistent

data names and

data definitions.

(7)

Why Do We Need An SBV? - Plugging Applications Into A

Portal Means Each Application Displays Its Own Data Names

WebServices \ SQL \ Custom

App App App

Portlet

Portlet

Portlet

Portlet

Portlet

Portlet

Figure 4 - SBVs provide consistency across multiple applications.

The answer, of course, is obvious. All applications present their data using their own application-specific data names (vocabulary). If the same data is used in different applications (e.g., customer data, product data, order transactions, etc.) and each of these applications use different data names for the same data, then the user has to

know this to correctly understand data presented to them on a portal page by the

different applications. Worse is when different data in different applications have the same data names and this data is presented to the user on the same portal page. In this case the user once again must be aware of the differences if they are to

accurately understand what they are examining.

In order to resolve this problem, any application-specific data rendered using

application-specific XML tags for presentation on a portal page needs to be intercepted and translated into commonly understood data names before the user sees it. This can be achieved by introducing a message broker or enterprise service bus (ESB)

technology between the application and the portal. When this happens, data marked up using application-specific XML tags can be translated at run time into common XML tags when the data appears on the screen using SBV data definitions. Simply put, the application-specific data definitions still exist but have been hidden by the introduction of message-broker or ESB software.

Similarly, if message-brokers or ESB technology is used for application integration in a service oriented architecture (SOA), then data in any messages that travel between application services as part of a business process needs to be translated from source application specific mark-up to common mark-up, and from common mark-up to destination application specific mark-up. What does all this mean? It means that business integration software needs to make use of SBV common data definitions and mappings from disparate systems to common definitions. Is this familiar? It should be – and to show you why, look at Figure 5 (below).

Figure 5 shows a remarkable coincidence when comparing application integration software (message brokers or ESBs) and on-demand data integration technology

Integrating the

user interfaces

of disparate

applications in a

portal will

highlight all the

differences in

disparate data

names across all

applications.

(8)

(sometimes referred to as enterprise information integration, or EII). EII products, part of an EDM technology suite, present virtual integrated views of disparate data in multiple underlying systems and allow these virtual views to be accessed as if the data was integrated in a database. These virtual views can be defined using common standard data definitions (i.e., using the SBV definitions). Data integrated in real-time by EII products will then render the data marked up using common data definition tags.

Common Metadata Is Needed In Data and Application

Integration Technologies To Achieve Consistency

data vocab 1 data vocab 2 data vocab 3 EII Data Integration

mapping mapping mapping

web service adapter

Composite App service EAI Application

Integration

platform Common vocabulary

App App App data vocab 1 data vocab 2 data vocab 3

mapping mapping mapping

web service adapter

all data presented in common vocabulary in the portal Composite App service WSRP

EII works by giving applications an on-demand virtual integrated common vocabulary view of disparate data

common vocabulary integrated virtual view

Figure 5 - Understanding why business integration software needs to make use of SBV common data definitions and mappings from disparate systems to

common definitions

These are the key points to remember:

• To integrate data when building a data warehouse using ETL technology, you need common data definitions for the target system, and you need to know how disparate data in source systems maps to common data definitions • To integrate data on-demand (e.g., for reporting or presenting on a portal

screen) using EII technology, you need common data definitions for the integrated virtual view of the data and need to know how disparate data in source systems maps to common data definitions in the virtual views • To manage electronic data messages as they enter the enterprise and move

data between applications, message brokers and ESB software, you need to know the common data definitions for data and how disparate data in source application systems maps to these common data definitions so that message translation can take place

• To achieve consistency across multiple performance management and

reporting tools, you need common data definitions for the BI and performance management tool business views, and you need to know how disparate data in source systems maps to the common data definitions in the business views

Data integration

technologies

also need to

make use of a

shared business

vocabulary.

Data mappings

from disparate

to common are

also needed by

multiple

technologies.

(9)

• To define master data and solve the master data management problem, the master data entities to be defined using common data definitions and then the mappings from disparate data to master data and vice-versa, you need to define this data so that master data integration and synchronization can be managed

In fact, everywhere you look you see precisely the same requirement again and again. The secret to Enterprise Data Management is therefore in the metadata. If you first capture the shared business vocabulary and all the mappings from disparate definitions to common SBV definitions, then that metadata can be provided to and shared across:

• EDM suite technologies, such as ETL tools or EII tools,

• Data modeling tools for building data models using consistent common data definitions

• MDM applications and technologies

• BI and Performance Management business views

• Message brokers and Enterprise Service Bus technologies used in application and business process integration

• Portal technology to present the data for business use.

Combine this with enterprise data quality and the data quality firewall discussed in the second paper in the series, and you can see the whole strategy for enterprise data management coming into clear focus and taking real shape. Figure 6 shows the power of common metadata when you have an SBV and know the mappings from disparate data definitions to common ones. If you have the tooling to do this work once, all the consistency needed to manage data across the enterprise stems from the same base metadata, as long as this metadata can be shared across technologies.

The secret to

enterprise data

management is

being able to

share common

metadata across

multiple

technologies to

achieve

consistency.

(10)

Operational App

Data Governance - Use Created Common Metadata To Generate Mappings For Multiple Integration Technologies To Achieve Consistency

Composite App service

EAI Application

Integration

platform Common vocabulary

Operational App data vocab 1 data vocab 2 data vocab 3

mapping mapping mapping web service adapter

all data presented in common vocabulary in the portal Composite App service data vocab 1 data vocab 2 data vocab 3 EII Data Integration platform

mapping mapping mapping web service adapter

common vocabulary integrated virtual view WSRP C R U D prod cust asset master data

Generate common vocabulary & XSLT mappings

Enterprise DQ & Data Integration

Generate common vocabulary virtual model and mappings

Common metadata historical data DW mart mart mart Operational App

Figure 6 - Data governance and the power of common metadata

Enterprise Data Quality

In addition to establishing a shared business vocabulary, an EDM strategy involves the establishment of a data quality firewall to validate data entering the enterprise via keyboard, electronic message or file. Implementing data profiling technology and data quality as a service so that it can be invoked on-demand, in-batch, on a timer-driven and event-driven basis is the way to handle this. Enterprise data quality technology is a key piece of the EDM suite of technologies outlined in the second paper of this series. Common rules defined for a shared enterprise data quality service is vitally important to consistent data validation, data repair, data matching and handling missing data.

Master Data Management

Master data management involves using the technologies in the EDM suite to solve the master data management problem. Key business entities such as Product, Asset, Employee, Customer, Supplier, etc. need to be defined using the common data

definitions of the shared business vocabulary. Data integration technologies leveraging the SBV and mappings from disparate data to the SBV definitions can then integrate master data from disparate line of business applications and persist it in a master data hub. In addition, the same metadata also allows MDM solutions to synchronize subsets of master data used in operational applications when master data is updated centrally, and supplies dimensional data to data warehouses and data marts. The data quality firewall protects the quality of master data and manages all changes to it via the keyboard, electronic message and batch files.

A shared

business

vocabulary, EDM

technologies and

a data quality

firewall are all

needed for

successful MDM.

(11)

Conclusion

An enterprise data management strategy involves getting the organizational structure right, selecting technologies that form an integrated EDM suite for handling all

metadata management and data management needs, putting controls in place, and setting up data governance processes. These things together allow the enterprise to take control of data ownership, achieve compliance and raise the bar on data quality, business practice and business confidence.

The keys to this EDM strategy require: a shared business vocabulary; metadata integration and metadata sharing across enterprise integration technologies; a data quality firewall; and master data management. We have the pieces to solve the problem. Establishing a strategy for enterprise data management will help you take back control of the data in your enterprise.

Organizational

structure, EDM

technologies and

data governance

are the key

pieces needed to

get control of

your data and

achieve

References

Related documents

Major chloroplast genome structural events (gene losses, IR expansion/contraction and structural rearrange- ments) and inferred scenarios impacting several Monocotyledons clades

Hawkins, Frisco Assistant Federal Public Defender Northern District of Texas Jerry Van Beard, Fort Worth Assistant Federal Public Defender Northern District of Texas. 11:30

[r]

The table shows the results of phytochemical test on fresh samples of nutmeg stem bark contain secondary metabolites of alkaloid, terpenoid, phenolic and flavonoid groups,

compositional differences between age groups in asthma severity and other unobserved factors. However, the results of our cross-sectional analyses were consistent with the results

A market survey of 50 pig traders from an urban market (Katsit) and two rural markets (Zonkwa and Samaru Kataf) was conducted to evaluate the structure, conduct and performance

А для того, щоб така системна організація інформаційного забезпечення управління існувала необхідно додержуватися наступних принципів:

As you may recall, last year Evanston voters approved a referendum question for electric aggregation and authorized the city to negotiate electricity supply rates for its residents