• No results found

Components Engineering Group. White Paper. Enterprise Data Management An 'IDEAL' Solution

N/A
N/A
Protected

Academic year: 2021

Share "Components Engineering Group. White Paper. Enterprise Data Management An 'IDEAL' Solution"

Copied!
18
0
0

Loading.... (view fulltext now)

Full text

(1)

Enterprise Data Management

An 'IDEAL' Solution

White Paper

(2)

Srikar Chilakamarri

senior consultant

Srikar Chilakamarri is a senior consultant with TCS and specializes in Data and Business Intelligence (BI) solutions. He has a Bachelor's degree in Mechanical Engineering. He has architected and managed many BI solutions across telecom and financial services. He has authored TCS' analytic data model and has been a speaker at many events. Currently, he heads the Enterprise Data Management program at TCS.

Shawnik Singh Thakur

Analyst, Business Development

Shawnik Singh Thakur is an analyst with TCS and specializes in business development and offering data solutions. He holds a Master's degree in Business Administration. Currently, he is part of the solutions team, offering data solutions to customers under the Enterprise Data Management program in TCS.

(3)

With enterprises looking to make more informed business strategies, data management

becomes indispensable for decision makers today. Collation, consolidation, transfer and

storage of data is more dynamic than ever before and terms such as data management,

governance and metadata are now a quintessential aspect of business strategy. The biggest

challenges of data management are well captured in the buzzword 'SMAC'—Social, Mobility,

Analytics and Cloud. The data from a controlled and structured environment has evolved to

expand its scope impacting the dimensions of volume, structure, ownership and intelligence,

hence bringing data governance to the forefront. It is crucial to leverage data management

skills and ensure it is scalable enough to mold itself for future needs.

This paper assesses the current business landscape and identifies the challenges faced in the

data management domain. It recommends a two-pronged approach to the problem, which

involves the use of process (through data governance) and technology (through metadata).

In this paper, we propose an 'IDEAL' approach:

Identify the need and stakeholders

Define the scope, landscape and governance model

Establish the metadata blueprint and governance body

Analyse the metrics, leveraging reports and analytics to monitor deviation

Liaise with the stakeholders

Organizations that follow this approach will optimize their investments by factoring in the

'reuse' of existing tools in their ecosystem. Based on the principles of 'loosely coupled tightly

integrated' architecture, this solution simplifies the implementation methodology for data

architects.

(4)

Contents

1. Introduction 6

2. Current State 6

3. Challenges in Data Management 8

4. Implementing metadata driven data governance 9

4.1. Data Governance: Process Lever 9

4.2. Metadata: Technology Lever 10

4.3. Solution Addressing Challenges 13

5. The ‘IDEAL’ approach 14

(5)

EAI Enterprise Application Integration ETL Extract, Transform, Load

GB Giga Bytes

IDC International Data Corporation

IT Information Technology

SOA Service-Oriented Architecture

TCO Total Cost of Ownership

TCS Tata Consultancy Services

TAT Turnaround time

(6)

1. Introduction

A major challenge for decision makers today is to process and utilize the exponentially increasing quantum of data to deliver strong business results. This ever-changing data, combined with cloud/virtualization technology, is radically reshaping today's data management landscape. It has marshaled a new era where data is moving from silo systems to cloud and from tables to social media. Possibly one of the biggest challenges, and consequently, the most exciting opportunities faced by enterprises today is 'Data Management'

A quick analysis of the market outlines the current challenges and key issues faced by organizations in the data management domain. The mandate for a solution is to ensure that it leverages the existing investments and implements a methodology-based process (through data governance) and technology-based approach (through metadata). This will enable the organizations to specify the policies, people, and processes needed to manage data for the purpose of delivering trustworthy, timely, and relevant information.

2 . The Current Data Universe

According to a white paper by International Data Corporation (IDC), there is about 500GB of data per person in the world and this is a growing figure (Source: IDC White Paper - The Expanding Digital Universe - March 2007).

Although less than 30 percent of the data is created by corporations, these entities are responsible for security, privacy, and reliability of 85 percent of the data. Corporations across the globe realize that data is a valuable corporate asset and it needs to be managed effectively. With ever-increasing data volumes, real-time business intelligence, and cost controls, the bigger challenge is to prevent data defects rather than fix them.

The universe of data spans across structured, semi-structured and unstructured data. Consequently, data management solutions are considering utilizing big data technologies to bring order to chaos and enable more streamlined data management processes. The data management space leverages all possible options to ensure that the knowledge facilitation is not compromised. Key trends governing data management today are:

Big Data: A technology-assisted technique to facilitate processing of large data volumes effectively

Cloud: A cost lever, where the obvious rendering platforms of hardware, software, and solutions are offered in a

shareable model. This enables enterprises to leverage 'pay per use' concepts for effective cost benefit

Mobility: Rendering data anywhere, anytime, and with effective messaging in handheld devices. This targets the

'information availability' dimension of the data management space

In-memory databases: With a focus to get the right information at the right time, the technology has evolved

to a state where reading from and writing to databases are eliminated to harness the computing power in memory

Data virtualization: A technique to optimize storage and eliminate redundancy of data across enterprises

Data management techniques have been guided by advancements in technology. Enterprises have realized the power of analytics to enforce the data management discipline, leading to faster adoption of technology levers enabling effective business plans.

(7)

Following are the key focal areas of enterprises that influence data management principles:

Lowered total cost of ownership: The unexpected flood of data, higher cost of existing data management tools

and market pressures have caused enterprises to look for solutions that have higher return on investment and direct impact on margins

Lesser time to market: In this intensely competitive business landscape, a day's delay in TAT can have serious

repercussions

Improved decision making: With speed and cost taking priority, strategic and tactical insights have emerged as

key factors to faster and more effective decision making. Data integrity and assurance as basic sanitary elements enable these processes

On-demand availability: 'Anytime, anywhere' is a buzzword and the objective is to deliver relevant and

actionable content to executives on the move

Figure 1 demonstrates the relationship between business mandates and technology solutions in the data management discipline.

7

Figure 1. Data Management: Business Mandates to Technology Solutions

Reduced TCO Increased Time to Market Improved Decision Making On-demand availability Cloud Platforms In Memory Databases Mobility Solutions DATA VIRTUALIZATION Information & Technology (IT) Data Quality Assurance

Data

Management

Discipline

BIG DATA

Business Manda

te

Technology S

olutions

(8)

The following table outlines the variety of solutions rendered by the data management discipline, with the evolution of business mandates and data management levers.

8

Business Mandates Data Management Levers

Lowered TCO Cloud platforms

Data virtualization

Improve/shorten time to market Big Data

In-memory databases

Improved decision making Big Data (unstructured data) Master data, Data Quality Solutions

On-demand availability Mobility solutions Cloud platforms

Table 1. Solutions to Business Mandates

Figure 2. Vertical Evolution of the IT Landscape

3. Challenges in Data Management

With business mandates increasing pressure on IT to manage data effectively, IT faces a constant challenge to mold itself with reduced adaptation time. IT architectures have witnessed the evolution from standalone systems to Service-Oriented Architectures (SOA). The evolution period comprises consistently updating the technology landscape of enterprises. This gust of acceleration in the knowledge-rendering space needs the right infrastructural support rather than the vertical scalability noticed in the past.

In the evolution of architectures, information rendering has witnessed a vertical evolution. The vertical evolution started from data being consolidated from silos to data warehouses, while the rendering techniques evolved from in-house to the Internet to hosted platforms.

V

er

tical E

v

olution

Cloud based solutions

Service oriented Architectures

Web based with data integration

Web Based Applications

Client Server Architecture with data centralization

Standalone Desktop with data silos

Enterprise Challenges Rapid evolution of architectures Data Governance

Roadmap to evolve Data Challenges Data Redundancy Data Knowhow Data Quality & Assurance

(9)

Rendering to mobile devices finds its evolution from web based to modern mobile platforms. Significant

investments have been made so far in this journey. At this maturity level, it is imperative that the next generation banks on harnessing the existing infrastructure to define an optimized architecture.

The challenges and issues that the enterprises are subjected to, in order to leverage existing infrastructure, can be categorized into 'Enterprise-level' and 'Data-level'. Both these categories of challenges are correlated at the granular level. They are as follows:

Enterprise Challenges

 Architecture adaptation: It will be difficult to quickly adapt the existing architectures to newer architectures. With lower TCO mandated by business, rebuilding a new system is not an option anymore

 Data Governance: Definitions, ownerships, rules, and policies are available within an organization. Unfortunately, these reside either with individuals or at very few places with easy access

 Building a roadmap to migrate to next generation technologies: Identification of systems that need to be moved and analysis to understand the impact of these systems

Data Challenges

 Data redundancy: The data stores, built over a period of time, have created multiple copies or captures from different entry points

 Data know-how: The knowledge of what data resides where and the purpose of it is confined to very few which results in the data being underutilized

 Data quality and assurance: Enterprises have struggled with this aspect for long and it continues to be a challenge that needs to be addressed

4. Implementing metadata driven data governance

It is evident that enterprises cannot afford to neglect the new

offerings in data management. It is imperative that they adopt a well-guided approach towards effective data management, by leveraging the twin levers of data governance and technology.

4.1 Data Governance: Process Lever

Data Governance is a process to effectively manage the data assets within an organization. The methodology suggested in figure 3 refers to a set of activities to be performed for successful implementation of a governance process within an enterprise. The steps defined are as follows:

Define objectives, scope, roles and responsibilities, operational

procedures, and performance metrics of the governance body/council

9

Figure 3 . Data Governance Methodology

Policies

Or

ganization

Roles & Resp

Standards Data Architecture Data Quality Data Securit y Metadata Managemnent Data Exploitation Master Data Management Data Cr eation Data Lif ecycle Define Initiate Data Governance Monitor Establish

(10)

Initiate communication with all key stakeholders, comprising both business and IT; identify the governance

body; as well as defining roles and responsibilities of the body and individuals participating in the council

Establish communication of the governance body, its roles and responsibilities and process familiarization;

enable tools and techniques for monitoring and control; formulate and finalize policies, standards, and checklists for data, templates, and change management

Monitor metrics and take corrective/disciplinary actions as applicable for successful implementation of the

governance program

Data governance plays a significant role in this solution as it is imperative that enterprises receive a buy-in from all stakeholders to address existing challenges and prepare for next-generation architectures. One of the key functions of the governance council is to ensure that the transition to the new data management offerings are brought in at the right time with the right value proposition and harnessed appropriately to maximize benefits to the enterprise.

4.2 Metadata: Technology Lever

With growing data and a variety of data formats, the key challenge is to lay out a blueprint and understand the spread and behavior of data within an enterprise. Data governance is increasingly challenging in such scenarios. Effective data management involves leveraging enterprise metadata to generate insights and define points of control for the governance body. The key aspects of the solution (also depicted in Figure 4) are as follows:

 Identify, gather, and maintain the enterprise

metadata into one single model called 'Enterprise Metamodel’

 Integrate the gathered metadata with logical

integration points to create an integrated enterprise metadata

 Define and create reports to display to

measure and monitor the key metrics for data governance teams

Create a well-defined governance model for

sharing metrics across the enterprise

 Provide supporting infrastructure for

communication, collaboration, and administration to cater to

operational needs of data governance

10 Governance Intelligence M etrics C ollabor ation Monitor Identify Gather Integrate Measure Enterprise Metamodel

(11)

11 The functional architecture of the solution is depicted in Figure 5. The key components of this solution are as

follows:

Enterprise Metadata: This comprises segregated technical, business, process, and content metadata, spread

across enterprises. This metadata is identified, classified, and formatted to be fed into the system

System Layer: This layer is supported by the administration and management module which supports

configurations, user and process management, and housekeeping of the layer. Adapters, processes and tools gather, integrate, and provide mechanisms to access intelligence through the information layer. This layer also supports interfaces for users to enter, modify, correct or delete the metadata appropriately

Information Layer: The intelligence gathered by the system processes using integrated metamodel is rendered

to the data governance body/council. This layer supports analysis (slice and dice, drill downs, roll ups, lineage and so on) across the enterprise blueprint available in the system. It also offers a set of predefined reports and

Information Layer

Enterprise Metadata

Databases Business Definitions Data Management Processes Enterprise Content

Data Analytics Reports & Dashboards

Administration & Management Enterprise Metamodel User Defined Metadata

System Layer Data Governance

Technical Metadata Business Metadata Process Metadata Content Metadata

(12)

12 dashboards to measure and monitor operational metrics for data management. The reporting module can also support 'on demand' reporting by providing a capability to create reports as per demand

Data Governance: Data Governance is a body/council exercising control over enterprise data management.

This solution provides role-based access to all identified roles and its respective users. Hence, it renders the right intelligence to the required bodies

The technical architecture of the solution has the following components (also depicted in Figure 6):

Data Layer: This is at the base of the solution landscape. It includes data stored in data stores, documents,

repositories and in any other form.

Enterprise Metadata: This layer qualifies the data layer by encapsulating definitions of the data. There are four

types of metadata considered for this solution:

 Technical metadata: This is also referred to as structural metadata. It qualifies the architecture components.  Business metadata: This refers to the business definitions of the data elements, including components of

business glossary, metric definitions, and common terminology.

 Process metadata: This refers to metadata that is generated by processes within the organization, comprising

the data processes (ETL, data base logs, data flows, information refresh and so on) and business processes (EAI, alerts, thresholds among others).

Figure 6 . Solution: Technical Architecture

Data Governance

System Layer (Including Information & User Interface Layer)

Data Analytics Reports & Dashboards Administration & Management User Interaction

Enterprise Metamodel Process Configuration & Control Communication & Collboration

Technical Metadata Business Metadata Process Metadata Enterprise Metadata

Content Metadata

Data Layer Data Governance

Databases Business Definations Data Management Processes Enterprise Content

Adapters - Metadata and Data

Da ta G o v ernanc e Da ta G o v ernanc e

(13)

13

 Content metadata: This refers to the metadata of the content that exists in the form of documents, files, code,

videos and music

Adapters: The adapters, either technology specific or configurable, are built into the solution. These adapters are

used to define an extract routine from the metadata stores and enable periodic refresh from the sources. The adapters for specific data processing also exist, which can be used by the system layer to build any point solutions

System Layer: This is the core of the solution which consists of an enterprise metamodel that encapsulates the

metadata from the categories mentioned above. This metadata is integrated, refreshed, and quality maintained by the 'process configuration and control' module. The 'communication and collaboration' module with

components of messaging, workflows, emails, alerts and so on provides a strong backbone to support the governance body with information rendering services. The user interface includes components of reporting and analytics, administration and management, and communication. These are interface components rendered in a user-understandable format to leverage the system functions

4.3 Rising to Data Challenges

It is a challenging business mandate to ensure that data is made available and accessible to users. The following table maps the solution components that alleviate the challenges posed by the rapid vertical evolution of the IT landscape.

Challenge Area Challenge Description Solution Component

Enterprise Challenge Architecture Adaptation Reporting and analytics module of the user interface layer provides capability of impact analysis for any new architecture proposition

Data Governance Administration and management module provides a complete guidance on setting up data governance body along with roles and responsibilities; with flexibility to scale vertically and horizontally, the model can be tuned for any enterprise need

Roadmap for Evolution Reporting and analytics module provides dependency chart and impact analysis to decide the right roadmap for the organization

Data Challenge Data Redundancy Structural metadata in enterprise metamodel integrated with business metadata provides an insight to redundant data across an enterprise

Data Know-how Integrated metadata from data models and databases, data flow diagrams, and data process metadata provides a footprint of data across an enterprise

Data Quality and Assurance Integration of process and structural metadata in the enterprise metamodel layer with scorecards and dashboards in reporting and analytics module helps maintain data quality

Table 2 . Challenges to Solution Mapping

The solution, while addressing the current challenges, also provides an adaptable framework. This can be scaled with sufficient customization for futuristic requirements.

(14)

14

5. The 'IDEAL' Approach

Recognizing and understanding the process and technology levers, mentioned above, alone takes a lot of analysis, effort and understanding of the complete enterprise blueprint.. It is imperative that the implementation of the process and technology levers is completed in a methodical manner. We suggest adopting the 'IDEAL' approach (concisely depicted in Figure 7) for implementing the solution:

Figure 7. The IDEAL Implementation Approach

1. Identify need and stakeholders: Enterprises have multiple data management needs across the data life

cycle—from data collection, management, and analysis to preservation and archiving. Deciding what needs to be managed (process) at which point of this data management cycle and involving the right people is very critical for an organization. Sharing a vision for data management that aligns with measurable business and technical benefits will help in getting a buy-in from various organizational functions. This phase should outline the people, process and business function where the need and applicability of such a solution is higher. This will require a thorough analysis of current data management techniques, cross-functional knowledge, technology landscape and buy-in from all the stakeholders.

2 . Define scope, landscape, and governance model: Defining the scope for data management projects and data

governance can seem like an unnerving task. It has the potential to take on an impossibly large scope and a pervasive, enterprise-wide reach. Organizations are recommended to first start small (pilot), proving the value and capitalizing on those achievements to expand the scope. The various dimensions to define the scope and landscape are process, business function, data, and systems to be involved in the first pilot. Techniques of prioritization, ranking, and weights can be used to carve out the right portion within the organization to pilot.

Figure 8. Sample Governance Model

Data Governance Council Head/Sponsor

Executive Data Steward (Business)

Data Management

Executive (IT) Chief Process Officer

Business Data Steward Enterprise Data Architect Chief Data Analyst Chief Administrator Enterprise Process Analyst Technical

Specialist RegulatoryBodies

Process Specialist Data Stewards (Business Function) Architects Data Analyst/ Knowledge Workers

Administrators BusinessAnalyst Designers &Developers KnowledgeOfficers Auditors

Management C

ommittee

with Monthly meetings

Ex

ecutiv

e C

ommittee

with Quaterly meetings

Oper a tional C ommittee with W eek ly meetings Indentify

Need & Stakeholders

Define Scope, Landscape, Governance model Establish Metadata Blueprint, Governance body Analyze

Metrics, Reports & Analytics

Liaise

Workflows, Communication

(15)

15 In this phase, it is important to define and socialize the governance model, which would be applicable to the organization. The people classified in the 'Identify' phase are tagged and associated with roles and

responsibilities. A sample governance structure, as in Figure 8, can be used as the baseline to refine it further. Also, this phase defines the metrics to be captured and monitored by the governance bodies for effective monitoring of deployment of solutions within the organization. The metrics referring to monitoring the effectiveness of the governance model, architectural disciplines, progress on projects, and improvements in dimensions of data management (quality, security and so on) are finalized.

3. Establish metadata blueprint and governance body: This phase is also referred to as an 'Implementation'

phase wherein 'process and technology' levers of the solution are implemented in the organization. It is imperative to establish an enterprise-wide footprint to give a complete picture of the organization landscape. Metrics defined in the 'Define' stage are introduced at various hotspots identified for continuous monitoring and improvement. While the technology is prepared in this phase, the governance body should also be introduced to the nuances of the solution by means of training programs and socializing of program objectives and mandates.

Figure 9. Hotspots: Data Governance

4. Analyse metrics, leveraging reports and analytics to monitor deviation: The data governance body

should continuously monitor the defined metrics on a regular basis to ensure that the defined priorities are progressing as planned. The technology lever of the solution should be configured with all metrics and continuously refreshed to provide the right status of the health of the scope defined. The technology lever should also have the capability to drill down to the lowest possible element to pinpoint the issues for the data governance body to decide on actionable items.

(16)

16

Figure 10. Data Governance Metrics

Figure 11. Analyse Tools: Metrics, Dashboards

5. Liaise with stakeholders: The data governance body is provided with extensive communication methods to

liaise effectively within the organization and enable tracking of the status of communication as well. The established accountability infrastructure with people or roles and communication enablers ensures the right direction of the program.

We recommend adopting the 'IDEAL' approach which provides a flexible framework for organizations to leverage and customize data based on their needs.

(17)

17

6. Conclusion

Metadata driven data governance' is a powerful tool for governance teams who can derive intelligence and monitor metrics to ensure the right analysis of data assets. Organizations, over a period of time, have made investments in technology to support their businesses. It is imperative that these should be considered in the evolution of architectures to build a cost-effective, scalable, and high-performance system. 'Metadata driven data governance', provides deeper insight and analysis capability by means of integrating metadata. However, such programs need meticulous implementation methodology wherein we recommend the 'IDEAL'

(18)

All content / information present here is the exclusive property of Tata Consultancy Services Limited (TCS). The content / information contained here is correct at the time of publishing. No material from here may be copied, modified, reproduced, republished, uploaded, transmitted, posted or distributed in any form without prior written permission from TCS. Unauthorized use of the content / information appearing here may violate copyright, trademark and other applicable laws, and could result in criminal or civil penalties. Copyright © 2014 Tata Consultancy Services Limited

IT Services

Business Solutions Consulting

Subscribe to TCS White Papers

TCS.com RSS: http://www.tcs.com/rss_feeds/Pages/feed.aspx?f=w Feedburner: http://feeds2.feedburner.com/tcswhitepapers Contact

For more information about TCS’ consulting services, contact [email protected]

About Tata Consultancy Services (TCS)

Tata Consultancy Services is an IT services, consulting and business solutions organization that delivers real results to global business, ensuring a level of certainty no other firm can match. TCS offers a consulting-led, integrated portfolio of IT and IT-enabled infrastructure, engineering and

TM

assurance services. This is delivered through its unique Global Network Delivery Model , recognized as the benchmark of excellence in software development. A part of the Tata Group, India’s largest industrial conglomerate, TCS has a global footprint and is listed on the National Stock Exchange and Bombay Stock Exchange in India.

For more information, visit us at www.tcs.com

TC S Design Ser vices I M I 06 I 14

References

Related documents