The Data Reservoir as an enabler of differentiating Analytics initiatives

60 

Loading....

Loading....

Loading....

Loading....

Loading....

Full text

(1)

The Data Reservoir as an enabler of

differentiating Analytics initiatives

3

rd

March 2015

Mandy Chessell CBE FREng CEng FBCS Distinguished Engineer, Master Inventor Chief Architect, Information Solutions

(2)

Agenda

Changing landscapes

Analytics Lifecycles

Data reservoir overview

Questions

Looks like you’ve got all the data

– what’s the holdup?”

(3)

C

HANGING

I

NFORMATION

(4)

Knowing your customers enables you to serve them better …

Behavioral data

- Orders

- Transactions

- Payment history

- Usage history

Descriptive data

- Attributes

- Characteristics

- Relationships

- Self-declared info

- (Geo)demographics

Attitudinal data

- Opinions

- Preferences

- Needs and Desires

Interaction data

- Email / chat transcripts

- Call center notes

- Web Click-streams

- In person dialogues

Who?

What?

Why?

How?

High-value, dynamic - source of competitive differentiation

(5)

Knowing your customers enables you to serve them better …

Behavioral data

- Orders

Descriptive data

- Attributes

- Characteristics

Attitudinal data

- Opinions

- Preferences

- Needs and Desires

Interaction data

- Email / chat transcripts

- Call center notes

- Web Click-streams

- In person dialogues

Why?

How?

High-value, dynamic - source of competitive differentiation

Master Data

Information

Analysis of

Channel Interaction

Methods

Analysis of

Feedback and

Interaction Content

?

(6)

The broadening scope of analytics

Applications

Data

Warehouse

Pattern

Discovery for

Analytics

Reporting

Data Marts

Operational

Data Store

(7)

SOA

The broadening scope of analytics

Master Data

Management

Hub

Applications

Data

Warehouse

Pattern

Discovery for

Analytics

Reporting

Data Marts

Operational

Data Store

(8)

SOA

The broadening scope of analytics

Master Data

Management

Hub

Applications

Data

Warehouse

Pattern

Discovery for

Analytics

Reporting

Data Marts

Hadoop

Operational

Data Store

Hadoop provides cheap storage and processing to increase

the amount of data – and the type of data that can be

processed in a cost-effective manner.

Customer

Conversations,

Web,

Social Media,

Log files, …

Sensors and

(9)

SOA

The broadening scope of analytics

Master Data

Management

Hub

Applications

Data

Warehouse

Pattern

Discovery for

Analytics

Hadoop

Operational

Data Store

SAND BOXES

Analyze

Values

Search

For Data

Reporting

(10)

Data blues & skills issues

A disproportionate portion of the time spent in analytics project is about data

preparation: acquiring/preparing/formatting/normalizing the data

(11)

Business scenarios we see

Subject matter experts want access to their organization’s data to explore the content,

select, control, annotate and access information using their terminology with an

underpinning of protection and governance.

Data Scientists seeking data for new

analytics models.

Marketeer seeking data for new

campaigns.

Fraud investigator seeking data to

understand the details of suspicious

activity.

Day-to-day activity.

Requiring

ad hoc access

to a wide variety of data

sources.

Supporting analysis and

decision making.

Using the

subject matter

experts terminology

.

(12)

The interesting dilemma …

A man goes into a jewellers and buys an expensive watch …

Is it fraud – in which case the bank must stop it

Is it money-laundering – in which case the bank must report it

Does he have an expensive trophy partner – in which case perhaps he would be

interested in a loan?

Has he just won the lottery – should the bank improve the services offered?

Threat

Obligation

Opportunity

Opportunity

The same event is of interest by different departments.

There is major overlap in the data required to answer the question.

It may not be possible to determine the answer with just the information in the channel

- Previous or subsequent activity is required

(13)

Application Groupings

Characterised by:

Availability

Data requirements

Performance

Skills

Rate of change / Stability

Systems of

Engagement

Systems of

Record

Systems of

Insight

(14)

A growing demand …

Business Teams want

Open access to more information

More powerful analysis and visualization tools

IT Teams are

Concerned about cost.

(15)

T

HE

D

ATA

R

ESERVOIR

(16)

Access in place

Up-to-date information

Cost-effective

Slower access path

Remote Access

Reformatting

Make a local copy

Specially formatted for use

case

Local access

Local control

Local cost

Potentially stale values

How do we access information?

How much information? How rapidly is it changing? How frequently is it

accessed? How much transformation is required to consume the

information? When is the information available? Who owns the information?

How easily can it be changed?

(17)

How does the data reservoir support analytics development?

Advertise

Data Reservoir

Catalog

(18)

How does the data reservoir support analytics development?

Advertise

Discover

Data Reservoir

Catalog

Provision

1

2

3

4

(19)

How does the data reservoir support analytics development?

Advertise

Discover

Explore

Data Reservoir

Catalog

1

2

3

5

4

Sandbox

(20)

How does the data reservoir support analytics development?

Advertise

Discover

Explore

Deploy

Data Reservoir

Catalog

Provision

1

2

3

5

6

4

Sandbox

(21)

Active decision making in real-time

1.

An activity occurs that calls for a

decision.

2.

The context from the activity is

past to the decision process.

3.

The decision process augments

the context with stored

information and runs the

decision model.

4.

One of more actions are

recommended to the activity.

Context

Action

Decision

Feedback

Information

2

3

5

Facts,

Recent Events,

Options

Decision Input,

Actions and

Outcomes

3

5

(22)

How does the data reservoir support data distribution?

Data Reservoir

Catalog

Provision

1

Access

3

Distribute

2

(23)

Big Data Lakes or Swamps?

As we collect data

• Can we preserve clarity?

• Do we know what we are collecting?

• Can we find the data we need?

Are we creating a data swamp?

How do we build trust in big data?

(24)

"The need for increased agility and accessibility for data analysis is the primary

driver for data lakes," said Andrew White, vice president and distinguished

analyst at Gartner. "Nevertheless, while it is certainly true that data lakes can

provide value to various parts of the organization, the proposition of enterprise

wide data management has yet to be realized."

(25)

The Data Reservoir

Information Management and Governance Fabric

Data Reservoir Services

(26)

Data reservoir connects to many types of systems

Line of Business Applications Information Service Calls Search Requests Report Requests Deploy Decision Models Information Service Calls Data Access Deploy Real-time Decision Models Data Reservoir Operations Curation Interaction Management Data Access Data Deposit Data Deposit Decision Model Management Enterprise IT

New Sources

Third Party Feeds Third Party APIs Internal Sources Deploy Real-time Decision Models

Consumers of

Insight

Analytics Tools

Simple, ad hoc Discovery and Analysis Reporting Analytical Insight Applications

System of

Record

Applications

E nte rpris e S erv ic e B us

Systems of

Engagement

Other Systems

Of Insight

Other Data

E nte rpris e S erv ic e B us Events to Evaluate Information Service Calls Data Out Data In Notifications

(27)

Data reservoir supports real-time and batched ingestion of data

Line of Business Applications Information Service Calls Search Requests Deploy Decision Models Information Service Calls Data Access Deploy Real-time Decision Models Curation Interaction Data Deposit Data Deposit Decision Model Management Deploy Real-time Decision Models

Analytics Tools

M

ANUAL

R

EQUEST

I

NFORMATION

S

ERVICE

C

ALL

C

HANGE

D

ATA

C

APTURE

S

CHEDULED

E

XTRACT

Enterprise IT

New Sources

Third Party Feeds

Simple, ad hoc Discovery and Analysis Reporting Analytical Insight Applications

System of

Record

Applications

E nte rpris e S erv ic e B us

Systems of

Engagement

E nte rpris e S erv ic e B us Events to Evaluate Information Service Calls Data Out Notifications

(28)

Data refineries provide data movement, preparation, governance

Line of Business Applications Information Service Calls Search Requests Report Requests Deploy Decision Models Information Service Calls Data Access Deploy Real-time Decision Models Data Reservoir Operations Curation Interaction Management Data Access Data Deposit Data Deposit Decision Model Management Enterprise IT

Other Systems

Of Insight

New Sources

Third Party Feeds Third Party APIs Internal Sources

Other Data

Deploy Real-time Decision Models Understand Information Sources

Data Reservoir

Repositories

Consumers of

Insight

Analytics Tools

Simple, ad hoc Discovery and Analysis Reporting Analytical Insight Applications

System of

Record

Applications

E nte rpris e S erv ic e B us

Systems of

Engagement

Events to Evaluate Information Service Calls Data Out Data In Notifications Enterprise IT Interaction Service Interfaces Data Ingestion Publishing Feeds Continuous Analytics STREAMING ANALYTICS EVENT CORRELATION

(29)

Big data needs a variety of repositories for cost, access and

performance reasons

Line of Business Applications Information Service Calls Search Requests Deploy Decision Models Information Service Calls Data Access Deploy Real-time Decision Models Curation Interaction Data Deposit Data Deposit Decision Model Management Enterprise IT

System of

Record

Applications

E nte rpris e S erv ic e B us

New Sources

Third Party Feeds

Systems of

Engagement

Deploy Real-time Decision Models

Analytics Tools

View-based Interaction Published SAND OBJECT CACHE Simple, ad hoc Discovery and Analysis Reporting Analytical Insight Applications Events to Evaluate Information Service Calls Data Out Notifications Data Res erv oi r Repo s itori Descriptive Data INFORMATION VIEWS CATALOG Shared Operational Data ASSET HUB ACTIVITY HUB CODE HUB CONTENT HUB Deposited Data Historical

Data AUDITDATA

OPERATIONAL

HISTORY

SEARCH

INDEX

All types of data

All types of data

System-level Data

(Pre-Archive)

Master and Reference

(30)

Like a well-run library, the data reservoir has a catalog

Line of Business Applications Information Service Calls Search Requests Report Requests Deploy Decision Models Information Service Calls Data Access Deploy Real-time Decision Models Data Reservoir Operations Curation Interaction Management Data Access Data Deposit Data Deposit Decision Model Management Enterprise IT Deploy Real-time Decision Models Understand Information Sources Understand Information Sources Understand

Compliance ComplianceReport Advertise

Information Source

Governance, Risk and Compliance Team Information Curator Catalog Interfaces

Consumers of

Insight

Analytics Tools

View-based Interaction Published SAND BOXES REPORTING DATA MARTS OBJECT CACHE

Other Systems

Of Insight

New Sources

Third Party Feeds Third Party APIs Internal Sources

Other Data

Simple, ad hoc Discovery and Analysis Reporting Analytical Insight Applications

System of

Record

Applications

E nte rpris e S erv ic e B us

Systems of

Engagement

Events to Evaluate Information Service Calls Data Out Data In Notifications Data Res erv oi r Repo s itori es Harvested Data INFORMATION WAREHOUSE Descriptive Data INFORMATION VIEWS CATALOG Shared Operational Data ASSET HUB ACTIVITY HUB CODE HUB CONTENT HUB Deposited Data Historical Data DEEPDATA AUDITDATA OPERATIONAL HISTORY SEARCH INDEX OFFLINE ARCHIVE

(31)

Differing user perspectives

Information Governance

Catalogue

Search for, locate and download

data and related artifacts.

Provision Sand

Boxes.

Add additional insight into

data sources through

automated analysis.

Sand

Box

Define governance policies,

rules and classifications.

Monitor compliance.

View lineage (business and technical)

and perform impact analysis.

(32)

Data Reservoir

Information governance provides the mechanism for building trust

Line of Business Applications Information Service Calls Search Requests Report Requests Deploy Decision Models Information Service Calls Data Access Deploy Real-time Decision Models Data Reservoir Operations Curation Interaction Management Data Access Data Deposit Data Deposit Decision Model Management Enterprise IT

Other Systems

Of Insight

New Sources

Third Party Feeds Third Party APIs Internal Sources

Other Data

Deploy Real-time Decision Models Understand Information Sources Information Integration & Governance INFORMATION BROKER OPERATIONAL GOVERNANCE HUB CODE HUB WORKFLOW

STAGINGAREAS MONITOR GUARDS

Consumers of

Insight

Analytics Tools

Simple, ad hoc Discovery and Analysis Reporting Analytical Insight Applications

System of

Record

Applications

E nte rpris e S erv ic e B us

Systems of

Engagement

Events to Evaluate Information Service Calls Data Out Data In Notifications

(33)

Information governance delivers …

Information Governance

Compliance

Policy

Administration

Policy

Enforcement

Policy

Monitoring

Policy

Implementation

Standards

Protection

Lifecycle

Quality

Information Values

Quality

Information

Dependencies

Information

Requirements

Information Supply

Chain Integrity

Information

Identification

Information

Retention

Information

Usage

Information

Privacy

Information

Architecture

Information

Disposal

(34)

Policy

Three interlocking lifecycles of information governance

Policy

Policy

Policy

Operations

Development

Metadata

(35)

Classification Schemes

Classification is at the heart of information governance. It characterizes the type, value and

cost of information, or the mechanism that manage it. The design of the classification

schemes is key to controlling the cost and effectiveness of the information governance

program.

Business Classifications

Business classifications characterize information from a business perspective. This captures its

value, how it is used, and the impact to the business if it is misused.

Resource Classifications

Resource classifications characterize the capability of the IT infrastructure that supports the

management of information. A resource's capability is partly due to its innate functions and partly

controlled by the way it has been configured.

Activity Classifications

Activity classifications help to characterize procedures, actions and automated processes.

Semantic Classification

(36)

Policy support inside the Information Governance Catalogue

Principle

Policy

Implications

Classification

Classification

Governance

Rule

Classified by

Deployed to,

Executed by,

Monitored by

Actioned by

Metadata

Description

Governance Rule

Implementations

Governance Rule

Implementations

Modelled Metadata

Asset

Principle

Policy

Implications

Principle

Policy

Implications

Governs

Information

Asset

Describes

Implemented by

(37)

Data Reservoir

Line of Business Applications Information Service Calls Search Requests Deploy Decision Models Information Service Calls Data Access Deploy Real-time Decision Models Curation Interaction Data Deposit Data Deposit Decision Model Management Enterprise IT Events to Evaluate Information Service Calls Notifications System of Record Applications E n te rp ris e S e rv ic e B u s New Sources Systems of Engagement Deploy Real-time Decision Models Understand Information Sources Understand Information Sources Understand

Compliance ComplianceReport Advertise

Information Source

Governance, Risk and Compliance Team Information Curator Simple, ad hoc Discovery and Analysis Analytical Insight Applications

Governance Rules

Defined for each classification for each situation

Personal information

masked here

Personal information

masked here

Analytics Tools

Sensitive information

masked here

(38)

Integrated Metadata

Data Lineage (Traceability)

Where does this data come from?

Why is this data incorrect?

Why is this data incomplete?

Can I trust this value?

Impact Analysis

Where is this element used?

What happens if I change this?

Optimization

Where is the redundancy?

How can I make this run more efficiently?

Understanding

What does this mean?

How is this used?

Control

Why is this parameter set to this value?

Who made this change?

(39)

The Information Governance Ecosystem

Information Governance is built on metadata management,

Policy and

Standards

Information

Refineries

Exception

Management

Reporting and

Audit

Information

Curation

(40)

Information is delivered in appropriate forms for consumers

Line of Business Applications Information Service Calls Search Requests Report Requests Deploy Decision Models Information Service Calls Data Access Deploy Real-time Decision Models Data Reservoir Operations Curation Interaction Management Data Access Data Deposit Data Deposit Decision Model Management Enterprise IT

Other Systems

Of Insight

System of

Record

Applications

E nte rpris e S erv ic e B us

New Sources

Third Party Feeds Third Party APIs

Systems of

Engagement

Internal Sources Deploy Real-time Decision Models Understand Information Sources Understand Information Sources Understand

Compliance ComplianceReport Advertise

Information Source

Governance, Risk and Compliance Team Information Curator Catalog Interfaces Raw Data Interaction SAND BOXES

Data Reservoir

Repositories

View-based Interaction Access and Feedback

Other Data

Consumers of

Insight

Analytics Tools

Published SAND BOXES REPORTING DATA MARTS OBJECT CACHE Simple, ad hoc Discovery and Analysis Reporting Analytical Insight Applications Events to Evaluate Information Service Calls Data Out Data In Notifications

(41)

Information Virtualization hides the complexity of the information

landscape

Search and

View Values

Add

Insight

Create

APIs

Browse

Sources

Provision

Define

Views

Analyze

Values

Information Virtualization

(42)

Building a data reservoir

The data reservoir needs governance and change management to ensure that

information is protected and managed efficiently.

The first step in creating the reservoir is to establish the information

integration and governance components, the staging areas for integration, the

catalog, the common data standards.

The build out of the reservoir then proceeds iteratively based on the following

processes:

Governance of a data reservoir subject area.

Managing an information source.

Managing an information view.

Enabling analytics.

Maintaining the data reservoir infrastructure.

Information Integration & Governance INFORMATION BROKER OPERATIONAL GOVERNANCE HUB CODE HUB WORKFLOW

(43)

Data reservoir logical architecture

Line of Business Applications Information Service Calls Search Requests Deploy Decision Models Information Service Calls Data Access Deploy Real-time Decision Models Curation Interaction Data Deposit Data Deposit Decision Model Management Enterprise IT Events to Evaluate Information Service Calls Data Out Notifications

New Sources

Third Party Feeds

Deploy Real-time Decision Models Understand Information Sources Understand Information Sources Understand

Compliance ComplianceReport Advertise

Information Source

Governance, Risk and Compliance Team Information Curator Catalog Interfaces Raw Data Interaction SAND BOXES Enterprise IT Interaction Service Interfaces Publishing Feeds Continuous Analytics STREAMING ANALYTICS Simple, ad hoc Discovery and Analysis Reporting Analytical Insight Applications

Analytics Tools

View-based Interaction Access and Feedback Published SAND OBJECT CACHE

System of

Record

Applications

E nte rpris e S erv ic e B us

Systems of

Engagement

EVENT CORRELATION Data Res erv oi r Repo s itori Descriptive Data INFORMATION VIEWS CATALOG Shared Operational Data ASSET HUB ACTIVITY HUB CODE HUB CONTENT HUB Deposited Data Historical

Data AUDITDATA

OPERATIONAL

HISTORY

SEARCH

(44)

The data reservoir

As organizations experiment with

analytics they discover:

Creating new analytics requires access to

historical data from many systems.

This data includes valuable and sensitive

data that is core to the organization’s operation.

Hadoop is a flexible platform for storing many types of data but is not necessarily fast

enough for the production deployment of some analytics. Data needs to be

reformatted and copied onto a specialist analytics platforms such as Netezza.

A data reservoir provides:

Single extraction of data from operational systems and distribution to multiple

analytics platforms.

Cataloguing and governance of the data in the analytics platforms

Simple interfaces for the line of business to access the information they need.

Data Reservoir = Efficient Management, Governance, Protection and Access.

Data Reservoir

Information Management and Governance Fabric

Data Reservoir Services

(45)

z

z

z

z

z

z

z

(46)

P

RODUCT

M

APPINGS

(47)

Systems interfacing with the Data Reservoir

System/Subsystem Name Description

Mobile and other channels These are operational applications that support the interaction with people such as customers, suppliers, employees. The data reservoir may supply key data values and analytical insight to a high-speed cache for these applications to improve the performance of simple lookups. The data reservoir is able to refresh this cache after an outage.

System of Record Applications These are operational applications that are driving anorganization’s daily business. They supply information to the data reservoir that describes this daily operation and its associated master data. They receive analytical insights and other derived information such as micro-segmentation and alerts.

New Sources New sources describe information outside of the business data managed by the system of record applications. This may be log files from customer interactions, or information from third parties such as social media services and data providers. Other Data Lakes This data reservoir may be exchanging information with other data lakes, swamps or reservoirs either owned by this

organization, or part of a cloud deployment or owned by an external party.

Decision Model Management Decision model management describes the systems used by data scientists and business analysts as they configure analytics models and rules to execute inside the data reservoir. This is where the advanced analytics and data mining is managed from. The team need access to samples of the data, formatted for analysis tools, with sufficient performance capacity to handle intense, lumpy workloads from the mining and testing processing.

Information Curator An individual or group of people in the organization that have information sources to share.

(48)

Data Reservoir Services Components Summary

Component Description Product Pattern

Data Ingestion

Data ingestion is where data from the information sources is loaded into the data reservoir. This data is treated as reference data (read only) by the processes in the data reservoir. The data ingestion component is responsible for validating the incoming data, transforming relevant structured data to the data reservoir format and routing it to the appropriate data reservoir repositories.

InfoSphere Information Server Information Broker

Publishing Feeds

Publishing feeds is responsible for distributing data from the data reservoir repositories to systems outside of the reservoir. This includes other data reservoirs and the operational systems of record.

InfoSphere Information Server Information Broker

Real-time Interfaces

Real-time interfaces (a) provide services to access data in the data reservoir repositories and (b) provide real-time interfaces for querying data outside of the reservoir. These interfaces may be services or SQL style interfaces.

InfoSphere MDM, Information Server (Information Services Director)

Information Service

Real-time Analytics

Real-time analytics provides complex event processing and real-time analytics based on the activity within the organization, and externally.

InfoSphere Streams Streaming Analytics Node

Raw Data Interaction

Raw data interaction provides access to most of the data (security permitting) in the data reservoir for advanced analytics. It is responsible for masking sensitive personal information where appropriate.

InfoSphere Information Server; GaianDB/InfoSphere Federation Server

InfoSphere Big Insights

Information Provisioning

Catalog Interfaces

The catalog interfaces provide information about the data in the data reservoir. This includes details of the information collections (both repositories and views), the meaning and types of information stored and the profile of the information values within each information collection.

InfoSphere Information Server (InfoSphere Governance Catalog) Information Identification View-based Interaction

Provides access to data in the data reservoir (subject to security permissions) for line of business teams that wish to perform ad hoc queries, search, simple analytics and data exploration. The structure of this information has been simplified and it is labeled using business relevant terminology.

InfoSphere Information Server; GaianDB/InfoSphere Federation Server;

InfoSphere Data Explorer

Information Service; Information Provisioning; Search Node Reporting Data Marts

The reporting data marts provide departmental/subject oriented data marts targeted at line of business reports.

(49)

Data Reservoir Repositories –

Harvested Data, Descriptive Data and Deposited Data

Type Name Description Product Pattern

Harvested Data

Operational History

A repository providing a historical record of the data from the systems of record. Database Operational Status Node Information

Warehouse

A repository optimized for high speed analytics. This data is structured and contains a correlated and consolidated collection of information.

PureData for Analytics; Industry Models

Information Warehouse Deep Data A repository holding a copy of most of the data in the data reservoir. It provides

a place where raw data can be landed for analysis. The data may be annotated, linked and consolidated in deep data. Data may be mapped to data structures after it is stored so effort is spend as needed rather than at the time of storing. This repository is designed for flexibility, supporting both for high volumes and variety of data.

InfoSphere Big Insights; Industry Models

Map-Reduce Node

Audit Data A repository used to keep a record of the activity in the data reservoir. It is used for auditing the use of data and who is accessing it, when and for what purpose.

InfoSphere Big Insights Information Event Node

Descriptive Data

Catalog A repository and applications for managing the catalog of information stored in the data reservoir.

InfoSphere Information Server; Industry Models

Information Identification Information

Views

Definitions of simplified subsets of information stored in the data reservoir repositories. These views are created with the information consumer in mind.

Relational Database;

InfoSphere MDM; InfoSphere

Virtual Information Collection

(50)

Data Reservoir Repositories –

Shared operational data

Type Name Description Product Pattern

Shared Operational Data

Asset Hub A repository for slowly changing operational master data (information assets) such as customer profiles, product definitions and contracts. This repository provides authoritative operational master data for the time interfaces, real-time analytics and for data validation in data ingestion. It is a reference repository of the operational MDM systems but may also be extended with new attributes that are maintained by the reservoir. When this hub is taking data from more than one operational system, here may also be additional quality and deduplication processes running that will improve the data. These changes are published from the asset hub for distribution both inside and outside the reservoir. InfoSphere MDM Advanced Edition Information Asset and Information Asset Hub

Activity Hub A repository for storing recent activity related to a master entity. This repository is needed to support the real-time interfaces and real-time analytics. It may be loaded through the data ingestion process and through the real-time interfaces. However, many of its values will have been derived from analytics running inside the data reservoir.

InfoSphere MDM Custom Domain Hub; Industry Models

Information Activity and Information Activity Hub Code Hub A repository of common code tables and mappings used for joining information

sources to create information views.

InfoSphere Reference Data Management Hub (RDM)

Information Code and Information Code Hub

Content Hub A repository of documents, media files and other content that has been managed under a content management repository and is classified with relevant metadata to understand its content and status.

(51)

Information integration and governance components

Name Description Product Pattern

Information Broker

The runtime server environment for running the integration processes (such as the information deployment process) that move data in and out of the data reservoir and amongst the components within the reservoir.

InfoSphere Information Server Information Broker

Code Hub A repository managing code tables and code table used in the internal management of the data reservoir.

InfoSphere Reference Data Management Hub (RDM) Information Code and Information Code Hub Staging Areas

A server supporting the staging areas used to move information around the data reservoir.

Database or InfoSphere Big Insights or WebSphere MQ

Staging Area

Operational Governance Hub

A repository and applications for managing the information flow and information governance within the data reservoir. This information node supports the metadata services.

InfoSphere Information Server Governance Node

Monitor A mechanism to monitor the overall function and responsiveness of the data reservoir to assure consistent working.

InfoSphere Information Server Information Probe and Information Monitoring Workflow A server running stewardship processes that coordinate the work of individuals

responsible for fixing any problems with the data in the data reservoir.

WebSphere Business Process Manager

Agile Information Process and

This component provides the control of the information movement and consumption within the data

reservoir (more details follows …)

(52)

R

EFERENCE

M

ATERIAL

(53)

Information Architecture for a New Era of Computing

A high level description of the

Big Data and Analytics

(54)

Taking the Journey to IBM Cognitive Systems

Describes how an organization

should prepare for cognitive

computing

Includes an example roadmap of

solutions to develop key skills and

capabilities.

(55)

Next Best Action Redguide

The NBA Redguide is a customer guide to the solution.

It is suitable for the C-suite executives.

It explains the value of the solution.

It describes the solution’s architecture using the same

diagrams as we have just covered.

It also has examples of case studies from different

industries.

(56)

Ethics for Big Data and Analytics

Context –

for what purpose was the data originally surrendered? For

what purpose is the data now being used? How far removed from the

original context is its new use?

Consent & Choice

– What are the choices given to an affected

party? Do they know they are making a choice? Do they really

understand what they are agreeing to? Do they really have an

opportunity to decline? What alternatives are offered?

Reasonable

– is the depth and breadth of the data used and the

relationships derived reasonable for the application it is used for?

Substantiated –

Are the sources of data used appropriate,

authoritative, complete and timely for the application?

Owned

– Who owns the resulting insight? What are their

responsibilities towards it in terms of its protection and the obligation

to act?

Fair

– How equitable are the results of the application to all

parties? Is everyone properly compensated?

Considered

– What are the consequences of the data collection and

analysis?

Access

– What access to data is given to the data subject?

Accountable

– How are mistakes and unintended consequences

detected and repaired? Can the interested parties check the results

http://www.ibmbigdatahub.

com/whitepaper/ethics-big-data-and-analytics

(57)

Staying Ahead in the Cyber Security Game

http://www-

01.ibm.com/common/ssi/cgi-bin/ssialias?subtype=WH&infotyp

e=SA&appname=SWGE_TI_SE_US

EN&htmlfid=TIL14103USEN&attac

hment=TIL14103USEN.PDF#loade

d

(58)

Industry Models and Big Data

Whitepaper on the use of our

industry models with big

(59)

Roles within the Reservoir

Governor

; appoint an individual to coordinate the definition of policies related to information governance and their

implementation.

Information Steward

; appoint an individual to coordinate the manual activity necessary to monitor and verify that an

information collection is meeting agreed quality levels. Create user interfaces and access rights to involve this individual in

information quality processes such as the exception management process.

Quality Analyst

; appoint an individual to monitor and analyze the state of the information flowing through the information

supply chain.

Integration Developer

; maintaining the data movement functionality in, around and out of the data reservoir.

Infrastructure Operator

; appoint an individual responsible for starting, maintaining, and monitoring the systems that

support the information supply chain.

10001 01011 01101

Data Scientist

; appoint an individual to analyze the information that the organization is collecting in order to understand

patterns of success.

Business Analyst

; appoint an individual to analyze the way people are working, understand where the processes can be

improved, and define new procedures, rules, and requirements for the IT systems.

Information Owner

; appoint an individual to be the owner of the information collection who is responsible and accountable

for ensuring it is capable of supporting the organization’s activities.

(60)

Figure

Updating...

References

Updating...

Related subjects :