• No results found

Are You Big Data Ready?

N/A
N/A
Protected

Academic year: 2021

Share "Are You Big Data Ready?"

Copied!
53
0
0

Loading.... (view fulltext now)

Full text

(1)

ACS 2015 Annual Canberra Conference

Are You Big Data Ready?

Vladimir Videnovic

(2)
(3)

“If you can't explain it simply,

you don't understand it well enough”

Albert Einstein

Introduction

(4)

Big Data

Term commonly used to describe phenomenon of

large and/or complex datasets

caused by the exponential growth and availability of data

(5)

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Big Data – Big Vs

(6)
(7)
(8)

Big Vs

Business Perspective

It’s just Data – nothing has changed...

Value

– Enable smarter decisions, faster actions and optimised results

Valuation

– Evaluation of outcomes and continuous optimisation

– Measure the impact on key performance indicators

– Assess the ability to forecast future outcomes

Visibility

– Discover important insights – demand for good quality analytics & insight

– Study patterns, behaviours, trends and relationships

(9)

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Big Vs

Information Management Perspective

It’s Information – and many things have changed!

Variety

Data Integration

New data sources (Social Media, Data streams, Internet of Things)

Lack of standards for managing various data types

Velocity

Data streaming, machine/sensor generated data

Real-time insights

Vulnerability

Information Security

Fraud discovery

(10)

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Big Vs

Data Science Perspective

Value

Understanding what is possible

Prioritising/sizing up the opportunities

Analytics that do not support decision making doesn’t add value

Variety

Internal document/text/audio sources may be inaccessible

Need to prototype/test value add in adding new unstructured data from social

media and other external sources

Velocity

Move to real-time services

(11)

Big Data

(12)
(13)
(14)

Big Vs

Survey

(15)

BIG Data

Better

(16)

BIG Information

Enterprise capability required to

acquire relevant

Data

from multiple sources,

organise it in a repository,

convert it into

Information

by analysing it,

utilise it for deriving

Knowledge

(17)

Promise of Analytics

An Information-Powered Organisation

is equipped with timely insights and knowledge

to continuously optimise the way they

derive answers, decisions and actions

conduct their business

streamline their operations

deliver services and/or products

(18)
(19)

Information Management Today

(20)

Information Management Today

Common Challenges

Access to Information

Information Quality

Efficient Use of Information

(21)

Barriers to Mature Use of Information

Survey

(22)

Information Management

(23)
(24)

Need

Analytic Value at Fast Pace

Tool Complexity

Early Hadoop tools only for experts

Existing BI tools not designed for Hadoop

Emerging solutions lack broad capabilities

80% effort typically

spent on evaluating

and preparing data

Data Uncertainty

Not familiar and overwhelming

Potential value not obvious

Requires significant manipulation

Overly dependent on

scarce and highly

(25)

Modern Information Environment

Roles

Data Engineer (Data Wrangler, Data Janitor)

Data Scientist (Data Analyst, Data Miner)

(26)

Data Engineer

• ability to quickly understand data

• transform and enrich data

• programming

• data visualisation

• statistics

• effective present of results

Skills

Prepare data for easier access, analytical processing and efficient interpretation

(27)

Data Engineer

• data pattern discovery

• Ability to tell the story

• curiosity

• attention to detail

• research skills

• Question everything

• strive to make a difference

Skills

Create insights from available data

(28)

Modern Information Environment

People

Everyone on the Journey

Change

Empowered people

Easy Access to ALL Information

Engineered Processes

(29)
(30)

Modern Information Environment

Business Processes

(31)

Modern Information Environment

Business Processes in the Digital Era

(32)
(33)

Data Reality

Existing sources + Emerging Sources

Data Warehouse

Existing Sources

Emerging Sources

(34)

Information Management – Logical View

(35)

Big Data Eco-System

Data Fast Data

Events Actions

1 2 3

Streams

Information Warehouse

Reservoir Factory Warehouse

(36)

Cloudera Hadoop Distribution

HDFS – Storage - distributed file system that provides high-throughput access to data MapReduce – Process - framework for performing distributed data processing

Hive - Data Warehouse system for Hadoop

HCatalog - metadata abstraction layer for referencing data HBase - Hadoop DataBase - distributed columnar database

Hue – user interface for managing and using Hadoop components

Spark - parallel data processing framework for developing Big Data applications

YARN – cluster management – reliable distributed processing of very large data sets Oozie – workflow - coordination system for managing Hadoop jobs

ZooKeeper - centralised service for maintaining configuration information

Sqoop – efficient transfer of bulk data between Hadoop and structured datastores

(37)

Addressing Barriers

Information

Management

Capabilities

Focus Organisation-wide Information Management Goal Information Capability Design Patterns enabling

Information-Powered

(38)
(39)

Silos Standardise Optimise Information on Demand

• One off tools/solutions

• Bottom-up

• Local business unit

driven

• Many versions of truth

• Independent data

marts

• Enterprise standard

tools/solutions

• Data warehouse and

dependent data marts

• IT & LOB partnership

• Secure, consistent

access to all data

• Agile, flexible

architecture

• Analyse “just in time”

structured & unstructured data together

• Advanced analytics &

real-time

recommendations

• All stage 3 capabilities

delivered as a platform

(service/cloud)

• Access of tools and

data among broad subscriber group

Unified Information Architecture Maturity Phases

(40)

Data Processing - Exploration

Definition

• Understand Data and Data Value • Explore attributes/ metadata • Understand data quality

(needs/reality) • Prioritise • Identify custodians IM Capabilities • Data Staging • Data Discovery Enterprise Capabilities • Planning • Experience • Continuous Improvement • Data and Information • Enablers

(41)

Data Processing - Collection

Definition

• Acquisition/Ingestion/Extraction of data • Quality audit

• Change Data Capture

• High-Volume Data Acquisition • Stream/Event Data Capture

IM Capabilities • Extract Data • Ingest Content

• Change Data Capture

(42)
(43)

Data Processing - Storage

Definition • Structured Data • Content/Documents/Records • Non-structured Data • Storage Strategy • Data availability IM Capabilities

• Enterprise Data Warehouse (Curated Data) • Data Reservoir (Exploratory Data)

Enterprise Capabilities • Resource Management • Business Continuity • Knowledge

(44)

Data Processing – Data Use/Information Creation

Definition

• Generate Information Assets • Derive intelligence

• Actionable Insights • Support Decisions • Enable Innovation

• Support Business Processes

IM Capabilities • Transactional Reporting • Ad-Hoc reporting • Structured Analysis • Sentiment Analysis • Advanced Analysis • Enterprise Search Enterprise Capabilities • Decision Making • Performance Mngmt. • Change Management • Strategy and Plans

Addressing Barriers

(45)

Data Processing – Data/Information Sharing and Collaboration

Definition

• Publish Information Assets • Action on Intelligence

• Communicate Decisions • Operationalise Innovation

IM Capabilities

• Enterprise Information Portal

Enterprise Capabilities • Outcomes • Decision Making • Communication • Collaboration • Change Management • Shared Vision

(46)

Data Processing - Optimisation

Definition

• Optimise Information Assets • Optimise Business Processes • Optimise Decisions

IM Capabilities

• Unknown Data Analysis

• Evidence-based Decision-Making Enterprise Capabilities • Performance Management • Business Continuity • Outcomes • Shared Vision

(47)

Data Processing – Information Archiving/Disposal

Definition

• Define and apply disposal authorities • Preserve information of interest

IM Capabilities

• Enterprise Content Management/Archiving

(48)

Core Information Governance Capabilities

IM Capabilities

• Change Management

• Data Quality Continuous Optimisation • Data Lineage

• Master Data

• Information Security

Enterprise Capabilities

• Strategy and Planning • Procedures

• Policies

• Data and Information • Security

(49)
(50)

Modern Information Environment

• People

• Business Processes

• Technology

(51)

Information Management Capability Modernisation

Benefits

Information-powered

users

Right

Information

Operationalised

Innovations

- Data available for analysis and ad-value processes

- Information transparent, useable and actionable to larger audience - Improved utilisation of existing information assets

- Information presented

consistently and accurately to all authorised users via appropriate channels

- Standardised information

platform ensures reuse of business patterns and information

consistency

- Support Innovation to

continuously improve business processes and/or delivery of services

- On-time insights available to larger population of information consumers, enabling better and consistent decision-making

ACCESS

QUALITY

EVOLUTION

BUSINESS

Informed

Decisions

- Consolidated information platform can reduce costs of maintenance and increase staff efficiency - Minimise support issues from inadequate delivery of information - Real-time analytic environment to support forward-looking analysis and evidence-based decision-making

(52)
(53)

References

Related documents

The application of a combined S-BPM/DSM approach to process integration can be illustrated using a cross- organisational research funding process involving four

Draw- ing on this research, we created a set of recommendations to im- prove the organization and created a set of marketing materials: a logo and promotional video for the

Similarly, comfort and travel time are valued higher by commuters from zones close to CBD (i.e., within 5 km to the CBD) than those from city peripherals. It was, how- ever,

In conclusion, for the studied Taiwanese population of diabetic patients undergoing hemodialysis, increased mortality rates are associated with higher average FPG levels at 1 and

The main wall of the living room has been designated as a "Model Wall" of Delta Gamma girls -- ELLE smiles at us from a Hawaiian Tropic ad and a Miss June USC

For the purposes of the GAFIS project, gateway opportunities are those that bring large quantities of poor customers to the threshold of the formal financial system—to the point

(44) Datum van openbaarmaking (44) Date of publication by printing or similar process of an examined document on which no grant has taken place on or before said date.. (45) Datum

allocation across application needs, (ii) index management to facilitate indexing of data on flash, (iii) storage reclamation to handle deletions and reclamation of storage space,