• No results found

Big Data & Data Gate : Auditing Big Data : What do we need to know?

N/A
N/A
Protected

Academic year: 2021

Share "Big Data & Data Gate : Auditing Big Data : What do we need to know?"

Copied!
32
0
0

Loading.... (view fulltext now)

Full text

(1)

©Copyright 2003-Present Atre Group, Inc. www.atre.com Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

Big Data & Data Gate

:

Auditing Big Data : What do we need to know ? *What is Big Data?

*Why is it important? *10 Golden Rules for it to be useful *Security & Privacy *How to audit “Big Data Implementation” in an Organization?

Atre Group, Inc. Milan, Italy: Oct 02,2013 Innovations & Norms: Friends or Enemies? Shaku Atre,

Atre Group, Inc.

366 West 11th Street, Suite 7D, New York, NY 10014

521 38th Avenue, Santa Cruz, CA 95062

(2)

2

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

Is the overuse of the phrase “Big Data” tiring you?

Too much of anything could be tiring

Big Data seems to be the phrase that is tiring us now

 When can we label the data as Big Data?

 Let us attempt to set up a framework of rules when a Data System can be called a Big Data System

Here is the framework of 10 Rules!

My Blog http://www.atre.com/big-data/

Shift focus from “Big” to “”Data”! And then to “Big Data Business

Analytics”! And then to “Value” gained!

(3)

3

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

Shaku Atre’s Ten Golden Rules of Big Data Systems

1.

The Big Data System for Business Analytics Rule

2.

The Big Data System Access Rule

3.

The Big Data System Scalability Rule

4.

The Big Data System Flexibility Rule

5.

The Big Data System Security and Privacy Rule

6.

The Big Data System Visualization & Data Mining Rule

7.

The Big Data System Dispersion and Reassembly Rule

8.

The Big Data System Dormant Data Access Rule

9.

The Big Data System Skill Set Rule

10.

The Big Data System “Big Human Judgement” Rule

We will try to handle one rule at a time. Here we go!

(4)

4

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

Ten Golden Rules of Big Data Systems

The Big Data System for Business Analytics Rule

The V⁵ Function

To qualify as a Big Data System, a system must be a function of V⁵

Big Data =

f

(Variety of Data,

Various Interactions due to correlations between data,

Velocity of data arrival and departure,

Volume

of data,

Providing big Value to the business via Business Analytics)

http://www.atre.com/big-data/

Big Data Business Analytics Must:

• Provide business

performance comparisons • Use Big Data as actionable

information to improve

performance of the business New Big Data Trends are:

• Meet V⁵ requirements

• Look at the data for opportunities

• Use the data to reap the most benefits from it! • Determine your “Golden

Path” business analytics

(5)

5

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

Top Ten Rules of Big Data Systems

What is Needed to Perform Business Analytics with Big Data?

First principle of analysis Based on comparison Better or worse? Than what? Second principle of analysis Based on evidence Facts…

Because data doesn’t lie What we sold

How we sold Why we sold

or didn’t sell

What we were able to sell compared

to the goal set?

Third principle of analysis

Provide actionable information with multiple variables

Why, how, what, where, when, how much

What action should we be taking or should we keep

it status quo?

(6)

6

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

% of technology-savvy users/customers is in business environment % of business-savvy users/customers is in business environment % of young customers (< 25 years of age) is on the rise at

a very fast rate

One of the big reasons is easy access to

social media!

Ten Golden Rules of Big Data Systems

Big Data System Access Rule

http://www.atre.com/big-data/big-data-system-access-rule/

For a system to qualify as a Big Data System

it must address the needs of all types of

users

,

their varied demands, and various technologies they are using.

Let us see whether the demography of our varied user groups is

changing as we speak?

(7)

7

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

Top Ten Rules of Big Data Systems

The Big Data System Scalability Rule

RULE 3

Growth of data

Growth of number of users

Growth of various devices and channels

used for sending and receiving data Growth of various functionalities

Growth of interactions between these variables Growth of expectation levels of performance

http://www.atre.com/big-data/big-data-system-scalability-rule/ For a system to qualify as a Big Data System, it must be scalable with Big

Data in Motion and Humongous Data at rest

Scalability

of information

α

α

Left Variable Right Variable

(8)

8

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

Top Ten Rules of Big Data Systems

The Big Data System Flexibility Rule

http://www.atre.com/big-data/big-data-system-flexibility-rule/

For a system to qualify as a Big Data System,

it must have

flexibility

with its underlying architecture

RULE 4

Flexibility

of Delivery of Information

Is the underlying architecture of the software flexible enough to:

• Receive and accept data from

a variety of devices

• from various types of users • in a variety of formats, • at various timeframes • at various speeds, and • in various volumes?

Business Analytics Software’s and Hardware’s

Underlying Architecture

α

Can the software store it, read it, and divide it if necessary, work on it and present it in various formats, on various devices to satisfy variety of needs of business analytics to improve business performance?

If the underlying architecture is not meant for the new features and you try to fit it – as a “Square Peg in Round Hole” the new doesn’t fit and the old gets

(9)

9

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

Top Ten Rules of Big Data Systems

The Big Data System Security and Privacy Rule

Decide how to implement security and privacy

between the interacting sources of data

with the

complexity of correlations

involved.

RULE 5

http://www.atre.com/big-data/rule-5-the-big-data-system-security-and-privacy-rule/

For a system to qualify as a Big Data

System, it must implement

security and privacy

between the interacting sources of data

with the complexity of correlations

involved.

Implementation

of Privacy

α

Implementation of Security

As a car is driven thru a toll booth, various interacting databases can be accessed and the driver’s private information will be at risk!

(10)

10

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

Top Ten Rules of Big Data Systems

The Big Data System Visualization & Data Mining Rule

Very large sets of data in complex relationships are difficult to grasp. Can the software you are considering prepare

performance dashboards based on the Key Performance Indicators (KPIs)

determined by the organization?

RULE 6

http://www.atre.com/big-data/

For a system to qualify as a Big

Data System,

it must be able to represent data in a

visualized

form as a performance dashboard and find some

“nuggets” of insights, as well, that we didn’t know before

Business Analytics

Effective Performance Dashboards

integrating words, numbers, images

and possibly audio/ video by integrating data sources for providing possible

inferences for action

α

(11)

11

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

Top Ten Rules of Big Data Systems

The System Dispersion and Reassembly Rule

http://www.atre.com/big-data/rule-7-the-big-data-system-dispersion-and-reassembly-rule/

For a system to qualify as a

Big Data System, it must be

able to disperse data in “chunks” to a number of processors,

reassemble them and not lose anything on the way.

RULE 7

Divide and conquer by using low

-

cost commodity hardware

Processors

Reassemble

Disperse Big Data

(12)

12

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

Top Ten Rules of Big Data Systems

The Big Data System Dormant Data Access Rule

The software should be able to reach into document archives to draw upon the wealth of hidden information

for improving performance of the business

RULE 8

http://www.atre.com/big-data/rule-8-the-big-data-system-dormant-data-access-rule/

For a system to qualify as a Big Data

System,

it must be able to exploit both conventional

and “quirky” reservoirs of

dormant data

Big Data =

f

(Structured and unstructured dormant data

generated by old, conventional systems as well as new, quirky systems

(13)

13

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

Top Ten Rules of Big Data Systems

The Big Data System Skillset Rule

The greatest skill required is quick determination of knowing what to throw away,

knowing what to keep

and knowing when to walk away That is the secret to survival!

RULE 9

http://www.atre.com/big-data/

For a system to

qualify

and

succeed

as a Big Data System,

a workforce with

skillsets

in Business Knowledge

and Data Science working in tandem is necessary

Success with Big Data System =

f

(Team with Skillsets in Business Knowledge + Expertise in Data Science as a cumulative expertise from the disciplines of

Computer Science, Mathematics, Statistical Analysis, Data Visualization and even Social Sciences!)

(14)

14

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

Big Data Business Analytics Skillset

Vertical Industry More Specialties Analytics with Deductive Logic Vis ualiz at io n, Au dio , V id eo Pr es enta tio ns

Big Data Business

Technology Business Analytics:

Presentation & Visualization

• Is there one Data Scientist who has expertise in all of these four sectors: with Mathematics, Statistics, Economics, Business Administration…? Absolutely Not

• People with different expertise have to be teamed up!

©Copyright of Atre Group, Inc.

><

><

><

(15)

15

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

Top Ten Rules of Big Data Systems

The Big Data System “Big Human Judgment” Rule

http://www.atre.com/big-data/

For a system to qualify as a Big

Data System,

don’t ignore I

ntuition + Common

S

ense

…which is not very common!

RULE 10

Big Data is new, but not that new, so it presents a challenge…

convincing the Old Guard that they need to trust Big Data results the way they trust their own intuition…

aka Big Human Judgment

Keep in mind that Garbage in Garbage out is valid for any

machine! But a human brain is capable of determining what

garbage is and what is not almost in an unlimited way!

Determine how Business Analytics

can add the most

improvement value

by using the Big Data and insights hidden in it

(16)

16

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

Security Requirements

A typical distribution of security requirements

for data:

* 85% of data needs little or no security

* 13% of data needs some level of security

* 2% of data needs high level of security

(17)

17

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

Security & Privacy : What is at stake with Big Data?

Security and Privacy

walk hand in hand. Implementation of

Privacy is directly proportional to implementation of Security.

That means “No Security Implementation” implies “No Privacy

Implementation”.

Let us consider an example:

 A car is being driven through a toll booth. Toll Booth E-Z Pass Toll

Collection System is a Big Data System. The “electronic eye” reads

off of the EX pass located on the front wind shield of each car that passes thru the EZ pass lane (a driver without an EZ pass driving through the EX pass lane creates a big honking competition!).

 A banking or credit card database is accessed, which is also stored at one of the banks, for deducting the toll.

(18)

18

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

Privacy Issues of Time & Location Data

Time and location data is about more than just customers – it is a

way for businesses to know how and where their employees are

doing their work

 For example: A delivery service is going to want to know where each delivery person is at any given time

Time and location data raises a lot of serious questions, not only

privacy, but also moral and ethical. Making it one of the most

privacy-sensitive types of Big Data

 For example: Should microchips be implanted in children in case they get lost? Or an elderly person with dementia who has a tendency to wander away from home?

(19)

19

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

Public Cloud versus Private Cloud : Security & Privacy are the

major concerns

Figure 4.5 Public Clouds versus Private Clouds

©Taming the Big Data Tidal Wave, Bill Franks, John Wiley & Sons, Inc

Public Cloud

Firewall

Users

(20)

20

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

Global Privacy Principles

1.

Notice (Transparency): Why information is being collected

2.

Choice: Offer the opportunity to choose what information is

used and or disclosed

3.

Consent: Information is only disclosed to the parties consistent

with notice and choice

4.

Security: Protect collected information from loss, disclosure,

destruction…etc

5.

Data Integrity: Ensure the information is true, complete, and up

to date

6.

Access: Individuals should have access to their personal data

7.

Accountability: Firms must be accountable for these principles

(21)

21

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

Security & Privacy : What is at stake with Big Data?

What are different ways of implementing security?

 Centralized vs Decentralized solutions  Physical security

 Logical security

Authentication, Authorization, Access Control

(22)

22

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

Security & Privacy : What is at stake with Big Data?

Step 1: Determine Connectivity Paths

Laptops Mobile Devices PCs Mainframe

A

B

D

E

F

G

H

C

This figure maps the physical network to the logical data paths in the Network Communication

Server

Database Server

LAN File Server

(23)

23

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

Security exists No security

Security & Privacy : What is at stake with Big Data?

Step 2: Linking Connectivity Paths to Security Packages

Use this chart to show the data paths on one axis and the security systems on the other. Mainframe Security Package LAN Security Package PC Security Package ???? Password

Security Encryption Function Specific Security Package 1 Specific Security Package 2 A B C D E F G H I

(24)

24

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

Big Data & Auditing : Security & Privacy Centered

Instead of trying to build out your own big data infrastructure,

use big data capabilities in the cloud. Another reason to consider

this is that (1) most organizations don’t have the required big

data security skills anyway, and (2) offloading this to somebody

else frees up resources to deal with the information coming from

the big data analytics

Very Important Questions:

Which big data is considered important enough to be

secured?

Does your organization have Big Data Security Skills?

Have you thought about

Many other domains are also involved, such as legal, privacy,

(25)

25

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

How do I audit Organizations with Big Data?

Shaku Atre’s Ten Golden Rules of Big Data Systems

1.

The Big Data System for Business Analytics Rule

2.

The Big Data System Access Rule

3.

The Big Data System Scalability Rule

4.

The Big Data System Flexibility Rule

5.

The Big Data System Security and Privacy Rule

6.

The Big Data System Visualization & Data Mining Rule

7.

The Big Data System Dispersion and Reassembly Rule

8.

The Big Data System Dormant Data Access Rule

9.

The Big Data System Skill Set Rule

(26)

26

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

How do I audit Organizations with Big Data?

Shaku Atre’s Ten Golden Rules of Big Data Systems

Audit objectives represent the high-level goals and anticipated

accomplishments of the review and address controls and risks

associated with the client's activity.

In the Context of Rule #1:

The Big Data System for Business

Analytics Rule

Questions for the client:

 What are the different types of Big Data you are planning to use?  Describe: Types of Big Data, Volume, “Noise” in data, Are you

deleting Serially Correlated Data, If yes – how? If not – why not?

V5: Variety: Have you classified it? How many different types of data

are you managing? Can you specify what they are? Various

Interactions: Do you know them, Velocity: At what speed is the data

“hitting “ you? Volume: What is the rate at which it is increasing?,

Value: What type and how much value are you getting out of it? Can

(27)

27

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

How do I audit Organizations with Big Data?

Shaku Atre’s Ten Golden Rules of Big Data Systems

In the Context of Rule #2:

The Big Data System Access Rule

Questions for the client:

 Do you have any cross reference list of which data is being accessed by which users?

 Which customer data is accessed by which users in your organization?

 Do you have any categories of user groups? And corresponding needs for data?

 Does the client have any standards for use of mobile device? Have you implemented Bring Your Own Device (BYOD)? Any standards for that?

 How user-friendly are your systems? (see slide for user-friendliness)  How strong a presence do you have on the web? Do you conduct

business (selling) from your website?  Do you use “cookies”?

(28)

28

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

What is User-Friendliness and User-Intuitiveness?

Quantifiable criteria or numbers:

1 Average or median time to learn procedures

Measured in clock time

2 Speed of task accomplishment

Measured in clock time

3 Acceptable rate of user errors

Set the rate and change for repeat users with the moving calendar – errors in first-time and repeat use

4 What percentage of users ask for user’s manual?

If it is more than 10% the system is NOT user intuitive!

5 User retention of commands and queries over a period of time

How long before users forget or start making same errors

6 Subjective satisfaction

Percentage of users who find system usable and come back for more

7 Help system: does it really resolve problems?

Percentage of times that users give up

(29)

29

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

How do I audit Organizations with Big Data?

Shaku Atre’s Ten Golden Rules of Big Data Systems

I have prepared a list with questions that you can

ask of your clients for Auditing based on the 10

Rules such as the previous 3 slides

Please send me an email to

[email protected]

requesting those slides and I will email those to

you.

Visit

http://www.atre.com/forms/survey1.html

Fill out a small form and you will have access to

many columns I have written out of 500+ columns

Visit

the blog

(30)

30

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

How do I audit Organizations with Big Data?

Shaku Atre’s Ten Golden Rules of Big Data Systems

 Credit The Executive Guide to Information Security: Threats,

Challenges, and Solutions By Tim Mather

 Credit: Cloud Security and Privacy: An Enterprise Perspective on Risks

and Compliance (Theory in Practice) By Tim Mather

 Credit:

http://www.infosecisland.com/blogview/19643-Data-at-Rest-Dormant-But-Dangerous.html By Simon Heron

 Credit: A White paper by Actuate: Requirements of an Enterprise Reporting Application Platform

 Credit: Securing Big Data, Cloud Security Alliance Congress 2011, Tim Mather KPMG

 Credit: http://wiki.apache.org/incubator/AccumuloProposal,

(31)

31

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan Slide 31

About Shaku Atre

 President of Atre Group, Inc.

 Author of 6 books and close to thousand articles as

columnist for Information Management, Computerworld, DM Review , BIReview and other publications

 Co-Author of “Business Intelligence Roadmap-The Complete Project Lifecycle for Decision Support Applications” ( Addison-Wesley)

 Former partner with Price Waterhouse Coopers

 Former faculty member at IBM’s prestigious Systems Research Institute

 Keynote speaker and lecturer on business intelligence, data warehousing and databases throughout the world  Reach Shaku at [email protected]

(32)

32

©Copyright 2003–Present Atre Group, Inc.

www.atre.com

Auditing Big Data : What do we need to know ? Milan

How do I audit Organizations with Big Data?

Shaku Atre’s Ten Golden Rules of Big Data Systems

Big Data Questions:

 Following questions have been requested as a backup in case during the presentation there is not a proper possibility to allow for live questions  1) Success stories of Big Data seem to be very dependent on long

business experience on the Internet. What would you advise to new starters?

 2) How much time/effort should be allowed for a new starter to obtain results from Big Data collection?

 2) Should security protection of Big Data be extended to avoid that internal users violate ethic rules while collecting information from outside?

 Best Regards

http://www.atre.com/big-data/ http://www.atre.com/big-data/big-data-system-access-rule/ http://www.atre.com/big-data/big-data-system-scalability-rule/ http://www.atre.com/big-data/big-data-system-flexibility-rule/ http://www.atre.com/big-data/rule-5-the-big-data-system-security-and-privacy-rule/ http://www.atre.com/big-data/rule-7-the-big-data-system-dispersion-and-reassembly-rule/ http://www.atre.com/big-data/rule-8-the-big-data-system-dormant-data-access-rule/ http://www.atre.com/forms/survey1.html ,

References

Related documents

Results from this study indicated that male students, computer facilitates at home, cultural possessions, economic social and cultural status and quality of schools’ educational

16 week homeless or alcohol/drug rehabilitation shelter for persons with drug/alcohol addictions. What’s Offered

Performance was evaluated in terms of execution time and error rate while accomplishing predecessor identification, successor identification and interpretation tasks

IMPORTANT: Place your scale on a hard flat surface, and be ready with bare feet to record your first weight entry on the last step.. This scale can be customized for up to

Despite a large VOM on the coronary sinus venogram (left panel, right anterior oblique projection) and suc- cessful VOM cannulation with an angioplasty balloon (mid-left panel,

At the time of application for admission, if a student has completed the 26 general education course requirements plus additional courses which are required for the degree,

• The “Private Folder (Share) Location” text box allows the user to select which disk is used for the Private Folder.. • Read/Write access will be granted to