• No results found

Demystifying Big Data Analytics

N/A
N/A
Protected

Academic year: 2021

Share "Demystifying Big Data Analytics"

Copied!
46
0
0

Loading.... (view fulltext now)

Full text

(1)

Demystifying “Big Data” Analytics

Practical approaches to business intelligence and forensic analytics

forensic analytics

(2)

Discussion topics

► Big Data & Big Data Analytics

► Current fraud risks - industry research ► Current fraud risks - industry research

► Components of an effective anti-fraud analytics program

► Advanced email analytics to detect rogue employee behavior

► Forensic analytics technology framework – beyond rules-based tests ► Forensic analytics technology framework – beyond rules-based tests

(3)
(4)

The big question is “What is Big Data?”

Big Data characteristics

► Big data represents data sets that can no longer be easily managed

or analyzed with traditional or common management tools, methods, and infrastructure.

and infrastructure.

High

Volume

(5)

Big Data has arrived

In this decade, the universe will grow 44x from 0.9 zettabytes to 35.2 zettabytes.

Video Mobile Sensors Social Media Electronic Payments Video Surveillance Video Rendering Medical Imaging Facebook PayPal

Smart grids Geophysical exploration

Medical Imaging

Gene Sequencing

(6)

Enterprises are looking to leverage Big Data

to…

Better understand their customer need and further

their business growth.

► eBay needs insights into what the customer wants in order to improve

customer experience and increase traffic on the site customer experience and increase traffic on the site

► Yahoo adopted Big Data to connect what users are looking for with what

advertisers are trying to sell to them.

Optimize business decisions through data driven

insights.

► The auto industry has been able to use GPS systems to gather

information on customer driving habits to then improve their products.

Make money by selling insights from Big Data.

Make money by selling insights from Big Data.

(7)

What is Big Data analytics?

Transform data to information, information to insight and insight to intelligence

Analysis is based on a large population of transactions

instead of sample

Process of examining large amounts of data of a variety of

types to uncover hidden patterns, unknown correlations

and other useful information

Act of transforming data with the aim of extracting useful

information and facilitating the achievement of factual

conclusions

conclusions

“Extracting the nuggets of gold

hidden under mountains of data”

(8)

The Analytics Value Chain:

Manage DATA

► Primary goal with managing data should be to cut through the “Big

Drive DECISIONS Perform ANALYTICS Manage DATA insights rules/algorithms relevant data

► Primary goal with managing data should be to cut through the “Big

Data” hype to leverage the relevant data needed to drive better

business decisions

Big Data

“Not So” Big Data

Volume Terabytes/Pedabytes Megabytes/Gigabytes

Variety Unstructured (text, voice, video) Structured / Relational

(9)

The Analytics Value Chain:

Perform Analytics

► Clients must be adept across a continuum of analytical techniques

Drive DECISIONS Perform ANALYTICS Manage DATA insights rules/algorithms relevant data

► Clients must be adept across a continuum of analytical techniques

Predictive Analytics

Leverage past data to understand the underlying relationship between data inputs and outputs to understand WHY something happened or to predict WHAT willhappen in the future across various scenarios

Prescriptive Analytics

To determine WHICH decision and/or action will produce the most effective result against a specific set of objectives and constraints Advanced

Analytics

Descriptive Analytics

Mine past data to report, visualize, and understand WHAT has already happened – after the fact or in real-time

Mathematical Complexity

Business Intelligence

(10)

The Analytics Value Chain:

Drive Decisions

Last year, IBM surveyed 4,500 clients who said that they use

Drive DECISIONS insights rules/algorithms data Perform ANALYTICS Perform ANALYTICS relevant Manage DATA

► Last year, IBM surveyed 4,500 clients who said that they use

analytics to drive decisions in three primary areas:

Customer -

to grow revenue and provide personalized services / products

Operations -

to reduce costs and improve service reliability

Finance

(11)
(12)
(13)

How is fraud detected?

50.3% by tip

or accident

Source: ACFE 2010 Report to the Nations On Occupational Fraud

or accident

(14)
(15)

2012 Ernst & Young Global Fraud Survey

39% of respondents say that bribery & corruption

practices occur frequently in their countries

15% of CFOs surveyed said they would be willing to

make cash payments to win business

20% of CFOs surveyed said that they are

willing to make personal gifts to win

business

business

(16)

Code of Ethics

Fraud and Corruption

Prevention Communication and Training AssessmentRisk

Controls Monitoring and Analytics Incident Response Plan Reactive Proactive

Setting the Proper Tone

Elements of a successful corporate anti-fraud, bribery and

Components of an effective anti-fraud &

corruption compliance program

Ethics Prevention

Policies and Training Assessment and Analytics Plan

bribery and corruption program Anti-fraud, bribery and corruption key activities ►Corporate compliance assessment ►Corporate compliance design ►Gap analysis

►Future state design session

►Who owns fraud?

►Assign roles and responsibilities

►Fraud and risk

committee formulation

►Customized training

►Corporate governance

►Fraud risk assessment

►Targeted anti-fraud analytics

►Anti-bribery and corruption analytics

►M&A Due Diligence

►3rdParty Due Diligence

►Vendor Risk profiling

►Investigations ►Fraud response planning ►Forensic data analytics ►Discovery and document review

Management Ownership and Involvement

session

►Discovery response

►Corporate governance

►Design sessions

►Vendor Risk profiling

►Vendor Vetting

(17)

Start with the Fraud Tree

New tools and methodologies are required for monitoring corruption schemes

Fraud tree

Corruption Fraudulent statements

Traditional focus of external audit.

Traditional focus of legal and compliance. Increased use of audit resources.

Asset misappropriation

Revenue

recognition financialNon Conflicts of interest Bribery and corruption/ FCPA Illegal

gratuities procurementBid-rigging/ GAAP Reserves

Traditional focus of internal audit.

Cash larceny Theft of other assets – inventory/ AR/ fixed assets Fake

(18)

Forensic analytics maturity model

Beyond traditional “rules-based queries” – consider all four quadrants

S tr u ct u re d Detection Rate Low High

Matching, Grouping, Ordering, Anomaly Detection, Clustering

S tr u ct u re d D at a st ru ct u re d D at a

“Traditional” rules-Based Queries & Analytics Matching, Grouping, Ordering,

Joining, Filtering

Statistical-Based Analysis

Anomaly Detection, Clustering Risk Ranking

Keyword Search Data visualization, Drill-down

into data, Text Mining

st ru ct u re d D at a

(19)

Fraud risk is in all datasets

► When considering enterprise risk, all

sources of data should be addressed

► Gartner study shows that 80% of

enterprise data is unstructured in TextText

enterprise data is unstructured in nature

► Most internal audit

procedures focus on the 20% structured data 80% Unstructured Data Text Graphics Email

Presentations & Spreadsheets

CRM Databases Accounting Systems 80% Unstructured Data Text Graphics Email

Presentations & Spreadsheets

CRM Databases Accounting Systems Structured Data Unstructured Data

Few organizations have the methodologies

or technologies to efficiently address unstructured data

20% 80%

20% 80%

(20)
(21)

Every second

02

New blogs created

05

New broadband subscribers

05

Babies are born subscribers New blogs created

Babies are born

09

PCs sold

50

Mobile phones sold

1.4

Million spam e-mails

>01

New domains registered

34,000

48

Minutes of video are

uploaded to YouTube Google searches New domains registered uploaded to YouTube

60

People visit an online dating site

200,000

Text messages are sent

06

People long-on for the first time

(22)

Message Frequency

Analyze communication over time to identify gaps in data set

Understand message counts across the population Understand message counts across the population (View email and/or instant messaging frequency and consistency relative to the population.)

When communications occurCommunication spikes

around key business events

(23)

Keyword Search Summary

Analyze keyword hits by term, custodian, and date

Analyze effectiveness of keywords. Understand the effect of keyword hits by custodian and timeframe to prioritize review and analyze keyword hits.

(24)

Link analysis

Who is talking to who?

The first 48 hours: Live server log files pulled in quickly for early case assessments

Understanding a complex organization’s true organization chart: Identification of relationships, versus activities, amongst actors

Triage of custodians and communications for traditional review and additional

Triage of custodians and communications for traditional review and additional

(25)

Fraud Triangle analytics

Applying the theory to electronic

communications

► Over 3,000 fraudulent terms/phrases

► Over 3,000 fraudulent terms/phrases

(26)
(27)

Emotional Tone Analysis

(28)

Combine analytics to surface issues

Email & TXTMessages

Unstructured, communications data…. is organized and “risk scored”

for analysis and remediation.

TXTMessages

Voice Mail & Instant Messages Transactions

Analysis Platform

(29)

Custodian Risk Ranking

Scored by custodian and time period based on multiple criteria

1. Behavioral Keywords Percentage of EY-ACFE opportunity-focused behavioral term hits for that week in

ESI sent or received by the custodian in focus Scaling: 3

2. Behavioral Keywords Percentage of EY-ACFE rationalization-focused behavioral term hits for that week in

ESI sent or received by the custodian in focus. Scaling: 3

4. User Activity Percentage of instances within that week, where custodian sends or receives ESI

involving those outside of peer group, as identified through hierarchies. Scaling: 2

5. User Activity Percentage of instances within that week, where custodian sends or receives ESI

involving those outside of superiors, as identified through hierarchies. Scaling: 2

6. Alias Clustering

Percentage of instances within that week, where custodian sends or receives ESI

involving at least one (1) of their identified communicative aliases. Scaling: 3

7. Emotive Percentage of instances within that week, where the custodian sends or receives ESI Scaling: 5

3. Behavioral Keywords

Percentage of EY-ACFE incentive-pressure-focused behavioral term hits for that

week in ESI sent or received by the custodian in focus. Scaling: 4

Custodian C1 C2 C3 C4 C5 C6 C7 ScalingC1 ScalingC2 ScalingC3 ScalingC4 ScalingC5 ScalingC6 ScalingC7 Score

A , Week 1 1 3 3 4 6 2 3

3 3 4 2 2 3 5 45

A , Week 2 2 2 4 5 3 4 2 37

7. Emotive Tone

Percentage of instances within that week, where the custodian sends or receives ESI

(30)

Rogue employee analytics

(31)
(32)

Integrated sampling approach

Raw ERP system tables

X riskiest payments, for detailed manual review

EY Payment Sample

Selection Tool payments, for contextual review0.5X randomly selected

Data model design

Identifying a sample set of 50 payments for review, from 1.4 million in scope.

Selection Tool payments, for contextual review

Structured Attribute Analysis

► Payments related to sensitive field changes, by riskiest users

► Payments related to sensitive field changes, by riskiest fields

► Duplicative invoices ► Invoice completeness ► Round payment amounts

Clustering

► Payments to high amount, low frequency vendors

► Payments to low amount, high frequency vendors

► Benford’s Law deviants

► Payments to statistically anomalous vendors

► Fuzzy matching between

Text Mining and Analysis

► Identification and extraction of entities:

► Geographies ► Proper nouns ► Addresses

► Telephone numbers ► Top concept extraction

► Identification of hits against EY-► Fuzzy matching between

employees and vendors

► Identification of hits against EY-ACFE bribery and corruption

(33)

Transaction Risk Scoring

Analyzing multiple tests for each transaction

Filter by selected analytics

Review breaches on targeted analytics

(34)

Beyond “rules-based” tests

Integrate statistical, visual and text mining techniques to identify patterns of high risk or rogue employee activities.

(35)

Focus on the payment text descriptions

What if you saw these terms used as justification for payments to third parties?

<blank description>

Pay on behalf of

Nobody calls it “bribe expense”

Government fee

<blank description>

Donation

Pay on behalf of

Special payment

One time payment

Special commission

Team building expense

Friend fee

Consulting fee

Goodwill payment

Volume contract incentive

Incentive payment

Team building expense

Commission to the customer

(36)
(37)

Interactive dashboard: Expense review

interface

(who, what, where, why, how and how much?)

(38)
(39)

Search-around functionality

Rapidly build out networks of interest and tie in multiple data sources

Easily find entities, documents, events, etc which

are directly related to your selection

(40)

A more human way to look at data

Data points are represented as objects, with logical relationships

Graphical representation of relationships between seemingly discrete entities

Epicenters of activity Epicenters of activity become immediately

(41)
(42)

Ernst & Young resources

The guide to investigating business fraudCorruption or compliance — weighing the

costs, the Ernst & Young 10th global fraud

costs, the Ernst & Young 10th global fraud survey

► “Best practices for a global FCPA program,”

Compliance Week

► “Keep it clean: the role of policies and

training in compliance with anti-corruption

laws,” Supply Chain Quarterly

(43)

Ernst & Young resources

► “Staying ahead of corruption liabilities,” ACG

Mergers & Acquisitions

Detecting financial statement fraud – Ernst & ► Detecting financial statement fraud – Ernst &

Young white paper

► “Demonstrating the effectiveness of your

compliance program,” Compliance Week

► “Acquisitions in emerging markets: know the

risks and how to address them,” Financier

Worldwide Worldwide

► “Accounting for Words,” Internal Auditor

► “Breaking the Status Quo in E-Mail Review”,

(44)

Components of an effective corruption

compliance program

(45)

Contacts

James Walton Senior Manager

Advisory Services – Enterprise Intelligence Advisory Services – Enterprise Intelligence Dallas, TX

(214) 969-0777

[email protected] Dave Rogers

Senior Manager

Fraud Investigation & Dispute Services Dallas, TX

(214) 969-8037

(46)

Ernst & Young

Assurance | Tax | Transactions | Advisory About Ernst & Young

Ernst & Young is a global leader in assurance, tax, transaction and advisory services. Worldwide, our 152,000 people are united by our shared values and an unwavering commitment to quality. We make a difference by helping our people, our clients and our wider communities achieve their potential.

Ernst & Young refers to the global organization of member firms of Ernst & Young Global Limited, each of which is a separate legal entity. Ernst & Young of which is a separate legal entity. Ernst & Young Global Limited, a UK company limited by guarantee, does not provide services to clients. For more information about our organization, please visit www.ey.com.

Ernst & Young LLP is a client-serving member firm of Ernst & Young Global Limited operating in the US. © 2012 Ernst & Young LLP.

All Rights Reserved. 1206-1369289

This publication contains information in summary form and is therefore intended for general guidance only. It is not intended to be a substitute for detailed research or the exercise of professional judgment. Neither Ernst & Young LLP nor any other member of the global Ernst & Young organization can accept any responsibility for loss occasioned to any person acting or refraining from action as a result of any material in this publication. On any specific matter, reference should be made to the appropriate advisor. reference should be made to the appropriate advisor.

References

Related documents

Previous studies have reported estimates of gaming revenue from casino-style games added to existing race tracks. Other reports and studies have examined the potential revenue

The policy provides 3 levels of lifetime insurance cover for cats subject to certain terms and conditions being met.. Significant features

The data presented in Table F1 demonstrate that, for Heathrow (LHR2), the airport model is not likely to be significantly over-predicting the airport-related concentrations

In conclusion, for the studied Taiwanese population of diabetic patients undergoing hemodialysis, increased mortality rates are associated with higher average FPG levels at 1 and

After these conclusions, the final prototype design was modified towards a group of straight evaporation channels with individual solar chim- neys, adopting the raised pre-heater

We used pollination exclusion on flowers or inflorescences on a whole plant basis to assess the contribution of insect pollination to crop yield and quality in four flowering

Our view about the role of medium-scale farms is that they should be allowed to develop under a land tenure policy that does not conflict with land tenure security of indigenous

Hertel and Martin (2008), provide a simplified interpretation of the technical modalities. The model here follows those authors in modeling SSM. To briefly outline, if a