Demystifying “Big Data” Analytics
Practical approaches to business intelligence and forensic analytics
forensic analytics
Discussion topics
► Big Data & Big Data Analytics
► Current fraud risks - industry research ► Current fraud risks - industry research
► Components of an effective anti-fraud analytics program
► Advanced email analytics to detect rogue employee behavior
► Forensic analytics technology framework – beyond rules-based tests ► Forensic analytics technology framework – beyond rules-based tests
The big question is “What is Big Data?”
Big Data characteristics
► Big data represents data sets that can no longer be easily managed
or analyzed with traditional or common management tools, methods, and infrastructure.
and infrastructure.
High
Volume
Big Data has arrived
In this decade, the universe will grow 44x from 0.9 zettabytes to 35.2 zettabytes.
Video Mobile Sensors Social Media Electronic Payments Video Surveillance Video Rendering Medical Imaging Facebook PayPal
Smart grids Geophysical exploration
Medical Imaging
Gene Sequencing
Enterprises are looking to leverage Big Data
to…
►
Better understand their customer need and further
their business growth.
► eBay needs insights into what the customer wants in order to improve
customer experience and increase traffic on the site customer experience and increase traffic on the site
► Yahoo adopted Big Data to connect what users are looking for with what
advertisers are trying to sell to them.
►
Optimize business decisions through data driven
insights.
► The auto industry has been able to use GPS systems to gather
information on customer driving habits to then improve their products.
►
Make money by selling insights from Big Data.
►Make money by selling insights from Big Data.
What is Big Data analytics?
Transform data to information, information to insight and insight to intelligence
►
Analysis is based on a large population of transactions
instead of sample
►
Process of examining large amounts of data of a variety of
types to uncover hidden patterns, unknown correlations
and other useful information
►
Act of transforming data with the aim of extracting useful
information and facilitating the achievement of factual
conclusions
conclusions
“Extracting the nuggets of gold
hidden under mountains of data”
The Analytics Value Chain:
Manage DATA
► Primary goal with managing data should be to cut through the “Big
Drive DECISIONS Perform ANALYTICS Manage DATA insights rules/algorithms relevant data
► Primary goal with managing data should be to cut through the “Big
Data” hype to leverage the relevant data needed to drive better
business decisions
Big Data
“Not So” Big DataVolume Terabytes/Pedabytes Megabytes/Gigabytes
Variety Unstructured (text, voice, video) Structured / Relational
The Analytics Value Chain:
Perform Analytics
► Clients must be adept across a continuum of analytical techniques
Drive DECISIONS Perform ANALYTICS Manage DATA insights rules/algorithms relevant data
► Clients must be adept across a continuum of analytical techniques
Predictive Analytics
Leverage past data to understand the underlying relationship between data inputs and outputs to understand WHY something happened or to predict WHAT willhappen in the future across various scenarios
Prescriptive Analytics
To determine WHICH decision and/or action will produce the most effective result against a specific set of objectives and constraints Advanced
Analytics
Descriptive Analytics
Mine past data to report, visualize, and understand WHAT has already happened – after the fact or in real-time
Mathematical Complexity
Business Intelligence
The Analytics Value Chain:
Drive Decisions
Last year, IBM surveyed 4,500 clients who said that they use
Drive DECISIONS insights rules/algorithms data Perform ANALYTICS Perform ANALYTICS relevant Manage DATA
► Last year, IBM surveyed 4,500 clients who said that they use
analytics to drive decisions in three primary areas:
►
Customer -
to grow revenue and provide personalized services / products►
Operations -
to reduce costs and improve service reliability►
Finance
How is fraud detected?
50.3% by tip
or accident
Source: ACFE 2010 Report to the Nations On Occupational Fraud
or accident
2012 Ernst & Young Global Fraud Survey
•
39% of respondents say that bribery & corruption
practices occur frequently in their countries
•
15% of CFOs surveyed said they would be willing to
make cash payments to win business
•
20% of CFOs surveyed said that they are
willing to make personal gifts to win
business
business
Code of Ethics
Fraud and Corruption
Prevention Communication and Training AssessmentRisk
Controls Monitoring and Analytics Incident Response Plan Reactive Proactive
Setting the Proper Tone
Elements of a successful corporate anti-fraud, bribery and
Components of an effective anti-fraud &
corruption compliance program
Ethics Prevention
Policies and Training Assessment and Analytics Plan
bribery and corruption program Anti-fraud, bribery and corruption key activities ►Corporate compliance assessment ►Corporate compliance design ►Gap analysis
►Future state design session
►Who owns fraud?
►Assign roles and responsibilities
►Fraud and risk
committee formulation
►Customized training
►Corporate governance
►Fraud risk assessment
►Targeted anti-fraud analytics
►Anti-bribery and corruption analytics
►M&A Due Diligence
►3rdParty Due Diligence
►Vendor Risk profiling
►Investigations ►Fraud response planning ►Forensic data analytics ►Discovery and document review
Management Ownership and Involvement
session
►Discovery response
►Corporate governance
►Design sessions
►Vendor Risk profiling
►Vendor Vetting
Start with the Fraud Tree
New tools and methodologies are required for monitoring corruption schemes
Fraud tree
Corruption Fraudulent statements
Traditional focus of external audit.
Traditional focus of legal and compliance. Increased use of audit resources.
Asset misappropriation
Revenue
recognition financialNon Conflicts of interest Bribery and corruption/ FCPA Illegal
gratuities procurementBid-rigging/ GAAP Reserves
Traditional focus of internal audit.
Cash larceny Theft of other assets – inventory/ AR/ fixed assets Fake
Forensic analytics maturity model
Beyond traditional “rules-based queries” – consider all four quadrants
S tr u ct u re d Detection Rate Low High
Matching, Grouping, Ordering, Anomaly Detection, Clustering
S tr u ct u re d D at a st ru ct u re d D at a
“Traditional” rules-Based Queries & Analytics Matching, Grouping, Ordering,
Joining, Filtering
Statistical-Based Analysis
Anomaly Detection, Clustering Risk Ranking
Keyword Search Data visualization, Drill-down
into data, Text Mining
st ru ct u re d D at a
Fraud risk is in all datasets
► When considering enterprise risk, all
sources of data should be addressed
► Gartner study shows that 80% of
enterprise data is unstructured in TextText
enterprise data is unstructured in nature
► Most internal audit
procedures focus on the 20% structured data 80% Unstructured Data Text Graphics Email
Presentations & Spreadsheets
CRM Databases Accounting Systems 80% Unstructured Data Text Graphics Email
Presentations & Spreadsheets
CRM Databases Accounting Systems Structured Data Unstructured Data
Few organizations have the methodologies
or technologies to efficiently address unstructured data
20% 80%
20% 80%
Every second
02
New blogs created
05
New broadband subscribers
05
Babies are born subscribers New blogs created
Babies are born
09
PCs sold
50
Mobile phones sold
1.4
Million spam e-mails
>01
New domains registered
34,000
48
Minutes of video are
uploaded to YouTube Google searches New domains registered uploaded to YouTube
60
People visit an online dating site
200,000
Text messages are sent
06
People long-on for the first time
Message Frequency
Analyze communication over time to identify gaps in data set
Understand message counts across the population Understand message counts across the population (View email and/or instant messaging frequency and consistency relative to the population.)
•When communications occur •Communication spikes
around key business events
Keyword Search Summary
Analyze keyword hits by term, custodian, and date
Analyze effectiveness of keywords. Understand the effect of keyword hits by custodian and timeframe to prioritize review and analyze keyword hits.
Link analysis
Who is talking to who?
► The first 48 hours: Live server log files pulled in quickly for early case assessments
► Understanding a complex organization’s true organization chart: Identification of relationships, versus activities, amongst actors
► Triage of custodians and communications for traditional review and additional
► Triage of custodians and communications for traditional review and additional
Fraud Triangle analytics
►
Applying the theory to electronic
communications
► Over 3,000 fraudulent terms/phrases
► Over 3,000 fraudulent terms/phrases
Emotional Tone Analysis
Combine analytics to surface issues
Email & TXTMessages
Unstructured, communications data…. is organized and “risk scored”
for analysis and remediation.
TXTMessages
Voice Mail & Instant Messages Transactions
Analysis Platform
Custodian Risk Ranking
Scored by custodian and time period based on multiple criteria
1. Behavioral Keywords Percentage of EY-ACFE opportunity-focused behavioral term hits for that week in
ESI sent or received by the custodian in focus Scaling: 3
2. Behavioral Keywords Percentage of EY-ACFE rationalization-focused behavioral term hits for that week in
ESI sent or received by the custodian in focus. Scaling: 3
4. User Activity Percentage of instances within that week, where custodian sends or receives ESI
involving those outside of peer group, as identified through hierarchies. Scaling: 2
5. User Activity Percentage of instances within that week, where custodian sends or receives ESI
involving those outside of superiors, as identified through hierarchies. Scaling: 2
6. Alias Clustering
Percentage of instances within that week, where custodian sends or receives ESI
involving at least one (1) of their identified communicative aliases. Scaling: 3
7. Emotive Percentage of instances within that week, where the custodian sends or receives ESI Scaling: 5
3. Behavioral Keywords
Percentage of EY-ACFE incentive-pressure-focused behavioral term hits for that
week in ESI sent or received by the custodian in focus. Scaling: 4
Custodian C1 C2 C3 C4 C5 C6 C7 ScalingC1 ScalingC2 ScalingC3 ScalingC4 ScalingC5 ScalingC6 ScalingC7 Score
A , Week 1 1 3 3 4 6 2 3
3 3 4 2 2 3 5 45
A , Week 2 2 2 4 5 3 4 2 37
7. Emotive Tone
Percentage of instances within that week, where the custodian sends or receives ESI
Rogue employee analytics
Integrated sampling approach
Raw ERP system tables
X riskiest payments, for detailed manual review
EY Payment Sample
Selection Tool payments, for contextual review0.5X randomly selected
Data model design
Identifying a sample set of 50 payments for review, from 1.4 million in scope.
Selection Tool payments, for contextual review
Structured Attribute Analysis
► Payments related to sensitive field changes, by riskiest users
► Payments related to sensitive field changes, by riskiest fields
► Duplicative invoices ► Invoice completeness ► Round payment amounts
Clustering
► Payments to high amount, low frequency vendors
► Payments to low amount, high frequency vendors
► Benford’s Law deviants
► Payments to statistically anomalous vendors
► Fuzzy matching between
Text Mining and Analysis
► Identification and extraction of entities:
► Geographies ► Proper nouns ► Addresses
► Telephone numbers ► Top concept extraction
► Identification of hits against EY-► Fuzzy matching between
employees and vendors
► Identification of hits against EY-ACFE bribery and corruption
Transaction Risk Scoring
Analyzing multiple tests for each transaction
Filter by selected analytics
Review breaches on targeted analytics
Beyond “rules-based” tests
Integrate statistical, visual and text mining techniques to identify patterns of high risk or rogue employee activities.
Focus on the payment text descriptions
What if you saw these terms used as justification for payments to third parties?
<blank description>
Pay on behalf of
Nobody calls it “bribe expense”
Government fee
<blank description>
Donation
Pay on behalf of
Special payment
One time payment
Special commission
Team building expense
Friend fee
Consulting fee
Goodwill payment
Volume contract incentive
Incentive payment
Team building expense
Commission to the customer
Interactive dashboard: Expense review
interface
(who, what, where, why, how and how much?)Search-around functionality
Rapidly build out networks of interest and tie in multiple data sources
Easily find entities, documents, events, etc which
are directly related to your selection
A more human way to look at data
Data points are represented as objects, with logical relationships
Graphical representation of relationships between seemingly discrete entities
Epicenters of activity Epicenters of activity become immediately
Ernst & Young resources
► The guide to investigating business fraud ► Corruption or compliance — weighing the
costs, the Ernst & Young 10th global fraud
costs, the Ernst & Young 10th global fraud survey
► “Best practices for a global FCPA program,”
Compliance Week
► “Keep it clean: the role of policies and
training in compliance with anti-corruption
laws,” Supply Chain Quarterly
Ernst & Young resources
► “Staying ahead of corruption liabilities,” ACG
Mergers & Acquisitions
► Detecting financial statement fraud – Ernst & ► Detecting financial statement fraud – Ernst &
Young white paper
► “Demonstrating the effectiveness of your
compliance program,” Compliance Week
► “Acquisitions in emerging markets: know the
risks and how to address them,” Financier
Worldwide Worldwide
► “Accounting for Words,” Internal Auditor
► “Breaking the Status Quo in E-Mail Review”,
Components of an effective corruption
compliance program
Contacts
James Walton Senior Manager
Advisory Services – Enterprise Intelligence Advisory Services – Enterprise Intelligence Dallas, TX
(214) 969-0777
[email protected] Dave Rogers
Senior Manager
Fraud Investigation & Dispute Services Dallas, TX
(214) 969-8037
Ernst & Young
Assurance | Tax | Transactions | Advisory About Ernst & Young
Ernst & Young is a global leader in assurance, tax, transaction and advisory services. Worldwide, our 152,000 people are united by our shared values and an unwavering commitment to quality. We make a difference by helping our people, our clients and our wider communities achieve their potential.
Ernst & Young refers to the global organization of member firms of Ernst & Young Global Limited, each of which is a separate legal entity. Ernst & Young of which is a separate legal entity. Ernst & Young Global Limited, a UK company limited by guarantee, does not provide services to clients. For more information about our organization, please visit www.ey.com.
Ernst & Young LLP is a client-serving member firm of Ernst & Young Global Limited operating in the US. © 2012 Ernst & Young LLP.
All Rights Reserved. 1206-1369289
This publication contains information in summary form and is therefore intended for general guidance only. It is not intended to be a substitute for detailed research or the exercise of professional judgment. Neither Ernst & Young LLP nor any other member of the global Ernst & Young organization can accept any responsibility for loss occasioned to any person acting or refraining from action as a result of any material in this publication. On any specific matter, reference should be made to the appropriate advisor. reference should be made to the appropriate advisor.