experience expertise // Elevate your understanding of key forensics concepts with guidance from BKD Forensics Institute.
Predictive Coding: Understanding the Wows & Weaknesses
Bryan Callahan, CPA, CFF, CFE, CVA Managing Consultant
Forensics & Valuation Services [email protected]
Lanny Morrow, EnCE® Supervising Consultant
Forensics & Valuation Services [email protected]
Definitions & Aliases
2
“Predictive coding is the electronic coding, organization, and prioritization of entire sets of electronically stored information (“ESI”) according to their relation to discovery responsiveness, privilege, and designated issues before and during the legal discovery process. Lawyers control this process by specifying
relevant criteria.” --Forbes
“The analysis and identification of relevant ESI, assisted by artificial intelligence.”
The Reason
3
• Explosion of data – “Big Data” • 95% of data never leaves digital
domain
• Considered “greatest eDiscovery challenge” by 2015
• Predictive coding has “greatest potential to improve eDiscovery” by 2015
What Is Unstructured Data?
Overview of the Process
1. Predictive coding tool selects random sample of documents (called “seed set”)
2. Senior attorney(s) review this set & assign each item as “relevant” or “not relevant”
3. Artificial Intelligence (AI) component performs “find more like this” search
4. Subsequently retrieved documents are again reviewed & assigned a category
5. Process continues until no more items are found to be relevant
Under the Hood
• Seed set – Relevance is sometimes moving target in a case. Random sampling protects against bias
• AI learns from initial items marked “relevant”, then assigns probability to every single item
• Applies true “latent semantic analysis”
• AI compares attorney decision to its own & continuously “retrains” itself
• Better systems will have “game” component • Process is keyword independent
From: The Boss To: Everyone
I wanted to congratulate Lisa on her really outstanding presentation at the meeting! The update was easy to follow and was very well received by the group. Great job, Lisa! Kim also gave a nice presentation. We appreciate you both.
Lisa Kim
nice
Real Example
To: Briber
From: Employee taking bribe ---
“Thank you for the “little” token – I park it in the driveway so my neighbors can be jealous! It screams on the open road (wow...!)
PS-I promise it’s just between you and me!”
Between you and me promise
Token
Thank you gift
park driveway road vehicle conspire !,!,! Wow
Benefits & Best Uses
• Forget smoking gun – 85 to 90% reduction in review is primary benefit
• Every part of process is documented, repeatable & can be statistically validated
• Excellent in early case assessment & investigations for rapid identification of relevant materials
• Best use is in tandem with traditional keyword, concept & similarity searches already established • Reduces inherent human error
2011 RAND Study on Human Review
10
“Taken together, this body of research shows that groups of human reviewers exhibit significant inconsistency when examining the same set of documents for responsiveness under conditions similar to those in large-scale reviews … Human error in applying the criteria for
inclusion appears to be the primary culprit [regarding the lack of
accuracy], not a lack of clarity in the document’s meaning or ambiguity in how the scope of the production demand should be interpreted. In other words, people make mistakes, and based on the evidence, they make them regularly when it comes to judging relevancy or
Weaknesses, Myths & Concerns
• Not “man vs. machine”—it’s more about “augmented intelligence” than “artificial intelligence”
• This is not new technology, just really improved • Vendors jockeying for market position, so lots of
“proprietary flavors” of technology
• Still under consideration by courts (when both parties are not in agreement)
• AI is still far from perfect—proceed with caution
Da Silva Moore
Decision
Da Silva Moore decision in Southern District of New York – appropriateness of predictive coding
1. Parties’ agreement
2. Vast amount of ESI to be reviewed
3. Superiority of computer-assisted review to available alternatives
4. Need for cost effectiveness & proportionality 5. Transparent process proposed by Defendant
Da Silva Moore v. Publicis Groupe, 287 F.R.D. 102 (SDNY 2012)
Current Stance by Courts
• July 2012 – Nat’l Day Laborer Org. Network v United States ICE Agency, 877 F. Supp. 2d 87, 109 (SDNY) • December 2012 – Robacast, Inc. v. Apple, Inc., No.
11-235 (D. Del.)
• January 2013 – EORHB, Inc. v. HOA Holdings, Inc.,
No. 7409-VCL (Del. Ch. Ct.)
• March 2013 – Chevron Corp. v. Donziger, No. 11 Civ. 0691, 2013 U.S. Dist. LEXIS 36353 (SDNY) and Harris v. Subcontracting Concepts, LLC, No. 1:12-MC-82, 2013 U.S. Dist. LEXIS 33593 (SDNY)
The Future of Predictive Coding
“It’s hard to predict things, especially the future” – Yogi Berra
• Rise of “evidentiary expert”?
• Can attorneys cooperate enough? • It’s not perfect, is that ok?
experience expertise // Elevate your understanding of key forensics concepts with guidance from BKD Forensics Institute.
Predictive Coding: Understanding the Wows & Weaknesses
Bryan Callahan, CPA, CFF, CFE, CVA Managing Consultant
Forensics & Valuation Services [email protected]
Lanny Morrow, EnCE® Supervising Consultant
Forensics & Valuation Services [email protected]
BKD Forensics Institute Upcoming Schedule
16
• Valuation Strategies for 2013 & Beyond: Tax & Business Succession Planning After the Fiscal Cliff
o Wednesday, May 1, 2013 o 11:30 a.m. – 12:30 p.m. CST
o Presented by Teal Dakan, CPA & Carol Lewis, CPA, ABV
• Whistle-Blowers & Fraud Hotlines: Reporting & Preventing Fraud
o Wednesday, May 8, 2013 o 11:30 a.m. – 12:30 p.m. CST
o Presented by Julia Swafford, CPA, CFE
Starbucks Gift Card & Certificate
• To help promote the event as associate training tool, we are offering a bonus
o Associates who attend all three sessions will receive a
Starbucks gift card & Certificate of Participation from BKD
Slides & Webinar Archive
• Today’s presentation slides are available at
bkd.com/fi. Recorded copy of webinar will be
available at the same location at conclusion of webinar series
• If you have any questions, please contact Dane Ryals @ [email protected]