E-Discovery in Mass Torts:

(1)

E-Discovery in Mass Torts:

Predictive Coding—Friend or Foe?

Sherry A. Knutson

Sidley Austin

One S Dearborn St 32nd Fl Chicago, IL 60603 (312) 853-4710 [email protected]

(2)

Sherry A. Knutson is a partner in Sidley Austin LLP’s Chicago office. Ms.

Knutson’s practice ranges from multi-plaintiff, multi-jurisdiction litigations to single

plaintiff cases. Her mass tort experience includes complex multidistrict and related

state litigation, class actions, and national coordination and defense of product

liability and toxic tort cases, including Actos, Gadolinium-Based Contrast Agents,

Yaz, Trasylol, Celebrex/Bextra, Baycol, and latex gloves. She also has served as lead

counsel in individual lawsuits involving a variety of personal injury claims.

(3)

E-Discovery in Mass Torts: Predictive Coding—Friend or Foe? ■ Knutson ■ 407

E-Discovery in Mass Torts:

Predictive Coding—Friend or Foe?

I. What Is Predictive Coding? ...409

II. The Relevant Case Law ...409

III. Take-Away Points ...411

Endnote ...411

(4)

(5)

E-Discovery in Mass Torts: Predictive Coding—Friend or Foe? ■ Knutson ■ 409

E-Discovery in Mass Torts:

Predictive Coding—Friend or Foe?

Production of electronically stored information (ESI) has added substantial time, costs, and complex-ity to discovery — particularly in cases with large-volume ESI such as product liabilcomplex-ity mass torts. Under the traditional production model, keyword searches are run across custodians’ ESI to cull potentially responsive documents. A team of people then manually review all culled documents to make responsiveness determina-tions. The party produces those documents that are deemed responsive and non-privileged by reviewers. This method is costly and subject to human error. As a result, parties have begun considering use of predictive cod-ing – a form of technology-assisted review – in an attempt to reduce the need for manual review of ESI.

I. What Is Predictive Coding?

Rather than conducting a keyword search and full manual review, predictive coding relies on “trained” software to help determine the relevance of ESI. Software is trained by first selecting a sample or control set of documents that experienced attorneys – either representing the producing party or sometimes both sides – analyze and code for relevance. The software uses this coding to “identif[y] properties of those documents that it [then] uses to code other documents.” Da Silva Moore v. Publicis Groupe SA, No. 11 Civ. 1279, 2012 WL 607412, at *2 (S.D.N.Y. Feb. 24, 2012). As the attorneys code more sample documents, the software builds an algorithm that begins to predict the reviewers’ coding. Id. The reviewers test and refine the algorithm in order to verify that the software is accurate at retrieving responsive documents. Eventually, the software’s algorithm and the reviewers’ coding will be sufficiently similar to permit the software to make sub-sequent responsiveness determinations. In general, attorneys will need to review a few thousand documents to properly train the software. Id.

Once the software is trained, the algorithim is run across the entire document collection. Some pre-dictive coding software simply provides a binary yes-no response as to the relevance of the documents. Other software provides a scaled relevance score for each document: for example, a relevance score of 0 through 100, with 0 being least likely to be relevant, and 100 being most relevant. The parties can reach agreement on the cut-off relevance score that will be used to determine which documents need to be reviewed manually. For instance, the parties could agree that all documents with a relevance score of 80 to 100 will undergo manual review to make final responsiveness determinations for production.

II. The Relevant Case Law

Only a few courts have addressed the issue of predictive coding. In Da Silva Moore v. Publicis Groupe

SA, No. 11 Civ. 1279, 2012 WL 1446534 (S.D.N.Y. Apr. 26, 2012), Judge Andrew Carter affirmed Magistrate

Judge Andrew Peck’s order approving use of predictive coding over plaintiffs‘ objections. In his order, Magis-trate Judge Peck concluded that “computer-assisted review is an acceptable way to search for relevant ESI in appropriate cases.” Da Silva Moore v. Publicis Groupe SA, No. 11 Civ. 1279, 2012 WL 607412, at *2 (S.D.N.Y. Feb. 24, 2012). According to Magistrate Judge Peck, “computer-assisted coding should be used in those cases where it will help ‘secure the just, speedy, and inexpensive’ determination of cases in our e-discovery world.”

Id. (quoting Fed. R. Civ. Proc. 1). Judge Carter agreed, finding that Judge Peck’s decision properly recognized

the potential advantages and disadvantages of the predictive coding software to be used. Da Silva, 2012 WL 1446534, at *2.

(6)

410 ■ Product Liability ■ April 2013

Magistrate Judge Peck, who had published an article on predictive coding four months earlier on the website Law Technology News, identified a number of factors that support the use of predictive coding soft-ware:

• Manual review is too expensive in most cases. Da Silva Moore, 2012 WL 607412, at *9. • Statistics show that computerized searches are at least as accurate as manual review—if not

more accurate. Id. at *9.

• Keyword searches alone are generally not sufficient. Id. at *3.

 Particularly at early stages, the requesting party may not have sufficient information about

the terminology or abbreviations used by the custodian to suggest keywords that will cap-ture the relevant documents. Id. at *10.

 Keyword searches frequently produce overinclusive results, with numerous irrelevant

doc-uments. Id.

 Keyword searches are often not very effective. For example, in a study that attempted to use

keywords to retrieve at least 75 percent of relevant documents, searchers believed they had met this goal, but really only found 20 percent of the relevant documents. Id. at *9.

Magistrate Judge Peck also acknowledged the potential problems that might arise in discovery that involves predictive coding. He advised that the party seeking to use predictive coding should be prepared to show both the court and the opposing party (1) what the procedure entails, and (2) why that procedure pro-duces defensible results—that is, that the procedure “produce[s] responsive documents with reasonably high recall and high precision.” Id. at *2. As used in Magistrate Judge Peck’s Order, recall measures completeness; precision measures accuracy. Id. at *9. In other words, precision and recall determine how effective the proce-dure is at finding all responsive documents, but only responsive documents.

In short, the procedure must be sufficiently transparent to permit all parties to have reasonable confi-dence in the results. Conficonfi-dence in the process can be instilled by:

•_{permitting the requesting party to participate, along with the producing party, in training the} predictive coding software that is selected; or

• providing access to the control set of documents used to train the software—whether relevant or not (with privileged information redacted or withheld). Id. at *10-11.

The procedure also should be validated, preferably with statistical sampling (rather than random sampling) to ensure that there is a scientific basis that the predictive coding software is identifying relevant documents at a sufficiently high rate.1

Only a few other courts have addressed predictive coding since the Da Silva Moore decision. In

Global Aerospace Inc. v. Landow Aviation, L.P. dba Dulles Jet Center, Case No. CL 6140 et al, (Va. Cir. Ct,

Lou-doun Cty. April 23, 2012), the court entered an order allowing defendants to proceed with the use of predic-tive coding for purposes of processing and producing ESI, over plaintiff’s objection. In National Day Laborer

Organizing Network v. United States Immigration & Customs Enforcement Agency, 877 F. Supp. 2d 87, 111-12

(S.D.N.Y. 2012), the court indicated that it would allow the parties to agree to use predictive coding. In Kleen

Prods. v. Packaging Corp. of America, No. [ ], 2012 WL 4498465 (N.D. Ill., Sept. 28, 2012), plaintiffs initially

asked the court to order defendants to use predictive coding to re-do prior productions and make future pro-ductions, but later withdrew their request. Id. at *5. Finally, in the (In re Actos (pioglitazone) Products Liability

Litigation MDL, No. 6:11-md-2299, Judge Rebecca Doherty entered an order on July 27, 2012, that sets forth

(7)

cod-E-Discovery in Mass Torts: Predictive Coding—Friend or Foe? ■ Knutson ■ 411

ing for production. See Case Management Order: Protocol Relating to the Production of Electronically Stored Information (“ESI”), available at http://www.lawd.uscourts.gov/sites/default/files/ UPLOADS/11-md-2299.esi. pdf (last accessed Jan. 23, 2013). Based on this order, three lawyers from each side collaboratively “trained” the software by jointly reviewing a control set of documents. Various procedures have been implemented sub-sequently to try to promote recall and precision in the software’s algorithm. As of the date of this article, the parties are still determining the viability of using predictive coding going forward.

* * *

The driving force for predictive coding is the cost of discovery. The goal for any review method is “to result in higher recall and higher precision than another review method, at a cost proportionate to the ‘value’ of the case.” Da Silva Moore, 2012 WL 607412, at *9. Proportionality is addressed in Federal Rule of Civil Pro-cedure 26(b)(2)(C), which requires that the scope of discovery be appropriate to the litigation. Moreover, courts have an obligation to balance the utility against the costs of discovery. Id., at *12 (quoting U.S. ex rel.

McBride v. Halliburton Co., 272 F.R.D. 235, 240 (D.D.C. 2011). Whether predictive coding leads to production

results that are more accurate and less costly than the traditional process, however, remains to be seen.

III. Take-Away Points

In order for predictive coding to be viable, the parties should consider agreeing on the following: • number and identity of document custodians;

• source of electronic data to be reviewed and produced (e.g., current email, ESI from hard drive); • quality control measures; and

• process transparency to demonstrate reliability of the results.

Ultimately, whether predictive coding is a “friend or foe” in product liability mass tort cases will depend upon how accurate the procedure is proven to be in these type of cases, and whether it can be shown to save time and money.

Endnote

1_{Magistrate Judge Peck ruled that Daubert does not apply to the results of computer-assisted review, because Daubert}

(8)