Dynamo Q&A with Expert Witness Jim Scarazzo
On September 17 Judge Buch of the U.S. Tax Court made an important ruling in favor of predictive coding in the Dynamo Holdings Limited Partnership, Dynamo GP, Inc., tax matters partner, et al v. Commissioner of Internal Revenue case. This is the Tax Court’s first ever opinion regarding how to conduct e-discovery, and FTI’s own Jim Scarazzo served as an expert witness in court and was cited several times in the ruling. Here is a short Q&A with Jim about the case and his involvement.
Q: What’s the key takeaway of this ruling and why is it garnering so much attention?
A: The Dynamo ruling is important for a few key reasons. This is the first time that the U.S. Tax Court has ruled in favor of using predictive coding. The judge had a clear choice between what we would consider a traditional e-discovery process – broad collection and then linear review (with potential application of keywords) – and using predictive coding to cull the data set to the most potentially relevant information. Secondly, it’s one of a small group of court rulings that have come out decisively in favor of predictive coding use.
Q: Can you quickly summarize the arguments made on each side?
In July 2013, the government filed a motion to compel production of Electronically Stored Information (ESI) from Dynamo’s archives. The hearing cited two backup tapes. Each tape encompassed about a month of data from numerous sources – email boxes, calendars, system files, etc. Dynamo’s legal team argued that transferring all of the data from these two backup tapes into a reviewable format, as well as processing, hosting and reviewing for privilege, would take a lot of time and come at an exorbitant cost. In response, the government offered to take the tapes and do the e-discovery for Dynamo, but the company argued that handing over the tapes would essentially license a “fishing expedition.” Further, Dynamo argued that it couldn’t hand over the data because it contained sensitive employee data protected by regulations such as HIPAA. Dynamo then argued that predictive coding could be used to more efficiently respond to the government’s request. The government argued that predictive coding was “unproven technology” which led to the hearing in March of 2014.
Q: What was your role in this matter?
A: I was retained by Dynamo’s counsel, Gunster, in August 2013 to analyze and evaluate proposed backup tape restoration and review scenarios and to provide opinion (s) regarding those proposed processes, provide estimated costs associated with those proposed processes and to compare and contract procedures to assist the Court in determining the most reasonable approach to resolving a discovery issue relating to electronically stored information. I was asked to analyze and compare a scenario in which data restored from backup tapes is searched using specific criteria and compare that to a scenario where ESI is restored from tape and reviewed without predefined culling criteria. I interviewed the company’s IT department to understand its data retention policies and assess the type and volume of data on the backup tapes. I then developed cost estimates for processing and hosting doing the government’s broad method, and then another cost estimate for processing and hosting a dataset if defined minimization criteria (de-NISTing, keywords and so on) were applied. In March 2014, a hearing was scheduled and I was asked to provide a report of my findings and submit it as direct testimony. I was also cross-examined by opposing counsel about the predictive coding process, precision and recall, statistical sampling, etc. In the end, the September 17 ruling from Judge Bach includes several references to my testimony and the fact that the government’s technical witness didn’t rebut any of my findings.
Q: So, predictive coding wasn’t something that was argued for at the very beginning? The arguments eventually evolved to include the use of predictive coding?
A: Correct. When I began my analysis of the tapes, I was comparing the costs of processing and hosting everything (the government’s position) versus the costs of processing and hosting data that had been culled through the use of a predefined minimization and searching criteria. The written report that I submitted to the court in March as part of my testimony didn’t even include the term “predictive coding.”
Q: You included some cost estimates that were cited by the judge in the ruling. Can you explain those?
A: The minimization strategy would enable Dynamo and Gunster to limit the data set to an estimated 200,000 to 400,000 documents, with processing and hosting costs of $80,000 to $85,000. Using the government’s method, the data set could be as high as seven million documents and cost between $450,000 and $550,000 for processing and hosting. I should note that my report did not include estimated costs for reviewing the documents (normally the most expensive step in e-discovery) or for applying predictive coding, because the company had not yet determined how review would be handled at the time I submitted the initial report. Of course, with this ruling we expect the Dynamo legal team to use predictive coding and further reduce the cost of legal review.
Q: What advice would you have for other legal teams that want to use predictive coding?
A: It’s clear from the judge’s ruling that predictive coding is gaining greater acceptance. It’s also clear from the judge’s ruling that predictive coding is just one tool and method to help you respond to requests yet still safeguard the company’s data and minimize costs. Having a smart strategy ahead of time, minimizing the data set early in the process, arguing for what’s reasonable – all of these play an important role. I’m happy to talk with anyone about how they can implement this strategy on their next matter.
To contact Jim Scarazzo, please email him at Jim.Scarazzo@fticonsulting.com.
The views expressed herein are those of the author(s) and not necessarily the views of FTI Consulting, its management, its subsidiaries, its affiliates, or its other professionals.