That is how much an average company spends to collect, process, review and produce a gigabyte of materials relevant to a legal matter, according to RAND. But what happens to that data, and all of the intellectual capital invested in the documents, after the matter is over? What if another matter calls for collection or review of some or all of the same material?
“Let’s reuse this data where it makes sense,” is the answer for a growing number of corporations. These companies are able to reuse data through use of a multi-matter repository that stores collected documents, processed data, privilege calls, document coding, redactions and more for use in more than one matter. The result is a head start on discovery in ensuing matters and enormous work and cost savings downstream.
But deploying a multi-matter repository is not as easy as it sounds. This article will outline some key considerations for deploying a multi-matter repository as well as provide an overview of the benefits a repository may provide to corporations.
A good first step is to consider what types of information and work product are worth keeping for potential reuse. The following items are ripe for this:
- Collections: avoids the need to spend time and money—and bother custodians—again and again.
- Processed data: saves reprocessing costs and includes the added bonus of facilitating early case assessment across the entire universe.
- Review coding: can be especially useful going forward if fact patterns are similar; privilege calls and redactions are especially suitable as they normally do not rely as directly on matter-specific criteria.
- Entire productions: may be possible in some cases.
Setting up an effective multi-matter repository to include any or all of the items above is not simple and there are many factors to consider. We have found that there is a “continuum of complexity” related to reusing each type of data. For example, not all matters require the same custodians or date ranges. Some documents may be responsive in some matters but not others, and it is even possible that different privilege rules apply. Different information may need to be redacted in different matters. Production format specifications may vary.
On the complex end of the continuum, things may get worse. There might be dozens of related matters, but they may not overlap completely or concurrently. At a given moment, some matters may be complete, some may have review and production in progress and some may be dormant or just gearing up. The constellation of matters can seem like a Venn diagram run amok.
Fortunately, the more complexity that may exist, the more utility a multi-matter repository may offer: More matters help improve the odds various work product can be reused. More documents improve the opportunity for savings on collecting, processing and imaging operations. The more complicated the document review, the more effort is saved each time the work is reused. As time goes on and the repository grows, the usefulness of its contents increases for early case assessment – especially for new matters, where there is a greater and greater likelihood that pertinent data is already available for immediate searching and analysis. Legal teams thus gain an opportunity to assess potential issues before discovery even begins, providing a valuable head start over an adversary. The key to unlocking this value is to plan in advance and set up a repository that is scalable, flexible, fungible and standardized.
Scalability and Flexibility
The technical requirements for a repository can quickly become extremely important, particularly as the number of expected matters and the complexity of the relationships between them grow. The repository will need the ability to potentially handle many terabytes of data, stored securely and accessible on demand to multiple parties and teams. Account for the unexpected by planning for unforeseen growth. There will often be more ensuing matters or other sources of information than originally anticipated.
There must be enough computing power, and security features, for many people to access the appropriate data in the repository at once and enough speed to perform sophisticated searches, coding and exporting across multiple matters. The repository must be robust enough to perform an early case assessment, including test searches and analytics across the whole corpus or specific-matter subsets.
One factor to consider is whether to have a single database or multiple linked databases that communicate with one another. Having many matters with different but partially overlapping criteria suggests multiple separate reviews, and there may be a limit to how many concurrent reviews one database can manage – for logistical if not for technical reasons. If the legal team decides to pursue a multiple database configuration, it is important to make sure that tasks like isolating and exporting review populations can be done in parallel.
The ability to standardize processes across a number of stages can help to reduce the overall cost and complexity. Here are some major factors to consider.
Can all the data be processed the same way? Up front processing choices have small but potentially consequential downstream effects, such as the time zone in which email data is rendered.
Duplicates can be identified by custodian, by matter, or globally across many matters. Can all related cases use the same deduplication approach? If possible, set up the repository in a flexible way to offer custodial deduplication on some matters and global on others, while still being able to reuse work product. Email thread deduplication (suppressing earlier emails in a thread when a later inclusive email is available) adds another layer of complexity. Can all matters use thread deduplication? Do different matter sub-populations each need separate thread deduplication analysis?
Many corporations use a variety of e-discovery service providers and technologies. There may be minor differences in processed output, complicating efforts to combine data from different tools. Perhaps most important, software from different providers may create incomparable hash codes, rendering it difficult or impossible to reuse coding calls or other work product since duplicates across sets will not systematically match up.
First, standardize on a single imaging tool and settings for that tool if possible. Rather than imaging the same data multiple times, data imaged once can go into a number of separate review databases or review streams in the same database. This is especially important as it impacts redactions. In order to have any hope of reusing redactions across matters, and thereby saving a tremendous amount of manual effort often conducted by more senior members of the legal team, images must match identically.
Adversaries of course can complicate all of the above. Opponents will have their own ideas on how discovery should be handled. For example, some may request data to be processed a certain way, may push back on (or request) predictive coding, or may expect data to be produced in a specific format. In a multi-matter environment, preparation for meet-and-confers is crucial, as it is important to know what must be negotiated—even items as simple as black-and-white versus color images —in order to retain the standardization necessary across matters with different adversaries.
Fungible Work Product
Once there is buy-in from legal teams on standardization of processing and hosting specifications, the team can map out which aspects of review coding should be “universal” and which will be matter-specific – from among privilege, “hot”, issue-related, and responsiveness determinations.
Legal teams must partner with their e-discovery providers to determine what to reuse – and the rules of reuse. First, collaborate to determine if there is enough overlap between matters and if the rules are similar enough to warrant establishing a coding reuse component for the multi-matter repository. Assuming there is, whatever can be reused, or is “fungible,” should be reused, but protocols typically ought to allow for different coding in different cases.
Inevitably, there will be disagreements and changes. For example, consider a privileged document that has been inadvertently produced and then clawed back. Should that document be clawed back from all matters—even those from the past—or only for active matters? The legal team will likely want to have the flexibility to permit intentional discrepancies to stand.
The cost savings provided by a multi-matter repository can be enormous, and can multiply successively based on the number of documents, the number of aspects that can be reused, and the number of related cases involved. For example, we have clients that have been able to save millions of dollars by eliminating the need to re-collect, re-process and re-review data across matters with overlapping custodians.
Ultimately, the big question is: Do the likely savings outweigh the effort expended at the front end to establish standardization and set up a multi-matter repository? If there are many cases and/or a lot of data, even a conservative estimate of savings is likely sufficient to predict a return on investment. Moreover, beyond the monetary benefit tally, legal teams will see significant value from the added consistency and defensibility of reusing decisions across matters, and from the enhanced early case assessment opportunities afforded by a multi-matter repository.
If you have an appropriate litigation profile, consider joining the vanguard of corporations that are moving away from managing each e-discovery event as a oneoff, and toward proactively planning for an e-discovery approach that leverages valuable work product to realize significant savings and strategic advantages.