E-Discovery Update: Precision, Accuracy, and Relevance

Searching for anything is an inherently imprecise process. After all, if you know an object’s location, there’s no reason to search for it – you simply go to where it is. On the other hand, when you’re not sure where something is located, a systematic search process narrows an object’s possible locations until it is finally found.

Searching for relevant documents in litigation fact discovery is particularly imprecise. Documents may be critically important even though they don’t directly discuss the subject matter at dispute. Documents also gain and lose relevance as a legal matter progresses and legal arguments are added or dropped. In many ways, it’s an amazing testament to human intelligence that legal teams are able to identify critical documents in the voluminous amounts of discovery materials typically exchanged.

Volume, especially, is a key variable when developing methodologies for locating discovery documents. Decades of photocopying and electronically creating and distributing documents have created information repositories whose sheer volume makes it difficult, if not impossible, to find relevant documents using traditional human-based search and review strategies. Instead, in this environment of information overload, litigants have increasingly come to rely on computerized search queries, rather than free-form document review, to identify potentially relevant documents.

Substituting queries for broad human intuition has positive and negative consequences. On the plus side, unlike human reviewers who fatigue over time, computerized search queries serve up fast and consistent results across an electronic document collection. No one disputes that search engines are a much better way to find all mentions of a given name or word within digital text than human review. On the other hand, search queries, even those executed in cutting-edge search engines, seek items matching only the specific criteria that were entered. And, as a consequence, the human being conducting the search may not recognize the limitations of a search query if it turns up some documents that apparently meet relevance criteria. Sometimes, such oversight can be corrected, but other times, the damage may be permanent—and significant.

The recent case of Victor Stanley, Inc. v. Creative Pipe, Inc. (MJG-06-2662, D. Md. May 29, 2008) is only the latest in a long line of cases that demonstrate the limitations of computerized search technology when it is carelessly used. Defendants (the producing party in this case) used searches to screen a body of electronically stored discovery materials for potentially privileged documents. According to the opinion, defendant, working together with its outside counsel, devised a set of seventy search terms designed to identify potentially privileged documents. The opinion notes that the producing party did not provide the court with the search criteria they used, but they evidently were not particularly comprehensive, as approximately 165 privileged electronic documents (in a production of slightly more than 8,000 searchable electronic documents) were produced to the requesting party, which quickly discovered them and reported their findings to the producing party and to the court.

As noted by the court, these inadvertently produced documents included direct communications between the client and its outside counsel, communication between the client and its litigation-specific forensic expert, draft discovery responses, and documents relating to settlements in unrelated legal actions—in short, obviously privileged documents that should have been caught by even basic filtering. Clearly, something went very wrong.

In motion practice testing whether the producing party had waived privilege as to these documents, (replacement) counsel for the producing party argued that because a systematic process had been used to identify privileged documents, inadvertently produced privileged material should retain their privileged status. However, in light of the obvious privilege red flags raised by many of the 165 documents at issue and the producing party’s (former) counsel’s earlier disinterest in having the court approve a “claw-back” agreement for privileged documents, the court found that the producing party had not adequately shown that its review processes constituted a sufficiently reasonable process to protect legal privilege. Accordingly, the requesting party was entitled to use the documents.

Litigants relying on search technology to shape a discovery document review and production obviously hope to avoid the dramatic consequences of a botched search strategy. A few simple questions can help legal teams evaluate their strategies and identify potential legal weaknesses in their approaches?

1. What Are The Consequences Of Overlooking Documents?

Legal teams must accept that every search strategy – whether manual page-flipping by attorneys or cutting edge search engines – will exclude some relevant materials. Period. However, if perfect document identification is not possible, attorneys should still consider the consequences of missing relevant documents to understand how much they should strive to near perfection. For example, overlooking potentially relevant documents when a litigation hold is put in place could lead to inadvertent spoliation of evidence that fatally taints all subsequent discovery. Clearly, it’s important to put significant effort into minimizing this risk. Conversely, overlooking documents that have already been loaded into a review / production platform carries lesser consequences, as these documents can easily be identified at a later point and made available to the requesting party in a supplemental production.

Measuring the probability of catastrophic failures is necessarily fact specific. Overlooking privileged documents might well tend to fall into the first category of “vitally important” document categorization, but the consequences of such an occurrence can be mitigated by litigant claw-back agreements or even substantive law in the relevant jurisdiction. Rules of thumb (e.g., privileged documents are important) are not always reliable, and they should be validated rather than blindly accepted.

Sometimes it may not be easy to estimate the consequences of overlooking relevant documents. An alternate strategy, as suggested by Judge Grimm in the Victor Stanley case, is to try to quantify the number of relevant documents that were not captured by the search strategy. Statistical sampling, for example, can be used to test the documents outside the target queries to see whether additional relevant documents can be found. Depending on the results, the legal team should have higher—or lower—confidence in the methodology that they chose. The Victor Stanley opinion suggests that such additional measures, even if ultimately incorrect, add significantly to the defensibility of a litigant’s position.

2. Am I Using The Right Search Technology?

Different search strategies use different tools. At one extreme, a completely human-based subjective review can be conducted with technology no more sophisticated than paper and photocopying. At the other end of the spectrum, some litigants have experimented with document reviews conducted entirely by computers running sophisticated linguistic analysis algorithms with no human review at all. Most projects, obviously, fall between these two extremes. All that said, legal teams should consider how well their search technology is matched to the needs of their matter. For example, how long does it take to retrieve a known document using specific search criteria? Depending on the size of the document collection, some search engines may take several minutes to find a document, while others return the same results in seconds.

More subtly, does specific search technology locate the information that matters most? Not all search engines can index all native file formats. If critical ESI is stored in an uncommon data format, it’s worth double-checking to make sure your technology can read and index these materials. In addition, some computerized search and review solutions do not automatically index all ESI metadata, which may be an issue when trying to find all documents created, edited, or accessed by a specific individual. Finally, some search engines also index only a fixed number of words in each document. As a consequence, latter portions of long documents may fall outside the index limit and be excluded from search queries.

3. Who Can Defend My Search Strategy?

In Victor Stanley, the Court found it significant that the producing party provided little evidence in support of the defensibility of its privileged document search query terms. Not only did the litigant fail to disclose its search terms, even for in camera review, that the Court could directly analyze, but the litigant also submitted no outside support for its search strategy, as could be provided through expert testimony. Citing a number of cases, but in particular United States v. O’Keefe, 537 F. Supp. 2d 14 (D.D.C. 2008), the Court strongly criticized trial counsel’s lack of external validation:

While keyword searches have long been recognized as appropriate and helpful for ESI search and retrieval, there are well-known limitations and risks associated with them, and proper selection and implementation obviously involves technical, if not scientific knowledge. . . . (“[D]etermining whether a particular search methodology, such as keywords, will or will not be effective certainly requires knowledge beyond the ken of a lay person (and a lay lawyer) . . . .”) (citing Equity Analytics, LLC v. Lundin, 248 F.R.D. 331, 333 (D.D.C. 2008).

Victor Stanley, ___ at ____, ____.

Though some cynics may interpret this language and line of cases as overt marketing for e-discovery consultant services, a better analysis might be that it reflects judicial acknowledgement that testing the adequacy of search queries usually involves nuanced analysis far beyond the simple facts and concepts of which a court can take judicial notice. Thus, litigants should be prepared to present appropriate educational, explanatory, and other supporting evidence to demonstrate their competence in choosing and executing electronic search strategies — even in situations where searches were negotiated through the meet and confer process. Expert witnesses are one obvious way that legal teams can defend the reasonableness of their actions without revealing significant amounts of the attorney analytical process, though experts are not the only possible source of validation. Learned treatises and reported e-discovery case law may also be used to show reasonableness —so long as clear parallels can be drawn. Such materials, once rare, are increasingly common, making it easy to find helpful fact patterns and legal analysis.


There’s no question that electronic documents can be most effectively managed with electronic search and review tools—and that legal teams should be using some type of computerized search technology to leverage the knowledge of team members. However, it’s also increasingly likely that opposing counsel will notice when these tools have been used incorrectly or inadequately. Lawyers should understand that the bar is being raised with respect to how technology can and should be applied to manage discovery and discovery documents, and they should also understand that they may need strongly persuasive evidence, including outside expert validation, to demonstrate the reasonableness of their e-discovery decisions when they have been challenged.

Posted in: Information Architecture, Search Engines