E-Discovery Update: Producing Spreadsheets in Discovery 2008

In 2004 – at least a decade ago by e-discovery standards—I wrote an article about the difficulty of producing electronic spreadsheets in litigation discovery. At that time, I concluded, “Until new technologies are developed that can easily present a read-only, easily verified versions of native format files, production of spreadsheets will remain an imperfect process.” “Producing Microsoft® Excel Spreadsheets in Discovery,” LawSolutions (Winter 2004).

It’s now 2008. Many legal teams, perhaps even a majority of them, now use online repositories to host discovery document reviews instead of law firm-hosted tools (and infrastructure) that cannot economically scale to adequately support terabytes upon terabytes of potentially relevant electronically stored information (“ESI”). Next-generation search tools, whether online or installed at a law firm, permit the rapid and cost-effective identification of duplicate and “near-duplicate” documents and can cluster documents based on user-identified degrees of similarity. And, most importantly, amendments to the Federal Rules of Civil Procedure that took effect in December 2006 now require litigants to discuss production formats for electronic documents, with an explicit goal of reducing the squabbling that has characterized most productions of digital information.

But electronic spreadsheets? They’re still a problem. In spite of great financial investment to produce these documents in a way that satisfies competing litigation needs of authenticity and full native functionality, litigants continue to disagree on a production format for these documents. And judges still spend entirely too much time resolving discovery disputes over these particular types of documents. Why hasn’t anything changed?

Layers Upon Layers Of Data

Producing electronic spreadsheets in discovery would be relatively straightforward if the requesting party were only interested in visible content. After all, for word processing documents, many counsel litigants are often satisfied with the production of the visible text as it appeared to the user. Even when a requesting party specifically seeks the deleted text and document-associated metadata found in a word processing file, this information can often be extracted and offered in a fielded or other “flat” format that provides this additional file content in a format that can be used and authenticated as evidence in a fairly straight-forward manner.

Unlike other common user-created documents, however, electronic spreadsheets contain data in at least three distinct layers of information: (1) the visible value or contents displayed in the cell; (2) the formula or link to another spreadsheet or document that populates the cell with its visible value; and (3) one or more annotations to the content. Though each of these layers may have equal relevance in fact discovery, only one of these layers is ordinarily visible at a time. Producing these data in their separate layers, however, limits or eliminates the connections that a requesting party can draw from the matrix of data associated with one or more data cells.

Further, spreadsheets are often considered valuable not only for the visible data they display, but also for the underlying algorithms and structure that create the visible results. Litigants are interested in how changing variables and formulas that alter the overall information displayed to a spreadsheet user—it’s a very common way to challenge the assumptions that an opposing party’s fact witnesses may have used. However, to receive that functionality, production solutions that work for e-mail messages, word processing documents, and PowerPoint presentations simply don’t “fit” spreadsheets.

Authenticity Is In The Details

The obvious alternative to producing electronic spreadsheets in processed formats that destroy functionality is, of course, to produce them in the native format in which they have been kept in the ordinary course of business. Unfortunately, the very reasons that make native format attractive—the ability to see and manipulate all layers of data within the spreadsheet—also make these documents difficult to manage over the course of discovery. For example, the current way to authenticate native files is to compare their hash values (a type of electronic fingerprint). If the values match, the files are deemed identical. Unfortunately, even the most modest changes to an electronic file, including copying it from one medium to another, can change sufficient electronic data within a file that its hash value no longer matches the original copy. Legal teams seeking to introduce such files must go through an elaborate process to demonstrate why these changes aren’t material to the accuracy and reliability of their version of the file produced to them. Not all judges will consider such arguments.

Spreadsheets in native electronic form are also difficult to use as exhibits. Instead of pointing a witness, judge, or jury to a specific page or extract, lawyers can, at best, reference a specific sheet and range of cells within the spreadsheet. If too much clicking and scrolling is required to find relevant content, evidentiary significance may be overshadowed by the effort required to find it.

Making The Best Of Current Technology

In 2004, I brightly suggested that new technologies would help simplify the question of how to produce electronic spreadsheets to requesting parties. Four years later, we’re still waiting for this magic solution. In the meantime, what procedures make the most sense?

First and foremost, of course, litigants should work to find a solution that requesting and producing sides alike can accept. Depending on the case, spreadsheets may or may not play a significant role in the evidentiary story. If they’re not important, parties may find that a less than perfect (and therefore, less expensive) production format may work perfectly well. After all, as long as the spreadsheets are properly preserved in their native format, a supplementary production can always be made if these files change in significance.

Barring agreement to produce spreadsheets in something other than native format, native format spreadsheets can probably best be produced to a requesting party on read-only media such as CD or DVD-ROM discs. Files copied to such fixed media can have their hash values calculated one last time by the producing party to give all parties a reference point for authenticity. Requesting parties can create working copies of the spreadsheets while retaining the produced version for potential use as substantive evidence. Production of native file format spreadsheets on rewriteable media, such as flash and hard drives, does not provide the same built-in layer of data authenticity. While convenient, production of these files on such media may only defer, not prevent, evidentiary disputes about spreadsheets later in the case.


Competing courts have reached dissimilar conclusions regarding the need to produce electronic file metadata. In the well-known case of Williams v. Sprint / United Management Co., 230 F.R.D. 640 (D. Kan. 2005), the court found that nothing less than native file production of spreadsheets would suffice. Conversely, in the equally well-known case of Kentucky Speedway, LLC v. National Association of Stock Car Auto Racing, Inc., 2006 WL 5097354 (E.D. Ky. Dec. 18, 2006), rejected the Williams analysis as “unpersuasive” and declared that for most litigation, metadata “does not provide relevant information” and need not be produced. Kentucky Speedway, 2006 WL 5097354 at *8. Currently, courts seem a bit more inclined to follow the reasoning of the Williams case with respect to the production of electronic spreadsheets, but this is not a foregone conclusion in many jurisdictions, and the specific discovery needs of a case always have the potential to outweigh past judicial guidance.

For the time being, practitioners will still continue to struggle with the question of producing electronic spreadsheets in discovery “until new technologies are developed that can easily present a read-only, easily verified versions of native format files….” However, so long as a spreadsheet remains preserved in a format from which a supplemental production can be prepared, litigants should feel free to continue experimenting with – and negotiating – different ways in which these digital format documents can be usefully and cost-effectively provided in discovery.

Posted in: E-Discovery, E-Discovery Update