Introduction
to Digital Discovery
By Professor Charles Nesson
The ideal of civil discovery is complete disclosure, the equal access to all relevant information. Yet the adversarial nature of litigation has always insured a different reality, with lawyers doing their level best to keep information harmful to their client out of the hands of their opponents. The digital revolution suggests the possibility that the architects of the civil rules may get what they wished for, posing the attendant question of whether that is really what they want.
One way to see the digital world is that, once in it, all information inputted by any individual or company leaves a digital data trail that records all communications and actions. That data is then networked, accessible to an analyst of that network. This is the frightening vision posed by movies like “Enemy of the State.” As individuals we find ourselves worried about our loss of privacy, as we seem to be increasingly unable to avoid sensors that track us. Our capacity to be completely by ourselves doing what we choose, free of surveillance, seems to be evaporating. As litigants, we may face the same problem.
This concern for privacy has its analog with companies. As they transform themselves from ink-on-paper, non-networked communication environments to electronically networked environments, the possibility looms of near perfect information systems in which all actions are recorded, stored, and subjected to electronic retrieval and analysis, leaving a complete data trail capable of narrowing to nothing the difference between what was actually done and the history that can be told of it. Here, then, in theory, and all but close enough on the horizon to be real, is the vision of the framers of the civil discovery rules come true.
Companies that have embraced digital communications for their business processes have done so with remarkably little concern for the consequences in litigation. Corporate legal cultures have generally not kept pace with their company’s technological transformations, and have not been pushed to do so by their CEOs. Nor, generally speaking, have they been pushed yet by the plaintiffs’ bar, which has itself lagged in terms of understanding and appreciating the litigation opportunities that are dawning with the advent of the digital era. But, after the fact, companies, their general counsels and associated defense firms and their adversaries in the plaintiffs’ bar are now awakening to the subject of digital discovery. They are identifying and exploring its distinctive qualities, and beginning to master them. The government’s devastating use of discovered emails against Bill Gates in the Microsoft antitrust case served as a startling wake-up call to the managers of corporate America, alerting them to the dangers to which digital discovery exposes their companies. Plaintiffs attorneys are becoming energized with the electric possibilities of the digital age. Judges charged with administering the discovery process feel their unfamiliarity with the technologies involved and are eager to develop the experiential basis that has served them in making the kind of balancing judgments that have typified discovery administration in the ink-and-paper era. They are feeling a need to better understand the sweep and consequence of the discovery orders they are asked to approve.
My immediate goal in this writing is to introduce the subject of digital discovery and to outline its relevance to judicial decision making. My objective is to identify the distinctive features of digital discovery and to help chart the way toward accommodation in the digital environment among the legitimate desires of corporations to protect themselves, the rights and needs of plaintiffs for relevant information about corporate actions that injure them, and the interests of courts in managing a litigation process that is at once highly adversarial yet based to a remarkable degree on the willingness of litigants and their lawyers to preserve and produce evidence relevant to litigated controversies. Ultimately the question lurks whether indeed a litigation world of perfect information is ideal.
The Digital Landscape from a Company Viewpoint
While the specter of companies laying down complete and accessible data trails is still in the future, many companies are well on the way. Companies are presently leaving behind them huge data trails comprised of a wider variety of information and in vastly greater volume than was true when ink-on-paper was their dominant recording technology. Email, new to companies less than ten years ago, has overtaken the telephone as a transactional medium, resulting in archives of informal, candid internal and external messages of a kind that would never have been recorded in the ink-on-paper era. Word processors employed to generate formal documents, letters, memoranda and reports leave behind recorded trails of drafts and editings, so that even the language you delete or the ideas you edit out are discoverable. They also generate and record meta-data showing who created documents, who received them, when and by whom they were changed and read, often data of which the user is unaware. Spreadsheets, calendar programs, address books, bookmarks, downloads, cookies, caches and history files (see glossary) all add to the proliferation of information being generated and recorded.
Not only is a greater variety and volume of information being generated and recorded, but much less information is being thrown away. Once recorded, electronic data is remarkably persistent. The sheer bulk of ink-on-paper systems of the past imposed disciplines of deciding what should be filed, what should be discarded. The efficiency of electronic storage in both compactness and cost, on the other hand, substantially eliminates the motive to sort and discard. It is now easier and faster to save and store everything. Moreover, the very process of disposal for electronic information, even if one goes to the trouble, is often less effective than for paper. “Deleting” an electronic document means eliminating only the readability of its index entry, not actually overwriting (and thus erasing) the document, thereby leaving it for possible recovery. “Trash” in a computer is just another electronic file (possibly the first that would be requested by any opposing litigant).
The practice of making periodic back-ups of computer memory adds tremendously to the persistence of electronic data. Fear of computer and system crashes leads users at all levels -- company, division, department and individual -- to back up their data periodically. Typically, back-ups are made in forms that would permit recovery in the event of disaster, but otherwise not in a form suited to easy search and retrieval. Such back-ups are made and put away, seldom needed, seldom used, each a snapshot of all the data on a system or part of a system at a particular time. The cumulative store of back-ups comprises an archaeological trove in discovery.
The Threat From a Company Viewpoint
A completely electronically networked company with all of its past and current data accessible online approaches the ideal of complete transparency in electronic discovery. All that would be needed would be to gather all of the company’s recorded data, mount it on equipment capable of reading it, and network this equipment together. At that point the company’s entire data history would become accessible to electronic search by the Boolean techniques with which we devotees of Westlaw and Lexis have become familiar. This specter of near transparency to discovery is frightening to companies for several reasons. First, companies fear that they will be obliged by courts to expend the effort and pay the costs of making their data accessible, a task that is potentially overwhelming. Beyond the expense of locating and indexing their data, companies confront huge expense in making their data readable. Currently active data can be read with software and hardware currently in use. But “legacy data” recorded in times past with now outmoded software and hardware is difficult and expensive to read. The cost of deciphering legacy data appears to be one of the major expenses associated with digital discovery.
Second, companies fear the decision of who will be allowed to do the discovery searches. If, as in traditional discovery process, the company’s lawyers do the search, the time and expense could be overwhelming. One could imagine a respondent company being required to open its system to the other side, inviting its litigation opponent to jack into its data world and search at their time and expense. For understandable reasons, no company of which I am aware has been willing to take this step, undoubtedly because giving a hostile litigant open access to a company’s entire information system would mean disclosure of current business plans, trade secrets, loss of attorney client privilege and invasions of privacy.
One alternative would be to allow search by a neutral trusted third party. This is a strategy Northwest Airlines used recently in an extraordinary suit it brought against the flight attendants union with which it was negotiating a new contract. Northwest claimed that the flight attendants organized a “sick-out” over the Y2K New Year. In an effort to find evidence that the union and its members organized the sick-out, Northwest sought and obtained a discovery order requiring searches of the office and home computers of forty-three union officials and rank-and-file union members. The discovery order required turnover of all of the computer equipment to the accounting firm of Ernst & Young for purposes of examining and copying information and communications contained on the computer hard drives. Ernst & Young, supposedly neutral, was paid by Northwest. Upon completion of the discovery, Northwest fired over a dozen employees for engaging in the sick-out. The Union has filed grievances and the matters are still in litigation. The example, though extreme, illustrates difficulties with translating the “trusted third party” ideal into reality. Neutrality can be questioned. Privacy is invaded, albeit by a “neutral.” Control is lost.
Realistically companies will resist any surrender of control in the digital discovery process. If searches through their data are to be conducted, they will want to conduct them themselves, or have their agents conduct them. This means they must look to the courts to set limits on the scope and cost of what they can be required to do. The legal problem companies face is that there seem to be no definite bounds on digital discovery, no limits of subject, type, time, or expense. All forms of digital data are potentially discoverable, including any data compilations, according to the Federal Rules, “from which information can be obtained, translated if necessary by respondent through detection devices into reasonable readable form.” The fact, for example, that the senders or receivers of email may have “deleted” them, far from insulating them from discovery, may give reason to think that these email messages are a particularly important source to mine for admissible evidence. The fact that a computer may be used for personal as well as company business may give reason for care in conducting a discovery search but does not necessarily limit the scope of what can be searched.
The Judicial Standard of Proportionality
Companies naturally raise their concerns about potentially huge costs and burdens of discovery to the judges administering the discovery rules. Judges address these protests from respondents using proportionality principles. The civil rules invite judges to limit discovery when the burden of discovery outweighs its likely benefit, in order to prevent undue burden or expense. But the rules neither specify the component measures of burden and benefit nor define what is undue.
Judges may consider the expense of the requested discovery in proportion to the amount in controversy in the lawsuit. This relationship of cost to amount in controversy is thought to be an important factor because a disproportionately high cost of responding to discovery can force the respondent to settle a suit regardless of the merits. But when the high cost of discovery can be attributed to the respondent company’s failure to organize itself efficiently, judges are likely to impose the cost of discovery on the company even if the cost is high relative to the value of the plaintiffs’ claim. The situation is of the company’s own making, the argument goes, created by the company’s decision to adopt an information system from which it has benefited overall but which has made the desired discovery difficult. Why, a judge will ask, should those litigating against the company be disadvantaged by the company’s disorganization? It is, moreover, somewhat doubtful how much judges care that their discovery orders may impose so much expense on a respondent company that it will be forced to settle. Judges may be under such pressures to clear their calendars that, far from seeming a negative, the tendency of high discovery costs to promote settlement may be regarded as a plus. This seems wrong in principle, but real in fact.
Judges, when asked to limit discovery requests, will also consider how likely the requested discovery is to produce admissible evidence. Where the discovery challenge is to search large volumes of digital data, judges have to consider Boolean searches and data sampling. Keyword searches are constructed so as to produce a limited number of high assay hits. Sampling may be used for back-up tapes at staggered dates to determine how much overlap there is and how likely the further search of back-ups will produce new material. The objective from the respondent’s viewpoint is to establish a pattern of diminishing return in which the judge perceives a tipping point beyond which the effort of further discovery seems unjustified by the likely benefit. If convinced, the judge can mitigate the discovery burden on the respondent either by barring discovery beyond that point or by shifting the cost of further discovery to the requesting party.
The processes of sampling and keyword searching can be highly efficient in searching massive volumes of data. These digital search techniques comprise the most distinctive feature and potential advantage of digital discovery. Yet, because of the attorney client privilege, these search techniques may benefit the requester more than the respondent. If privileged documents are turned over to the other side in discovery, the privilege will be lost, not only for the document itself, but also for all other documents relating to its subject matter. This waiver doctrine requires discovery respondents to review each document carefully before delivering it to the other side. Given the massive volume of documents that may be involved, reviewing each one can pose a huge cost. This means that while the requester reaps the efficiency advantage of digital search techniques, the respondent does not. Companies would prefer approval by judges of a more relaxed standard of waiver, expanding the ambit of so-called inadvertent waiver so that inspection can be done by keyword search, backed by the safety net that if privileged documents slip through the keyword review effort and are turned over to the other side, they would not lose their privileged status and could not be used in evidence.
The bottom line is that the legal framework of proportionality is amorphous and leaves tremendous discretion to the discovery judge, whose decisions are, in any case, reviewed only against a standard of abuse of discretion, and seldom reversed. Instead of looking to judges to protect them from potentially huge discovery burdens, companies are concluding that they need to take control of their data trails and eliminate much of them by destroying unwanted data and ordering what remains to make search and retrieval and production efficient.
The Strategy of Destruction
In the absence of clear legal limits to the scope and cost of discovery, the only definite bound is the limit of what physically exists. This suggests that companies adopt protective, pre-emptive and systematic strategies of culling and destroying unwanted digital data. To do this, a company would need to differentiate the types of data it generates, assess each type of information to determine its value to the company over time, estimate how costly to the company the information would be in the hands of a hostile litigant, and determine the cost of destroying it. At the point in time when the net value of the information to the company goes negative by an amount sufficient to cover the costs of destroying it, the company would destroy it.
This approach, similar to the logic that drives so-called “retention” policies for paper records, may be difficult to implement for digital data. As a legal matter, there is the question whether judges will be as willing to respect digital data destruction as they have been willing to respect paper records destruction. The physical bulk of paper records meant that their periodic destruction was easily rationalized in terms of saving the costs of physical storage associated with retaining them. There was no need to emphasize the potential value of the documents to adverse litigants, no need, in other words, to advertise the motive of suppressing evidence. But with digitally recorded information the calculus changes.
Digital storage is so efficient and compact that storage costs become negligible, which means that, unless the negative potential value of stored information to adverse litigants is taken into account, there is little administrative non-litigation reason to destroy it. The question becomes: will courts respect programs the object of which is to destroy evidence that would be valuable to hostile litigants? Or will judges find such programs objectionable and look for ways to undercut them, possibly by allowing spoliation inferences to be drawn in favor of those disadvantaged by them? Spoliation inferences are based on the idea that if a company knows about a controversy and destroys evidence in its possession relating to the controversy, a litigant can argue and a jury can infer that the information the company destroyed was unfavorable to it.
To illustrate the question, imagine as a hypothetical a company that has generated a timetable for the destruction of each separable category of information the company generates, and contracts with “Disappearing, Inc.” (a real company) to execute its program. Assume that Disappearing, Inc. offers an encryption product and service that mediates every transaction through the client company in a way that results in all information the company records being encrypted with different keys for each information category, and finally assume that Disappearing, Inc. exclusively holds the encryption keys. By agreement with the company and according to the company’s schedule for destruction, Disappearing, Inc. simply destroys the appropriate keys, thus rendering all the recorded information in the category covered by the key effectively inaccessible.
The hypothetical poses a situation in which avoidance of storage cost is not available as a reason for destruction. The company’s data may continue to be stored; it simply becomes unreadable after a pre-set time. The sole and explicit reason for using the services of Disappearing, Inc. is to put the company’s information beyond the reach of discovery.
There is reason to think, notwithstanding the “in your face” quality of this destruction program, that courts would respect it. Companies are considered to own their data and have the right to destroy their data until and unless it becomes relevant to a controversy. Once a company is on notice of the relevance of information to a current or potential controversy, destruction of the information may then expose the company to sanction in the form of a spoliation inference. But if the company’s program of destruction was adopted without reference to any specific controversy or litigation and is administered consistently according to its design, then, typically, no spoliation inference would arise.
Moreover, even in this example, the company can claim that it has legitimate objectives other than suppressing evidence. Even if a company has no information that would help litigants against it, a process of culling and destroying unwanted information makes sense simply as a means of limiting the data set that must be searched each time there is a discovery request. Saving discovery costs may be seen as in itself a legitimating cost-saving rationale for destroying unwanted digital records, as effective as the saving of storage cost and space has been as a legitimating rationale for the destruction of paper records.
Finally, there is momentum that is likely to carry over from the current acceptance of paper destruction programs based on the rationale of saving storage space. Judges may view the negligibility of electronic storage costs compared to those for paper as a difference merely of degree, not of kind.
But this is not the end of the story. Companies wanting to implement destruction programs as a strategy for dealing with the potential burdens of digital discovery are typically engaged in litigation on a continuous basis. These are the companies that most feel they have a problem.
Explicit Preservation Obligations
When a company is sued, an obligation arises for it to preserve evidence relevant to the suit. Often the plaintiff in the lawsuit will formalize this obligation by sending a “preservation notice” to the company or by persuading the court to issue a preservation order. Typical notices and orders sweep broadly, calling for all necessary steps to assure that the company’s “employees, agents, accountants and attorneys refrain from discarding, destroying, erasing, purging or deleting any relevant documents including but not limited to computer memory, computer disks, data compilations, email messages sent and received and all back-up computer files or devices.”
The breadth of the preservation obligation causes serious problems for companies. For a company that is engaged in multiple lawsuits as a continuing part of its business, broad preservation obligations arising from successive individual lawsuits can effectively nullify a general data destruction program. With the company on specific notice of its preservation obligations, any failure to comply with the preservation obligation may be cause for sanction, thus creating a sanctions trap. This makes it dangerous for a company to run a continuing destruction program with the expectation that it can be adjusted as necessary to comply with the requirements of various preservation orders. Again, the problem is one of organization. Most companies that are confronting digital discovery problems simply do not have company-wide information systems that are well enough controlled to permit such fine-tuning. Counsel who receive an order such as that quoted above may send out email, voicemail and letter directives to employees, agents, accountants and other attorneys for the company passing the substance of the order along, but the process of changing the actual behavior of these people may take much more. The preservation order quoted above was issued in a wrongful death case litigated recently in Massachusetts against American Home Products involving the Fen/Phen diet drug combination. Notwithstanding messages forwarding the terms of the order to all employees, American Home failed for several months to halt its routine destruction of back-up tapes, with the consequence that at trial the plaintiff’s lawyer, backed by the trial judge, was free to prove the fact of failure and argue that the jury could draw the inference that the information American Home destroyed was unfavorable to it. The case resulted in a huge settlement within days after the plaintiff’s forceful opening argument about spoliation.
Conclusion
The bottom line is that if judges want to press companies toward the ideal of electronic transparency, they have the power and the opportunity to do so. They need only (1) hold companies responsible for keeping their data in readable form (resisting arguments that legacy data is too expensive to make accessible), and (2) interpret the bounds of the preservation obligations attendant on pending litigation sufficiently broadly so that companies cannot safely destroy their data trails.
Return to the opening question. As discussed, courts now more than ever have considerable leverage in shaping how companies structure their information systems. Is it actually in the interest of courts and litigants that companies preserve their full data trails in perpetuity and make them transparent to discovery?
We are still very early in the evolution of digital discovery. Corporate legal cultures are racing to catch up with the litigation implications of what have been runaway technological corporate changes. Judges are not yet familiar with the new technologies and the issues they present. There is practically no information in the literature about actual costs of digital discovery, broken down by different types of information and in contrast to discovery costs in the paper environment. No patters of best practice have yet emerged either for companies seeking to design and implement data destruction programs or for courts wanting to adopt efficient approaches to case management.
In this workshop, we hope to address how lawyers and litigants are responding to issues of digital discovery, and think through together how judges ought to administer requests for digital discovery in response to claims of the burdens and costs such discovery imposes.
– Charles Nesson