*3 Plaintiffs request the production of six different electronic databases. Although both sides have filed their submissions under seal due to the sensitive nature of some of the materials involved, no security concerns prevent public disclosure of the nature of these databases. No one would be surprised to learn that the prison system maintains computerized records that enable them (1) to track the location of prisoners in the system; (2) and (3) to record and recover unusual incident reports and disciplinary records; and (4), (5) and (6) to monitor prisoners' medical problems, including scheduling of medical appointments, tracking prisoners with certain specific medical problems, and recording the pharmaceuticals required by prisoners. Plaintiffs have demanded to be provided with copies of these databases.
Plaintiffs argue that this is a simple and straightforward discovery demand, of a sort routinely enforceable under the Federal Rules. Plaintiffs' complaint argues that double-celling produces intolerable health and safety risks to them. Information about which prisoners have been double-celled (and where and when), and the extent of disciplinary problems, violent incidents, and medical problems in the prisons is clearly relevant to testing that claim. Plaintiffs point out that at the Bolton
trial, the State offered expert testimony about the correlation between double-celling and various problems that was derived, in part, from information contained in these databases. Plaintiffs claim that they should have the same opportunity to analyze the data for themselves, to see whether information in the State's records supports their claims.
In fact, plaintiffs go father than arguing relevance. They insist that the databases are “essential
to the effective prosecution of Plaintiffs' claims.” (Pl.Br. at 2, emphasis added.) “Without the Databases, Plaintiffs cannot
effectively demonstrate the unconstitutional impact of double celling.” (Id.
at 9, emphasis added.) The argument is that without sophisticated statistical analysis of the incidence of disease and violence vis-à-vis double-celling, individual testimony will be dismissed as anecdotal, and plaintiffs will fail to establish that double-celling causes extensive and systematic harm. Plaintiffs propose to undertake “an iterative process in which a statistical expert must identify a set of independent variables related to characteristics of double celling,” and then test the data for “relationships with any number of dependent variables, ... [including] incidents of violence, misbehavior, the severity of violence and misbehavior, incidence of infectious diseases among the inmate population, ... etc.” (Id.
at 9-10). The process of identifying correlations and performing regression analyses on these variables is not a simple matter of “running a test. Different combinations must be analyzed. After each analysis, the statisticians and counsel must review the results, and propose revised analyses to be run.... [T]he scope of the project is substantial: the number of potential statistical relationships is in the millions.” (Id.
Plaintiffs go on to anticipate and rebut the State's objections to production. (1) To the extent that the State is concerned about security of information, plaintiffs contend that existing confidentiality orders can adequately protect the security of the information; further, they offer additional safeguards, such as limiting access to the copied data. Moreover, they point out that much of the data contained in the electronic databases (though not all) is already available to the plaintiffs in hard copy, so that the additional security risk is slight. (2) To the extent that the State argues that plaintiffs already have the necessary information, plaintiffs argue that the electronic data are both more extensive and more easily manipulable, permitting more efficient analysis of the evidence, and that requiring the plaintiffs to create their own electronic database will impose extraordinary burdens on plaintiffs' pro bono counsel. (3) To the extent that the State claims production of the databases will be unduly burdensome, plaintiffs point out that the data is already backed up on a routine basis, and that replication of an additional copy for plaintiffs on the occasion of such a data back-up should be relatively simple.
*4 Notably, however, plaintiffs' assertions about the kinds of analyses they wish to perform, the security issues involved, and the practicalities of reproducing the material and utilizing it for the desired purposes are supported only by affidavits from counsel. Plaintiffs provide no evidentiary support from any statistical expert concerning the kinds of hypotheses that could be tested by data of the nature contained in the databases, or what statistical methodology might produce reliable results. Nor do they offer an affidavit from any information technology expert who has considered the descriptions of the databases obtained in depositions, who can testify to the ease with which such data, once copied, can be integrated into a form that will permit whatever statistical manipulations are contemplated, or to how the data could be maintained in a secure manner. Indeed, it is apparent from an examination of plaintiffs' motion that plaintiffs have not retained any such experts yet, and are engaging for the most part in (somewhat informed) speculation about what might be done with the data they seek.
The discovery permitted by the Federal Rules is broad, but not without limits. As an initial matter, “Parties may obtain discovery regarding any matter, not privileged, that is relevant to the claim or defense of any party....” Fed.R.Civ.P. 26(b)(1). However, discovery is subject to limitation by the Court to the extent that
the discovery sought is unreasonably cumulative or duplicative, or is obtainable from some other source that is more convenient, less burdensome, or less expensive; (ii) the party seeking discovery has had ample opportunity by discovery in the action to obtain the records sought; or (iii) the burden or expense of the proposed discovery outweighs its likely benefit, taking into account the needs of the case, the amount in controversy, the parties' resources, the importance of the issues at stake in the litigation, and the importance of the proposed discovery in resolving the issues.
*6 Fed.R.Civ.P. 26(b)(2). In addition, where necessary “to protect a party or person from annoyance, embarrassment, oppression, or undue burden or expense,” the Court may enter any protective order that is necessary, including “that the disclosure or discovery not be had.” Fed.R.Civ.P. 26(c)(1). “Rule 26 vests the trial judge with broad discretion to tailor discovery narrowly.” Crawford-El v. Britton, 523 U.S. 574, 598, 118 S.Ct. 1584, 140 L.Ed.2d 759 (1998).
Thus, the first step in the process is deciding whether requested material is discoverable, that is, whether it is relevant and not privileged. If it is, the Court still has considerable discretion to evaluate the practical realities of discovery, balancing the importance of the information against the burdens of production to decide whether fairness does or does not require production, and if so, on what terms.
This framework applies to requests for electronic or computer-based information just as it applies to more traditional materials. 8A Charles Alan Wright & Arthur R. Miller, Federal Practice and Procedure, § 2218 at 451 (2d ed. 1994). As electronic mechanisms for storing and retrieving data have become more common, it has increasingly behooved courts and counsel to become familiar with such methods, and to develop expertise and procedures for incorporating “electronic discovery” into the familiar rituals of litigation. The rules cited above, albeit for the most part drafted in an earlier era, deal perfectly well with the problems occasioned by discovery of electronic “documents.” At the same time, that the principles applied are the same does not mean that any information that would be discoverable in paper form must automatically be discoverable, on the same terms and conditions, and without consideration of additional issues, in electronic form-or, for that matter, that material that could not be discovered in hard copy would not be discoverable in the form of a database. Particularly when it comes to balancing the costs and benefits of providing discovery, the balance may well differ depending on the form of the information. In this case, for example, the plaintiffs claim that the same information produced in hard copy will be more valuable in the form of an electronic database, because the information will be more manipulable. The State, on the other hand, claims that the burdens and risks of producing electronic data will be much greater than those of producing the same information in the form of paper documents and reports. Such claims must be carefully evaluated on a case-by-case basis. Without at this point accepting either side's claims, it is evident that the form of the data requested creates special problems that must be given appropriate consideration.
It is evident that the material requested is, at least for the most part, relevant to the plaintiffs' claims. Plaintiffs contend that the State's double-celling program, as it has actually been conducted, leads to unacceptable increases in the amount of disease and violence in the State's maximum-security prisons. To test this hypothesis (one might prefer to say, “to prove this case,” but the plaintiffs' own submissions-that it will be impossible to prove their case without subjecting evidence that they have not yet obtained to statistical manipulations they cannot yet specify by experts they have not yet identified-strongly imply that they are engaged in speculation rather than proof), it would clearly be helpful to be able to identify which prisoners have been located in two-inmate cells during which periods, and to correlate that information with medical and incident reports. The databases in question, which relate to the location of prisoners, the incidence of medical problems and pharmaceutical use, and the extent of disciplinary incidents of various kinds, are generally relevant to that inquiry. The State, indeed, does not directly challenge the claim that the material sought is relevant, within the meaning of Rule 26(b)(1).
At the same time, it is far from clear from the evidence presented that all of the information in the databases sought goes to these issues. Plaintiffs have not established that the databases are limited to class members, or that they are or could be (without enormous effort and expense) redacted to relate only to maximum-security inmates. So far as the record before the Court indicates, in fact, the opposite is true. That is, the databases appear to be general DOCS managerial tools, covering all inmates in the State correctional system. Nothing in the record suggests that the databases are easily broken down in such a way that only the portion relating to the institutions involved in this lawsuit can be separately reproduced or disclosed.
Moreover, the databases by their nature disclose more than the data that is in them. By producing the electronic material in raw form, the State would disclose not only the underlying data sought by the plaintiffs, but also the organizational framework of the databases, which would effectively disclose a great deal about the way that DOCS maintains, stores, and classifies information. Worse, as will be discussed further below, the State asserts, and backs up its claim with expert testimony, that this disclosure would not be merely the passive result of providing the database. In order to enable any statistical use of the data, the State would have to affirmatively develop and provide to plaintiffs' experts the equivalent of a manual on how the data is encoded and organized. The burden of doing so will be addressed below; for now it is sufficient to note that as a practical matter the plaintiffs' demands of necessity include the production of information that is not strictly speaking relevant to the case.
For purposes of this motion, however, it will be assumed that the databases, or at least a substantial part of the information contained in them, is relevant to the issues in this case.
While the State does not contest the relevance of the material, it does argue that the databases are privileged. (Def.Br. at 13-19) The claim is that because production of the databases would raise issues of prison security, the material is subject to a qualified privilege. But defendants' claim that “courts have recognized at least qualified privileges in prison personnel and inmate files” (Def.Br. at 14) is somewhat overstated. Courts are generally and appropriately reluctant to create new privileges, since evidentiary privileges are exceptions to the general rules of disclosure and admissibility of evidence that favor seeking the truth and promoting uniformity and simplicity in the laws. See
Trammel v. United States, 445 U.S. 40, 50, 100 S.Ct. 906, 63 L.Ed.2d 186 (1980) (citing United States v. Bryan, 339 U.S. 323, 331, 70 S.Ct. 724, 94 L.Ed. 884 (1950)); Gonzales v. National Broadcasting Co., 155 F.3d 618, 623-24 (2d Cir.1998); 3 Jack B. Weinstein and Margaret Berger, Weinstein's Federal Evidence,
§ 501.02[c] at 501-15 (2d ed. 2002). Evidentiary privileges “are not lightly created nor expansively construed, for they are in derogation of the search for truth.” “United States v. Nixon,” 418 U.S. 683, 710, 94 S.Ct. 3090, 41 L.Ed.2d 1039 (1974). As such, they must be strictly construed and accepted “only to the very limited extent that ... excluding relevant evidence has a public good transcending the normally predominant principle of utilizing all rational means for ascertaining truth.” Trammel, 445 U.S. at 50 (citation omitted).
Accordingly, this Court has examined carefully the precedents relied on by the State, to see whether the privilege for which it contends has been recognized as within “the principles of the common law as they may be interpreted by the courts of the United States in the light of reason and experience.” Fed.R.Evid. 501. This examination discloses that the State considerably overstates its authority, which do not support the conclusion that a broad privilege for prison personnel records has been established.
The State cites Kerr v. United States District Court, 426 U.S. 394, 405, 96 S.Ct. 2119, 48 L.Ed.2d 725 (1976), implying that in that case the Supreme Court recognized a privilege, “which ‘rest[s] in large part on the notion that turning over the requested documents would result in substantial injury to the State's prison-parole system.’ ” (Def.Br. at 14, quoting Kerr, 426 U.S. at 405). But Kerr
does not create such a privilege. The quoted text, in context, simply describes the basis for the California prison authorities' “claims
of privilege.” Id.
The Court in Kerr
refused to grant mandamus overturning a district court's disclosure order; while the Court noted with favor that the lower courts had not foreclosed an in camera review of the materials to determine whether privileged material was contained in the files, it certainly did not create, recognize or define any specific privilege in “prison personnel and inmate files,” and instead appears to have left open whether there was any basis for invoking the “official or state secret privilege,” which was apparently the only privilege asserted in the case. Id.
at 399, citing United States v. Reynolds, 345 U.S. 1, 73 S.Ct. 528, 97 L.Ed. 727 (1953).
The other case relied on by defendants, Association for Reduction of Violence v. Hall, 734 F.2d 63, 66 (1st Cir.1984), is no more availing, and is also misleadingly cited by the State. The First Circuit in Hall
did not recognize any general privilege in prison files, nor did it even pass on any claim of privilege at all. Hall
holds that the district court erred in relying on documents it had reviewed in camera, but withheld from the plaintiff on grounds of privilege, in deciding a summary judgment motion. With respect to the merits of the privilege determination, the court merely “assumed, without deciding, that the district court did not abuse its discretion in ruling that the documents it viewed in camera were privileged” under one or more of the well-established privileges covering the identities of informants, law enforcement techniques, and intragovernmental policy-making memoranda. Id.
Thus, even the district court's reasoning, which the Court of Appeals expressly did not adopt as its own, did not purport to create a general privilege for “prison personnel and inmate files,” but simply found that particular documents were not discoverable because they contained specific information privileged under these other categories.
*9 Thus, the authorities relied on by the defendants do not establish the sweeping general privilege for which they contend. Rather, they support the less dramatic and unsurprising proposition that prison records are sensitive materials, which can be expected to contain a number of different types of information that potentially fall within several more traditional, carefully-defined privileges. Deciding whether the databases requested are privileged, or, more plausibly, contain some quantity of privileged information, would require more careful legal analysis than the parties have yet provided, including most likely some in camera inspection of the records, and a consideration of whether any privileges have been waived by the extensive disclosures already made by the defendants, which they themselves assert already contain substantial amounts of the data in the contested electronic databases.
Fortunately, it is not necessary in this case to decide whether the databases are, or contain material that is, within any of these privileges, or whether this Court should recognize the more general privilege claimed by the State as justified by “wisdom and experience.” Defendants themselves claim no more than a qualified privilege, which may be overcome by a showing of need. But, as discussed below, even assuming for the sake of the argument that no privilege exists, and that the burden falls not on the plaintiffs to show special need, but on the defendants to establish that presumptively discoverable material nevertheless should be denied under the provisions of Fed.R.Civ.P. 26(b)(2) and 26(c), the balance of benefit and burden clearly supports denying the plaintiffs' motion to compel disclosure of the databases. Though this inquiry in many ways parallels the privilege analysis, it is preferable to address the question under the case-specific, balancing test of the discovery rule, without making unnecessary broad legal proclamations about privilege.
*10 All three of the reasons set forth in Rule 26(b)(2) as rationales for limiting disclosure of otherwise-discoverable information apply in this case and require denying discovery of the databases.
The most important reason for this conclusion, foreshadowed by defendants' claim of privilege, is that set forth in Rule 26(b)(2)(iii): “the burden or expense of the proposed discovery outweighs its likely benefit, taking into account the needs of the case, the amount in controversy, the parties' resources, the importance of the issues at stake in the litigation, and the importance of the proposed discovery in resolving the issues.” Indeed, as will be seen this standard largely subsumes the other considerations reflected in Rules 26(b)(2)(i) and (ii).
The State has demonstrated that the burden of the proposed discovery is extremely serious. In ordinary commercial litigation, the “burden” to be considered is primarily the economic cost of locating, reviewing, redacting, and reproducing requested files. Contrary to plaintiffs' claims, defendants have demonstrated that these burdens are not insignificant for the requested databases. Presumably, the actual data files could be reproduced quickly and without excessive labor on the part of DOCS's information technology specialists. But the expert affidavits supplied by defendants, and uncontradicted by any evidence offered by plaintiffs, persuasively establish that such a production of a few CD-ROMs would be of no utility to plaintiffs. Because the databases were designed for the operational purposes of prison administrators, the data desired by plaintiffs, while perhaps present in the databases, are not readily available for the statistical manipulations proposed by plaintiffs. (Martin Aff. 3A, 11.) Thus, the databases in question are not simply collections of lists or numbers that can be easily extracted and correlated with other numbers; rather, each of the requested databases has “been constructed to support the interactions of hundreds of concurrent users rather than to support the analytical activities of a few.” (Id.
4.) Consequently, the databases are integrally connected to a data system that comprises “25 separate but interdependent subsystems that each are comprised of scores of programs, tens of databases and scores of screen and report formats. There are over 3,000 programs containing a total of 1,500,000 lines of program instructions.” (Id.
To take just one illustrative example, plaintiffs have requested the “database [ ] ... supporting ... the Unusual Incident System.” (Pl. Br. at 1.) The Assistant Director of the Bureau of Management Information Systems in DOCS describes that system as follows:
The Unusual Incident System was developed over fifteen years ago to replace and augment telephone and written notification to Central Office of any prison incident of an unusual nature. Data entry screens are used in the prison to record the description of the incident, the events causing the incident, the action taken, and a medical report. [This report is finalized and supplemented over time as additional information is received.] At each stage of the process, the system provides a rudimentary word processing capability to prepare hard copy documents to certify and document the electronic record. This rudimentary word processing capability is vastly different than modern day word-processing functionality.... The content of an Unusual Incident “document” consists of many data fields and separate records for each line of narrative appearing on the report. When it is necessary to produce a hard copy document, a major program of the system reassembles all of the date, headings, captions and lines of narrative into a document such as those [already] provided to Plaintiff[s'] counsel.... The Unusual Incident system includes 36 programs, 21 screen formats and 48 reports available to the prisons or Central Office. The information in the Unusual Incident system is recorded in 8 database tables with a total of over 2,000,000 rows or records containing nearly 400,000,000 characters.
(Martin Aff. 6.) This description is typical of those provided for the various other databases at issue on this motion.
*11 Thus, providing plaintiffs with any meaningful access to aspects of this system is not a matter of duplicating discs and handing over copies. The data that would presumably be useful to plaintiffs in analyzing patterns of disease and violence or correlating such patterns with double-celled inmates are not simply numbers maintained in a simple set of files that can be downloaded into some (unspecified) statistical analytic program and then crunched in some (unspecified) way to produce meaningful results. DOCS personnel would need to prepare extensive documentation of the structure of the programs and databases to enable any experts retained by plaintiffs to understand the layout of the data, the meaning of codes, and the sources from which those codes can be derived. (Martin Aff. 12-17.)
Again to take just one concrete example, when double-celling was instituted, it took DOCS officials nearly six months, and several person-months of effort, to develop from the Locator database requested by the plaintiffs a method to produce a supposedly accurate list of inmates who have been double-celled; since the data were apparently not originally organized for purposes of tracking double-celling, which did not exist when the system began, it required the construction of additional programming tools to generate the list. DOCS experts estimate it would take eight weeks to prepare documentation of the Locator databases that would enable plaintiffs experts to recreate this effort to extract double-celling data in any form that would be usable for subsequent analysis. (This estimate does not, of course, account for whatever time it would take the experts to master that documentation.) (Martin Aff. 12.) Ironically, the plaintiffs have already been provided with the fruit of this effort, in the form of the most recent iteration of the list of inmates who have been double-celled. Contrary to plaintiffs' assertions, the electronic data are not a more easily-manipulated electronic version of the list, but an infinitely more complex and embedded set of files from which the list can only be extracted (conceivably, given enough additional time, burden and expense, in a more manipulable form than a hard-copy list) with great difficulty.
The defendants estimate that the cost of the efforts required by plaintiffs' requests would be in excess of $100,000, not including the burden of disrupting DOCS operations by shifting expert computer personnel from operational duties. (Martin Aff. 3.) Even taking the specific dollar figure with a large grain of salt, it is apparent that the financial burden of the State of complying with plaintiffs' demand would be significant.
But unlike the typical commercial litigation, the burden on defendants here is not limited to financial cost or business disruption. Production of the requested databases would have other negative consequences for the State that are even more significant. The security risks of producing the databases are substantial, and cannot be met by the mild and insufficiently thought-out measures suggested by plaintiffs.
It should go without saying that the data in question are highly sensitive. Only a handful of DOCS's own employees, and no one else, have access to the programs used to compile and store the data sought here. (Martin Aff. 19) Access to the data themselves is more widely shared, but few DOCS officials have access to the totality of the databases sought, which are used for different purposes and are accessed piecemeal rather than in their totality by prison officials.
The requested databases contain confidential information about prison personnel; for example, unusual incident reports may contain identifying information (such as social security numbers) and medical information concerning prison staff. Pursuant to earlier court orders and confidentiality stipulations, such information has been painstakingly redacted, at considerable cost of time and effort, from hard-copy versions of these reports provided to plaintiffs. No program or other practicable automated method exists for redacting the computer records to remove the same information from any copy of the database provided to plaintiffs. (Martin Aff. 20.) Accordingly, the alternatives available to the Court are either to order production of the unredacted database, dramatically ratcheting up the security burden of disclosure, or to order the production of a redacted version of the database, which would require a digital duplication of the extraordinary effort already undertaken with respect to the paper discovery process: a manual effort to input changes and corrections into the copied database to eliminate sensitive material.
*12 The databases also contain sensitive information that has
been provided to plaintiffs, but in a different form. Here the particular characteristics of electronic information come into play in assessing the security costs of providing information in digital form. The costs and difficulties of redaction, for example, have earlier led the Court to permit production of documents with otherwise-confidential inmate medical information unredacted. The cost/benefit analysis with respect to hard-copy information has led to the conclusion that reasonable restrictions on access to and reproduction of documents are sufficient to prevent wholesale exposure of this information to unauthorized persons, particularly as the authorized users were primarily members of the bar of this Court and their employees. While even the best security measures cannot prevent occasional regrettable leakage of particular documents, the difficulties attendant on the physical reproduction of hundreds of thousands of pages of material provide a de facto safeguard against massive theft or misuse of data.
But computer security is an entirely different matter. Plaintiffs' suggestions, including maintaining strict physical custody of discs or other storage media containing the requested data ignore the fact that the manipulation of the data projected by plaintiffs will require loading them into foreign computer systems, presumably belonging to as-yet-unselected experts who are neither subject to the licensing rules and ethical constraints imposed on members of the bar nor chosen for expertise in computer security. The ease with which entire databases can be reproduced or transmitted radically alters the security stakes and requires a rebalancing of the factors that permitted unredacted information to be disclosed in paper form. (Kirkpatrick Aff. 6; Clark Aff.) Plaintiffs have provided no reports or affidavits from anyone with expertise in computer security or database management that would provide any evidentiary basis for a conclusion that the data in question could be safeguarded by the means suggested by plaintiffs or by any other means.
Finally, production of the databases would present one other significant security concern not present in routine document discovery. Disclosure of the databases, and of the codes and documentation required to utilize them, would provide access not merely to the data themselves, but also to the techniques used by the prison authorities to record and store data. This is itself a highly confidential matter. In a business context, such disclosure would in some cases raise issues of the protection of trade secrets. See, e.g.,
Ruckelshaus v. Monsanto Co., 467 U.S. 986, 1001-02, 104 S.Ct. 2862, 81 L.Ed.2d 815 (1984) (noting broad definition of trade secrets); North Atlantic Instruments, Inc. v. Haber, 188 F.3d 38, 44 (2d Cir.1999) (outlining defining factors of trade secrets); 3 Weinstein's Federal Evidence,
§§ 508.02-508.07. In a prison context, the stakes are much higher. The security of the data in question against inmates or their associates who have an incentive to seek access to DOCS computers is obviously an important concern. Disclosure of the underlying code and programming information presents a significant danger not presented by disclosure of the data content in hard-copy format. As defendants eloquently make the point, “[Y]ou cannot hack into a computer printout, but a program can be a roadmap for hackers.” (D.Br. at 16.)
None of these concerns, significant as they are, necessarily preclude discovery. Security concerns can perhaps be met, with enough time, ingenuity, and expense; redaction can be accomplished, with the expenditure of sufficient resources. Parties to important public litigation or even private disputes sometimes have to bear significant costs in the discovery process, in order to assure fair and informed decision-making by courts. The “burden or expense of the proposed discovery,” even when substantial, does not forbid disclosure. Rather, the rule asks whether that burden and expense “outweighs its likely benefits.” Fed.R.Civ.P. 26(b)(2)(iii). Accordingly, it is necessary to turn to the projected benefits of discovery to plaintiffs and to the truth-seeking process before determining whether disclosure can be had.
The suggested benefits, however, prove elusive. Plaintiffs make the plausible claim that having data in electronic, manipulable form will facilitate expert analysis of the data, and avoid the need for plaintiffs to incur significant expense-which, as indigent inmates represented by pro bono counsel who have already expended significant resources on the case-they can ill afford. But if this claim is facially appealing in the abstract, it is much more difficult to document and quantify any specific benefit sought by plaintiffs. As discussed above, the databases in question do not exist in a form that is susceptible to statistical analysis, and considerable time, effort and expense will be necessary to reduce them to a form that might be so usable, or to provide the background instructions and documentation that will permit them to be reduced to such form. Yet plaintiffs provide no meaningful account of what exactly they plan to do with the data. Other than very general statistical terms, such as “regression analysis,” that offer little specific content, plaintiffs do not describe what they hope to find or how they hope to find it. Presumably, thinking through the actual uses of the databases is to be left to the as-yet-unselected experts who will one day be retained.
This puts the virtual cart before the digital horse. The Court cannot find that the major risks and burdens detailed above will be worth bearing on the basis of entirely speculative benefits. Of course, until an analysis is done, a court could never know whether that analysis will prove useful to plaintiffs, or produce proof of constitutional violations. But that is not the test; negative results could be as useful as positive ones to resolving this longstanding (long-running
would be a misnomer) litigation. Though plaintiffs do not need to show that obtaining this data will in fact advance their cause, they do have to provide some basis to believe that specific tests can be run that, with a reasonable degree of scientific certainty, can be expected to yield results that would be relevant to the issues before the Court. See, e.g.,
Waldron v. Cities Services Co., 361 F.2d 671, 673 (2d Cir.1966). The case might be different if the plaintiffs proffered an affidavit from a reputable statistician explaining exactly what hypotheses would be tested, on what theory and using which techniques, and precisely what data would be necessary to perform those tests. At a minimum, such an affidavit would enable defendants' computer experts to address whether and how the kind of data sought by the plaintiffs' statistics experts could be made available from DOCS's computers. On the present record, however, plaintiffs can only be characterized as demanding a huge volume of sensitive data, in a form that may or may not be usable for any productive purpose, on the vague hope that it will prove useful when subjected to future massage by unspecified experts. Plaintiffs, thus, have made little showing of “the importance of the proposed discovery in resolving the issues” before the Court. Fed.R.Civ.P. 26(b)(2)(iii).
*14 Finally, the benefits to be expected from the proposed discovery must be assessed in light of whether “(i) the discovery sought is unreasonably cumulative or duplicative, or is obtainable from some other source that is more convenient, less burdensome, or less expensive; [and] (ii) the party seeking discovery has had ample opportunity by discovery in the action to obtain the records sought.” Fed.R.Civ.P. 26(b)(2)(i), (ii). Though the Rule identifies these as separate grounds on which discovery may be limited, they may productively be considered as part of the burden/benefit analysis. Even if the discovery sought is not so
cumulative that it can be dismissed out of hand on that ground alone, the extent to which the discovery is duplicative will bear importantly on the benefits to be expected from the discovery; even if the party cannot be said to have simply forfeited any claim to discovery by prior inaction, the extent to which a party has unreasonably foregone efforts to seek the discovery sought will affect both the burden on the opposing party and the evaluation of whether the party seeking discovery itself believes that significant benefits will accrue. These factors both weigh against compelling discovery in this case.
First, much of the actual data
in the databases (to the extent relevant) has already been provided to plaintiffs in documentary form. Plaintiffs cite no concrete facts available in the database, relevant to the litigation, and unprivileged, that are not available in the 700,000 pages of material already provided to them. As repeatedly acknowledged in this opinion, the Court accepts the fact that the form of the databases sought could in theory provide non-cumulative value added to plaintiffs. At the same time, as also discussed above, the defendants are correct that the security burdens on the State will be massively increased by production in this form, such that it can readily be concluded that the document discovery already provided is far “less burdensome [and] less expensive” than the electronic production now sought by plaintiffs.
*15 Second, the expense of additional production has to be weighed in the context of the extraordinarily burdensome and expensive discovery process already undertaken in this case. Defendants have produced nearly three quarters of a million pages of records, and expended the months of personnel hours necessary to locate, prepare, redact, copy and ship those documents. This is no mere clerical task; the redaction process involves skilled work by attorneys and by paralegals and interns acting under their supervision. The Attorney General of New York has hired three full time clerks to work specifically on this case. (Moody Aff. 2.) While the costs of employees on the public payroll are harder to estimate than the bills paid by litigants represented by lawyers who bill by the hour, the (largely hidden) burden on the public of the discovery process already conducted in this case is disturbing. Plaintiffs' counsel, too, has expended a significant amount of out-of-pocket costs on the case, as well as investing an enormous amount of (putatively very costly) attorney time.
This is relevant in assessing the cost of the requested discovery; that cost should not be assessed in isolation, but in the context of the total cost of discovery to date. It is also relevant to assessing whether the plaintiffs have had “ample opportunity” to obtain the information in question. As noted above, plaintiffs argue that discovery demands served early in this litigation could be construed as covering the databases that are the subject of this motion. But that claim ignores the realities of big-case discovery, which, perhaps unfortunately, often proceeds by the exchange of overbroad discovery demands and blanket objections, followed by concrete negotiations over what records are really necessary and what production the opposing party is really prepared to make without compulsion. Discovery in many cases is already burdened by posturing correspondence and excessive resort to court conferences that vastly increase the friction and cost of the process. Parties should be encouraged to work out differences amicably; for a court to ignore or fail to give effect to the informal accommodations and understandings that develop over time would undermine the spirit of cooperation and informality that alone makes effective discovery and settlement possible in overburdened courts.
Viewed against that realistic background, the record in this case makes absolutely clear that this discovery request is belated. The parties have spent years, significant sums, and exhaustive efforts on discovering and analyzing a mountain of paper-a project that, plaintiffs now claim at the eleventh hour before expiration of the n-th discovery deadline, was largely useless, or at least superseded, because the production of electronic data instead would have accomplished the same thing and more. Plaintiffs may not have identified the specific databases by name until a deposition in December 2001 laid out the details of the DOCS computer system, but they have been aware throughout discovery that DOCS, like any modern enterprise, keeps its records in computerized form. It is undisputed that reports devised from these very databases were introduced into evidence at the Bolton
trial in 1997, and many of the reports and records that defendants have disclosed in hard-copy form have self-evidently been produced as printouts from computer files.
*16 Nothing prevented plaintiffs from seeking, either informally through the extensive correspondence and telephonic communication between counsel, or formally by interrogatory, further information about the computer databases available from DOCS. Nothing prevented plaintiffs from discussing with their adversaries in a timely fashion, before the expenditure of hundreds of thousands of dollars on conventional document discovery, whether the mountain of paper was really necessary, or whether a cooperative effort to explore electronic discovery possibilities could have led to a mutually beneficial agreement. Nothing prevented plaintiffs from retaining experts to advise them as to what kinds of electronic data would likely be found in the DOCS computer system, what kinds of data would be necessary to evaluate plaintiffs' claims, and what kinds of security precautions could be presented to the State to satisfy its legitimate interests and induce agreement to cooperate in providing data. Nothing in the record suggests that any such efforts were made, or, indeed, that the plaintiffs took any step to seek discovery of any electronic data until after their adversaries had expended exorbitant sums of public money on conventional discovery, and until the ultimate deadline for completing fact discovery, after seven years of litigation, was at last at hand. Accordingly, the Court is constrained to find that the plaintiffs have had, and let pass, ample opportunity to obtain this information earlier in the discovery process, and that that conclusion strongly supports denying plaintiffs' motion.
The factual findings set forth in this opinion are largely uncontested. The State has compiled an impressive record documenting the technical issues involved in assessing the burdens and risks, and the obstacles to securing the hoped-for benefits, of ordering the discovery sought. Plaintiffs, as noted above, have blithely declined to respond to that presentation, relying on the facially plausible generalities put forth in their initial motion papers. The Court has nevertheless carefully and skeptically reviewed the papers submitted by defendants, only to find them ultimately persuasive and accurate.
Assuming for the sake of argument that the databases sought by plaintiffs are relevant and not privileged, the Court finds that discovery should be denied because defendants have made a compelling showing that the burden of the proposed discovery far outweighs its likely benefit for resolving the issues before the Court, particularly in light of the failure of the plaintiffs to seek such discovery despite ample opportunity to do so in a more timely manner, and the vast amount of material, largely duplicating the contents of the databases now sought, which has already been provided by the defendants. The Court will not impose additional burden, expense and risk of harm on the parties, and especially on the defendants, at this belated hour, after the expenditure of so much effort and expense, and on the undocumented hope of obtaining such speculative benefits.
*17 The motion to compel production is denied.