Illman, Robert M., United States Magistrate Judge
Plaintiffs,
v.
OPENAI, INC., et al.,
Defendants
ORDER RE: PLAINTIFFS’ MOTION TO COMPEL PRODUCTION FORM NON-PARTY REUTERS NEWS AND MEDIA, INC. Re: Dkt. No. 314
Now pending before the court is Plaintiffs’ Motion (dkt. 314) to Compel Production of Documents from Non-Party Reuters News and Media, Inc. (hereafter, “Reuters”); Reuters has responded (dkt. 348); and, Plaintiffs have filed a reply (dkt. 367). Having reviewed the Parties’ submissions, pursuant to Federal Rule of Civil Procedure 78(b) and Civil Local Rule 7-1(b), the court finds the matter suitable for disposition without oral argument. For the reasons stated below, Plaintiffs’ motion is denied.
Plaintiffs begin by contending that in order to prosecute this case and to respond to OpenAI’s fair use defense, they require information from Reuters that they cannot get from OpenAI. See Pls.’ Mot. (dkt. 314) at 2. Plaintiffs believe that “Reuters possesses information relevant to the existence of a market for copyrighted content as training data for LLMs [because] [o]n October 24, 2024, it was reported that Reuters had entered into a multi-year licensing agreement with Meta Platforms Inc. for its news content to be used in Meta’s AI ChatBot [which, according to a blog referenced by Plaintiffs, is] one of only a handful of AI data licensing agreements in the LLM training data market.” Id. In essence, Plaintiffs seek to obtain the agreement between Reuters and Meta Platforms, Inc. (hereafter, “Meta”) because of the notion that those documents are relevant to OpenAI’s anticipated fair use defense, “namely the effect of [OpenAI’s] theft of copyrighted material on the market for copyrighted content as training data for LLMs, and to assist in valuing the copyrighted works for use as training data, including developing a damages methodology.” Id. More specifically, Plaintiffs’ subpoena seeks three categories of documents from Reuters: (1) executed AI training data licensing agreements; (2) AI training data licensing agreement negotiations and valuations; and, (3) specific content for AI training. Id. at 3. Plaintiffs argue that Reuters has resisted compliance with their subpoena demands through meritless boilerplate objections, many of which Plaintiffs characterize as “frivolous.” Id. at 4.
As to relevance, Plaintiffs submit that prying into the confidential business affairs of Reuters is justified here for two principal reasons: (1) their reported relevance to Plaintiffs’ claims of direct copyright infringement by OpenAI; and, (2) their importance and necessity for countering OpenAI’s fair use defense. Id. Regarding the first reason, Plaintiffs state that “[t]he documents sought from Reuters via the subpoena will help establish the existence of a market for copyrighted work as training data for LLMs because they will provide evidence of benchmark licensing agreements and market demand[1] . . . [and that the] information is also relevant to calculating actual damages and recovery of lost profits.” Id. at 4-5. In short, Plaintiffs essentially ask the undersigned to order Reuters to divulge the details of its confidential business arrangements with Meta simply so that Plaintiffs can show that they “would have [otherwise] been able to license their works at a competitive price.” Id. at 5. Plaintiffs contend that Reuters’ “internal valuation memos can provide evidence of potential lost revenue due to infringement and would likely contain projections and analyses of potential licensing income” which might “support their claims for lost profits.” Id. Plaintiffs add that the negotiations surrounding the Reuters/Meta licensing deal might “provide evidence of the license’s value, helping to quantify [Plaintiffs’] losses [in this case].” Id.
As to Plaintiffs’ efforts to counter OpenAI’s anticipated fair-use defense, Plaintiffs contend that this discovery would relate to “the fourth factor in establishing such defense (the effect of the use upon the potential market for or value of the copyrighted work)[,] [which] requires consideration of whether the unlicensed use of the work undermines the market or licensing opportunities for it.” Id. at 5. In other words, Plaintiffs’ argue that “[i]f a licensing market exists and is negatively impacted by OpenAI’s theft of [Plaintiffs’] data, [which] weighs against a finding of fair use,” that the opening up of the files documenting the business relationship between Reuters and Meta “will aid Plaintiffs in this analysis.” Id. at 5-6. It is, of course, unclear why Plaintiffs believe that the particulars of their non-news content case will be materially illuminated by a licensing deal for news content.
Plaintiffs report that Reuters has resisted compliance with their subpoena request on grounds of overbreadth, vagueness, undue burden, and confidentiality. Id. at 6-8. By way of response, Reuters argues that – as a third-party – it is entitled to protection from undue burden and production of confidential information where a substantial need is not demonstrated. See Reuters’ Opp. (dkt. 348) at 5-7. Reuters also reports that, by way of an offered compromise that would have narrowed the scope of Plaintiffs’ demands, it “offered to produce all executed agreements related to AI training data (which would include the Meta Agreement) and, to the extent they exist, documents and communications with third parties related to all non-executed, proposed, or in pipeline licensing agreements related to AI training data where there was any reasonable chance of aligning on the financial or licensing terms,” but that Plaintiffs’ counsel rejected its offer. Id. at 7. Reuters then adds that “Plaintiffs’ counsel [even] declined Reuters’ request that the date for [its] opposition be adjourned to permit the parties to discuss the resolution of the Motion.” Id. at 8.
In ruling on Plaintiffs’ request to compel this information, the undersigned will note several facts. First, nowhere in Plaintiffs’ motion is there any explanation as to why the essence of the information it seeks cannot be garnered either from OpenAI, or through expert witness testimony, or by some other less intrusive means. That is, Plaintiffs have not explained why prying into Reuters’ commercial relationship with Meta is the only way to “establish the existence of a market for copyrighted work as training data for LLMs . . . [such as to] provide evidence of benchmark licensing agreements and market demand . . . [so that Plaintiffs can] calculat[e] actual damages and recovery of lost profits.” See Pls. Mot. at 4-5. In their reply brief, Plaintiffs mention (for the first time, and in a rather conclusory fashion) that they “seek relevant evidence not otherwise obtainable from OpenAI about the market for textual data[2] as training data for Large Language Models.” See Pls. Reply (dkt. 367) at 3. Plaintiffs then note that they have sought licensing deals for AI training data from OpenAI, but they do so without any mention as to the outcome of having made that request (i.e., without explaining what happened – did OpenAI hand over a trove of documents? Did OpenAI claim it had nothing? Did OpenAI simply ignore the request?). See id. at 3 n.2.
“The determination of substantial need is particularly important in the context of enforcing a subpoena when discovery of trade secret or confidential commercial information is sought from non-parties.” See Gonzales v. Google, Inc., 234 F.R.D. 674, 685 (N.D. Cal. 2006) (citing Mattel Inc. v. Walking Mt. Prods., 353 F.3d 792, 814 (9th Cir. 2003). The undersigned will first note that, in this context, the “substantial need” showing must be concrete, not speculative as is the case here. See, e.g., Waymo LLC v. Uber Techs., Inc., 2017 U.S. Dist. LEXIS 132721, *10 (N.D. Cal. Aug. 18, 2017) (“. . . that Uber ‘maybe’ will need discovery from other companies is not even close to a ruling that Uber has shown a substantial need for compelling the non-parties’ trade secrets.”). Second, the “substantial need” showing must be just that, substantial. See, e.g., Cameron v. Apple Inc. (In re Apple Iphone Antitrust Litig.), 2021 U.S. Dist. LEXIS 25194, *12 (N.D. Cal. Jan. 26, 2021) (“In ruling on discovery issues, and in particular to assess ‘substantial need,’ the Court must sometimes dip its toe into the merits of the case, which inform the consideration of how relevant the requested documents [really] are.”) (emphasis added).
Here, the court finds that Plaintiffs have sought what is undeniably confidential commercial information (if not trade secrets) from one non-party, Reuters, about its relationship with another non-party, Meta. The court further finds that while Plaintiffs have made a clear case that OpenAI does not possess the particular documents at issue here – that is, those underlying the business affairs between Reuters and Meta – what Plaintiffs have not done is to even mention (let alone convincingly establish) why the essence of this information (i.e., its upshot or import, which is merely information about the existence of a market for licensing copyrighted “textual data” to train an AI model) cannot be either obtained from OpenAI, or by expert witness testimony, or by any other means that would not involve prying into the confidential commercial information of two non-parties. Moreover, the court finds that the marginal relevance of the information Plaintiffs seek is substantially outweighed by the burden its production would pose to the non-parties whose rights would be implicated by that production. At bottom, the court finds that Plaintiffs have not demonstrated a substantial need for the discovery they seek, and that the requested discovery is not proportional to the needs of the case. Accordingly, Plaintiffs’ Motion (dkt. 314) is DENIED.
IT IS SO ORDERED.
Footnotes