In re Google RTB Consumer Privacy Litig.
In re Google RTB Consumer Privacy Litig.
Case No. 21-cv-02155 (N.D. Cal. 2022)
December 16, 2022

Demarchi, Virginia K.,  United States Magistrate Judge

Failure to Produce
Proportionality
Download PDF
To Cite List
Summary
The Court ordered Google to produce documents, including ESI, sufficient to show the “verticals” data fields it shared with RTB participants during the class period, as well as any information that would permit RTB participants to infer information about the account holder based on verticals linked with the account holder. This ESI is necessary to show the data fields Google shared with RTB participants for the purpose of helping participants target their ads.
Additional Decisions
IN RE GOOGLE RTB CONSUMER PRIVACY LITIGATION
Case No. 21-cv-02155-YGR (VKD)
United States District Court, N.D. California
Filed December 16, 2022
Demarchi, Virginia K., United States Magistrate Judge

ORDER RE NOVEMBER 21, 2022 DISCOVERY DISPUTE RE NAMED PLAINTIFF DATA REDACTED VERSION

Re: Dkt. Nos. 365, 367

Plaintiffs and defendant Google LLC (“Google”) ask the Court to resolve a dispute concerning whether Google has complied with its obligations to produce named plaintiff data. Dkt. No. 376. In connection with this dispute, the parties provided excerpts from discovery produced in the Calhoun litigation,[1] which plaintiffs contend support their position. See Dkt. No. 367. The Court held a hearing on this dispute on December 6, 2022.[2] Dkt. No. 376.

For the reasons explained below, the Court orders Google to produce information concerning “verticals” and requires the parties to confer further regarding time sampling.

I. BACKGROUND

This dispute concerns Google’s compliance with the Court’s August 26, 2022 order resolving the parties’ earlier dispute regarding plaintiffs’ discovery of named plaintiff data. Dkt. No. 314.[3] In that order, the Court ordered Google to produce documents sufficient to show several categories of data relating to the named plaintiffs. As to some disputed categories of information, the Court directed the parties to confer further and, if appropriate, to comply with the procedures the Court established for considering materials from the Calhoun litigation.

Plaintiffs say that Google’s production of named plaintiff data is incomplete. They argue that Google has chosen to produce data principally from the [REDACTED], but neglected other sources that contain unique data about information Google shares with RTB participants. Dkt. No. 365 at 2. Specifically, plaintiffs argue that Google should be required to search for and produce named plaintiff data from six other data logs, referred to collectively as the [REDACTED] Id. at 1-3. In support of their position, plaintiffs separately describe and attach examples of these other data logs that were produced in the Calhoun litigation. See Dkt. No. 367.

Google says that it has complied with the Court’s August 26 order to produce documents sufficient to show the named plaintiffs’ sign-up information, consent-related data, settings, “My Activity” records (including web-browsing) history, and other records that are “responsive to the ‘Linking account information,’ ‘Ads shown,’ and ‘Cookie-matching’ sections” of the order. Dkt. No. 365 at 4-5. At the hearing, Google represented that it would produce additional data by the week of December 12, 2023, following the expiration of a contractually-required notice period. Id. at 5; Dkt. No. 376. Google says that it has produced named plaintiff data “from six different time periods and five log sources,” including the [REDACTED] and some of the [REDACTED] Dkt. No. 365 at 5. Google vigorously disputes plaintiffs’ characterization of the information produced in the Calhoun litigation and disagrees that it includes unique information relevant to this case. See Dkt. No. 367.

II. DISCUSSION

This dispute requires the Court to address Google’s compliance with the August 26 order. This task is complicated by the fact that plaintiffs’ portion of the joint submission does not refer to what the Court has ordered and where Google’s production falls short. Instead, plaintiffs focus on the alleged superiority of the [REDACTED] as a source of responsive information. At the hearing, plaintiffs elaborated on their positions. First, plaintiffs say that Google has not produced certain data fields that reflect information about the named plaintiffs that is shared with RTB participants in violation of the Court’s August 26 order. Second, plaintiffs say that Google has improperly limited its production to data for only six weeks from the relevant time period. Third, plaintiffs argue that Google has produced data in a manner that destroys its context and structure. Fourth, plaintiffs argue that Google should be ordered to produce information from [REDACTED] that relates to information shared with RTB participants.

The Court considers each of these arguments.

A. Missing categories of information

Plaintiffs’ portion of the joint submission does not identify any categories of information missing from Google’s production. However, in the accompanying submission addressing materials from the Calhoun litigation, plaintiffs identify, for each of [REDACTED], the specific fields available in those logs that they say Google has withheld from production. See Dkt. No. 367 at 1-3, Exs. 1-5. For most of these fields, Google responds that it has already produced (or will produce) the information, or that the field corresponds to information Google maintains internally and does not share with RTB participants. See id. at 3-5. The exception appears to be data fields that have to do with what the parties refer to as “detected verticals” or “targeting verticals,” which the Court addresses below.

Plaintiffs have no response to Google’s argument that [REDACTED] contain no information responsive to the August 26 order that Google has not already produced (or committed to produce) from other sources, other than to say that they do not believe Google’s representations. Plaintiffs acknowledge that the discovery they have obtained in Calhoun does not show that Google shares the data fields they highlight in [REDACTED] with RTB participants. Nevertheless, they suspect Google is withholding responsive information from production. For purposes of discovery in this case, plaintiffs essentially argue that it is not enough for the Court to order Google to produce documents sufficient to show all of the data Google shares with RTB participants (as the Court has already done), but that the Court should instead order Google to produce all of the data fields from the [REDACTED] for each of the named plaintiffs and permit plaintiffs to explore their suspicions in discovery. Plaintiffs do not dispute Google’s representation that [REDACTED] contain [REDACTED], the vast number of which are not relevant to this litigation. See Dkt. No. 365 at 6. The Court has already considered and rejected an approach to discovery that requires Google to produce all information it collects about the named plaintiffs, regardless of whether that information is disclosed to a third party in an RTB auction. See Dkt. No. 269 at 5; Dkt. No. 314 at 4. Nothing in the parties’ Calhoun submission causes the Court to reconsider that decision. Plaintiffs simply have not shown that wholesale production of all data fields from [REDACTED] is relevant or proportional to the needs of this case.

However, Google has failed to comply with the Court’s August 26 order in one important respect. It is undisputed that during the relevant class period, Google shared at least one data field with RTB participants for the purpose of helping participants target their ads. In the operative complaint, plaintiffs allege that Google collects personal and sensitive information about account holders, groups them into interest-based categories called “verticals,” and then shares the verticals (or information about the verticals) with RTB participants. Dkt. No. 80 Y 16, 144-150. Google acknowledges that during a portion of the class period it shared a data field called [REDACTED] with RTB participants but says that the field was deprecated and not shared beginning in approximately February 2020.[4] Google has not produced documents sufficient to show the detected verticals shared with respect to each of the named plaintiffs, nor has it investigated whether the information can be produced from any existing data sources. Moreover, after much discussion at the hearing, the Court now understands that plaintiffs believe Google presently uses verticals derived from account holders’ personal and sensitive information in connection with RTB auctions, even if Google does not expressly disclose a “vertical” data field to an RTB participant. Specifically, plaintiffs contend that Google uses the verticals it maintains to curate bid requests presented to its ad customers to enable those customers to target account holders associated with particular verticals, or that Google otherwise conducts the RTB auction process in a manner that facilitates the targeting of account holders based at least in part on verticals linked to account holders. Plaintiffs say that Google’s production must include documents sufficient to show both the explicit sharing of verticals associated with the named plaintiffs and any implicit sharing that facilitates ad targeting. The Court agrees. Google must produce documents sufficient to show for each named plaintiff the “verticals” data fields it shared with RTB participants during the class period. In addition, Google must produce documents sufficient to show for each named plaintiff the information, if any, that it provided to RTB participants that would permit those participants to infer information about the account holder based on verticals linked with the account holder.

B. Time sampling

Google says that it has produced six weeks’ worth of named plaintiff data from different months within the relevant class period for the relevant data sources that required notice to ad customers. According to Google, because it is contractually-required to provide notice to each of the ad customers whose data is encompassed by the production, if sampling is not permitted, it would be required to provide notice to thousands of customers. Plaintiffs dispute that production of data for the full time period would be unduly burdensome for Google, describing Google’s selection of time periods as “arbitrary.”

The Court accepts Google’s representations that for data sources that require notice to ad customers, production of named plaintiff data for the entire class period would impose an undue burden, and some kind of sampling is appropriate. For data sources that do not require notice to ad customers, the Court is not persuaded that production of named plaintiff data for the entire class period would impose an undue burden. For data sources that require notice to ad customers, if Google wishes to rely on a sample of responsive data to satisfy its obligations, it must take steps to ensure that the sample is representative such that plaintiffs can rely on it in making the arguments they wish to make for class certification. At the hearing, the parties indicated that they had not considered or discussed whether Google’s production was a representative sample. If a dispute remains about Google’s time sampling, the parties must confer on this point. At a minimum, Google must explain to plaintiffs how it selected the six weeks it chose for production, and if plaintiffs disagree about whether those six weeks are representative, they must explain to Google why they disagree.

If a dispute remains after the parties confer, they may submit that dispute to the Court for decision.

C. Context and structure of production data

While plaintiffs acknowledge that the named plaintiff data in question is stored in databases, they object that Google’s production of only certain fields from the databases eliminates context and structure from the data that makes it difficult to use. Plaintiffs explained at the hearing that this is why they want Google to produce all data fields contained in the [REDACTED] logs for each named plaintiff. Dkt. No. 376. Google responds that neither the Court’s August 26 order nor any other rule of discovery requires them to produce thousands of irrelevant data fields simply because those fields are stored in the same database as relevant data fields. Id.

During the hearing, Google explained that for each named plaintiff it queried the relevant databases using data fields corresponding to the categories of information in the August 26 order, and then output the results of the queries to a file. Google represents that the resulting files include the names of the data fields and their corresponding values. This is how production of responsive data from a database commonly occurs—run queries on relevant data fields, output results. See, e.g., Sarieddine v. D&A Distribution, LLC, No. 17-CV-2390-DSF-SKX, 2018 WL 5094937, at *3 (C.D. Cal. Apr. 6, 2018); Zysman v. Zanett Inc., No. C-13-02813 YGR (DMR), 2014 WL 1320805, at *2 (N.D. Cal. Mar. 31, 2014). Plaintiffs’ claim that the data Google has produced lacks context and structure and is therefore unusable is not persuasive. Plaintiffs do not contradict Google’s explanation of how it produced the data, and they do not show or explain how Google’s production using that methodology prevents plaintiffs from understanding and using the data. As explained above, the Court is not convinced that Google must produce the entire contents of a particular database or data log in order to comply with the August 26 order.

D. Related information

On several occasions during the hearing, plaintiffs appeared to advocate for production of a broader scope of information “related to” information Google shares with RTB participants, or “in furtherance of” the RTB auction process. Such information is beyond the scope of the Court’s August 26 order, which requires Google to produce documents sufficient to show specific categories of information.

III. CONCLUSION

The Court orders as follows:

1. Google must produce “verticals” information described in section II.A above by January 31, 2023.

2. The parties must confer promptly about the time sampling issue in section II.B, but no later than January 13, 2023, and may submit any remaining dispute for resolution.

The Court denies all other relief requested.

IT IS SO ORDERED.

Dated: December 16, 2022

Footnotes

Calhoun et al v. Google LLC, No. 20-5146.
Because the hearing involved detailed discussion of Google’s confidential information, the Court conducted the hearing under seal.
Google’s portion of the joint submission expressly refers to the Court’s prior order. Plaintiffs’ portion does not refer to the Court’s prior order or to any specific discovery request. At the hearing, plaintiffs confirmed that the relevant discovery request is plaintiffs’ RFP No. 42, which was one of the requests addressed in the Court’s August 26 order.
At the hearing, Google explained that the [REDACTED] field refers to the web site the account holder happened to be visiting at the time of the auction (e.g., a website about motorcycles) and does not refer to categorization of the account holder herself. See also Dkt. No. 80 ¶¶ 148-149 (discussing different kinds of verticals). According to Google, [REDACTED]. Dkt. No. 376.