Meta’s Legal Maneuvering in Copyright Controversy
In a recent development, Meta has found itself embroiled in yet another controversy regarding the training of its language models using pirated books. The company has adopted an unexpected legal strategy in response to a class-action lawsuit from authors, asserting that the distribution of pirated books via BitTorrent qualifies as fair use.
The background of the case reveals that Meta utilized so-called shadow libraries, particularly the aggregator Anna’s Archive, to amass large volumes of text for training its Llama language model. The nature of the BitTorrent protocol means that users downloading content simultaneously share it with others, implicating the company not only in the downloading of pirated books but also in their distribution.
A California court has already issued a partial ruling in favor of Meta, determining that the use of pirated books for training large language models (LLMs) falls under the doctrine of fair use. However, the question of copyright infringement through downloading and distribution via BitTorrent remained unresolved.
Meta utilized BitTorrent as it was a more efficient and reliable method for obtaining datasets, and in the case of Anna’s Archive, these datasets were only available in bulk through torrent downloads. To the extent that the plaintiffs can provide evidence that their works or portions thereof were theoretically accessible to other users on the BitTorrent network during the download process, this was an integral part of the plaintiffs’ works being downloaded for the purpose of transformative fair use by Meta.
The authors who filed the lawsuit have reacted negatively to this argument, particularly since the document was submitted on a Friday, the last permissible day for disclosure. The plaintiffs’ attorneys point out that Meta has been aware of the allegations regarding BitTorrent downloads since November 2024, yet the company never mentioned this fair use defense even when directly prompted by the court.
Meta (for obvious reasons) has not once hinted at asserting a fair use defense concerning the file-sharing claims, even after the court raised this issue with Meta in November of last year. Meta appears to be attempting to create a loophole and evade the discovery process regarding this line of defense.
In its defense, Meta references a December 2024 case management statement where this argument was allegedly mentioned, insisting that the opposing counsel brought up the topic during a hearing shortly thereafter.
The plaintiffs’ assertion that Meta has never indicated an intention to claim fair use concerning the file-sharing allegations, including after the November 2024 hearing, is false.
Additionally, the company argues that each author involved in the lawsuit has acknowledged that they are unaware of any instance where the Llama model reproduced content from their books in its responses. From this, Meta concludes that in the absence of evidence, the lawsuit targets not the protection of specific works but rather the very process of training, which the court has already deemed lawful.
The ultimate decision now rests with the judge, who must determine the admissibility of such a defense in this ongoing legal battle.