Categories: Artificial Intelligence, Machine Learning, Neural NetworksTechnology and IT market. news

Mark Zuckerberg personally allowed Llama’s AI models to be trained on pirated materials

Meta✴ CEO Mark Zuckerberg personally authorized the Meta✴ division responsible for developing Llama artificial intelligence models to use a data set containing illegally obtained books and articles to train them. This became known from documents published as part of the lawsuit of writer Richard Kadrey against Meta✴.

Image Source: Tingey Injury / unsplash.com

The case is just one of a number of cases in which tech giants that develop AI systems are accused of training models on copyrighted material without the authors’ permission. Defendants have traditionally argued that their actions meet the fair use standard, a doctrine that allows copyright to be overridden to create new works and products that are substantially different from the original. Many copyright holders do not agree with this position.

A new batch of declassified documents (PDF) provides testimony from Meta✴ representatives: it turned out that Mark Zuckerberg personally approved the company’s use of the LibGen array to train Llama. The LibGen project, which bills itself as a link aggregator, actually provides access to copyrighted works operated by major publishers. He was repeatedly sued, tens of millions of dollars were recovered from him for copyright violations, and as a result the project was forced to close. Zuckerberg, the documents say, approved the use of LibGen to train at least one Llama model, despite concerns raised by Meta✴ employees and management. An internal memo is cited that notes that LibGen’s work was approved after “escalation to MZ,” an acronym that apparently meant the head of the company.

Image source: Igor Omilaev / unsplash.com

The plaintiff’s side filed a statement with the court on January 8 containing new charges. In particular, it is alleged that Meta✴ could try to hide this act and remove information about the use of LibGen materials – this was allegedly done by Meta✴ engineer Nikolay Bashlykov, who wrote a script that removed copyright information from books in the training array. Meta✴ also allegedly removed copyright notices and related metadata from scientific journal articles in the dataset. Moreover, Meta✴ violated copyright by downloading the LibGen array via the BitTorrent protocol – at this moment the company not only downloaded, but also simultaneously “distributed” this data, actually distributing pirated materials, the plaintiff claims. The head of generative AI at Meta✴, Ahmad Al-Dahle, gave permission to download LibGen data via BitTorrent, although engineer Bashlykov indicated that this “may not be legally permissible.”

The case is still far from over. For now, it only applies to early Llama models, not the latest releases. And if Meta✴ convinces the court of fair use of the materials, it may side with the company – in 2023, several plaintiffs were unable to prove copyright infringement, and their claims against Meta✴ were rejected.

admin

Next TSMC has identified another client who ordered the production of advanced chips for Huawei »

Previous « TikTok in the US found a new buyer 10 days before a possible ban on the service

South Korea steps up support for domestic semiconductor industry

Large-scale changes in the sphere of international trade, provoked by the actions of the US…

57 minutes ago

South Korea steps up support for domestic semiconductor industry

Large-scale changes in the sphere of international trade, provoked by the actions of the US…

1 hour ago

Technology and IT market. news

“We’ll do for the nuclear industry what Ford did for the automotive industry”: Aalo Atomics announces modular microreactors for AI data centers

Texas-based small modular reactor developer Aalo Atomics has unveiled a prototype of its first compact…

1 hour ago

Mark Zuckerberg personally allowed Llama’s AI models to be trained on pirated materials

Recent Posts

South Korea steps up support for domestic semiconductor industry

South Korea steps up support for domestic semiconductor industry

“We’ll do for the nuclear industry what Ford did for the automotive industry”: Aalo Atomics announces modular microreactors for AI data centers

AMD Ready for US Chip Launch, Plans to Capitalize on ZT Server Business

Misfortune Helped: Apple Became the Leader in Smartphone Shipment Growth Last Quarter

Why DDR5 CUDIMM is a step forward: details and tests