Late last year, a lawsuit began in which The New York Times and other major publishers accused OpenAI of using materials they published without permission to train its neural networks. Now it has become known that OpenAI engineers accidentally deleted data that could potentially become evidence that the developer of AI algorithms was guilty of copyright infringement.
The report said the news outlets’ lawyers spent more than 150 hours studying the data OpenAI uses to train its neural networks. The goal of their work was to find cases where news articles from publications protected by copyright law were used to train AI algorithms. It is not known exactly what information the OpenAI engineers deleted. The company admitted the mistake and tried to restore the data, but it was not possible to do so in full. The same data that was recovered does not allow us to determine that the publications of the publications were involved in training the neural networks. OpenAI’s lawyers described the data deletion as a “glitch,” and The New York Times said it had “no reason to believe” it was done intentionally.
Last December, The New York Times accused OpenAI and its largest partner, Microsoft, of creating their AI algorithms by “copying and using millions of articles” from the publication. The company is demanding that OpenAI be held accountable for “billions of dollars in statutory and actual damages” for allegedly copying the publication’s articles. The New York Times has already spent more than $1 million fighting OpenAI in court. At the same time, OpenAI managed to negotiate and conclude an agreement with other publications, such as Axel Springer, Conde Nast and Vox Media. This suggests that many publishers prefer collaboration rather than litigation.