The active collection of information by large language models from open sources has long irritated copyright holders, and the media, sensitive to such problems, try to protect their copyrights in court at the first opportunity. Now it’s the turn of Canadian publishers to sue OpenAI.
As Bloomberg notes, Torstar Corp., Postmedia Network Canada Corp., Globe and Mail Inc., Canadian Press and CBC/Radio-Canada filed a lawsuit in the Ontario Superior Court. They are seeking unspecified damages for OpenAI’s efforts to train its language models using text materials it published. The plaintiffs’ statement said: “OpenAI earns and benefits from the use of this content without obtaining official permission or compensating copyright holders.” According to the plaintiffs, their publications account for the bulk of content produced by journalists in Canada.
Representatives of OpenAI managed to object that its models are trained on “publicly available data according to the principles of fair use and taking into account internationally recognized principles of copyright protection.” The company also works with news publishers to develop content display and curation practices that suit them, striving to credit original sources and, where necessary, remove material at the request of copyright holders.
At the end of last year, OpenAI had to face a lawsuit from The New York Times, which simultaneously accused Microsoft Corporation, which is a close partner of the startup. The defendants were accused of using millions of the plaintiff’s copyrighted articles to train their language models. News Media Canada CEO Paul Deegan accused OpenAI of “ripping off journalists while illegally and unjustifiably enriching itself in a significant way.”