The Internet’s largest sites have banned Apple from collecting their data for AI training

One of the data sources for training generative artificial intelligence systems are publicly available web resources. Apple gave their owners the opportunity to opt out of collecting data for training the Apple Intelligence system, and many of the largest resources took advantage of this opportunity. These include Facebook✴ and Instagram✴, as well as major news and media outlets including the New York Times and The Atlantic.

For the past few years, Apple has been using a web crawler called AppleBot, which uses the data it collects to train Siri and the Spotlight search engine. And most recently, the company connected to AppleBot and Apple Intelligence. This is a controversial practice, since modern AI takes liberties with copyrighted materials – in narrow areas where there is not much material at all, systems quote entire paragraphs almost unchanged.

Apple says it collects information ethically, filtering out personal data, using only licensed materials and publicly available data that comes from the AppleBot scanner. To give webmasters the opportunity to refuse to collect information only for AI training, the company used the pseudonym Applebot-Extended – standard search indexing remains in place when this pseudonym is prohibited.

The refusal is carried out by entering the appropriate directive into the robots.txt file publicly available on web resources, which means that anyone has the opportunity to see which publisher has blocked access to Apple Intelligence. This was done by Facebook✴, Instagram✴, Craigslist, Tumblr, New York Times, Financial Times, The Atlantic, Vox Media, USA Today Network and Condé Nast, Wired magazine established. Just over a quarter of major American news sites (294 out of 1,167) refused to allow Apple’s AI into their sites, said journalist Ben Welsh.

According to unconfirmed information, Apple has entered into deals with some media companies, paying them for the right to use their materials to train AI. Probably, these considerations are holding back other resources – they are simply waiting for money.

admin

Share
Published by
admin

Recent Posts

Nvidia Overtakes Samsung to Become World’s Largest Semiconductor Supplier for the First Time

For years, Intel and Samsung Electronics have been battling it out to become the world's…

13 hours ago

Nvidia Overtakes Samsung to Become World’s Largest Semiconductor Supplier for the First Time

For years, Intel and Samsung Electronics have been battling it out to become the world's…

13 hours ago

EU moves from higher tariffs to minimum acceptable price level for import of Chinese electric cars

Increased duties on Chinese electric cars, which came into effect in the European Union following…

16 hours ago

EU moves from higher tariffs to minimum acceptable price level for import of Chinese electric cars

Increased duties on Chinese electric cars, which came into effect in the European Union following…

16 hours ago

Every fifth iPhone is now made in India

Not so long ago, every seventh iPhone was manufactured in India, which in itself was…

17 hours ago

The First Berserker: Khazan – Cheap, but Good and Cheerful. Review

Played on Xbox Series X Although Dungeon Fighter Online has a billion players, you probably…

23 hours ago