The social network Reddit continues to fight web bots that use the platform’s content for free to train neural networks. According to the source, over the past few weeks, the Reddit administration has adjusted the robot.txt file, which tells bots whether or not to crawl sections of the site, so that community content and user comments are no longer displayed correctly in many search engines.

Image source: redditinc.com

The message states that currently only Google’s system correctly displays search results for the latest posts on Reddit. At the same time, in other search engines, such as Bing or DuckDuckGo, similar requests are processed incorrectly, i.e. either they do not find the pages that users are interested in, or they display only part of them. Probably, in the case of Google, there are no problems due to previously reached agreements, under which the search giant will pay Reddit $60 million a year for using the site’s content to train its own AI algorithms.

At the same time, Reddit denied information that the deal with Google somehow influenced developers’ permission to use the platform’s content for training neural networks. “This is completely unrelated to our recent partnership with Google. We negotiated with several search engines. We could not reach an agreement with everyone because some are unable or unwilling to make any promises regarding their use of Reddit content, including for training artificial intelligence,” a Reddit representative commented on this issue.

For a site as large as Reddit, blocking major search engine webbots is a bold move, but one that is expected. Over the past year, the site administration has become much more active in protecting the content published by users, trying to open a new source of income and attract investors. The developers increased the cost of using the Reddit API by third-party developers, and also threatened Google with blocking the search engine if the company did not stop using the platform’s content for free to train its neural networks.

Leave a Reply

Your email address will not be published. Required fields are marked *