«Wikipedia is paying the price for the AI boom — the online encyclopedia is facing rising costs from bots that copy its articles to train AI models, wasting resources and dramatically increasing traffic and load on the site. In the last three months alone, traffic generated by AI crawlers has increased by 50%.
Image source: Wikipedia
The Wikimedia Foundation (the nonprofit that runs Wikipedia) said that “automated requests for our content have grown exponentially.” According to the foundation, the bandwidth used to download media content has increased by 50% since January 2024. However, the traffic is not coming from humans, but from automated programs that constantly download openly licensed images to feed to AI models.
«Our infrastructure is designed to withstand sudden surges of human traffic during high-interest events, but the volume of traffic generated by scraper bots is unprecedented and presents growing risks and costs,” Wikipedia said.
Bots often scrape data from less popular Wikipedia articles. Wikipedia experts claim that at least 65% of such traffic comes from bots, which is disproportionately large given that bots account for about 35% of total page views. Bots also show interest in “key systems in developer infrastructure, such as our code review platform or our bug tracker,” which further strains the site’s resources.
«Wikipedia has been forced to impose individual speed limits on AI bots or ban some of them altogether. But to address the problem in the long term, the foundation is developing a plan called “Responsible Use of Infrastructure.” The plan involves collecting feedback from the Wikipedia community on how to identify traffic from AI bots and filter their access.
Social media platform Reddit faced a similar problem in 2023. For example, Microsoft used the platform’s data to train AI models without notifying Reddit, which forced Reddit to block Microsoft’s bots. After this incident, Reddit decided to charge third-party developers for access to its API. This led to massive developer protests and the closure of some popular Reddit forums.