Cloudflare Is Blocking AI Crawlers by Default

Last year, internet infrastructure firm Cloudflare launched tools enabling its customers to block AI scrapers. Today the company has taken its fight against permissionless scraping several steps further. It has switched to blocking AI crawlers by default for its customers and is moving forward with a Pay Per Crawl program that lets customers charge AI companies to scrape their websites.

Web crawlers have trawled the internet for information for decades. Without them, people would lose vitally important online tools, from Google Search to the Internet Archive’s invaluable digital preservation work. But the AI boom has produced a corresponding boomlet in AI-focused web crawlers, and these bots scrape web pages with a frequency that can mimic a DDoS attack, straining servers and knocking websites offline. Even when websites can handle the heightened activity, many do not want AI crawlers scraping their content, especially news publications that are demanding AI companies to pay to use their work. “We’ve been feverishly trying to protect ourselves,” says Danielle Coffey, the president and CEO of the trade group News Media Alliance, which represents several thousand North American outlets.

So far, Cloudflare’s head of AI control, privacy, and media products, Will Allen, tells WIRED, over 1 million customer websites have activated its older AI-bot-blocking tools. Now millions more will have the option of keeping bot blocking as their default. Cloudflare also says it can identify even “shadow” scrapers that are not publicized by AI companies. The company noted that it uses a proprietary combination of behavioral analysis, fingerprinting, and machine learning to classify and separate AI bots from “good” bots.

A widely used web standard called the Robots Exclusion Protocol, often implemented through a robots.txt file, helps publishers block bots on a case-by-case basis, but following it is not legally required, and there’s plenty of evidence that some AI companies try to evade efforts to block their scrapers. “Robots.txt is ignored,” Coffey says. According to a report from the content licensing platform Tollbit, which offers its own marketplace for publishers to negotiate with AI companies over bot access, AI scraping is still on the rise—including scraping that ignores robots.txt. Tollbit found that over 26 million scrapes ignored the protocol in March 2025 alone.

In this context, Cloudflare’s shift to blocking by default could prove a significant roadblock to surreptitious scrapers and could give publishers more leverage to negotiate, whether through the Pay Per Crawl program or otherwise. “This could dramatically change the power dynamic. Up to this point, AI companies have not needed to pay to license content, because they’ve known that they can just take it without consequences,” says Atlantic CEO (and former WIRED editor in chief) Nicholas Thompson. “Now they’ll have to negotiate, and it will become a competitive advantage for the AI companies that can strike more and better deals with more and better publishers.”

AI startup ProRata, which operates the AI search engine Gist.AI, has agreed to participate in the Pay Per Crawl program, according to CEO and founder Bill Gross. “We firmly believe that all content creators and publishers should be compensated when their content is used in AI answers,” Gross says.

Of course, it remains to be seen whether the big players in the AI space will participate in a program like Pay Per Crawl, which is in beta. (Cloudflare declined to name current participants.) Companies like OpenAI have struck licensing deals with a variety of publishing partners, including WIRED parent company Condé Nast, but specific details of these agreements have not been disclosed, including whether the agreement covers bot access.

Meanwhile, there’s an entire online ecosystem of tutorials about how to evade Cloudflare’s bot blocking tools aimed at web scrapers. As the blocking default rolls out, it’s likely these efforts will continue. Cloudflare emphasizes that customers who do want to let the robots scrape unimpeded will be able to turn off the blocking setting. “All blocking is fully optional and at the discretion of each individual user,” Allen says.

Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top