TechnologyPublished: 3 July 2026 at 00:37

Cloudflare to Automatically Block Mixed-Use Web Crawlers Serving AI Companies

Cloudflare announced it will automatically block mixed-use web crawlers that index websites for search engines and simultaneously collect data for AI training, starting September 2026, giving website owners more control over their content.

Cloudflare, a leading web hosting and security platform, has announced a new policy to automatically block mixed-use web crawlers — those that both index websites for search engines and act as AI agents or trainers. Previously, the company offered customers the optional ability to block such crawlers, but now it becomes the default setting.

According to Matthew Prince, Cloudflare's CEO and co-founder, the majority of internet traffic is now non-human, necessitating faster action to create a sustainable ecosystem. The new tools and partnerships aim to give website owners greater transparency and commercial opportunities while benefiting AI companies with clear and transparent bots.

Starting September 15, 2026, all new Cloudflare customers and new websites from existing subscribers will default to allowing search engine indexing but blocking AI training and agent use on pages with ads. Mixed-use crawlers that do not offer site owners a choice about whether their content can be used for AI will also be blocked on ad pages. Free account users will switch to these defaults unless they opt out before the deadline.

Cloudflare is also updating its Pay Per Crawl feature, introduced in 2025, renaming it Pay Per Use. Instead of paying based on crawler visits, website owners will be compensated when their content appears in AI chatbot answers. Partnerships have been announced with Ceramic.AI and You.com, with hopes that other AI companies will join.

Although not explicitly named, the policy appears to indirectly target Google. Google's main crawler, Googlebot, indexes sites for search and collects data for training Gemini and powering AI features. While Google offers a separate crawler, Google-Extended, for traditional search only, publishers cannot allow their content in AI search results while blocking model training. Cloudflare aims to push Google and other mixed-use crawler operators to change their practices.

Cloudflare to Automatically Block Mixed-Use Web Crawlers Serving AI Companies

Comments

More in this category

Can Cursor Remain an Open Platform After SpaceX Acquisition?

Private space startups complete orbital rendezvous mission for US Space Force

Meta quietly launches 'Pocket' app for creating generative AI minigames