Cloudflare recently released a major update to its default service rules, setting a September 15th deadline for the entire industry's AI companies. All AI vendors must separate search crawlers from model training and AI agent-specific crawlers. Mixed crawlers that have not been distinguished will be automatically blocked by the system when accessing pages with advertisements.
The new rules apply broadly, covering new platform customers, existing users creating new sites, and all free user websites equally. If a website administrator wants to allow mixed crawlers, they can only manually modify the backend configuration, and this adjustment directly changes the way AI companies acquire web content for training materials.
Many site administrators are willing to open their content for traditional search engines to index, but they do not want their intellectual property to be collected and trained on without compensation. Cloudflare explicitly stated that Google's crawler has both search and AI data collection functions, making it difficult for websites to only open search while blocking AI training crawling. Google responded by introducing a dedicated robot tool for sites to block AI training access without affecting search indexing.
However, its core crawler will still simultaneously collect data for the built-in AI features of the search engine, making it difficult to completely separate search and AI data needs. The platform CEO stated that robot traffic has long exceeded human visits, and the industry ecosystem urgently needs to regulate various crawling behaviors.
Cloudflare continues to enhance content protection tools, upgrading from the 2024 anti-AI crawler tool to a new value-based billing model. Previously, the platform charged based on the number of crawls, but now it has upgraded to Pay Per Use, calculating fees based on the actual revenue generated by the content in AI. Data shows that over half of AI crawlers repeatedly crawl pages without updates, and the payment mechanism can reduce invalid traffic, increasing income for creators.
The current payment plan has already been piloted with two AI companies, allowing site administrators to directly receive revenue after their content is used by AI products. In an environment where copyright regulation is becoming stricter, the new rules force AI companies to improve the transparency of their crawling activities, giving web creators more control over their content.
