Cloudflare Accuses Perplexity AI of Using Stealth Crawlers to Bypass Website Blocks — Sparks Major AI Scraping Crackdown
Cloudflare has leveled serious accusations against Perplexity AI, the fast-rising $18 billion AI search startup, claiming the company used stealth tactics to scrape content from websites explicitly blocking its bots.
The allegations have set off a major escalation in the tech industry’s war against covert data extraction by AI companies—and captured the urgent attention of website owners, publishers, and digital rights advocates alike.
The controversy erupted after Cloudflare customers reported that Perplexity’s bots continued accessing their sites despite implementing robots.txt directives and specific firewall rules to block the company.
In order to confirm their suspicions, Cloudflare engineers ran controlled experiments, creating domains with restrictive robot.txt files that prohibited all automated access.
Yet, when they queried Perplexity’s AI, it still surfaced detailed information about content hosted on the restricted domains, clearly proving that these Perplexity AI crawlers has been actively evading blocks.
More surprisingly, the Cloudflare engineering conducting the test also noticed that Perplexity had been deploying stealth crawlers that masked their identity by mimicking mainstream browsers like Google Chrome on macOS.
These crawlers rotated IP addresses, often operating from multiple autonomous systems, and issued requests from IPs outside Perplexity’s publicly declared ranges.
Cloudflare estimated the volume of these “undeclared” crawlers added an additional 3–6 million requests per day on top of the company’s declared traffic, affecting tens of thousands of domains.
Cloudflare’s Swift Retaliation and Industry Ramifications
Cloudflare responded by delisting Perplexity from its verified bots program and rolling out new defensive measures.
All new domains registered with Cloudflare are now by default protected from unauthorized AI crawlers—a move the firm dubbed “Content Independence Day.”
The company has also released signature-based rules targeting Perplexity’s stealth tactics, free to all Cloudflare customers, and is piloting advanced tools like the “AI Labyrinth.”
This feature traps rogue bots in decoy content, and an upcoming “pay-per-crawl” system may let publishers directly monetize and control how their content is accessed by AI tools.
Leading publishers and platforms, including the Associated Press, Time, BuzzFeed, Reddit, and Universal Music Group, have joined the growing movement to block unwanted AI crawlers as content creators voice mounting frustration over unauthorized data harvesting.
Following the accusations, Perplexity AI have dismissed the findings as a Cloudflare “sales pitch” and maintain they have not accessed any content in defiance of site bans.
Industry at a Turning Point
The standoff isn’t just about one startup. Cloudflare CEO Matthew Prince has become a vocal critic of the “unsustainable” practices of AI web scraping, pointing to devastating ratios: Google refers one website visitor for every 18 pieces of content it crawls, while AI companies’ ratios are far worse—sometimes as poor as one referral visit for every 1,500 to 60,000 pages scraped.
The situation has reignited debate over the future of web content, publishing revenue, and the proper mechanisms for policing AI access.
Cloudflare contrasts Perplexity’s behavior with OpenAI, which it claims properly honors site preferences and ceases crawling when instructed.
The company’s message is clear: the age of stealth scraping must give way to transparency, consent, and fair compensation.
This flashpoint is forcing the tech world to reckon with hard questions about data rights, AI development, and the economic foundation of online publishing.
As AI becomes ever more integrated into search, content discovery, and digital experiences, the fallout from the Cloudflare–Perplexity dispute will very likely shape how websites defend their content—and how AI startups build their products—throughout the next phase of the digital economy.