Friday, May 8, 2026

Media Companies Underuse AI

Introduction to AI in Advertising The latest "State of Data" report from the Interactive...

The Power of Emotional...

Emotional connection is the key to writing viral blog posts that resonate with...

How to Create a...

Google Ads is a powerful tool that can help you drive targeted traffic...

Provenance & Trust In...

Introduction to the Future of Primary Sources The rise of AI has sparked a...
HomeSEOCloudflare Delists And...

Cloudflare Delists And Blocks Perplexity From Crawling Websites

Introduction to Cloudflare and Perplexity

Cloudflare, a well-known internet security and performance company, recently announced that it has delisted Perplexity’s crawler as a verified bot. This decision was made after multiple user complaints and an investigation that revealed Perplexity was using aggressive rogue bot tactics to force its crawlers onto websites. Perplexity’s actions were found to be in violation of Cloudflare’s requirements for verified bots, which include obeying the robots.txt protocol and refraining from using undeclared IP addresses.

What is Cloudflare’s Verified Bots Program?

Cloudflare has a system called Verified Bots that whitelists bots in their system, allowing them to crawl the websites that are protected by Cloudflare. To maintain their privileged status, verified bots must conform to specific policies, such as obeying the robots.txt protocols. The robots.txt protocol is a standard used by websites to communicate with web crawlers and other web robots. It provides a way for websites to specify which parts of their site should not be crawled or indexed by search engines.

Perplexity’s Violations

Perplexity was found to be violating Cloudflare’s requirements in several ways. The company was using aggressive rogue bot tactics, including rotating IP addresses, changing ASNs, and impersonating browsers like Chrome. These actions allowed Perplexity to circumvent the robots.txt protocol and crawl websites that had explicitly blocked their crawlers. Perplexity’s actions were seen as a serious violation of Cloudflare’s policies and a threat to the integrity of the internet.

- Advertisement -

Stealth Crawling Behavior: Rotating IP Addresses

Perplexity’s crawlers were found to be using rotating IP addresses, changing ASNs, and impersonating browsers like Chrome. This allowed them to evade blocks and crawl websites that had explicitly blocked their crawlers. An ASN, or Autonomous System Number, is a unique identifier assigned to a group of IP addresses. By changing ASNs, Perplexity’s crawlers were able to disguise themselves as legitimate traffic and avoid detection.

Stealth Crawling Behavior: Spoofed User Agent

Perplexity’s crawlers were also found to be spoofing their user agent, posing as a human user browsing with Chrome on a Mac operating system. This allowed them to bypass filters that block known crawlers and crawl websites that had explicitly blocked their crawlers. The user agent is a string of text that identifies the browser or crawler making a request to a website. By spoofing their user agent, Perplexity’s crawlers were able to disguise themselves as legitimate traffic and avoid detection.

Cloudflare’s Response

In response to Perplexity’s violations, Cloudflare delisted the company as a verified bot and implemented new blocking rules to prevent their stealth crawling. This decision was seen as a strong response to aggressive bot behavior and a necessary step to protect the integrity of the internet. Cloudflare’s actions will help to prevent Perplexity’s crawlers from evading blocks and crawling websites that have explicitly blocked their crawlers.

Takeaways

There are several key takeaways from this incident:

  • Perplexity violated Cloudflare’s Verified Bots policy, which grants crawling access to trusted bots that follow common-sense rules like honoring the robots.txt protocol.
  • Perplexity used stealth crawling tactics, including rotating IP addresses and spoofing their user agent, to crawl content after being blocked from accessing it.
  • Cloudflare’s response was swift and decisive, delisting Perplexity as a verified bot and implementing new blocking rules to prevent their stealth crawling.
  • The incident highlights the importance of following rules and respecting website directives, and the need for companies like Cloudflare to take strong action against aggressive bot behavior.

Conclusion

In conclusion, Cloudflare’s decision to delist Perplexity as a verified bot and block their stealth crawling is a strong response to aggressive bot behavior. The incident highlights the importance of following rules and respecting website directives, and the need for companies like Cloudflare to take strong action against aggressive bot behavior. By taking this action, Cloudflare is helping to protect the integrity of the internet and ensure that websites are able to control who crawls their content.

- Advertisement -

Latest Articles

- Advertisement -

Continue reading

Bing Team Describes How Grounding Differs From Search Indexing

Introduction to Microsoft's New Framework Microsoft's Bing team has published a framework that describes how indexing requirements change when the goal is to support AI answers rather than to rank search results. This framework identifies five measurement areas where the...

GoDaddy Transferred A Domain By Mistake And Refused To Fix It

Introduction to the Problem GoDaddy, a well-known domain registrar, allegedly transferred a domain name without the authorization of its longtime registrant. This unauthorized transfer occurred without the necessary documentation, leaving the victim in a difficult situation. After spending nearly ten...

Google Tests AI Headlines, Rolls Out Spam Update – SEO Pulse

Introduction to Google's Latest Updates Google has been making significant changes to how content appears in its search results. This week's updates affect how headlines appear in search, how spam enforcement is handled, and how AI-generated content is labeled. These...

Google Answers Questions About Search Console’s Branded Queries Filter

Introduction to Google Search Console's Branded Queries Filter Google Search Central recently announced that the branded queries filter in Search Console is now available to all eligible sites. This update has led to many questions from SEOs, which Google's John...