Blocking AI Crawler Robots

AI Crawler Robots

There are many AI crawler robots and new ones come online frequently. You may not want to block all of them for various reasons, below is a list of some current known AI crawler robots along with the code line for the robots.txt file to block them. These can be blocked using the self serve robots.txt tool in Continuum.

Operator Crawler Blocking line to add to robots.txt (use the two lines below) Notes
Amazon Amazonbot

User-agent: Amazonbot

Disallow: /

 
Anthropic Claude-SearchBot

User-agent: Claude-SearchBot

Disallow: /

 
Anthropic Claude-User

User-agent: Claude-User

Disallow: /

 
Anthropic-AI anthropic-ai

User-agent: anthropic-ai

Disallow: /

 
Apple Applebot

User-agent: Applebot

Disallow: /

 
ByteDance Bytespider

User-agent: Bytespider

Disallow: /

 
Common Crawl CCBot

User-agent: CCBot

Disallow: /

 
DuckDuckGo DuckAssistBot

User-agent: DuckAssistBot

Disallow: /

 
Google Google-CloudVertexBot

User-agent: Google-CloudVertexBot

Disallow: /

 
Google GoogleBot

User-agent: GoogleBot

Disallow: /

 
Google-Extended Google-Extended

User-agent: Google-Extended

Disallow: /

A newer user agent which feeds data to Bard (their AI search engine product) and Vertex AI generative APIs
GoogleOther GoogleOther

User-agent: GoogleOther

Disallow: /

Used by Google to crawl for internal research and development. Please read the documentation for this crawler to determine if blocking is appropriate in your situation.
Huawei PetalBot

User-agent: PetalBot

Disallow: /

 
Internet Archive archive.org_bot

User-agent: archive.org_bot

Disallow: /

 
Meta FacebookBot

User-agent: FacebookBot

Disallow: /

 
Meta Meta-ExternalAgent

User-agent: Meta-ExternalAgent

Disallow: /

 
Meta Meta-ExternalFetcher

User-agent: Meta-ExternalFetcher

Disallow: /

 
Microsoft BingBot

User-agent: BingBot

Disallow: /

 
Mistral MistralAI-User

User-agent: MistralAI-User

Disallow: /

 
OpenAI ChatGPT-User

User-agent: ChatGPT-User

Disallow: /

 
OpenAI GPTBot

User-agent: GPTBot

Disallow: /

 
OpenAI OAI-SearchBot

User-agent: OAI-SearchBot

Disallow: /

 
Perplexity Perplexity-User

User-agent: Perplexity-User

Disallow: /

 
Perplexity PerplexityBot

User-agent: PerplexityBot

Disallow: /

 
ProRata.ai ProRataInc

User-agent: ProRataInc

Disallow: /

 
Timpi Timpibot

User-agent: Timpibot

Disallow: /

 
Webz.io Omgilibot

User-agent: Omgilibot

Disallow: /

 

Please be aware that the landscape of AI crawlers and bots is constantly evolving. The provided list may not be exhaustive. Ensure that you conduct independent research and verify the inclusion/exclusion of any specific AI crawlers or bots relevant to your needs.