Configure Robots.txt for AI Visibility
Complete checklist for configuring your robots.txt file to allow AI crawlers while maintaining control.
1AI Crawlers to Allow
Allow GPTBot (OpenAI/ChatGPT)
User-agent: GPTBot with Allow: /
Allow ChatGPT-User
For ChatGPT's browsing feature
Allow Claude-Web (Anthropic)
Claude's web crawler
Allow anthropic-ai
Anthropic's general AI agent
Allow PerplexityBot
Perplexity's search crawler
Allow Google-Extended
Google's AI training crawler
Allow Cohere-ai
Cohere's AI crawler
Allow YouBot
You.com's AI search crawler
2Configuration Best Practices
Place AI rules before wildcards
Specific rules should come first
Don't use overly aggressive crawl-delay
May prevent full indexing
Allow access to CSS/JS files
AI needs these to render pages
Include sitemap directive
Help crawlers discover pages
3Pages to Consider Blocking
Block admin areas
Disallow: /admin/
Block login/auth pages
No value for AI indexing
Block duplicate content paths
Prevent confusion
Block internal search results
Low-value pages
4Testing & Monitoring
Test with robots.txt validator
Use Google's testing tool
Check server logs for AI bots
Verify crawlers are accessing
Monitor for new AI crawlers
Stay updated on new bots
Review quarterly
Update as AI landscape changes
Checklist Summary
20 total items across 4 sections