"
CiteScan

AI Search Glossary

robots.txt

robots.txt is a plain-text file at the root of a domain (yoursite.com/robots.txt) that tells web crawlers which pages and directories they are allowed or not allowed to access.

In detail

AI search crawlers (GPTBot, OAI-SearchBot, PerplexityBot, ClaudeBot, Google-Extended) follow the robots.txt standard. A misconfigured robots.txt can silently block AI crawlers from indexing your content. Best practice for AI readiness: add explicit allow rules for each AI crawler, and only disallow paths that genuinely should not be crawled (e.g. /api/, /admin/).

See how your site handles robots.txt.

Free AI search readiness check — no account required.

Check my site →

Related terms

gptbotoai searchbotllms txt