Glossary · robots.txt
robots.txt
A plain-text file at the root of a domain that tells search engine crawlers which paths they may or may not access.
robots.txt is a crawl-control file, not an index-control file. A page blocked by robots.txt may still appear in search results if other sites link to it (without the page content, just the URL). To exclude a page from search results, use a noindex meta tag or HTTP header.
For AI search visibility, robots.txt should explicitly allow the major AI crawlers: GPTBot, ChatGPT-User, Google-Extended, PerplexityBot, ClaudeBot, OAI-SearchBot, and others.