Glossary · robots.txt

robots.txt

A plain-text file at the root of a domain that tells search engine crawlers which paths they may or may not access.

robots.txt is a crawl-control file, not an index-control file. A page blocked by robots.txt may still appear in search results if other sites link to it (without the page content, just the URL). To exclude a page from search results, use a noindex meta tag or HTTP header.

For AI search visibility, robots.txt should explicitly allow the major AI crawlers: GPTBot, ChatGPT-User, Google-Extended, PerplexityBot, ClaudeBot, OAI-SearchBot, and others.

Want this applied to your business?

Three-part audit, five business days, flat $999.

Book a 10-minute fit call