Here's something most website owners don't realize: when someone asks ChatGPT, Perplexity, or Google's AI Overview about your industry, your website probably doesn't exist in their answer.
Not because your content is bad. Not because your SEO is weak. But because AI search engines look for your site differently than Google's traditional crawler — and you're missing the files they need.
Traditional SEO optimized for Googlebot crawling HTML. AI search engines like ChatGPT's browse mode, Perplexity's indexer, and Google's AI Overviews need structured, machine-readable summaries of who you are, what you do, and where your content lives. Without these signals, they skip you entirely.
After auditing hundreds of websites with our free AI visibility tool, I've identified the same 5 gaps on almost every site. Let me walk you through each one.
The 5 Gaps Making Your Site Invisible to AI
1. No llms.txt — AI crawlers can't find your pages
llms.txt is a standardized file (proposed by Jeremy Howard in September 2024) that sits at your domain root and provides a structured directory of your pages for LLM crawlers. Think of it as sitemap.xml for AI. It lists every important page with a one-line description so that ChatGPT, Claude, and Perplexity know what your site contains without parsing your entire HTML. Without it, AI systems must guess your site structure — and they usually guess wrong or skip you entirely.
2. robots.txt blocks AI crawlers by default
This is the silent killer. WordPress security plugins — Wordfence, AIOSEO, Sucuri, and others — automatically add Disallow rules for GPTBot, ClaudeBot, PerplexityBot, and ChatGPT-User to your robots.txt. They do this to "protect" your content from AI training, but the side effect is catastrophic: your site is literally invisible to AI search engines. ChatGPT cannot browse your pages. Perplexity cannot index them. You don't exist in AI search results.
3. No ai-summary.txt — AI can't describe you accurately
When an AI assistant recommends a business or cites a website, it needs a concise, accurate description of what that site does. Without ai-summary.txt, AI systems cobble together a description from whatever fragments they can find — often outdated cached snippets, third-party mentions, or generic metadata. The result: inaccurate citations, missed recommendations, or complete omission from AI-generated answers.
4. Missing Organization and WebSite Schema — no Knowledge Panel, no AI Overviews
Google's AI Overviews and Knowledge Panels are powered by structured data. Without Organization and WebSite JSON-LD schema on your homepage, Google's AI system doesn't know who you are as an entity. You're just another HTML page. With proper schema, you become a recognized entity that AI systems can confidently reference, quote, and recommend.
5. No FAQ Schema — your content isn't quotable
AI search engines love Q&A formatted content because it maps directly to how users ask questions. FAQPage schema tells AI systems: "here's a question, and here's the authoritative answer." Without it, your content competes with millions of unstructured pages. With FAQ schema, your answers get pulled directly into Google AI Overviews and become the source that ChatGPT and Perplexity cite.
How AI Search Actually Fetches Content
Understanding why these gaps matter requires knowing how AI search differs from traditional search:
Google's traditional crawler (Googlebot) renders your full page, follows links, indexes every word, and ranks based on hundreds of signals accumulated over months.
AI search engines work differently:
- ChatGPT's browse mode sends
ChatGPT-UserandGPTBotto fetch specific pages in real-time. If robots.txt blocks them, the fetch fails silently. - Perplexity's indexer (
PerplexityBot) pre-crawls sites for its search index. It looks for structured summaries first — llms.txt, meta descriptions, schema — before diving into raw content. - Google AI Overviews pull from Google's existing index but prioritize pages with structured data, FAQ schema, and clear entity signals. Unstructured content rarely makes it into the AI-generated answer.
- Claude's search uses
ClaudeBotto fetch pages when users ask for current information. Same robots.txt rules apply.
The Fix: What Each File Does and How to Implement It
llms.txt — your AI directory
Place at yoursite.com/llms.txt. Format: markdown with your site name, a one-line description, then a bulleted list of every important page with its URL and purpose. AI crawlers check this file first, just like Googlebot checks sitemap.xml first. Implementation time: 30 minutes.
robots.txt — unblock AI crawlers
Check your robots.txt for lines like User-agent: GPTBot / Disallow: /. Remove Disallow rules for GPTBot, ClaudeBot, PerplexityBot, ChatGPT-User, and CCBot. If you're using a security plugin, look in its settings for "AI bot blocking" and disable it. Implementation time: 5 minutes.
ai-summary.txt — your elevator pitch for AI
Place at yoursite.com/ai-summary.txt. Plain text file with: site name, URL, what you do, who you serve, main services/topics, and a note that your content is freely accessible. AI citation engines reference this when generating descriptions of your business. Implementation time: 20 minutes.
Organization + WebSite Schema — entity recognition
Add two JSON-LD script blocks to your homepage <head>. Organization schema: your name, URL, description, logo. WebSite schema: site name, URL, SearchAction for sitelinks. This turns you from "a webpage" into "a recognized entity" in Google's knowledge graph. Implementation time: 15 minutes.
FAQPage Schema — quotable answers
Wrap your FAQ content (or question-format headings) in FAQPage JSON-LD. Each question-answer pair becomes a discrete, quotable unit that AI systems can pull directly into generated answers. This is the single highest-impact AEO tactic. Implementation time: 20 minutes per page.
The Numbers: Why This Matters Now
- Gartner predicts traditional search volume will drop 25% by 2026 as users shift to AI-powered answers
- Google AI Overviews now appear on 30%+ of search queries, pulling traffic from standard organic results
- ChatGPT processes over 800 million queries per week (as of early 2026)
- Perplexity crossed 100 million monthly active users in Q1 2026
- Less than 5% of websites have llms.txt implemented (based on our audit data from 200+ sites)
The gap between "AI-ready" and "AI-invisible" sites is enormous right now. This is a first-mover advantage that won't last — as awareness grows, the bar will rise. But today, implementing these 5 files puts you ahead of 95% of your competitors in AI search.
Check Your Site in 30 Seconds
I built a free tool that checks all 15 SEO and AI visibility factors on any URL. It takes 30 seconds and shows you exactly what's missing — no email required, no signup.
Run a Free AI Visibility Audit
15 checks covering traditional SEO + AI search readiness. See what's missing and what to fix — instant results.
Audit My Site NowFAQ
What is llms.txt and why does my website need it?
llms.txt is a standardized markdown file placed at your domain root that tells AI crawlers (ChatGPT, Perplexity, Claude) what your site is about and where to find each page. Think of it as a sitemap.xml for AI search engines. Without it, AI systems must guess your site structure from raw HTML, which often fails.
Is my WordPress site blocking AI crawlers?
Very likely yes. Popular security plugins like Wordfence, AIOSEO, and Sucuri add Disallow rules for GPTBot, ClaudeBot, and PerplexityBot to your robots.txt by default. This means ChatGPT, Claude, and Perplexity literally cannot access your content. Check your robots.txt file to verify.
What is the difference between GEO and AEO?
GEO (Generative Engine Optimization) focuses on getting your content cited by AI tools like ChatGPT and Perplexity when they generate answers. AEO (Answer Engine Optimization) focuses on Google's AI Overviews and featured snippets. Both require structured content, proper schema, and AI-accessible files, but target different systems.
How do I check if my site is visible to AI search?
Run a free AI visibility audit at khanconsulting.ch/seo-audit. It checks 15 factors including llms.txt presence, AI crawler access in robots.txt, ai-summary.txt, Organization and WebSite schema, and FAQ schema — all the elements AI search engines look for.
Will AI search replace Google?
AI search won't replace Google entirely, but it is rapidly capturing search market share. Gartner predicts traditional search volume will decline 25% by 2026 due to AI chatbots. Google itself is responding with AI Overviews on over 30% of queries. Sites that aren't optimized for both traditional and AI search will lose visibility on all fronts.