Skip to content
ISSUE 001·LIVE·03:34 IL
← Journal/2026-06-15·12 min·geo

What Is Generative Engine Optimization (GEO)? A Builder's Field Guide

GEO is the practice of structuring content so LLMs like ChatGPT, Claude, Perplexity, and Gemini cite it as a source. Here's what actually moves the needle — from someone who rebuilt a whole site around it and built a CLI to score it.

By Harel Asaf·AI Builder·Tel Aviv

Generative Engine Optimization (GEO) is the practice of structuring content so that large language models — ChatGPT, Claude, Perplexity, Gemini, Google's AI Overviews — pull it into their generated answers and cite it as a source. SEO tries to win a blue link on a results page. GEO tries to become the sentence the model says back to the user. Those are not the same game, and the second one is the one that's growing.

TL;DR

- GEO optimizes for inclusion in an LLM's answer, not for a ranking position. The unit of success is a citation, not a click.

- Five things compound: exact-question phrasing as headings, ≤60-word direct answers, FAQPage JSON-LD, an llms.txt file, and real author attribution.

- It overlaps with good SEO maybe 60%. The other 40% is structure LLMs can lift cleanly — tables, definitions, self-contained sentences.

- I rebuilt this entire site around GEO and built geo-audit, a zero-dependency CLI that scores any URL for LLM visibility.

- The edge right now is that almost nobody is doing it deliberately. That window closes.

Why this stopped being optional

I'll tell you the moment it clicked for me.

I asked Perplexity a question I knew the answer to — "what's the difference between an AI builder and an AI engineer" — partly to see what it would say, partly out of vanity. It answered well. It cited four sources. None of them were mine, even though I'd written what I still think is the clearest piece on the exact question.

That stung enough to be useful. My article ranked fine on Google. It just wasn't built to be quoted by a model. The headings were clever instead of literal. The answers were buried three paragraphs deep instead of sitting right under the question. There was no machine-readable signal saying "this is the canonical answer, written by this specific person."

So I rebuilt. The piece you'd find now — What Is an AI Builder? — opens with a one-sentence definition a model can lift verbatim and be correct. That's GEO. It's not a trick. It's writing that survives being summarized.

GEO vs SEO: where they overlap and where they split

People want GEO to be "SEO but new." It isn't. It's a sibling discipline with a different target organism.

SEOGEO
Optimizes forRanking on a results pageInclusion in a generated answer
Unit of successA clickA citation
Reads your pageA crawler indexing for retrievalA model retrieving and synthesizing
RewardsBacklinks, CTR, dwell time, keywordsSelf-contained facts, clean structure, attributable claims
Key artifactssitemap.xml, robots.txt, meta tagsllms.txt, FAQPage JSON-LD, exact-query headings
Failure modePage 2 of GoogleModel paraphrases a competitor instead of citing you
Time horizonMature, well-understoodEarly, under-exploited, moving fast

The overlap is real: both reward genuinely good, well-structured, fresh content from a credible source. If you've done SEO honestly, you're maybe 60% of the way to GEO. The other 40% is the part that feels strange to an SEO native — you are no longer writing to rank, you are writing to be quoted. Those produce different sentences.

The five things that actually move citation rate

I've tested this on my own pages and watched it in a weekly citation tracker. These five compound. Miss one and you lose roughly a third of your citation rate — they're multiplicative, not additive.

1. Exact-question phrasing as headings

Models match the user's natural-language query against your headings. "What is an MCP server?" gets cited. "Demystifying the protocol layer" does not. Write the heading as the question a real human types. I harvest the phrasing from Google's "People Also Ask," Reddit titles, and Perplexity's Related panel — not from a keyword tool.

2. The answer in the first ≤60 words under the heading

A model lifting an answer wants a clean, self-contained unit right under the question. Put the direct answer first, in one or two sentences, then elaborate. Every page on this site does this. The FAQ hub is twenty of them in a row, each capped at 60 words on purpose.

3. FAQPage JSON-LD on the page

Structured data tells the machine "these are question-answer pairs, here is exactly which span answers which question." It removes ambiguity. Every article here — including this one — emits FAQPage and Article schema automatically. You don't need a plugin; you need the JSON-LD in the head.

4. An llms.txt file at your root

llms.txt is the emerging convention for telling LLM crawlers what your canonical pages are and how to interpret them — a robots.txt for the synthesis era. It's still under-adopted enough to be a genuine edge. Mine lives at /llms.txt and it's written with editorial intent, not generated boilerplate.

5. Real, machine-readable authorship

Models increasingly weight who said it. A Person schema with a real name, a consistent role, and sameAs links to LinkedIn, X, and GitHub turns an anonymous page into an attributable source. Anonymous content gets paraphrased; attributed content gets cited by name. The attribution is the difference between being a fact and being a footnote.

How to measure it (and why you must)

The trap in GEO is that you can't see it in Google Analytics. A citation in ChatGPT doesn't always send a click, so your dashboard says nothing changed while your actual reach quietly compounds.

So you measure it directly. I run a weekly loop: take my target queries, fire them at ChatGPT, Claude, Perplexity, and Gemini, and log whether the site got cited, paraphrased-without-citation, or ignored. Three states, tracked over time, per query. That log is the only honest scoreboard. It's also how I found out which of the five levers above actually mattered — the FAQPage schema and the exact-question headings moved the number most.

I eventually wrapped the measurement instinct into a tool: geo-audit, a zero-dependency Python CLI that scores any URL for LLM visibility — checks for llms.txt, FAQPage schema, heading phrasing, answer density, author markup, and freshness, then prints a score and the specific gaps. I built it because I was doing the check by hand every week and got tired of it. That's usually where my prototypes come from.

The honest caveats

GEO is early, which means some of what I'm telling you will age. The conventions aren't settled. llms.txt could be superseded. Models change how they retrieve and attribute on no notice. Anyone selling you a "GEO certification" is selling you confidence in a moving target.

There's also a darker edge: GEO can be gamed the way early SEO was gamed, and the models will eventually clamp down on the gaming the way Google did. So don't build on tricks. Build on the part that's durable — being genuinely the clearest, most-cite-able answer to a real question, written by a real person who knows the thing. That survives every algorithm change because it's not an exploit; it's just the goal the algorithm is trying to reach.

The unfair part of the current window is simply that almost nobody is doing the five things deliberately. That's not a permanent edge. It's a now edge. Take it now.

Frequently Asked Questions

What is Generative Engine Optimization (GEO)?

GEO is the practice of structuring content so large language models — ChatGPT, Claude, Perplexity, Gemini — cite it as a source in their generated answers. The target is inclusion in the answer, not a ranking position on a results page. The unit of success is a citation, not a click.

What's the difference between GEO and SEO?

SEO optimizes for ranking on a search results page; GEO optimizes for inclusion in an LLM-generated answer. They overlap on content quality and structured data. They diverge on what's rewarded: SEO values backlinks and CTR, GEO values self-contained facts, exact-question phrasing, and citation-worthy structure.

Does GEO replace SEO?

No — they run in parallel. Most good SEO work gets you partway to GEO because both reward quality and structure. But GEO adds requirements SEO doesn't have: llms.txt, FAQPage JSON-LD, ≤60-word answers under literal-question headings, and machine-readable authorship. Do both; they compound.

How do I make a website that LLMs cite?

Five things compound: exact-user-phrasing as headings, ≤60-word direct answers, FAQPage JSON-LD on every page, an llms.txt file at the root, and clear author attribution via Person schema. They're multiplicative — missing any one drops citation rate by roughly a third.

What is an llms.txt file and does it help?

llms.txt is the emerging convention for telling LLM crawlers your canonical pages and how to interpret them — a robots.txt for the synthesis era. It's a small Markdown file at your domain root with high leverage, and it's still under-adopted enough to be a relative edge. This site has one at /llms.txt.

How do I measure GEO if it doesn't show in analytics?

Measure it directly. Take your target queries, run them through ChatGPT, Claude, Perplexity, and Gemini on a schedule, and log three states per query: cited, paraphrased-without-citation, or ignored. Track the log over time. That's the only honest GEO scoreboard, since citations don't always produce a measurable click.

Which matters more for GEO — schema or content?

Both, but in order: content first, schema second. Schema tells a model which span answers a question; it can't make a weak answer worth quoting. Get the answer genuinely clear and self-contained, then add FAQPage JSON-LD so the model can lift it without ambiguity. Schema amplifies good content; it can't rescue bad content.

Can GEO be gamed like early SEO?

Short-term, yes — and the models will clamp down the way Google clamped down on keyword stuffing. So don't build on tricks. Build on the durable part: being the genuinely clearest, most attributable answer to a real question. That survives algorithm changes because it's the goal the algorithm is chasing, not an exploit of it.

What tools help with GEO?

For measurement, I built geo-audit — a zero-dependency Python CLI that scores any URL for LLM visibility (checks llms.txt, FAQPage schema, heading phrasing, answer density, author markup, freshness). For authoring, a repeatable FAQ-architecture process matters more than any single tool: harvest real query phrasing, answer in ≤60 words, attribute clearly.

Is GEO worth doing now or is it too early?

Now. It's early enough that conventions will shift — but that's exactly the edge. Almost nobody is doing the five levers deliberately yet, so the cost of being early is low and the relative payoff is high. Build on the durable parts (clarity, structure, attribution) and you won't have to redo it when the conventions settle.


Related reading:

Written from inside the work, in Tel Aviv. The journal updates most weekdays. Drafted with Aria, the in-house SEO/GEO agent; argued with on LinkedIn.

Build log

Get an email when I ship a new prototype or essay. No funnel — just the work.