Before a search engine ranks a business, it has to identify one. That sounds like a formality, and for large, well-documented brands it is. For everyone else it is a live question with commercial consequences: Google decides who you are before it decides where you belong, and every ranking signal, review, mention and backlink is only worth what the engine can confidently attribute to the right entity. When the identification is shaky, the signals scatter — and the brand underperforms in ways no keyword audit will ever surface.
Entity SEO is the work of making that identification unmistakable. It is less glamorous than content and less familiar than links, but it sits underneath both, and in AI search it stops being optional altogether.
An entity is a thing, not a string
The distinction comes from Google itself. When it introduced the Knowledge Graph, Google framed the shift as moving from strings to “things, not strings” — from matching the letters in a query to recognising the real-world objects behind them. The launch example was “Taj Mahal”: as a string, two words; as things, a monument, a Grammy-winning musician, a casino and a restaurant, each a separate entry with its own facts. At launch in 2012 the Knowledge Graph already held more than 500 million objects and more than 3.5 billion facts and relationships between them, and it has been the substrate of Google’s understanding ever since.
So an entity is a distinct thing — an organisation, a person, a place, a product — that exists independently of any particular words used to name it. It carries facts (what it is, where it operates, who runs it, what it sells) and relationships (to its founders, its locations, its industry, its parent company). A keyword is a string. An entity is the thing the string refers to, and modern search engines rank things.
The practical consequence: your brand is not the name on your logo. It is the entry — explicit or inferred — that the engines hold for you, assembled from everything they have read. Entity SEO is the discipline of making sure that entry exists, is unambiguous, and is correct.
Recognition precedes ranking
Consider a firm called Apex Legal. There are several. When a signal arrives attached to that string — a review, a citation in an article, a directory listing, a link — the engine has to decide which Apex Legal earned it. If your entity is well defined, the attribution is clean and the signal compounds. If it is not, three failure modes open up: the signal is credited to a competitor with the same name, it is discarded as unresolvable, or it seeds a second, fragmentary version of you that dilutes the first.
This is why entity recognition precedes ranking rather than accompanying it. Relevance and authority are judgements about an entity; they cannot be made about a string the engine has not resolved. A business with strong content and genuine authority but a poorly defined entity is pouring signals into a leaking bucket.
For AI engines the dependency is sharper still. A ranking system that misattributes a signal loses some accuracy. A generative system that misattributes an identity produces a wrong answer with your name in it — the wrong services, the wrong city, a merged description of you and your namesake. And a model that cannot resolve which “you” the retrieved passages describe tends to do the cautious thing: leave you out of the answer entirely. In generated answers, ambiguity does not cost position — it costs presence.
How an entity gets assembled
Engines build their picture of an organisation from two layers, and both are workable.
The declaration — what you state about yourself. This is the layer fully in your control. It starts with Organization schema markup, which exists for exactly this purpose: Google’s structured data documentation describes it as helping Google “better understand your organization’s administrative details and disambiguate your organization in search results”. Name, legal name, address, telephone, logo, founder, the services offered — stated in machine-readable terms that leave nothing to inference. Alongside the markup sits the about page, which most businesses treat as a branding exercise and engines treat as a primary source. It should read like a fact sheet wearing good prose: legal name, category, locations, specialisations, key people, history — as plain declarative sentences. The about page is written for the machine as much as the buyer, and it should state facts, not positioning. The broader machine-readable layer — schema, llms.txt, structured entity facts — is covered in its own article.
The corroboration — what everyone else says about you. Engines do not take a brand’s word for its own identity; they cross-check the declaration against independent sources. For an Australian SME the realistic corroboration set is concrete: the company register and ABN lookup, Google Business Profile, LinkedIn, the relevant industry association, established directories, supplier and partner pages, press coverage with a named author, conference and event listings. Wikipedia and Wikidata are generally treated as the strongest identity anchors on the web, but most small and mid-sized businesses do not qualify for a Wikipedia article and should not manufacture one — a thin, promotional page is a liability, not an anchor. The tier below is achievable by almost anyone, and it is sufficient: what matters is that multiple sources you do not control describe the same organisation in the same terms.
The two layers verify each other. A declaration nothing corroborates reads as a claim; corroboration with no clear declaration reads as noise. An entity is what emerges when they agree.
The sameAs web — consistency beats volume
The mechanism that stitches the layers together is small and underused. The sameAs property is defined by schema.org as the “URL of a reference Web page that unambiguously indicates the item’s identity” — the item’s Wikipedia page, Wikidata entry, or official profiles. In practice, it is your organisation schema saying: the entity on this website is the same entity as this LinkedIn page, this Google Business Profile, this directory listing, this registry record. It is a hand-drawn map of your identity across the web, handed directly to the machines.
The map only works if the territory agrees with it. Every profile you point to that carries the same name, the same description and the same facts strengthens the entity; every profile that carries a stale address, a superseded description or an old trading name introduces a contradiction at a location you personally flagged as authoritative. This is why consistency beats volume in entity building: ten sources that agree outweigh a hundred that almost agree. The instinct to be listed everywhere is the wrong instinct. The right one is to be described identically everywhere you are listed, and to prune or correct the places you cannot keep true.
Entity gaps — the silent killer
Most entity problems are inherited, not created, and they share a defining trait: nothing visibly breaks. No error appears in any console. Rankings soften, citations fail to arrive, the knowledge panel shows someone else or nothing at all — and every conventional audit comes back clean.
The recurring gaps are familiar to anyone who has looked for them. A rebrand where the old name survives in directories, press and profiles, so the engines hold two half-entities instead of one whole one. A merger that never consolidated its identities. An alias problem — legal name, trading name and colloquial abbreviation used interchangeably, each seeding its own partial record. A name collision with a larger business that absorbs every ambiguous mention. And the most common gap of all: a thin about page and absent schema, which together leave the engines to infer identity from scattered fragments — an inference they make cautiously, incompletely, or in favour of a namesake.
An entity gap is silent — nothing breaks, no error shows, the brand simply underperforms everywhere at once. That silence is why the gaps persist for years, and why finding one is routinely the highest-leverage discovery in an AI search assessment.
The sequence for establishing an entity
The work orders itself naturally, because each step feeds the next.
Decide the canonical facts first. One name, written one way. One category noun the business will commit to. One two-sentence description. The definitive list of locations, services and named people. This is a leadership decision, not a technical one, and skipping it guarantees the later steps encode the ambiguity instead of resolving it.
Publish the declaration. Rewrite the about page as declarative fact. Implement Organization schema carrying the canonical facts, with sameAs links to every profile you intend to stand behind. Make the visible page and the markup say the same thing — engines treat divergence between them as a discrepancy, not an oversight.
Reconcile the corroboration. Inventory every place the organisation is described — registries, profiles, directories, partner pages, old coverage — and bring each into line with the canonical facts or retire it. If a rebrand or merger is in the history, this step is the bulk of the work, and it is coordination rather than budget.
Earn the independent layer. Association memberships, press with named authors, event listings, published profiles for key people. This layer compounds slowly, which is the reason to start it immediately rather than last.
Verify from the outside. Search the brand name. Check what the knowledge panel shows, if one exists. Ask the AI engines directly who the business is, what it does and where it operates, and compare their answers to the canonical facts. The method for doing this systematically is set out in the self-audit guide — an entity that is well established returns boring, correct, consistent answers, and boring is the goal.
Why this decides AI citation
Entity work is one factor in a larger frame. The four-factor citation framework — content structure, entity clarity, authority signals, consistency of facts — treats the entity as the second factor, but it is better understood as the one the other three settle on. Structure determines whether your passages can be quoted; the entity determines whether the quote is attributed to you. Authority accrues to an entity, and lands nowhere if the entity is ambiguous. Consistency is largely entity consistency — the same facts about the same thing, everywhere the engines look.
Which is why entity gaps surface so often as the root cause behind a brand that publishes well and still is not cited. The content was never the problem; the engines were never sure who was speaking. Establishing the entity, closing the gaps and verifying how the engines describe the brand is foundational work inside an AI search optimisation engagement — but the first step costs nothing: ask the engines who you are, and see whether the answer is the one you would have given.