Pular para o conteúdo

6. Coherence Signals

Este conteúdo não está disponível em sua língua ainda.

Boundary with Structural Formatting: Structural Formatting asks “is each surface individually well-formed?” — valid JSON-LD, semantic HTML, scoped schema. Coherence Signals asks “do the surfaces agree with each other?” — same address in HTML and Markdown, same numbers in llms.txt and the homepage profile, no two Organization entities with conflicting fields. A page can pass Structural Formatting and still fail Coherence: each block is valid, but together they tell two stories.

Coherence Signals measure whether your content tells the same story across every surface that an AI agent can read. A modern LLMO-optimized site exposes facts through many channels:

  • HTML page body (visible to humans + AI crawlers)
  • JSON-LD structured data
  • llms.txt and llms-full.txt
  • /ai/*.md and URL.md endpoints (e.g. /company.md)
  • OG/Twitter meta tags
  • Sitemap, robots.txt, hreflang declarations

When the same fact (a number, an address, a service catalog, a publication date) appears differently in two of these surfaces, an AI system that draws from both gets confused. The model may pick whichever value it weighs more heavily, surface a stale figure, or refuse to cite at all because the conflict signals low quality.

Coherence is the LLMO discipline of guaranteeing single source of truth across every surface.

Citation accuracy depends on convergent evidence. When a model retrieves your content from multiple paths and the values agree, confidence rises and the citation is shipped to the user. When the values disagree, several failure modes appear:

  • Lower citation rate — the model defers to a source where internal evidence is consistent.
  • Wrong fact cited — if the AI picks the older variant from /ai/founder.md, your homepage’s updated number never reaches the user.
  • Hallucination amplification — when surfaces conflict, the model is more likely to interpolate a “compromise” answer that matches neither.
  • Authority erosion — savvy AI re-rankers (Perplexity, AI Overviews) compare cross-references; conflicting self-references read as low quality.

A 2024 self-audit of Propel-Lab found that the same author profile claimed both 4 books / 39,000+ Qiita PV (in /ai/founder.md, llms-full.txt) and 14 books / 80,000+ Qiita PV (in the homepage profile component) — an active contradiction that had been served to AI crawlers for months.

1. Designate a single source for each fact

Section titled “1. Designate a single source for each fact”

For every numeric or factual claim, name one file as the canonical source. Every other surface imports or quotes it.

FactCanonical sourceConsumers
Book count, PV statssrc/data/profile.tsProfile component, /ai/founder.md, llms-full.txt, JSON-LD
Service catalogsrc/data/services.ts/products/, JSON-LD Service[], /ai/services.md, llms.txt
Address, founding datesrc/data/company.tsFooter, /company.md, JSON-LD Organization, llms-full.txt
FAQ itemssrc/lib/faq-schema.tsFAQ component, JSON-LD FAQPage, /faq.md

The pattern is content collection or typed data module → templates and static endpoints both pull from it.

2. Generate AI surfaces from the same source as HTML

Section titled “2. Generate AI surfaces from the same source as HTML”

Don’t hand-write llms.txt or /ai/*.md if their content already exists in typed data:

src/pages/products.md.ts
import { services } from '../data/services';
export const GET: APIRoute = async () => {
const markdown = services
.map((s) => `## ${s.name}\n\n${s.summary}\n\n— Target: ${s.target}`)
.join('\n\n---\n\n');
return new Response(markdown, {
headers: { 'Content-Type': 'text/markdown; charset=utf-8' },
});
};

The HTML view, the JSON-LD Service[], and /products.md all originate from services. Drift becomes structurally impossible.

3. Treat URL canonicalization as a coherence concern

Section titled “3. Treat URL canonicalization as a coherence concern”

https://www.example.com/ and https://example.com/ are two strings to a string-matching crawler. Pick one canonical host, then enforce it:

  • <link rel="canonical"> on every page
  • og:url, JSON-LD url, sitemap entries — same host
  • /ai/*.md, llms.txt references — same host
  • Internal links — relative or canonical-absolute, never the alternate host

A common bug is forgetting /ai/*.md files when migrating from www. to apex (or vice versa). The rest of the site is canonical, and the Markdown surfaces silently leak the wrong host to AI.

4. Treat trailing-slash policy as a coherence concern

Section titled “4. Treat trailing-slash policy as a coherence concern”

If your host normalizes /blog/post/blog/post/ with a 301, every internal link should already include the slash. Mixed forms produce:

  • Wasted crawl budget on redirects
  • Conflicting canonical signals during the redirect window
  • Broken hreflang (some declared with slash, some without)

Pick a policy at the framework level (Astro trailingSlash: 'always' or 'never') and grep your repo to ensure no offenders remain.

Add a CI step that grep’s for the same numeric or string claim across surfaces and asserts equality:

Terminal window
# Will fail if any source has the old book count
! grep -rn "4 books\|4冊\|Kindle著者: 4" public/ src/data/ src/content/

Even simpler: a JSON-LD validator that parses both inline <script> and any standalone .jsonld file and asserts they agree on shared @id values.

A version number is a fact in the LLMO sense — a claim about your site that an AI can quote. If package.json says 1.2.0, src/data/version.ts says 1.1.0, the changelog page says v1.2.0 in English but v1.1.0 in Japanese, and the latest git tag is v1.1.0, the site is contradicting itself across five surfaces about the same fact.

This is not theoretical. The framework you are reading shipped exactly that drift in v1.2.0; the self-audit case study records what happened.

The pattern that prevents it:

  1. Generate as many version surfaces as possible from one source. A bump script that updates package.json + a typed data module + the changelog markdown together is mandatory infrastructure for any framework that claims coherence as a value.
  2. Make the version visible at run time, not just in metadata. A footer that displays v{VERSION} reading from the typed data module turns build-time drift into immediate user-facing feedback. A maintainer running npm run build will see the discrepancy on every page.
  3. Gate the release on cross-checks. A CI step that reads package.json version and grep’s for it in CHANGELOG.md, src/data/version.ts, and the changelog page should exit non-zero if any disagrees.
  4. Run a read-only AI second-pass review before tagging. Cost is a few cents in API tokens; benefit is catching the irony before users do.

The release process is the framework’s content surface speaking to AI in real time. Treat it as one.

7. Avoid duplicate JSON-LD entities for the same @id

Section titled “7. Avoid duplicate JSON-LD entities for the same @id”

The most common silent failure: the layout emits Organization with one address, and a per-page snippet emits another Organization with a different address. Both make it to HTML. The crawler parses both. The trust score for the page drops.

Fix: assign each entity an @id at the framework level (https://example.com/#org, #founder, #website) and reference by @id everywhere else. Any duplicate becomes obvious in code review.

❌ Drift across surfaces:

/ai/founder.md
- Publishing: Kindle author of 4 books
- Technical Writing: 39,000+ PV on Qiita
// src/components/Profile.astro (rendered to homepage)
<p>Kindle 14冊・Qiita 80,000+ PV</p>
// JSON-LD on /
{ "@type": "Person", "name": "Ken Imoto" /* no current numbers */ }

Three surfaces, three different stories. An AI quoting /ai/founder.md reports stale numbers; an AI quoting the HTML reports current numbers; the JSON-LD provides no help in resolving the conflict.

✅ Single source:

// src/data/profile.ts — canonical
export const profile = {
highlights: [
'Kindle author: 14 books',
'Qiita: 80,000+ PV',
],
};
<!-- Profile component -->
{profile.highlights.map(h => <li>{h}</li>)}
src/pages/founder.md.ts
return new Response(
`# Founder\n\n${profile.highlights.map(h => `- ${h}`).join('\n')}`,
{ headers: { 'Content-Type': 'text/markdown' } }
);

One value lives in one place. The HTML view, the AI Markdown endpoint, and the JSON-LD all evolve together.

  • Every factual claim (numbers, addresses, dates, catalogs) has exactly one canonical source file
  • AI-only surfaces (llms.txt, /ai/*.md, URL.md endpoints) are generated from the same data as the HTML, not hand-maintained in parallel
  • Canonical host is consistent across <link rel="canonical">, og:url, JSON-LD, sitemap, and Markdown surfaces
  • Trailing-slash policy is set at the framework level and reflected in every internal link
  • No two JSON-LD blocks describe the same entity with different values; entities use stable @id for cross-page references
  • CI checks for cross-file drift on key metrics (book counts, PV stats, service lists)
  • Periodic two-pass audit (self review → second-opinion AI review) catches drift between releases — see LLMO Audit: Two-Pass Review