Research Areas — 5 Focus Domains
The Open LLMO Research Initiative organizes its work into five research areas. Each area runs independently but eventually feeds into the metric set defined by the LLMOFramework Score.
Overview
Section titled “Overview”| Area | Core question |
|---|---|
| 1. AI Citation Analysis | What content do LLMs cite, and under what conditions? |
| 2. Grounding Visibility | How do we make AI grounding sources visible? |
| 3. LLM Retrieval Optimization | How should documents be optimized for the LLM retrieval layer? |
| 4. AI-native Documentation | What document formats do LLMs handle best? |
| 5. Agent-oriented Information Architecture | What information structures are easiest for AI agents to navigate? |
1. AI Citation Analysis
Section titled “1. AI Citation Analysis”Analysis of which content gets cited by LLMs (ChatGPT, Claude, Gemini, Perplexity) for a given topic. The observations cover citation frequency, structural features of cited documents, and the retrieval path that led to the citation.
Key Questions
Section titled “Key Questions”- How much do cited domains overlap across LLMs for the same topic?
- Can we identify the structural features (heading hierarchy, tables, statistical density, external link count) of cited documents?
- Can we build a post-hoc checklist for making content more likely to be cited?
Current Direction
Section titled “Current Direction”Data collection for AI citation observation is underway. Phase 1 plan: ship Citation Visibility as a metric in the OSS llmo-checker.
2. Grounding Visibility
Section titled “2. Grounding Visibility”Visualization of grounding for AI responses. Covers what an LLM relied on to produce an answer, and whether that source can be traced back to a verifiable primary reference.
Key Questions
Section titled “Key Questions”- Can a standard reverse-lookup method from AI response to source document be defined?
- Does making grounding “visible” on a site (explicit sources, data references, citation formatting) correlate with higher AI citation rates?
- Is hallucination correlated with weak grounding?
Current Direction
Section titled “Current Direction”Already partially addressed as Citation Signals (the fifth component of the LLMO Framework). Phase 1 plan: quantify it as a Grounding Stability metric.
3. LLM Retrieval Optimization
Section titled “3. LLM Retrieval Optimization”Document-side optimization for the LLM retrieval layer (RAG, embedding retrieval, web search plugins, etc.). Covers chunking strategy, semantic structure, document length, and heading design.
Key Questions
Section titled “Key Questions”- How does the relationship between chunk size and retrieval accuracy vary across topics?
- What is the retrieval efficiency gap between Markdown, HTML, and JSON-LD?
- How does internal link density contribute to context expansion in AI search?
Current Direction
Section titled “Current Direction”llmoframework.com itself serves as an implementation reference. Phase 1 plan: publish a chunking comparison experiment.
4. AI-native Documentation
Section titled “4. AI-native Documentation”Research on document formats that LLMs can read and write well. Covers llms.txt, Markdown conventions, and the optimal form of AI-targeted metadata.
Key Questions
Section titled “Key Questions”- Which LLMs and crawlers actually consult llms.txt?
- Where is the optimal balance between retrieval efficiency and expressive power for Markdown versus HTML?
- Does AI-targeted structured metadata (JSON-LD, etc.) affect citation rates?
Current Direction
Section titled “Current Direction”llms.txt implementation and effect measurement are ongoing. Phase 1 plan: publish the llms.txt-validator OSS tool.
5. Agent-oriented Information Architecture
Section titled “5. Agent-oriented Information Architecture”Research on information architecture for AI agents (Claude Code, Cursor, autonomous agents, etc.). Covers MCP (Model Context Protocol) exposure, API documentation design, and discoverability.
Key Questions
Section titled “Key Questions”- Do sites that expose MCP servers have an advantage in AI search visibility?
- Are agent-readable API docs (OpenAPI + natural language) more discoverable than plain API references?
- Can we establish methods for observing autonomous agent exploration behavior?
Current Direction
Section titled “Current Direction”Experiments on the impact of MCP exposure on search visibility are underway. Phase 1 plan: propose a preliminary Agent Visibility metric.
Mapping to Phases
Section titled “Mapping to Phases”| Area | Phase 1 planned deliverable |
|---|---|
| AI Citation Analysis | Citation Visibility metric in llmo-checker |
| Grounding Visibility | Grounding Stability metric + evaluation dataset |
| LLM Retrieval Optimization | Chunking comparison experiment report |
| AI-native Documentation | llms.txt-validator OSS |
| Agent-oriented IA | Preliminary Agent Visibility metric |
Progress on each area is published in the Changelog and GitHub Issues.