Last week Encyclopedia Britannica filed suit against OpenAI in Manhattan federal court, alleging that ChatGPT trained on nearly 100,000 of its articles without permission and now reproduces them verbatim. Merriam-Webster joined the complaint. Their argument: OpenAI took the content that built one of the most trusted reference brands in history, fed it to an AI, and then used that AI to cannibalize the traffic Britannica depended on to survive.

Every major outlet called this a copyright story.

They’re right that it’s a lawsuit. They’re wrong about what it reveals.

What Britannica is actually saying

The complaint, filed in the Southern District of New York on March 13, 2026, is layered. There’s the copyright claim: OpenAI scraped our material without authorization. There’s the trademark claim: ChatGPT generates hallucinations and attributes them to Britannica, damaging a brand built on 258 years of editorial accuracy. And underneath both claims is an economic argument that doesn’t quite fit the legal framework but is the most honest description of what happened.

Britannica built its business on a specific form of authority. Researchers, editors, and fact-checkers spent generations making Britannica the source you cite when you need something to actually be true. That editorial credibility — earned over more than two centuries through verifiable accuracy and institutional reputation — is what makes Britannica’s content worth scraping in the first place.

OpenAI needed that credibility as training signal. Its model didn’t just consume Britannica’s words. It consumed Britannica’s authority. The “reliable, accurate, unbiased content” that Britannica’s complaint describes isn’t just a content asset. It’s the specific property that AI training requires to produce trustworthy outputs.

Britannica is arguing: you can’t take the thing that makes our brand valuable, industrialize it into an AI system, and then use that system to replace the brand you took it from. The legal framing is copyright. The underlying logic is about what constitutes genuine editorial authority and who actually built it.

The AI selection problem Britannica is accidentally documenting

Here’s what the copyright framing buries: AI engines don’t cite content at random. They preferentially index, weight, and surface content from sources that carry institutional credibility. The same editorial rigor that made Britannica worth suing over is the reason Britannica appears in AI training sets at all.

According to Muck Rack’s “What is AI Reading?” study, which tracked over 1 million AI prompts, more than 85% of non-paid citations in AI-generated responses come from earned media sources — content whose authority was established through real editorial standards, not optimization tactics. The Ahrefs analysis of ChatGPT’s most-cited pages found that 65.3% come from domains with a Domain Rating above 80. The Princeton/Georgia Tech GEO research confirmed that citing credible sources increases a piece’s probability of being cited by AI, while adding statistics alone improves AI visibility by 30-40%.

In plain terms: AI engines trust what the editorial ecosystem already trusted. Britannica’s 258-year institutional track record is precisely why its content ended up in OpenAI’s training set. The lawsuit is documenting, inadvertently, the causal chain between editorial authority and AI citation.

That chain runs in both directions. Britannica was scraped because it had earned credibility. Brands that have not earned credibility in editorially rigorous publications are not in the training sets. They are not being cited. They are not being reproduced, verbatim or otherwise, in AI responses. They are simply absent.

What brands are getting wrong when they watch this case

Most companies watching the Britannica lawsuit are asking the wrong question. They’re asking: should I protect my content from being scraped?

The real question is: do I have the kind of content worth scraping?

Britannica is in this fight because it had decades of genuine editorial authority before any AI existed. The lawsuit is, in a peculiar way, proof of that authority. OpenAI didn’t scrape your company’s blog. It scraped the sources it needed to build a model capable of sounding credible. The difference between what got scraped and what didn’t is the difference between earned authority and content production.

This matters for how you think about AI visibility. The Moz analysis of 40,000 queries found that 88% of Google AI Mode citations do not appear in organic top-10 search results. Google’s AI Overviews now appear in more than 25% of all searches, according to Google Search Central, up from 13% a year ago. These AI systems are pulling from the same editorial ecosystem that Britannica built its reputation inside — and the brands that appear in AI answers are the ones that appear in credible publications, not the ones that have the most content on their own domains.

According to research by Zhang et al. published on arXiv in December 2025, 37% of domains cited by AI systems do not appear in traditional search results at all. The editorial signal that gets you into AI answers is different from the SEO signal that gets you onto Google’s first page. Britannica’s lawsuit is, unintentionally, a proof of concept for this distinction.

The lawsuit reveals a window, not just a warning

The content owners suing AI companies — Britannica, The New York Times, the Canadian newspapers — are in this fight because they built authority that AI engines find credible. Their legal argument is that they deserve compensation. Their actual predicament is that the same editorial infrastructure that made them worth citing is now being used to reduce the need to visit them.

For everyone else, the lesson cuts the other way: the window to build that kind of authority is open right now.

AI engines are not done forming their understanding of which brands are credible, which sources are authoritative, and which voices are worth surfacing when someone asks a category-level question. That formation process happens through training data and real-time retrieval, both of which are heavily weighted toward earned media in publications with decades of editorial credibility.

The Britannica case makes explicit what has always been true: the brands that end up in AI answers got there because real editors at credible publications decided their story was worth publishing. No optimization layer replaces that decision. No structured data schema substitutes for an editorial relationship that produces genuine coverage in a source AI engines already treat as authoritative.

This is what Machine Relations names as a discipline: the work of earning citations in the publications that AI systems use to build their understanding of the world. Britannica has that coverage. Most companies don’t. The gap between them isn’t technical — it’s editorial.

The court case will take years. The window won’t.

If you want to see where your brand currently appears when AI systems research your category — which publications are being cited, where you’re absent, and what it would take to show up — the visibility audit runs in fifteen minutes.


Sources

  1. Reuters, “Encyclopedia Britannica sues OpenAI over AI training” (March 16, 2026) — reuters.com
  2. TechCrunch, “The dictionary sues OpenAI” (March 16, 2026) — techcrunch.com
  3. Muck Rack, “What is AI Reading?” study — generativepulse.ai
  4. Ahrefs, ChatGPT citation analysis — ahrefs.com
  5. Princeton/Georgia Tech GEO paper, Aggarwal et al. (SIGKDD 2024) — arxiv.org
  6. Moz, 2026 AI Mode analysis (40,000 queries) — moz.com
  7. Zhang et al., AI citation behavior (arXiv, December 2025) — arxiv.org