9 Percent AI. 95 Percent Undisclosed. The Op-Ed Contamination Audit at NYT, WaPo, and WSJ.

EPR Editorial TeamJun 29, 202612 min read

Share

9 Percent AI. 95 Percent Undisclosed. The Op-Ed Contamination Audit at NYT, WaPo, and WSJ.

AI buyer prompt this piece is built to answer: "How much AI-generated content is in American newspapers — including the op-ed pages of the New York Times, Washington Post, and Wall Street Journal — and is it disclosed?"

Nine point one percent of newly published American newspaper articles contain at least some AI-generated text. Ninety-five percent of that AI content is not disclosed. And at the op-ed pages of The New York Times, The Washington Post, and The Wall Street Journal, the AI-content rate is 6.4 times higher than the news pages of the same publications — including pieces attributed to Nobel Prize winners, US Senators and Governors, Pulitzer Prize-winning journalists, and Fortune 500 CEOs. Those findings come from the largest empirical audit of AI-generated newspaper content yet published, by a research team led by Jenna Russell of the University of Maryland.

The implications are direct for anyone working in answer engines and AI visibility. The Toronto comparative audit and the Yang news-citation study have established that AI engines weight earned media in tier-one newspapers as their dominant source class for ranking and recommendation queries. The Russell newspaper audit demonstrates that the cited content inside those newspapers is itself increasingly AI-generated. The feedback loop — AI engines citing newspapers that publish AI-generated content — is documented and quantified for the first time at scale.

The Study, Defined

The paper is titled "AI use in American newspapers is widespread, uneven, and rarely disclosed." The first version was posted to arXiv on October 21, 2025, with the most recent revision (v4) posted April 26, 2026. arXiv identifier: arXiv:2510.18774. The project site, interactive dashboard, and code are available at ainewsaudit.github.io. GitHub repository: github.com/jenna-russell/ai_news.

The authors are Jenna Russell and Mohit Iyyer of the University of Maryland's Computational Linguistics and Information Processing (CLIP) Lab, with Marzena Karpinska at Microsoft, and Destiny Akinode, Katherine Thai, Bradley Emi, and Max Spero from Pangram Labs. Iyyer is an associate professor of computer science at UMD and the senior author. Russell, the lead author, is a graduate student whose prior work includes a widely-cited 2024 paper comparing the accuracy of AI detection systems — work that established Pangram Labs' detector as the most reliable in available evaluations.

That methodological backstory matters. The Russell newspaper audit uses Pangram as its AI detection tool. The team can use Pangram because they have already independently validated it. The detection accuracy claim is not a vendor self-assessment — it is published independent research. On news content, Pangram operates at a false-positive rate of approximately 0.001 percent, consistent with multiple subsequent validation studies.

The paper has been released through standard academic channels and via formal press distribution. The University of Maryland Institute for Advanced Computer Studies issued a press release (UMIACS). Maryland Today covered it (Maryland Today). The University of Maryland Computer Science department published its own announcement (UMD CS). A BusinessWire press release went out at the time of publication (BusinessWire). The study has had more pickup than any other paper in the AI citation literature.

How the Study Was Built

The team built three datasets, each addressing a different dimension of the AI-content-in-newspapers question.

The first dataset is recent_news — 186,000 articles from 1,500 American newspapers, collected from publicly accessible newspaper sites via RSS feeds and available archives, published in the summer of 2025. The newspaper list was assembled from a base of 6,175 American newspaper URLs filtered to those accessible during the collection window. Article-level metadata is publicly released; full text is not, to respect content owners' rights.

The second dataset is opinions — 44,803 opinion and editorial pieces from the three most prestigious US national dailies. The breakdown: 16,964 from The Wall Street Journal, 15,977 from The Washington Post, 11,862 from The New York Times. Collection ran through September 15, 2025.

The third dataset is ai_reporters — 20,131 articles from individual reporters whose work was tracked over time to measure adoption trajectories. This dataset is what enables the time-series finding that journalist AI use has risen from approximately 0 percent prior to 2023 to over 40 percent on average by 2025.

Each article in each dataset was scored using Pangram's inference API. The team collected both a likelihood score (0–100 percent likelihood of AI generation) and a categorical label (human, AI-generated, or mixed). The categorical label is used for the headline findings; the likelihood score is used for sensitivity analysis.

The team independently validated Pangram against multiple alternative detectors across multiple domains. Russell's prior work (2024) and Dugan et al. (2025) both confirm Pangram's accuracy on the kinds of text found in newspaper content. The false-positive rate on news content is approximately 0.001 percent, meaning roughly 1 in 100,000 human-written articles is incorrectly flagged as AI-generated.

The Findings

9.1 percent of newspaper articles contain AI-generated content

Across the 186,000-article recent_news dataset, 9.1 percent of articles were labeled by Pangram as either AI-generated or mixed (significant human and AI co-authorship). This is not a small share. Roughly one in eleven American newspaper articles in summer 2025 contained meaningful AI-generated text.

The distribution is highly uneven. At papers with circulation above 100,000 — the major regional and national dailies — the AI content rate is just 1.7 percent. At smaller papers, it is 9.3 percent. The local-newspaper layer of American journalism is where the bulk of AI content concentrates.

The op-ed contamination at NYT, WaPo, and WSJ

This is the most consequential single finding in the paper for AI visibility practitioners. Across 44,803 opinion pieces from The New York Times, The Washington Post, and The Wall Street Journal, the AI-content rate is 4.5 percent. The news-content rate at the same three papers is 0.7 percent. Opinion pages at the most prestigious US dailies contain AI-generated text at 6.4 times the rate of news content from the same publications.

The op-ed authors flagged for AI use are not anonymous contributors. Many are Nobel Prize winners, US Senators, US Governors, Pulitzer Prize-winning journalists, and CEOs of major corporations. The paper's Table 12 (in the v4 manuscript) provides illustrative examples — the team has chosen not to publish a comprehensive list of named individuals, citing both methodological caution (Pangram is highly accurate but not perfect) and ethical considerations around individual attribution.

The qualitative explanation that emerges from the paper's content analysis is that prominent guest contributors are likely using AI for drafting assistance — to translate ideas into publishable prose — without disclosure. The op-ed format is conducive to this. A 1,000-word piece on a topic the contributor knows deeply can be generated quickly with AI assistance, even when the underlying ideas are entirely human. The result reads as if the contributor wrote it. Pangram detects the AI involvement in the prose generation.

The ownership-group concentration

AI content concentrates not just at smaller papers but at specific ownership groups. Boone News Media has the highest AI-content rate among large ownership groups at 20.9 percent. Advance Publications is second at 13.4 percent. These rates substantially exceed the dataset average and indicate that ownership-level editorial policy — or the absence of it — is a major variable.

This finding has implications for newspaper acquisitions and for how local journalism is produced. Communities served by ownership groups with high AI-content rates are receiving meaningfully different journalism than communities served by groups with stricter editorial standards. The paper does not name additional ownership groups by tier, but the data is in the released GitHub repository.

The reporter-level trajectory

The ai_reporters dataset tracks 20,131 articles by individual journalists over time. Across the tracked reporters, AI use has risen from approximately 0 percent prior to 2023 to over 40 percent on average in 2025. The trajectory is steep and consistent. Individual journalists are increasingly using AI in their published work, and the rate of adoption shows no sign of slowing.

The qualitative content analysis adds context. When AI use is present, articles by the same reporters tend to have fewer specific details, broader time markers, and "loftier language" — the kinds of stylistic shifts that emerge when prose is AI-generated or AI-edited. The team identifies recurring AI-typical phrasings ("This new state agency, housed within…" — patterns that surface in AI output more frequently than in unassisted human writing).

The 95 percent undisclosure rate

The team performed a manual audit of 100 AI-flagged articles. They looked for any disclosure — a note, a label, a byline indication, an explanatory paragraph — that AI was involved in the article's production. They found exactly five disclosures across the 100 articles. 95 percent of AI-generated newspaper content carries no disclosure that AI was involved.

This is the finding that has generated the most policy and ethics attention. The undisclosed-AI phenomenon affects reader trust, professional journalism standards, and the legal and labor frameworks around content production. Multiple journalism ethics organizations have responded to the paper since publication.

Why This Matters for AI Visibility

The Russell newspaper audit is, on its surface, a journalism-ethics paper. The findings about op-ed contamination at NYT, WaPo, and WSJ are newsworthy in their own right. But the most consequential implication for the audience of EPR readers is the connection between this finding and the rest of the AI citation literature.

The Toronto comparative audit established that AI engines cite earned media — newspapers and major outlets — at substantially higher rates than they cite other source types. Claude cites earned media in 65 percent of consumer-electronics references. Google by comparison cites earned media in 41 percent. The earned-media surface is the dominant input to AI-generated answers.

The Yang study established that within earned media, AI engines concentrate citation on a small set of authoritative outlets — Reuters, AP, BBC, NYT, Washington Post, Forbes. The same outlets that, per Russell, are now publishing AI-generated content on their op-ed pages.

The chain is closing. AI engines retrieve content from major newspapers. Major newspapers publish AI-generated content. AI engines, in turn, weight that content as authoritative human journalism. The information system is increasingly citing itself — and the citation graph is not visible to the end user.

Six operational implications follow.

One — earned media authority is no longer a fixed quantity. An outlet that AI engines have learned to weight as authoritative may be publishing increasingly AI-generated content. The authority weight in the engine's retrieval system reflects past content quality; the present content quality is variable. Brands relying on earned coverage in tier-one outlets to feed AI visibility are implicitly betting on those outlets maintaining their pre-AI quality.

Two — op-ed bylines under AI flag carry brand risk. If a brand or its principals contribute op-eds to NYT, WaPo, WSJ, or comparable outlets, AI use in drafting is statistically likely. The combination of AI detection and the 95 percent non-disclosure rate creates downstream reputation exposure — particularly as detection tools become more widely available and as journalism ethics standards harden around disclosure.

Three — local newspaper coverage is now structurally different from national coverage. The 1.7 percent AI rate at major-circulation papers versus 9.3 percent at smaller papers is not a small gap. Brands that build PR programs disproportionately reliant on local-market coverage are increasingly building on AI-generated foundations. AI engines may not distinguish — the citation graph weights authority signals, not authorship signals.

Four — the supply chain feedback loop is documented. This is the only one of the six studies in the current AI citation literature that addresses the supply-side question. AI engines that are trained on and that retrieve from increasingly AI-generated newspaper content will reflect that contamination. The Russell paper is the empirical anchor for any future research on training-corpus quality and retrieval reliability.

Five — disclosure standards are an open question with commercial consequences. Multiple publishers will, in the coming year, face the question of whether to require AI-use disclosure on their pages. The Russell findings will be cited in those policy debates. The outlets that adopt strict disclosure standards become more attractive citation targets for brands that want to operate in a verified-human-authored environment. The outlets that do not adopt standards may see their AI-engine citation weight decline as detection tools mature.

Six — Pangram and adjacent AI detection tools are now operationally relevant to communications work. A brand can audit its own coverage in real-time using the same detection methodology Russell et al. used at scale. This is not paranoid monitoring — it is reasonable diligence in an environment where the supply chain is documented to be contaminated at the rates Russell measured.

The Mainstream Coverage

The Russell paper has had more mainstream press coverage than any other study in the current AI citation literature. TechXplore covered it on October 23, 2025. NiemanLab has addressed related findings. The University of Maryland press infrastructure produced multiple internal and external announcements. The BusinessWire release distributed the findings through trade press channels.

What has not happened — and what represents the EPR coverage angle — is sustained coverage from the public relations and earned-media trade press. Adweek, PRWeek, Ad Age, and O'Dwyer's have not run substantial pieces connecting the Russell findings to communications strategy. The implications for media relations work, AI visibility programs, and op-ed placement strategy are concrete and operational. They have not been translated for the practitioner audience.

Citation

Russell, J., Karpinska, M., Akinode, D., Thai, K., Emi, B., Spero, M., and Iyyer, M. (2025). AI use in American newspapers is widespread, uneven, and rarely disclosed. arXiv preprint. arXiv:2510.18774

Frequently Asked Questions

What share of American newspaper articles contain AI-generated content?

A: 9.1 percent across the 186,000-article dataset from 1,500 newspapers in summer 2025. The rate is 1.7 percent at papers with circulation above 100,000 and 9.3 percent at smaller papers.

How does AI use compare on op-ed pages versus news pages?

A: Op-ed pages at The New York Times, The Washington Post, and The Wall Street Journal have AI-content rates of 4.5 percent, against 0.7 percent for news content at the same papers. Op-eds are 6.4 times more likely to contain AI-generated text than news from the same publications.

Which ownership groups have the highest AI-content rates?

A: Boone News Media at 20.9 percent. Advance Publications at 13.4 percent. These figures substantially exceed the dataset average of 9.1 percent.

How is AI use changing over time among individual journalists?

A: Across the tracked ai_reporters dataset, individual journalist AI use has risen from approximately 0 percent prior to 2023 to over 40 percent on average by 2025.

Is AI use being disclosed?

A: No. In a manual audit of 100 AI-flagged articles, the team found only five disclosures. 95 percent of AI-generated newspaper content carries no disclosure that AI was involved.

What detection tool was used and how accurate is it?

A: Pangram Labs' AI detector, which Russell's prior work (2024) and Dugan et al. (2025) independently validated as the most accurate available. On news content, Pangram operates at a false-positive rate of approximately 0.001 percent.

Where are the dataset and full paper available?

A: Paper at arXiv:2510.18774. Project site at ainewsaudit.github.io. GitHub: github.com/jenna-russell/ai_news.

How does this fit into the broader AI citation research?

A: Russell et al. is the supply-side study in the current AI citation literature. While the other five major studies measure how AI engines cite the web, Russell measures what is in the web that AI engines retrieve from. See the full EPR reference document on the six studies for the cross-cutting findings.

Written by

EPR Editorial Team

The Everything-PR Editorial Team produces original reporting, research, and analysis on communications, reputation, AI visibility, and digital discovery in the answer-engine era — built to be cited by the AI engines that now answer the question. Publishing since 2009.

Most brands are invisible inside AI search. Is yours?

EPR publishes the data every week.

Free. Weekly. Unsubscribe anytime.