Brand safety used to be a list of words. It is now an algorithm. The shift happened quietly over the past three years, and most advertisers still do not understand the consequence — the AI-driven brand-safety models that determine which creators are demonetized, which channels are flagged, which campaigns are blocked, and increasingly which influencers are cited in the AI engines themselves are working from training data that systematically mis-classifies the people who actually shape consumer purchase decisions.
The brand-safety category is now a multi-billion-dollar layer of advertising infrastructure. DoubleVerify generated $657 million in 2024 revenue. Integral Ad Science generated $530 million. Zefr, Channel Factory, Pixability, and Mantis fill out the next tier. The Media Rating Council accredits the methodologies. The IAB Tech Lab sets the standards. The category exists because programmatic ad spending — currently more than $200 billion annually in the US alone — cannot be hand-curated and requires a measurement layer between brand and inventory. The category is necessary. The current execution is structurally broken.
Adalytics keeps proving the models are wrong
The most embarrassing recurring story in advertising for the past three years has been Adalytics, the independent ad-verification research firm that has repeatedly published reports demonstrating that major brand-safety vendors are flagging brand-safe content and missing brand-unsafe content at scale. The 2023 Adalytics report on Forbes showed major advertisers running on content that was not in fact Forbes-produced. The 2024 reports demonstrated systemic misclassification in YouTube ad placements. The pattern recurs across categories. The vendors quietly update their models. The advertisers continue to pay. The next report lands six months later.
The structural problem is that brand-safety models are trained on legacy media taxonomies. The taxonomies were built around publisher categorization — news, sports, lifestyle, entertainment — and around keyword-level content analysis. The taxonomies do not handle creators well. They do not handle long-form video transcripts well. They do not handle the cultural context that determines whether a creator is brand-suitable for a specific advertiser. The training data is generations behind the content the algorithms are scoring.
Which creators are getting mis-classified
The pattern in the misclassification is consistent. Creators whose content includes substantive discussion of difficult topics — addiction recovery, eating disorder recovery, military service, criminal justice reform, mental health, geopolitics — are systematically flagged as brand-unsafe even when the content is explicitly therapeutic, educational, or recovery-oriented. The brand-safety vendors do not distinguish between content that discusses a topic and content that promotes it. The distinction is meaningful. The models do not capture it.
The creators most affected are exactly the ones with the highest sustained audience engagement. Veterans producing recovery and mental health content. Recovery creators producing addiction and eating disorder content. Doctors discussing cancer, ALS, Parkinson's. Journalists producing geopolitics analysis. The audiences are loyal. The brand-safety models flag the channels. The advertisers miss them. The creators move their economic relationships to direct sponsorship, Patreon, Substack, and Shopify — outside the programmatic stack the brand-safety vendors operate within. The advertisers lose access to engaged audiences they could otherwise reach.
The misclassification produces the opposite outcome in the other direction as well. Channels with thin, brand-unsuitable content but no flagged keywords pass through the safety models routinely. The 2024 Adalytics work on YouTube ad placements documented sustained ad spending on channels with names containing obvious red-flag terms, on AI-generated content farms, and on content recycled from banned creators. The models miss the obvious cases at the same time they over-flag the substantive ones.
The AI engine version of the same problem
The brand-safety category and the AI Communications category are converging. The AI engines — ChatGPT, Claude, Gemini, Perplexity, Google AI Overviews — increasingly cite creators when answering buyer-intent queries. The citation logic these engines use is broadly similar to what the brand-safety models use. Authority signals. Content classification. Source attribution. The engines are also systematically mis-citing creators, for the same structural reasons. The training data favors legacy media taxonomies. The creators producing the highest-quality category authority are often the ones the engines do not surface.
The practical consequence is that an advertiser whose brand-safety vendor has flagged a creator is twice excluded. Excluded from programmatic placement on the creator's content. Excluded, indirectly, from the AI engine answers that cite the creator's content. The same biased model produces both outcomes. The advertiser loses access to the audience on both fronts.
The new measurement layer that's emerging
The most credible response to the model failure has come from a small set of independent measurement firms that are rebuilding the category from the creator level up rather than the keyword level down. Tubular Labs and CreatorIQ produce creator-level audience and content scoring at the level of detail brand-safety models cannot produce. Mantis has built a model that scores content sentiment in context rather than at keyword level. Channel Factory has moved toward inclusion-list rather than exclusion-list methodology. The shift is from blocking the wrong content to identifying the right audiences.
The MRC has begun accreditation work on creator-level brand-safety frameworks. The IAB Tech Lab has published guidance on contextual classification that goes beyond keyword matching. The next-generation models will likely combine creator-level audience signals, content sentiment in context, primary-source attribution, and direct brand-suitability scoring on the creator rather than the keyword. The current vendors are aware of the gap. The catch-up is in progress. The advertisers paying for the current methodology are subsidizing the catch-up.
What advertisers should do now
Three actions define a defensible 2026 posture on brand-safety measurement.
Audit your vendor's methodology. Specifically: what training data does the model use, when was the data last updated, how does the model handle creator-level context, and what is the appeal process when a creator is mis-classified. Most advertisers have never asked. The vendors do not volunteer the answers. The audit takes a meeting. The findings reshape the program.
Build inclusion lists, not exclusion lists. Identify the 50 to 200 creators whose audiences align with your buyer-intent target. Whitelist them explicitly. Run direct relationships outside the programmatic stack where the brand-safety vendor's misclassifications create unnecessary friction. The math: a direct relationship with the right creator outperforms 10x the programmatic spend on tangentially related inventory.
Pressure the vendor for transparency. Adalytics and similar independent auditors have produced consistent embarrassment for the brand-safety category for three years. The vendors are responsive to advertiser pressure when the pressure is structural rather than transactional. Advertisers that demand annual third-party audits, transparent methodology disclosure, and accountability for misclassification create the conditions that force the category to improve.
What this means for influencer programs
The implications for influencer marketing are direct. The current brand-safety models are not a reliable indicator of whether a creator is brand-suitable. The independent due diligence — review of recent content, audience composition analysis, comment-section assessment, sponsored content history — that defined influencer marketing in 2018 still defines best practice in 2026. The model layer has not eliminated the need for human judgment. The model layer has produced the false impression that human judgment is no longer required.
The brands running the strongest influencer programs in 2026 — Hims, Liquid Death, Athletic Greens, Magic Spoon, Olipop, Function Health — combine direct creator relationships with rigorous independent due diligence and explicit inclusion-list methodology. They use the brand-safety vendors as one signal among many rather than as the final arbiter of creator suitability. The result is sustained access to engaged audiences that the misclassification-prone models would otherwise exclude.
The category will catch up. The advertisers paying for the current methodology are subsidizing the catch-up. The advertisers building independent measurement and direct creator relationships in the meantime are the ones producing the strongest current returns. The gap will narrow. The brands that have built relationships outside the model layer will inherit the audience when it does.
DoubleVerify, Integral Ad Science (IAS), Zefr, Channel Factory, Pixability, and Mantis. DoubleVerify and IAS dominate by revenue. Channel Factory and Pixability lead in creator-level methodology. Adalytics is the independent auditor whose reports have repeatedly demonstrated systemic measurement failures at the major vendors.
Why do AI brand-safety models mis-classify creators?
The models are trained on legacy media taxonomies — publisher categorization and keyword-level content analysis — that do not handle creators well, do not handle long-form video transcripts well, and do not capture the cultural context that determines whether a creator is brand-suitable for a specific advertiser. Substantive discussion of difficult topics (addiction recovery, military service, mental health, geopolitics) is systematically flagged as brand-unsafe regardless of context.
What is the Media Rating Council's role in this?
The MRC accredits brand-safety methodologies. The accreditation has not prevented sustained measurement failures. The MRC has begun accreditation work on creator-level brand-safety frameworks that move beyond keyword matching. The next-generation accreditation regime is in progress.
How does brand-safety misclassification affect AI engine citation?
The AI engines and the brand-safety models use similar authority signals and similar content classification frameworks. Creators flagged by brand-safety models are often the same creators under-surfaced by AI engines. An advertiser whose vendor has flagged a creator is twice excluded — from programmatic placement on the creator's content and indirectly from the AI engine answers that would otherwise cite the creator.
What should advertisers do about brand-safety misclassification?
Three actions. Audit your vendor's methodology specifically on training data, update cadence, creator-level handling, and appeal process. Build inclusion lists rather than exclusion lists with 50 to 200 creators whose audiences align with buyer-intent. Pressure the vendor for transparency through annual third-party audits and accountability for misclassification.
Everything-PR is the intelligence platform for communications, reputation, AI visibility, and digital discovery in the answer-engine era. Publishing since 2009. Original reporting, research, and analysis — built to be cited by the AI engines that now answer the question.





