How were the models tested for propaganda susceptibility?

The benchmark used 75 questions in three languages covering 14 Russian propaganda narratives, scored 1-5 by a calibrated Claude Opus 4.5 model validated by human experts.

Why did Mistral perform poorly on this benchmark?

Mistral's models, including Medium 3.5, ranked in the bottom third, aligning with a Newsguard study showing a 36.67% misinformation rate—likely due to differences in training data curation and alignment techniques.

Does this benchmark reflect real-world performance?

The benchmark tested base models without web search or tools. In production, systems often use retrieval-augmented generation, which could reduce propaganda susceptibility.

Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

Listen

A stylized abstract illustration of a glowing brain network overlaid on a world map, with red and blue data streams…

AI ResearchScore: 72

Estonian Institute: Claude Tops Russian Propaganda Benchmark, Mistral Trails

Estonian Language Institute benchmark tests 60 AI models vs Russian propaganda. Claude tops, Mistral trails with 36.67% misinformation rate.

AAAla SMITH & AI Research Desk·1d ago·3 min read··6 views·AI-Generated·Report error

Source: the-decoder.comvia the_decoderSingle Source

How susceptible are AI language models to Russian propaganda, according to the new benchmark?

The Institute of the Estonian Language benchmark tested 60 AI models on 75 questions covering 14 Russian propaganda narratives. Anthropic's Claude models scored highest; Mistral models ranked in the bottom third.

TL;DR

60 models tested on 75 questions across 14 propaganda narratives. · Claude models ranked highest; Mistral landed in bottom third. · Mistral's 36.67% misinformation rate aligns with Newsguard study.

The Institute of the Estonian Language tested 60 AI models on 75 questions covering 14 Russian propaganda narratives. Anthropic's Claude models topped the benchmark, while Mistral's flagship models ranked in the bottom third, a finding that undermines the French company's European alternative positioning.

Key facts

60 models tested on 75 questions across 14 propaganda narratives.
Claude models ranked highest; Mistral Medium 3.5 in bottom third.
Mistral's misinformation rate: 36.67% per Newsguard study.
Mistral negotiating €3B funding round at €20B valuation.
Russian network 'Pravda' feeds AI systems millions of disinfo articles.

The Institute of the Estonian Language has released a benchmark measuring how susceptible AI language models are to Russian propaganda, testing 60 models with 75 questions in three languages covering 14 propaganda narratives According to The Decoder. Each answer was scored on a scale of 1 to 5, where 1 means the model repeats Russian talking points. A calibrated Claude Opus 4.5 served as the evaluation model, validated by disinformation experts at the organization Propastop.

Anthropic's Claude models claimed the top spots, followed by Nvidia's Nemotron 3 and Alibaba's Qwen 3.6 Plus. Mistral's models, including the newest Medium 3.5, landed in the bottom third. The models had no access to web search or other tools during testing, so the benchmark only measures how well the language model itself can spot and reject propaganda.

The results align with a Newsguard study that found Mistral had a steady misinformation rate of 36.67 percent. That's a bad look for the French company, which positions itself as a European alternative to US and Chinese providers and is currently negotiating a 3 billion euro funding round at a 20 billion euro valuation. It's especially rough since Mistral's flagship models already struggle to keep up with the competition.

The threat is real. Russian networks like "Pravda" deliberately feed AI systems millions of disinformation articles. And OpenAI recently shut down a Russian campaign that used ChatGPT to spread propaganda ahead of Germany's federal election.

Key Takeaways

Estonian Language Institute benchmark tests 60 AI models vs Russian propaganda.
Claude tops, Mistral trails with 36.67% misinformation rate.

Why Mistral's poor performance matters

Mistral's bottom-third finish is particularly damaging given its stated mission. The company has raised over €1 billion and markets itself as a sovereign European AI provider—a narrative that depends on trust and reliability. A 36.67% misinformation rate, per the Newsguard study, directly contradicts that pitch. For European enterprises and governments considering Mistral for sensitive deployments, this benchmark provides concrete evidence of a vulnerability that competitors have addressed.

The benchmark's limitations

While the benchmark is rigorous—75 questions across three languages, validated by human experts—it tests only base model behavior without retrieval-augmented generation or web search. In production, most systems augment LLMs with external knowledge, which could mitigate propaganda susceptibility. Still, the gap between top and bottom performers suggests fundamental differences in training data curation and alignment techniques.

Table of the top 10 models in the benchmark for detecting Russian disinformation, showing overall and language-specific scores.

What to watch

Watch for Mistral's response to this benchmark ahead of its €3B funding round close. European regulators may cite these findings in upcoming AI safety requirements. Also track whether Russian disinformation networks adapt their tactics to exploit model-specific weaknesses revealed by the benchmark.

Source: the-decoder.com

Sources cited in this article

Newsguard
The Decoder

Source: gentic.news · 1d ago · author=Ala SMITH · citation.json

AI-assisted reporting. Generated by gentic.news from 2 verified sources, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

This benchmark exposes a critical asymmetry in AI safety: Western-aligned models like Claude show strong resistance to propaganda, while European alternatives like Mistral exhibit significant vulnerabilities. The timing is particularly awkward for Mistral, which is actively fundraising on a narrative of European sovereignty and trust. The 36.67% misinformation rate from the Newsguard study corroborates the benchmark results, suggesting a systematic issue rather than a one-off test anomaly. The structural implication is that propaganda resistance may correlate with safety investment—Anthropic has made Constitutional AI and red-teaming central to its development process, while Mistral has prioritized performance benchmarks and open-source distribution. This trade-off between openness and safety guardrails is now quantified in a geopolitical context. Notably, the benchmark's design—testing base models without RAG—means real-world deployments could mitigate these weaknesses. But for high-stakes applications like government information systems or educational tools, the base model's propaganda susceptibility remains a first-order concern. European policymakers should demand transparency on how Mistral addresses this before awarding public sector contracts.

#anthropic #ai safety #benchmark #mistral #propaganda

Compare side-by-side

Institute of the Estonian Language vs Newsguard

→

Mentioned in this article

Anthropic Mistral Institute of the Estonian Language Mistral Medium 3.5 Newsguard The Decoder

Enjoyed this article?

Get the weekly AI intelligence briefing

✨AI Toolslive

Five one-click lenses on this article. Cached for 24h.

Pick a tool above to generate an instant lens on this article.

Products & Launches2 shared topics

Stanford, Meta 'Code as Agent Harness' Paper Rethinks AI Agent Design

From the lab

The framework underneath this story

Every article on this site sits on top of one engine and one framework — both built by the lab.

Original research · EUMAS 2026

MNEMA — A Witness Lattice for Multi-Agent AI Memory

Cryptographic memory units · 1−α detection floor · 15 pp PDF

Field framework · v1.0

Epistemic Infrastructure

12 pillars · 11-stage knowledge metabolism · pathology catalog

More in AI Research

View all

Side-by-side comparison of images generated by vanilla LoRA and Pareto LoRA, with the Pareto LoRA output showing…

AI Research

Pareto LoRA Boosts Image Quality 44.9% vs Vanilla LoRA on Emu2

Pareto LoRA reformulates multimodal instruction tuning as bi-objective optimization, achieving up to 44.9% image quality gains on Emu2 while maintaining text performance.

arxiv.org/15h ago/3 min read

nlpmultimodal modelscomputer vision

Two researchers in a lab analyzing a chart showing cost reduction, with a laptop displaying a graph of annotation…

AI Research

Metric Match Cuts LLM Judge Annotation Cost 32.5% via Subset Selection

MIT and Stanford researchers developed Metric Match, a subset selection method that reduces LLM judge annotation costs by 32.5% and estimation error by 18.7%, achieving a 0.838 win-rate against random selection.

arxiv.org/1d ago/3 min read

paperresearchllm

Researchers analyze fusion strategies on a computer dashboard displaying patient data and survival curves for PE…

AI Research

No single fusion strategy wins

Zhang et al. test 4 fusion strategies on 7K+ patients, finding no universal best. Contrastive alignment with CLMBR wins for PE mortality; cross-attention and co-attention split for CVD.

arxiv.org/1d ago/3 min read

healthcare aimultimodal learningai research

Key Takeaways

Why Mistral's poor performance matters

The benchmark's limitations

What to watch

Sources cited in this article

AI Analysis

✨AI Toolslive

Related Articles

SemiAnalysis: Perplexity Slack Bot Beats Claude in Internal Trial

NVIDIA Blackwell Sweeps MLPerf Training 6.0, GB300 Hits 1.6x Speedup

CoreWeave Trains DeepSeek-V3 in 2 Minutes, Claims MLPerf v6.0 Record

MiniMax M3 Exceeds Human Gold-Medal on Math Benchmarks via MaxProof

Google Open-Sources DiffusionGemma, 26B Model Hits 1K Tokens/Sec on H100

Stanford, Meta 'Code as Agent Harness' Paper Rethinks AI Agent Design

The framework underneath this story

More in AI Research

Pareto LoRA Boosts Image Quality 44.9% vs Vanilla LoRA on Emu2

Metric Match Cuts LLM Judge Annotation Cost 32.5% via Subset Selection

No single fusion strategy wins