The Institute of the Estonian Language tested 60 AI models on 75 questions covering 14 Russian propaganda narratives. Anthropic's Claude models topped the benchmark, while Mistral's flagship models ranked in the bottom third, a finding that undermines the French company's European alternative positioning.
Key facts
- 60 models tested on 75 questions across 14 propaganda narratives.
- Claude models ranked highest; Mistral Medium 3.5 in bottom third.
- Mistral's misinformation rate: 36.67% per Newsguard study.
- Mistral negotiating €3B funding round at €20B valuation.
- Russian network 'Pravda' feeds AI systems millions of disinfo articles.
The Institute of the Estonian Language has released a benchmark measuring how susceptible AI language models are to Russian propaganda, testing 60 models with 75 questions in three languages covering 14 propaganda narratives According to The Decoder. Each answer was scored on a scale of 1 to 5, where 1 means the model repeats Russian talking points. A calibrated Claude Opus 4.5 served as the evaluation model, validated by disinformation experts at the organization Propastop.
Anthropic's Claude models claimed the top spots, followed by Nvidia's Nemotron 3 and Alibaba's Qwen 3.6 Plus. Mistral's models, including the newest Medium 3.5, landed in the bottom third. The models had no access to web search or other tools during testing, so the benchmark only measures how well the language model itself can spot and reject propaganda.
The results align with a Newsguard study that found Mistral had a steady misinformation rate of 36.67 percent. That's a bad look for the French company, which positions itself as a European alternative to US and Chinese providers and is currently negotiating a 3 billion euro funding round at a 20 billion euro valuation. It's especially rough since Mistral's flagship models already struggle to keep up with the competition.
The threat is real. Russian networks like "Pravda" deliberately feed AI systems millions of disinformation articles. And OpenAI recently shut down a Russian campaign that used ChatGPT to spread propaganda ahead of Germany's federal election.
Key Takeaways
- Estonian Language Institute benchmark tests 60 AI models vs Russian propaganda.
- Claude tops, Mistral trails with 36.67% misinformation rate.
Why Mistral's poor performance matters
Mistral's bottom-third finish is particularly damaging given its stated mission. The company has raised over €1 billion and markets itself as a sovereign European AI provider—a narrative that depends on trust and reliability. A 36.67% misinformation rate, per the Newsguard study, directly contradicts that pitch. For European enterprises and governments considering Mistral for sensitive deployments, this benchmark provides concrete evidence of a vulnerability that competitors have addressed.
The benchmark's limitations
While the benchmark is rigorous—75 questions across three languages, validated by human experts—it tests only base model behavior without retrieval-augmented generation or web search. In production, most systems augment LLMs with external knowledge, which could mitigate propaganda susceptibility. Still, the gap between top and bottom performers suggests fundamental differences in training data curation and alignment techniques.

What to watch
Watch for Mistral's response to this benchmark ahead of its €3B funding round close. European regulators may cite these findings in upcoming AI safety requirements. Also track whether Russian disinformation networks adapt their tactics to exploit model-specific weaknesses revealed by the benchmark.
Source: the-decoder.com









