Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

Satellite image of patchwork agricultural fields in various shades of green and brown, with geometric boundaries…
AI ResearchScore: 70

Prithvi-EO Fails Cross-Country Crop Yield Generalization, Paper Shows

Prithvi-EO and ViT-Base embeddings yield universally negative R² under cross-country maize yield prediction, failing to beat traditional spectral features due to yield distribution shift.

·7h ago·3 min read··5 views·AI-Generated·Report error
Share:
Source: arxiv.orgvia arxiv_mlSingle Source
Do foundation model embeddings improve cross-country crop yield generalization?

Prithvi-EO-1.0-100M and ViT-Base embeddings yield universally negative R² values under leave-one-country-out evaluation on 6,404 maize field observations from five African countries, failing to outperform traditional Sentinel-2 spectral features.

TL;DR

Prithvi-EO and ViT-Base yield negative R² cross-country. · Leave-one-country-out reveals generalization gap in yield prediction. · Yield distribution shift, not representation, is key limitation.

A new arXiv paper from April 2026 finds Prithvi-EO and ViT-Base embeddings yield universally negative R² under cross-country maize yield prediction. The study evaluates 6,404 field observations across five African nations using a leave-one-country-out scheme.

Key facts

  • 6,404 maize field observations from five African countries.
  • Prithvi-EO / Ridge achieves least-negative LOCO R² of −0.027.
  • All nine feature–regressor combinations yield negative cross-country R².
  • Within-country random CV yields moderate R²; cross-country collapses.
  • Paper argues yield distribution shift, not representation, is the limit.

Geospatial foundation models are marketed as universal feature extractors for Earth observation tasks, but a rigorous generalization test out of sub-Saharan Africa shows they fail to transfer across national boundaries. The paper, Do Foundation Model Embeddings Improve Cross-Country Crop Yield Generalisation? A Leave-One-Country-Out Evaluation in Sub-Saharan Africa [arXiv], tests Prithvi-EO-1.0-100M (a NASA-developed Vision Transformer pretrained on satellite imagery) and ViT-Base against traditional Sentinel-2 spectral indices.

The core finding: every feature-regressor combination achieves negative R² under leave-one-country-out (LOCO) cross-validation. Within-country random splits yield moderate R², but the moment the model must predict on an unseen country, performance collapses. The best result comes from Prithvi-EO with Ridge regression, scoring −0.027 R². That means the models are worse than simply predicting the mean yield of the target country.

Why Foundation Models Don't Help

The paper's unique take: the bottleneck is not representation quality but a shift in yield distribution between countries. Even frozen Prithvi-EO embeddings, which encode rich spatial-spectral features, cannot compensate for the fact that maize yields in Kenya follow a different distribution than those in Tanzania. The authors argue that most published benchmarks overstate generalization by reporting only within-country performance.

This echoes a broader pattern in applied ML: foundation models excel when the test distribution closely matches the training distribution, but their value diminishes under severe covariate shift. The paper releases a reproducible negative benchmark — a rare and valuable contribution for a field that tends to publish only positive results.

Implications for Food Security AI

Accurate cross-country yield forecasting is critical for food security planning in sub-Saharan Africa, where smallholder maize farming dominates. The negative result suggests that purely satellite-based models, even with foundation model embeddings, cannot replace ground-truth yield surveys or country-specific calibration. Future work must either collect more representative training data or develop methods to handle distribution shift explicitly.

The study joins a growing body of work showing that foundation models for Earth observation are not silver bullets. A prior paper from April 2026 [arXiv] evaluating nine pretrained audio models for music recommendation similarly found that pretraining does not guarantee cross-domain transfer.

What to Watch

Watch for follow-up work that attempts to close the generalization gap — either through domain adaptation techniques, multi-task learning across countries, or integration of non-satellite data sources like soil surveys and market prices. The authors' released benchmark provides a standardized evaluation protocol for future methods to beat.

What to watch

ibm-nasa-geospatial/Prithvi-EO-1.0-100M-multi-temporal-crop ...

Watch for follow-up papers using domain adaptation or multi-task learning to close the LOCO generalization gap on the released benchmark. Also monitor whether NASA or IBM adjust Prithvi-EO training to include more diverse geographic yield data.

Figure 7: Yield distributions (kg/ha) per country. Nigeria and Rwanda exhibitmarkedly different central tendency and sp


Source: gentic.news · · author= · citation.json

AI-assisted reporting. Generated by gentic.news from multiple verified sources, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

This paper delivers a clean negative result that cuts against the hype around geospatial foundation models. The leave-one-country-out design is the right evaluation — most benchmarks use random splits that inflate generalization claims. The key insight is that distribution shift in the target variable (yield) dominates any gains from better representations. This mirrors findings in other domains: foundation models transfer well when the task is consistent but struggle when the output distribution changes. The authors' decision to release a reproducible benchmark is commendable and should become standard practice. The field needs more such papers to ground expectations.
Compare side-by-side
Prithvi-EO-1.0-100M vs ViT-Base
Enjoyed this article?
Share:

AI Toolslive

Five one-click lenses on this article. Cached for 24h.

Pick a tool above to generate an instant lens on this article.

Related Articles

From the lab

The framework underneath this story

Every article on this site sits on top of one engine and one framework — both built by the lab.

More in AI Research

View all