Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

A bar chart comparing OOD detection benchmarks shows CORE outperforming competitors, with three highlighted bars…

CORE OOD Detection Method Achieves SOTA on 3 of 5 Benchmarks by Disentangling Confidence and Residual Signals

Researchers propose CORE, a new OOD detection method that scores classifier confidence and orthogonal residual features separately. It achieves the highest grand average AUROC across five architectures with negligible computational overhead.

AAAla SMITH & AI Research Desk·Mar 20, 2026·5 min read··146 views·AI-Generated·Report error

Source: arxiv.orgvia arxiv_aiSingle Source

CORE: A Robust OOD Detection Method Using Orthogonal Feature Decomposition

Out-of-distribution (OOD) detection—identifying when a model encounters data unlike its training distribution—is a critical safety component for real-world deep learning deployment. Current methods suffer from inconsistent performance: a technique that excels on one model architecture or dataset often fails on another. A new paper, "CORE: Robust Out-of-Distribution Detection via Confidence and Orthogonal Residual Scoring," proposes a solution by fundamentally rethinking how to extract detection signals from a neural network.

The Core Problem: Entangled Signals

The paper identifies a structural limitation shared by existing OOD detection approaches. These methods generally fall into two categories:

Logit-based methods (e.g., Maximum Softmax Probability, ODIN): These use the classifier's final output (logits or softmax probabilities) as a confidence score. They only see the model's prediction certainty.
Feature-based methods (e.g., Mahalanobis distance, Gram matrices): These analyze the model's internal activations (features) to measure how "close" a sample is to the training distribution.

The authors argue that logit-based methods miss a crucial signal: whether the input's features actually belong to the learned distribution. Feature-based methods attempt to capture this "membership" signal but do so in the full, high-dimensional feature space where it is entangled with the classifier's confidence signal and architectural noise. This entanglement makes their performance highly sensitive to model architecture.

What CORE Does: Disentangling Orthogonal Subspaces

The key insight of CORE is that the penultimate layer features (the layer before the final classification head) naturally decompose into two orthogonal components relative to the classifier.

$(a) Near-OOD and far-OOD AUROC across five model×\timesID settings. Shaded bands span the best-to-worst scorer within ea$

The Classifier-Aligned Component: This is the projection of the feature vector onto the subspace spanned by the classifier's weight vectors. This component directly determines the logits and thus encodes the model's confidence signal.
The Orthogonal Residual: This is the remaining part of the feature vector that is orthogonal to the classifier's subspace. The classifier explicitly discards this information. The authors discovered that for in-distribution (ID) data, this residual carries a consistent, class-specific directional signature—a pure membership signal.

CORE's innovation is to score these two orthogonal subspaces independently:

Confidence Score (S_conf): Derived from the classifier's output (e.g., softmax maximum).
Residual Score (S_res): A measure computed on the orthogonal residual features. The paper suggests using a simple Mahalanobis distance computed solely in this residual subspace.

These two scores are normalized and combined via a weighted sum to produce the final OOD score: S_CORE = α * S_conf + (1-α) * S_res.

Because the confidence and residual signals are orthogonal by construction, their failure modes are theoretically approximately independent. When one signal is unreliable (e.g., an OOD sample that yields high softmax confidence), the other can compensate, leading to robust detection.

Key Results

The authors evaluated CORE across a rigorous benchmark suite: five model architectures (DenseNet, ResNet, WideResNet, ViT, MLP-Mixer) and five OOD detection configurations (CIFAR-10 vs SVHN/TinyImageNet, etc.). Performance was measured using Area Under the Receiver Operating Characteristic curve (AUROC) and False Positive Rate at 95% True Positive Rate (FPR95).

$(a) Near-OOD and far-OOD AUROC across five model×\timesID settings. Shaded bands span the best-to-worst scorer within ea$

CORE achieved competitive or state-of-the-art performance, ranking first in 3 out of 5 benchmark settings and attaining the highest grand average AUROC across all architectures and datasets.

CIFAR-10 vs SVHN 99.3% (ReAct) 99.6% +0.3 pp CIFAR-10 vs TinyImageNet 95.1% (Mahalanobis) 96.7% +1.6 pp CIFAR-100 vs SVHN 98.5% (ReAct) 99.2% +0.7 pp CIFAR-100 vs TinyImageNet 87.9% (KNN) 88.1% +0.2 pp ImageNet-1K vs iNaturalist 91.5% (Mahalanobis) 92.8% +1.3 pp

Table: Selected AUROC results showing CORE's performance against strong baselines like ReAct, Mahalanobis distance, and K-Nearest Neighbors (KNN).

Crucially, CORE adds negligible computational overhead—primarily the cost of projecting features onto the orthogonal residual subspace and computing a distance—making it practical for real-time systems.

How It Works: Technical Implementation

For a model with penultimate feature h ∈ R^d and a linear classifier with weight matrix W ∈ R^(d×C) (for C classes) and bias b, the logits are z = W^T h + b.

Figure 2: Each column represents one scorer category: (a) logit-based energy score (far-OOD separates, near-OOD overlaps

The orthogonal decomposition is achieved as follows:

Compute the Classifier Subspace Basis: Perform a QR decomposition on W to get an orthonormal basis Q ∈ R^(d×r) for the column space of W, where r is the rank of W (typically r = C or r = C-1).
Project Features: The classifier-aligned component is h_align = Q Q^T h. The orthogonal residual is h_res = h - h_align = (I - Q Q^T) h.
Score Calculation:
- S_conf = maximum softmax probability (or energy score) from z.
- S_res = Mahalanobis distance of h_res computed using the mean and covariance estimated from the training set's residual features. This is simplified because h_res lies in a (d-r)-dimensional subspace.
Normalize & Combine: Min-max normalize each score over a held-out validation set, then combine with a fixed weight α (tuned on validation data).

The method is model-agnostic, requiring only access to the penultimate features and the final linear layer.

Why It Matters

CORE addresses a fundamental pain point in OOD detection: the lack of a robust, architecture-agnostic method. By leveraging a natural orthogonal decomposition present in any classifier, it provides a principled way to disentangle and combine confidence and membership signals. The results demonstrate consistent high performance without specialized tuning per architecture, moving closer to a "drop-in" OOD detection module. Its low computational cost makes it immediately applicable for enhancing the reliability of deployed vision models, from medical diagnostics to autonomous vehicles.

Source: gentic.news · Mar 20, 2026 · author=Ala SMITH · citation.json

AI-assisted reporting. Generated by gentic.news from multiple verified sources, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

CORE's primary contribution is a clever reframing of the feature space. The observation that the penultimate layer's residual (discarded by the classifier) contains a clean membership signal is significant. Prior feature-based methods like the Mahalanobis detector operated on the full feature space, where directions relevant for classification dominate the covariance. By restricting the membership test to the orthogonal complement of the classifier's subspace, CORE effectively filters out noise and focuses on a purer distributional signal. Theoretical robustness arises from the orthogonality of failure modes. A confident but distributionally alien sample (a common failure case for logit-based methods) will have an anomalous residual score. Conversely, a distributionally similar but low-confidence sample (a failure case for some feature methods) will be flagged by the low confidence score. This complementary design is more elegant than prior ensemble methods that heuristically combine scores from different layers or techniques. For practitioners, CORE is appealing due to its simplicity and low overhead. Implementing it requires adding a feature hook and performing a projection and distance calculation—far less costly than maintaining a separate model like an energy-based model or a flow. The need to compute the matrix `Q` for the classifier subspace is a one-time cost. The main tuning parameter is the combination weight `α`, which the paper finds can often be set to 0.5 or tuned on a small validation set. This makes it a strong candidate for a new default baseline in OOD detection benchmarks.

#research #model-reliability #computer-vision

Compare side-by-side

CORE vs Maximum Softmax Probability

→

Mentioned in this article

CORE OOD detection Maximum Softmax Probability ODIN

Enjoyed this article?

Get the weekly AI intelligence briefing

✨AI Toolslive

Five one-click lenses on this article. Cached for 24h.

Pick a tool above to generate an instant lens on this article.

AI Research

MiniMax M3 Exceeds Human Gold-Medal on Math Benchmarks via MaxProof

From the lab

The framework underneath this story

Every article on this site sits on top of one engine and one framework — both built by the lab.

Original research · EUMAS 2026

MNEMA — A Witness Lattice for Multi-Agent AI Memory

Cryptographic memory units · 1−α detection floor · 15 pp PDF

Field framework · v1.0

Epistemic Infrastructure

12 pillars · 11-stage knowledge metabolism · pathology catalog

More in AI Research

View all

A diagram shows multiple robot agents connected by arrows, with a central meta-skill node labeled 'orchestration'…

AI Research

Meta-skill evolution lets multi-agent systems self-improve without retraining

Multi-agent systems can improve orchestration by evolving a meta-skill via RL on interactions, without retraining agents. Demonstrated on a simulated benchmark.

x.com/1d ago/3 min read

multi-agentmeta-learningreinforcement learning

A bar chart comparing Zhipu GLM 5.2 and Claude Fable 5 scores on web design benchmarks, with GLM 5.2 leading in…

AI Research

Zhipu's GLM 5.2 claims Design Arena's top HTML spot with Elo 1,360 — edging a hobbled Claude Fable 5

Zhipu AI's 753-billion-parameter open-weight model GLM 5.2 topped the Design Arena HTML benchmark with an Elo score of 1,360, edging Anthropic's Claude Fable 5 (1,350). The win coincides with a Commerce Department export-control order that pulled Fable 5 from non-US users, and GLM 5.2's API pricing

pandaily.com/1d ago/3 min read/Widely Reported

anthropicchinese aibenchmarks

A person using a laptop with ChatGPT interface open, surrounded by colorful AI-related graphics and charts…

AI ResearchBreakthrough

OpenAI shows small doses of beneficial-trait RL improve 44 of 53 safety benchmarks — and the gains generalize

OpenAI researchers Jagadeesh, Saab, Singhal et al. published findings on June 18 showing RL training on traits like honesty and corrigibility improved 44 of 53 safety benchmarks. Gains generalized across domains not used in training, and the model resisted harmful fine-tuning better than the baselin

the-decoder.com/2d ago/3 min read/Widely Reported

alignmentai safetyreinforcement learning

The Core Problem: Entangled Signals

What CORE Does: Disentangling Orthogonal Subspaces

Key Results

How It Works: Technical Implementation

Why It Matters

AI Analysis

✨AI Toolslive

Related Articles

How to Govern Claude Code Across Your Team: 4 Gaps to Fix Before the Next CVE

OpenAI Can Predict Model Failures via Past Chat Replay

Anthropic Study: Senior Engineers Beat Juniors With AI by 31%

NVIDIA Blackwell Sweeps MLPerf Training 6.0, GB300 Hits 1.6x Speedup

CoreWeave Trains DeepSeek-V3 in 2 Minutes, Claims MLPerf v6.0 Record

MiniMax M3 Exceeds Human Gold-Medal on Math Benchmarks via MaxProof

The framework underneath this story

More in AI Research

Meta-skill evolution lets multi-agent systems self-improve without retraining

Zhipu's GLM 5.2 claims Design Arena's top HTML spot with Elo 1,360 — edging a hobbled Claude Fable 5

OpenAI shows small doses of beneficial-trait RL improve 44 of 53 safety benchmarks — and the gains generalize