CORE: A Robust OOD Detection Method Using Orthogonal Feature Decomposition
Out-of-distribution (OOD) detection—identifying when a model encounters data unlike its training distribution—is a critical safety component for real-world deep learning deployment. Current methods suffer from inconsistent performance: a technique that excels on one model architecture or dataset often fails on another. A new paper, "CORE: Robust Out-of-Distribution Detection via Confidence and Orthogonal Residual Scoring," proposes a solution by fundamentally rethinking how to extract detection signals from a neural network.
The Core Problem: Entangled Signals
The paper identifies a structural limitation shared by existing OOD detection approaches. These methods generally fall into two categories:
- Logit-based methods (e.g., Maximum Softmax Probability, ODIN): These use the classifier's final output (logits or softmax probabilities) as a confidence score. They only see the model's prediction certainty.
- Feature-based methods (e.g., Mahalanobis distance, Gram matrices): These analyze the model's internal activations (features) to measure how "close" a sample is to the training distribution.
The authors argue that logit-based methods miss a crucial signal: whether the input's features actually belong to the learned distribution. Feature-based methods attempt to capture this "membership" signal but do so in the full, high-dimensional feature space where it is entangled with the classifier's confidence signal and architectural noise. This entanglement makes their performance highly sensitive to model architecture.
What CORE Does: Disentangling Orthogonal Subspaces
The key insight of CORE is that the penultimate layer features (the layer before the final classification head) naturally decompose into two orthogonal components relative to the classifier.

- The Classifier-Aligned Component: This is the projection of the feature vector onto the subspace spanned by the classifier's weight vectors. This component directly determines the logits and thus encodes the model's confidence signal.
- The Orthogonal Residual: This is the remaining part of the feature vector that is orthogonal to the classifier's subspace. The classifier explicitly discards this information. The authors discovered that for in-distribution (ID) data, this residual carries a consistent, class-specific directional signature—a pure membership signal.
CORE's innovation is to score these two orthogonal subspaces independently:
- Confidence Score (
S_conf): Derived from the classifier's output (e.g., softmax maximum). - Residual Score (
S_res): A measure computed on the orthogonal residual features. The paper suggests using a simple Mahalanobis distance computed solely in this residual subspace.
These two scores are normalized and combined via a weighted sum to produce the final OOD score: S_CORE = α * S_conf + (1-α) * S_res.
Because the confidence and residual signals are orthogonal by construction, their failure modes are theoretically approximately independent. When one signal is unreliable (e.g., an OOD sample that yields high softmax confidence), the other can compensate, leading to robust detection.
Key Results
The authors evaluated CORE across a rigorous benchmark suite: five model architectures (DenseNet, ResNet, WideResNet, ViT, MLP-Mixer) and five OOD detection configurations (CIFAR-10 vs SVHN/TinyImageNet, etc.). Performance was measured using Area Under the Receiver Operating Characteristic curve (AUROC) and False Positive Rate at 95% True Positive Rate (FPR95).

CORE achieved competitive or state-of-the-art performance, ranking first in 3 out of 5 benchmark settings and attaining the highest grand average AUROC across all architectures and datasets.
CIFAR-10 vs SVHN 99.3% (ReAct) 99.6% +0.3 pp CIFAR-10 vs TinyImageNet 95.1% (Mahalanobis) 96.7% +1.6 pp CIFAR-100 vs SVHN 98.5% (ReAct) 99.2% +0.7 pp CIFAR-100 vs TinyImageNet 87.9% (KNN) 88.1% +0.2 pp ImageNet-1K vs iNaturalist 91.5% (Mahalanobis) 92.8% +1.3 ppTable: Selected AUROC results showing CORE's performance against strong baselines like ReAct, Mahalanobis distance, and K-Nearest Neighbors (KNN).
Crucially, CORE adds negligible computational overhead—primarily the cost of projecting features onto the orthogonal residual subspace and computing a distance—making it practical for real-time systems.
How It Works: Technical Implementation
For a model with penultimate feature h ∈ R^d and a linear classifier with weight matrix W ∈ R^(d×C) (for C classes) and bias b, the logits are z = W^T h + b.

The orthogonal decomposition is achieved as follows:
- Compute the Classifier Subspace Basis: Perform a QR decomposition on
Wto get an orthonormal basisQ ∈ R^(d×r)for the column space ofW, whereris the rank ofW(typicallyr = Corr = C-1). - Project Features: The classifier-aligned component is
h_align = Q Q^T h. The orthogonal residual ish_res = h - h_align = (I - Q Q^T) h. - Score Calculation:
S_conf= maximum softmax probability (or energy score) fromz.S_res= Mahalanobis distance ofh_rescomputed using the mean and covariance estimated from the training set's residual features. This is simplified becauseh_reslies in a(d-r)-dimensional subspace.
- Normalize & Combine: Min-max normalize each score over a held-out validation set, then combine with a fixed weight
α(tuned on validation data).
The method is model-agnostic, requiring only access to the penultimate features and the final linear layer.
Why It Matters
CORE addresses a fundamental pain point in OOD detection: the lack of a robust, architecture-agnostic method. By leveraging a natural orthogonal decomposition present in any classifier, it provides a principled way to disentangle and combine confidence and membership signals. The results demonstrate consistent high performance without specialized tuning per architecture, moving closer to a "drop-in" OOD detection module. Its low computational cost makes it immediately applicable for enhancing the reliability of deployed vision models, from medical diagnostics to autonomous vehicles.




