Deep-HiCEMs & MLCS: New Methods for Learning Multi-Level Concept Hierarchies from Sparse Labels
What Happened
A new research paper, "Digging Deeper: Learning Multi-Level Concept Hierarchies," introduces two significant technical advancements in the field of interpretable AI: Multi-Level Concept Splitting (MLCS) and Deep-HiCEMs.
The core problem addressed is a major limitation in concept-based models. While these models are designed to be interpretable by explaining predictions using human-understandable concepts (like "stripes," "formal," "sporty"), they traditionally require exhaustive, fine-grained annotations for every concept. Furthermore, they typically treat all concepts as existing on a single, flat level without modeling their natural hierarchical relationships (e.g., "evening wear" is a type of "formal attire," which contains sub-concepts like "gown" or "tuxedo").
Previous work made strides with Hierarchical Concept Embedding Models (HiCEMs), which explicitly model concept relationships, and Concept Splitting, which discovers sub-concepts using only coarse, top-level labels. However, both approaches were restricted to shallow, often two-level hierarchies.
This new research overcomes that depth limitation.
Technical Details
Multi-Level Concept Splitting (MLCS)
MLCS is a method for discovering a multi-level hierarchy of concepts using only top-level supervision. Instead of needing annotators to label every nuanced sub-concept in a dataset, MLCS can take a dataset labeled with broad categories (e.g., "footwear," "outerwear," "bags") and automatically discover a tree-like structure of finer-grained concepts within them.
For example, given a dataset of products labeled simply as "footwear," MLCS could iteratively discover that this category splits into "heels," "sneakers," and "loafers." It might then discover that "sneakers" further splits into "running sneakers," "lifestyle sneakers," and "court sneakers," all without any of those sub-labels being provided during training. The authors state experiments show MLCS discovers "human-interpretable concepts absent during training."
Deep-HiCEMs
Deep-HiCEMs is a new neural network architecture designed to represent the multi-level hierarchies discovered by MLCS. Its key innovation is enabling test-time concept interventions at multiple levels of abstraction.
In a standard model, you get a prediction and perhaps an explanation. With a Deep-HiCEM, a human operator (like a merchandiser or analyst) can interact with the model's reasoning process. They could, for instance, intervene at a high level by telling the model, "For this analysis, focus on the 'formalwear' branch of the concept tree." Or, they could drill down and adjust the model's confidence in a specific low-level concept like "silk fabric." The paper reports that these interventions can not only improve interpretability but can also improve task performance (e.g., classification accuracy) by correcting or guiding the model's reasoning pathway.
Retail & Luxury Implications
The potential applications of this research in retail and luxury are profound, primarily because it tackles two critical industry pain points: the cost of data annotation and the need for trustworthy, granular AI insights.
1. Automating Product Taxonomy & Attribute Discovery:
Building and maintaining a detailed, hierarchical product taxonomy (Category > Sub-Category > Class > Attribute) is a massive, manual undertaking for retailers. MLCS offers a path to automate the discovery and structuring of this taxonomy from existing, coarsely labeled product catalogs. A brand could feed its product database labeled only by department (e.g., "Ready-to-Wear," "Leather Goods") into an MLCS system and have it propose a detailed, hierarchical attribute tree—discovering style families, material clusters, and design motifs that even human merchandisers might not have explicitly codified.
2. Interpretable Visual Search & Recommendation:
When a vision model powering "search similar" or "complete the look" recommends a product, the question is always why?. Deep-HiCEMs could provide the answer through a navigable concept hierarchy. The explanation wouldn't be a confusing list of neural activations but a clear path: "This handbag was recommended because it shares high-level concept 'Evening Bag' (85% match) and mid-level concept 'Crystal Embellishment' (92% match) with the item you're viewing." This builds user trust and allows for more refined, concept-based filtering.
3. Analyst-in-the-Loop Forecasting & Trend Analysis:
A Deep-HiCEM trained on seasonal sales data, social media imagery, and runway show photos could learn a hierarchy of visual trends. An analyst could then use test-time interventions to run scenarios: "What if the 'Y2K' trend (high-level concept) weakens, but the 'Low-Rise' sub-trend within it remains strong? How does that affect the forecasted performance of our denim line?" This moves analytics from black-box predictions to interactive, concept-driven simulation.
4. Quality Control & Craftsmanship Auditing:
For luxury houses, preserving craftsmanship standards is paramount. A Deep-HiCEM trained on images of flawless and defective items could learn a hierarchy of quality concepts—from broad categories like "stitching integrity" down to specific flaws like "uneven saddle stitch." Inspectors could use the model to highlight areas of concern, with the AI explaining its suspicion by pointing to specific, interpretable concepts in its hierarchy.
The fundamental value proposition is the shift from expensive, flat-label AI to efficient, hierarchical-concept AI. It reduces the dependency on armies of annotators to label every minute detail while providing a much richer, more actionable, and human-controllable structure for AI reasoning.

