structured representation
30 articles about structured representation in AI news
SSL: Structured Skill Language Boosts Skill Discovery MRR to 0.707
Researchers propose SSL, a three-layer typed JSON representation for AI agent skills, replacing unstructured SKILL.md prose. Using an LLM normalizer, SSL improves Skill Discovery MRR from 0.573 to 0.707 and Risk Assessment macro F1 from 0.744 to 0.787 on a newly released 6,184-skill corpus.
How Structured JSON Inputs Eliminated Hallucinations in a Fine-Tuned 7B Code Model
A developer fine-tuned a 7B code model on consumer hardware to generate Laravel PHP files. Hallucinations persisted until prompts were replaced with structured JSON specs, which eliminated ambiguous gap-filling errors and reduced debugging time dramatically.
Hybrid Self-evolving Structured Memory: A Breakthrough for GUI Agent Performance
Researchers propose HyMEM, a graph-based memory system for GUI agents that combines symbolic nodes with continuous embeddings. It enables multi-hop retrieval and self-evolution, boosting open-source VLMs to surpass closed-source models like GPT-4o on computer-use tasks.
Interluxe Group Launches Optima AI Index to Shape Luxury Discovery in
The Interluxe Group has introduced the Optima AI Index, a new data standard aimed at enhancing the accuracy and visibility of luxury brand information within generative AI platforms. This initiative seeks to address the challenge of inconsistent brand discovery in AI-driven search, providing a structured foundation for brand representation.
LLM Schema-Adaptive Method Enables Zero-Shot EHR Transfer
Researchers propose Schema-Adaptive Tabular Representation Learning, an LLM-driven method that transforms structured variables into semantic statements. It enables zero-shot alignment across unseen EHR schemas and outperforms clinical baselines, including neurologists, on dementia diagnosis tasks.
AFMRL: Using MLLMs to Generate Attributes for Better Product Retrieval in
AFMRL uses MLLMs to generate product attributes, then uses those attributes to train better multimodal representations for e-commerce retrieval. Achieves SOTA on large-scale datasets.
Columbia Prof: LLMs Can't Generate New Science, Only Map Known Data
Columbia CS Professor Vishal Misra argues LLMs cannot generate new scientific ideas because they learn structured maps of known data and fail outside those boundaries. True discovery requires creating new conceptual maps, a capability current architectures lack.
GPT-5.5 Generates Complex SVG in Single Prompt, User Reports
A developer shared that OpenAI's GPT-5.5 produced a sophisticated SVG image from a single prompt. This suggests improvements in the model's ability to generate precise, structured visual code.
Andrej Karpathy's LLM-Wiki Framework Solves AI Amnesia with Persistent Knowledge
Andrej Karpathy published a two-page framework called LLM-Wiki that transforms how AI systems handle accumulated knowledge. Instead of retrieving from raw documents each time, the AI compiles sources into its own structured wiki that persists across sessions.
LLM 'Declared Losses' Reveal Epistemic Nuance Missed by Neutrosophic Scalars
A study extending neutrosophic logic evaluation of LLMs finds scalar T/I/F outputs are insufficient, collapsing paradox, ignorance, and contingency into identical scores. Adding structured 'declared loss' descriptions recovers these distinctions with Jaccard similarity <0.10.
Image Prompt Packaging Cuts Multimodal Inference Costs Up to 91%
A new method called Image Prompt Packaging (IPPg) embeds structured text directly into images, reducing token-based inference costs by 35.8–91% across GPT-4.1, GPT-4o, and Claude 3.5 Sonnet. Performance outcomes are highly model-dependent, with GPT-4.1 showing simultaneous accuracy and cost gains on some tasks.
Luma Labs Launches Uni-1: An Autoregressive Transformer for Image Generation with a Pre-Generation Reasoning Phase
Luma Labs has released Uni-1, a foundational image model that uses an autoregressive transformer to reason about user intent before generating pixels. It aims to address the 'intent gap' common in diffusion models by adding a structured reasoning step.
NEO: A Unified Language Model for Large-Scale Search, Recommendation, and Reasoning
Researchers propose NEO, a framework that adapts a pre-trained LLM into a single, tool-free model for catalog-grounded tasks like recommendation and search. It represents items as structured IDs (SIDs) interleaved with text, enabling controlled, valid outputs. This offers a path to consolidate discovery systems.
Google Launches Gemini Embedding 2: A New Multimodal Foundation for AI Applications
Google has released Gemini Embedding 2, a second-generation multimodal embedding model designed to process text, images, and audio simultaneously. This technical advancement creates more unified AI representations, potentially improving search, recommendation, and personalization systems.
New Research: ADC-SID Framework Improves Semantic ID Generation by Denoising Collaborative Signals
A new arXiv paper proposes ADC-SID, a framework that adaptively denoises collaborative information to create more robust Semantic IDs for recommender systems. It specifically addresses the corruption of long-tail item representations, a critical problem for large retail catalogs.
Guardian AI: How Markov Chains, RL, and LLMs Are Revolutionizing Missing-Child Search Operations
Researchers have developed Guardian, an AI system that combines interpretable Markov models, reinforcement learning, and LLM validation to create dynamic search plans for missing children during the critical first 72 hours. The system transforms unstructured case data into actionable geospatial predictions with built-in quality assurance.
Accenture's Memex(RL) Revolutionizes AI Agent Memory for Complex Tasks
Accenture researchers have developed Memex(RL), a breakthrough system that gives AI agents structured, searchable memory for long-horizon tasks. This solves the critical problem of agents losing track of past experiences during complex operations like deep research and multi-step planning.
Intent Engineering: The Framework for Reliable AI Agents in Luxury Retail
Intent Engineering provides a structured layer between business goals and AI execution, enabling reliable luxury service agents, personalized styling, and automated clienteling that maintains brand standards.
Utonia AI Breakthrough: A Single Transformer Model Unifies All 3D Point Cloud Data
Researchers have developed Utonia, a single self-supervised transformer that learns unified 3D representations across diverse point cloud data types including LiDAR, CAD models, indoor scans, and video-lifted data. This breakthrough enables unprecedented cross-domain transfer and emergent behaviors in 3D AI.
Beyond Single Prompts: How 'Codified Context' Solves AI's Memory Problem in Large-Scale Development
A new research paper reveals why single-file AI agent instructions fail for complex projects and introduces a three-tier memory architecture that successfully managed a 108,000-line distributed system. The approach replaces simple prompts with structured, evolving documentation that becomes load-bearing infrastructure for AI development.
Google DeepMind's Unified Latents Framework: Solving Generative AI's Core Trade-Off
Google DeepMind introduces Unified Latents (UL), a novel framework that jointly trains diffusion priors and decoders to optimize latent space representation. This approach addresses the fundamental trade-off between reconstruction quality and learnability in generative AI models.
BrepCoder: The AI That Speaks CAD's Native Language
Researchers have developed BrepCoder, a multimodal AI that understands CAD designs in their native B-rep format. By treating 3D models as structured code, it performs multiple engineering tasks without task-specific retraining, potentially revolutionizing design automation.
Logitext Bridges the Gap Between Language Models and Logical Reasoning
Researchers introduce Logitext, a neurosymbolic framework that treats LLM reasoning as an SMT theory, enabling joint textual-logical analysis of partially structured documents. The system improves accuracy on content moderation and legal reasoning tasks.
Bridging Human Language and Machine Logic: New AI Framework Achieves Near-Perfect Translation Accuracy
Researchers have developed NL2LOGIC, an AI framework that translates natural language into formal logic with 99% syntactic accuracy. By using abstract syntax trees as an intermediate representation, the system dramatically improves semantic correctness and downstream reasoning performance.
Microsoft's Playwright MCP Server Replaces Vision for Web Agents
Microsoft built an MCP server for Playwright that lets AI agents interact with web pages using the accessibility tree, eliminating the need for screenshots and vision models. This approach reduces hallucinations and broken selectors, working with tools like Cursor, VS Code, and Claude Desktop.
Meta: Code Agents Improve by Reusing Short Summaries, Not Raw Logs
Meta's new paper reveals that coding agents with summary-based history reuse outperform those using raw logs, improving efficiency and success on complex tasks.
New MoE Framework Tames User Interest Shifts in Long-Sequence Recommendations
Researchers propose MoS, a model-agnostic MoE approach that handles long user sequences by detecting session hopping – where user interests shift across sessions. The theme-aware routing mechanism filters irrelevant sessions, while multi-scale fusion captures global and local patterns. Results show SOTA on benchmarks with fewer FLOPs than alternatives.
VoteGCL: A Novel LLM-Augmented Framework to Combat Data Sparsity in
A new paper introduces VoteGCL, a framework that uses few-shot LLM prompting and majority voting to create high-confidence synthetic data for graph-based recommendation systems. It integrates this data via graph contrastive learning to improve accuracy and mitigate bias, outperforming existing baselines.
ECLASS-Augmented Semantic Product Search
Researchers systematically evaluated LLM-assisted dense retrieval for semantic product search on industrial electronic components. Augmenting embeddings with ECLASS hierarchical metadata created a crucial semantic bridge, achieving 94.3% Hit_Rate@5 versus 31.4% for BM25.
Xiaomi's OneVL Uses Latent CoT to Beat Explicit CoT in Autonomous Driving
Xiaomi's Embodied Intelligence Team released OneVL, a vision-language model using latent Chain-of-Thought reasoning. It achieves state-of-the-art results on four autonomous driving benchmarks without the latency penalty of explicit reasoning steps.