Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

workflow optimization

30 articles about workflow optimization in AI news

IBM Research Survey Proposes Framework for Optimizing LLM Agent Workflows

IBM researchers published a comprehensive survey categorizing approaches to LLM agent workflow optimization along three dimensions: when structure is determined, which components get optimized, and what signals guide optimization.

99% relevant

Doby Cuts Claude Code Navigation Tokens by 95% with Spec-First Workflow

A spec-first fix workflow that slashes navigation tokens 95% and enforces plan docs as source of truth before code changes.

100% relevant

How Claude Routines Could Automate Your Dev Workflow (And What's Still Missing)

Claude Routines are cloud-based AI automations that run on triggers/schedules. While not directly in Claude Code yet, they hint at future workflow automation possibilities developers should prepare for.

100% relevant

NVIDIA's cuQuantum-DGX OS Aims to Manage Hybrid Quantum-Classical Workflows

NVIDIA announced its AI software stack is evolving into an operating system for quantum computing, aiming to manage the complex workflow between quantum processors and classical GPUs. This targets a major integration bottleneck as quantum hardware scales.

85% relevant

Codex vs. Claude Code: How to Benchmark Your Own Workflow

When comparing coding assistants, create objective benchmarks for your specific workflow instead of relying on general claims.

90% relevant

Anthropic's Agentic Workflows Launch: A Deep Dive on Cost & Capabilities

Anthropic launched Agentic Workflows, a managed service for running persistent AI agents. While marketed from $0.08/hr, real-world costs are higher due to compute, memory, and network fees.

82% relevant

Claude Skills: How Anthropic's Context-Aware Workflow System Solves the bloated CLAUDE.md Problem

Claude Skills are modular, self-contained workflow packages that load only when triggered by user intent, solving the context bloat caused by monolithic CLAUDE.md files. They support automatic invocation, slash commands, and can bundle supporting documents.

95% relevant

Fine-Tuning Llama 3 with Direct Preference Optimization (DPO): A Code-First Walkthrough

A technical guide details the end-to-end process of fine-tuning Meta's Llama 3 using Direct Preference Optimization (DPO), from raw preference data to a deployment-ready model. This provides a practical blueprint for customizing LLM behavior.

76% relevant

HyEvo Framework Automates Hybrid LLM-Code Workflows, Cuts Inference Cost 19x vs. SOTA

Researchers propose HyEvo, an automated framework that generates agentic workflows combining LLM nodes for reasoning with deterministic code nodes for execution. It reduces inference cost by up to 19x and latency by 16x while outperforming existing methods on reasoning benchmarks.

95% relevant

Minimax M2.7 Achieves 56.2% on SWE-Pro, Features Self-Evolving Training with 100+ Autonomous Optimization Loops

Minimax has released M2.7, a model that reportedly used autonomous optimization loops during RL training to achieve a 30% internal improvement. It scores 56.2% on SWE-Pro, near Claude 3.5 Opus, and ties Gemini 3.1 on MLE Bench Lite.

97% relevant

Helium: A New Framework for Efficient LLM Serving in Agentic Workflows

Researchers introduce Helium, a workflow-aware LLM serving framework that treats agentic workflows as query plans. It uses proactive caching and cache-aware scheduling to reduce redundancy, achieving up to 1.56x speedup over current systems.

74% relevant

Agentic Control Center for Data Product Optimization: A Framework for Continuous AI-Driven Data Refinement

Researchers propose a system using specialized AI agents to automate the improvement of data products through a continuous optimization loop. It surfaces questions, monitors quality metrics, and incorporates human oversight to transform raw data into actionable assets.

75% relevant

Sim Emerges as Open-Source Challenger to AI Workflow Automation Giants

Sim introduces a drag-and-drop interface for building AI agent workflows, positioning itself as a 100% open-source alternative to established platforms like n8n. Released under Apache 2.0 license, this tool promises greater accessibility and customization for developers creating automated AI systems.

85% relevant

Headroom AI: The Open-Source Context Optimization Layer That Could Revolutionize Agent Efficiency

Headroom AI introduces a zero-code context optimization layer that compresses LLM inputs by 60-90% while preserving critical information. This open-source proxy solution could dramatically reduce costs and improve performance for AI agents.

95% relevant

OpenCode vs Claude Code: What the 2026 Comparison Means for Your CLI Workflow

A new competitor validates Claude Code's terminal-first philosophy, but Claude's mature MCP ecosystem and proven local execution capabilities remain key differentiators for developers.

100% relevant

Dify AI Workflow Platform Hits 136K GitHub Stars as Low-Code AI App Builder Gains Momentum

Dify, an open-source platform for building production-ready AI applications, has reached 136K stars on GitHub. The platform combines RAG pipelines, agent orchestration, and LLMOps into a unified visual interface, eliminating the need to stitch together multiple tools.

87% relevant

AgenticGEO: Self-Evolving AI Framework for Generative Search Engine Optimization Outperforms 14 Baselines

Researchers propose AgenticGEO, an AI framework that evolves content strategies to maximize inclusion in generative search engine outputs. It uses MAP-Elites and a Co-Evolving Critic to reduce costly API calls, achieving state-of-the-art performance across 3 datasets.

91% relevant

Claude Code's 81.6K GitHub Stars: What This Community Momentum Means for Your Daily Workflow

Claude Code's massive GitHub adoption signals a mature ecosystem—here's how to leverage the new MCP servers and subagent features shipping now.

95% relevant

Goal-Driven Data Optimization: Training Multimodal AI with 95% Less Data

Researchers introduce GDO, a framework that optimizes multimodal instruction tuning by selecting high-utility training samples. It achieves faster convergence and higher accuracy using 5-7% of the data typically required. This addresses compute inefficiency in training vision-language models.

71% relevant

How Claude-Code-Workflow Orchestrates Multiple CLI Agents for Complex Tasks

Install this CLI tool to coordinate multiple Claude Code agents for complex projects using semantic commands and session management.

98% relevant

LangChain Open-Sources Deep Agents: MIT-Licensed Framework Replicating Claude Code's Core Workflow

LangChain released Deep Agents, an open-source framework that recreates the core architecture of coding agents like Claude Code. The MIT-licensed system is model-agnostic and provides modular components for building inspectable coding assistants.

87% relevant

AI Database Optimization: A Cautionary Tale for Luxury Retail's Critical Systems

AI agents can autonomously rewrite database queries to improve performance, but unsupervised deployment in production systems carries significant risks. For luxury retailers, this technology requires careful governance to avoid customer-facing disruptions.

60% relevant

MiniMax M2.7 AI Agent Rewrites Its Own Harness, Achieving 9 Gold Medals on MLE Bench Lite Without Retraining

MiniMax's M2.7 agent autonomously rewrites its own operational harness—skills, memory, and workflow rules—through a self-optimization loop. After 100+ internal rounds, it earned 9 gold medals on OpenAI's MLE Bench Lite without weight updates.

95% relevant

Blue Yonder Expands Agentic AI and Mobile Apps for Retail Supply Chain Execution

Blue Yonder announced new agentic AI capabilities and mobile companion apps for retail planning and execution. The updates target merchandise financial planning, assortment optimization, and mobile allocation workflows to improve decision speed and accuracy.

95% relevant

Embedding distance predicts VLM typographic attack success (r=-0.93)

A new study shows that embedding distance between image text and harmful prompt strongly predicts attack success rate (r=-0.71 to -0.93). The researchers introduce CWA-SSA optimization to recover readability and bypass safety alignment without model access.

82% relevant

Claude Code's Secret Efficiency Hack

Claude Code leverages speculative decoding to reduce LLM energy use by 100x. Learn how this built-in optimization makes your coding faster and cheaper.

85% relevant

Researchers Achieve Ultra-Long-Horizon Agentic Science with Cohesive AI Agents

A research team has developed AI agents capable of executing and maintaining coherent, long-horizon scientific research workflows. This addresses a core challenge in creating autonomous systems for complex discovery.

85% relevant

Agentic AI Emerges as a Strategic Force in Private Label and Loyalty

Three industry reports highlight the growing adoption of 'agentic AI' in retail. The technology is being used to streamline private label product development and create highly personalized customer loyalty experiences, moving beyond simple chatbots to autonomous workflow orchestration.

82% relevant

DharmaOCR: New Small Language Models Set State-of-the-Art for Structured

A new arXiv preprint presents DharmaOCR, a pair of small language models (7B & 3B params) fine-tuned for structured OCR. They introduce a new benchmark and use Direct Preference Optimization to drastically reduce 'text degeneration'—a key cause of performance failures—while outputting structured JSON. The models claim superior accuracy and lower cost than proprietary APIs.

72% relevant

MLX-VLM Adds Continuous Batching, OpenAI API, and Vision Cache for Apple Silicon

The next release of MLX-VLM will introduce continuous batching, an OpenAI-compatible API, and vision feature caching for multimodal models running locally on Apple Silicon. These optimizations promise up to 228x speedups on cache hits for models like Gemma4.

95% relevant