nemo
30 articles about nemo in AI news
NVIDIA NeMo RL Speculative Decoding: 1.8× Rollout Speed at 8B
NVIDIA's NeMo RL speculative decoding achieves 1.8× rollout speedup at 8B and projects 2.5× at 235B, cutting RL training time by over half.
NVIDIA Nemotron 3 Nano Omni: Open Multimodal Model Unifies Video, Audio, Image, Text
NVIDIA announced Nemotron 3 Nano Omni, an open multimodal model that processes video, audio, images, and text in a unified architecture, expanding accessibility for multimodal AI research.
NVIDIA Nemotron 3 Super: 120B Hybrid Mamba-Transformer MoE with 1M Context
NVIDIA has released Nemotron 3 Super, a 120B parameter open hybrid Mamba-Transformer Mixture of Experts model with 12B active parameters and 1M token context length. The company claims it delivers up to 7.5x higher throughput than similar open models.
Superintelligence Podcast Launches with NVIDIA Nemotron 3 Deep Dive
The Superintelligence podcast has launched, promising in-depth interviews with AI industry leaders. Its first episode is an exclusive interview with NVIDIA's Kari Briski on the Nemotron 3 Super model.
NemoVideo AI Automates Video Editing Based on Text Prompts
A video creator states NemoVideo AI now automates complex editing tasks like cuts and transitions from simple text descriptions, reducing a 5-hour manual process to a prompt-driven workflow.
Nemotron ColEmbed V2: NVIDIA's New SOTA Embedding Models for Visual Document Retrieval
NVIDIA researchers have released Nemotron ColEmbed V2, a family of three models (3B, 4B, 8B parameters) that set new state-of-the-art performance on the ViDoRe benchmark for visual document retrieval. The models use a 'late interaction' mechanism and are built on top of pre-trained VLMs like Qwen3-VL and NVIDIA's own Eagle 2. This matters because it directly addresses the challenge of retrieving information from visually rich documents like PDFs and slides within RAG systems.
NVIDIA Releases Nemotron-Cascade 2: A 30B MoE Model with 3B Active Parameters
NVIDIA has open-sourced Nemotron-Cascade 2, a 30B parameter Mixture-of-Experts model that activates only 3B parameters per token. It claims 'gold medal performance' on IMO and IOI 2025 benchmarks.
NVIDIA Open-Sources NeMo Claw: A Local Security Sandbox for AI Agents
NVIDIA has open-sourced NeMo Claw, a security sandbox designed to run AI agents locally. It isolates models from cloud services, blocks unauthorized network calls, and secures model APIs via a single installation script.
NVIDIA Nemotron Ultra: Details Emerge on Upcoming Open-Source LLM Series
NVIDIA is developing the Nemotron Ultra series of open-source large language models. The project, described as 'insane' and 'underrated,' is generating early hype among AI researchers.
NemoClaw Launches as 'Industry-Ready' Agent-as-a-Service Platform
Nvidia's Project NemoClaw has launched as a commercial 'Agent-as-a-Service' platform, positioning itself as an industry-ready alternative to OpenAI's offerings. The launch follows commentary predicting SaaS will evolve into AgaaS.
NVIDIA VP Kari Briski to Discuss Nemotron 3 Super Development in Upcoming Interview
NVIDIA VP Kari Briski will be interviewed on Thursday about the company's Nemotron models, specifically the recent Nemotron 3 Super. The recorded conversation will be published by NVIDIA.
NVIDIA NeMo Retriever Achieves #1 on ViDoRe v3 with New Agentic Pipeline
NVIDIA's NeMo Retriever team has developed a generalizable agentic retrieval pipeline that topped the ViDoRe v3 leaderboard and placed second on BRIGHT. The system moves beyond semantic similarity to dynamically adapt search strategies for complex, multi-domain data.
Nvidia Enters the AI Agent Arena: NemoClaw Targets Open Source Dominance
Nvidia is reportedly developing NemoClaw, an open-source AI agent platform to compete with OpenClaw. The announcement is expected at next week's GTC conference, signaling Nvidia's move to set standards in the rapidly evolving 'claw' ecosystem.
NVIDIA's Nemotron 3 Super: The Efficiency-First AI Model Redefining Performance Benchmarks
NVIDIA unveils Nemotron 3 Super, a 120B parameter model with only 12B active parameters using hybrid Mamba-Transformer MoE architecture. It achieves 1M token context, beats GPT-OSS-120B on intelligence metrics, and offers configurable reasoning modes for optimal compute efficiency.
NVIDIA Breaks the Data Bottleneck: Nemotron-Terminal and Nemotron 3 Super Democratize Agentic AI
NVIDIA has launched Nemotron-Terminal, a systematic data engineering pipeline to scale LLM terminal agents, and Nemotron 3 Super, a massive 120B-parameter open-source model. These releases aim to solve the critical data scarcity and transparency issues plaguing autonomous AI agent development.
Nvidia's NemoClaw: The Open-Source Platform Poised to Democratize AI Agent Development
Nvidia is preparing to launch NemoClaw, an open-source platform designed specifically for building and deploying AI agents. This move aims to accelerate the development of autonomous systems that can perform complex, multi-step tasks.
NVIDIA's Nemotron-Terminal: A Systematic Pipeline for Scaling Terminal-Based AI Agents
NVIDIA researchers introduce Nemotron-Terminal, a comprehensive data engineering pipeline designed to scale terminal-based large language model agents. The system bridges the gap between raw terminal data and high-quality training datasets, addressing key challenges in agent reliability and generalization.
Nvidia's Open-Source Gambit: NeMoClaw Aims to Tame Enterprise AI Agents
Nvidia is preparing to launch NeMoClaw, an open-source platform designed for building secure, autonomous AI agents for enterprise workflows. Breaking from its proprietary CUDA tradition, the move targets software ecosystem dominance regardless of hardware.
OpenAI Privacy Filter Gets 6x More PII Labels via Nvidia Data
OpenAI has retrained its privacy filter using Nvidia's Nemotron-PII dataset, expanding PII detection from 8 to over 50 label types, targeting healthcare and enterprise use cases with better accuracy.
MiniMax M2.7 Model Deploys on NVIDIA NIM Endpoints with OpenClaw Support
Chinese AI firm MiniMax has made its M2.7 model available through NVIDIA's GPU-accelerated NIM endpoints. This deployment includes support for the OpenClaw and NemoClaw frameworks, integrating it into a major AI development ecosystem.
SauerkrautLM-Doom-MultiVec: 1.3M-Param Model Outperforms LLMs 92,000x Its Size
Researchers built a 1.3M-parameter model that plays DOOM in real-time, scoring 178 frags in 10 episodes. It outperforms LLMs like Nemotron-120B and GPT-4o-mini, which scored only 13 combined, demonstrating the power of small, task-specific architectures.
NVIDIA and Cisco Publish Practical Guide for Fine-Tuning Enterprise Embedding Models
Cisco Blogs published a guide detailing how to fine-tune embedding models for enterprise retrieval using NVIDIA's Nemotron recipe. This provides a technical blueprint for improving domain-specific search and RAG systems, a critical component for AI-powered enterprise applications.
The LLM Evaluation Problem Nobody Talks About
An article highlights a critical, often overlooked flaw in LLM evaluation: the contamination of benchmark data in training sets. It discusses NVIDIA's open-source solution, Nemotron 3 Super, designed to generate clean, synthetic evaluation data.
Nvidia Commits $26 Billion to Open-Source AI, Aiming to Reshape the Ecosystem
Nvidia plans to invest $26 billion over five years in open-weight AI models, launching Nemotron 3 Super. This strategic move addresses a growing open-source gap left by major AI labs and counters rising Chinese model dominance while reinforcing Nvidia's hardware ecosystem.
From Prototype to Production: Streamlining LLM Evaluation for Luxury Clienteling & Chatbots
NVIDIA's new NeMo Evaluator Agent Skills dramatically simplifies testing and monitoring of conversational AI agents. For luxury retail, this means faster, more reliable deployment of high-quality clienteling assistants and customer service chatbots.
Invenergy, Nvidia, Emerald AI Partner on 'Flexible AI Factories'
Invenergy, Nvidia, and Emerald AI partner to develop flexible AI factories from edge to multi-gigawatt campuses, targeting rapid AI infrastructure deployment.
Mistral Medium Model Launch Teased by European AI Company
Mistral AI teased an upcoming model called Mistral Medium on X, signaling continued expansion of its model lineup. The announcement comes amid growing competition in the open-weight LLM space.
Nvidia Trains Billion-Parameter LLM Without Backpropagation
Nvidia demonstrated training a billion-parameter language model using zero gradients or backpropagation, eliminating FP32 weights entirely. This could dramatically reduce memory and compute costs for LLM training.
PayPal Cuts LLM Inference Cost 50% with EAGLE3 Speculative Decoding on H100
PayPal engineers applied EAGLE3 speculative decoding to their fine-tuned 8B-parameter commerce agent, achieving up to 49% higher throughput and 33% lower latency. This allowed a single H100 GPU to match the performance of two H100s running NVIDIA NIM, cutting inference hardware cost by 50%.
Qwen3.6-27B: How to Run a 17GB Local Model That Beats 397B MoE on Coding Tasks
Qwen3.6-27B delivers flagship-level coding performance in a 55.6GB model that can be quantized to 16.8GB, making high-quality local coding assistance accessible.