nemo
30 articles about nemo in AI news
Nemotron 3 Ultra matches GPT-5.5 on physics test at 10X lower cost
Nemotron 3 Ultra matched GPT-5.5 on a physics test at 10X lower cost ($0.051 vs $0.57), highlighting MoE efficiency.
NVIDIA Nemotron 3 Ultra: 550B Open-Weight Model Challenges GLM, Kimi
NVIDIA released Nemotron 3 Ultra, a 550B open-weight model claiming near-SOTA performance, competing with GLM-5.1 and Kimi K2.6. No benchmarks yet.
NVIDIA NeMo RL Speculative Decoding: 1.8× Rollout Speed at 8B
NVIDIA's NeMo RL speculative decoding achieves 1.8× rollout speedup at 8B and projects 2.5× at 235B, cutting RL training time by over half.
NVIDIA Nemotron 3 Nano Omni: Open Multimodal Model Unifies Video, Audio, Image, Text
NVIDIA announced Nemotron 3 Nano Omni, an open multimodal model that processes video, audio, images, and text in a unified architecture, expanding accessibility for multimodal AI research.
NVIDIA Nemotron 3 Super: 120B Hybrid Mamba-Transformer MoE with 1M Context
NVIDIA has released Nemotron 3 Super, a 120B parameter open hybrid Mamba-Transformer Mixture of Experts model with 12B active parameters and 1M token context length. The company claims it delivers up to 7.5x higher throughput than similar open models.
Superintelligence Podcast Launches with NVIDIA Nemotron 3 Deep Dive
The Superintelligence podcast has launched, promising in-depth interviews with AI industry leaders. Its first episode is an exclusive interview with NVIDIA's Kari Briski on the Nemotron 3 Super model.
NemoVideo AI Automates Video Editing Based on Text Prompts
A video creator states NemoVideo AI now automates complex editing tasks like cuts and transitions from simple text descriptions, reducing a 5-hour manual process to a prompt-driven workflow.
Nemotron ColEmbed V2: NVIDIA's New SOTA Embedding Models for Visual Document Retrieval
NVIDIA researchers have released Nemotron ColEmbed V2, a family of three models (3B, 4B, 8B parameters) that set new state-of-the-art performance on the ViDoRe benchmark for visual document retrieval. The models use a 'late interaction' mechanism and are built on top of pre-trained VLMs like Qwen3-VL and NVIDIA's own Eagle 2. This matters because it directly addresses the challenge of retrieving information from visually rich documents like PDFs and slides within RAG systems.
NVIDIA Releases Nemotron-Cascade 2: A 30B MoE Model with 3B Active Parameters
NVIDIA has open-sourced Nemotron-Cascade 2, a 30B parameter Mixture-of-Experts model that activates only 3B parameters per token. It claims 'gold medal performance' on IMO and IOI 2025 benchmarks.
NVIDIA Open-Sources NeMo Claw: A Local Security Sandbox for AI Agents
NVIDIA has open-sourced NeMo Claw, a security sandbox designed to run AI agents locally. It isolates models from cloud services, blocks unauthorized network calls, and secures model APIs via a single installation script.
NVIDIA Nemotron Ultra: Details Emerge on Upcoming Open-Source LLM Series
NVIDIA is developing the Nemotron Ultra series of open-source large language models. The project, described as 'insane' and 'underrated,' is generating early hype among AI researchers.
NVIDIA NeMo Retriever Achieves #1 on ViDoRe v3 with New Agentic Pipeline
NVIDIA's NeMo Retriever team has developed a generalizable agentic retrieval pipeline that topped the ViDoRe v3 leaderboard and placed second on BRIGHT. The system moves beyond semantic similarity to dynamically adapt search strategies for complex, multi-domain data.
Nvidia Enters the AI Agent Arena: NemoClaw Targets Open Source Dominance
Nvidia is reportedly developing NemoClaw, an open-source AI agent platform to compete with OpenClaw. The announcement is expected at next week's GTC conference, signaling Nvidia's move to set standards in the rapidly evolving 'claw' ecosystem.
NVIDIA's Nemotron 3 Super: The Efficiency-First AI Model Redefining Performance Benchmarks
NVIDIA unveils Nemotron 3 Super, a 120B parameter model with only 12B active parameters using hybrid Mamba-Transformer MoE architecture. It achieves 1M token context, beats GPT-OSS-120B on intelligence metrics, and offers configurable reasoning modes for optimal compute efficiency.
NVIDIA Breaks the Data Bottleneck: Nemotron-Terminal and Nemotron 3 Super Democratize Agentic AI
NVIDIA has launched Nemotron-Terminal, a systematic data engineering pipeline to scale LLM terminal agents, and Nemotron 3 Super, a massive 120B-parameter open-source model. These releases aim to solve the critical data scarcity and transparency issues plaguing autonomous AI agent development.
Nvidia's NemoClaw: The Open-Source Platform Poised to Democratize AI Agent Development
Nvidia is preparing to launch NemoClaw, an open-source platform designed specifically for building and deploying AI agents. This move aims to accelerate the development of autonomous systems that can perform complex, multi-step tasks.
NVIDIA's Nemotron-Terminal: A Systematic Pipeline for Scaling Terminal-Based AI Agents
NVIDIA researchers introduce Nemotron-Terminal, a comprehensive data engineering pipeline designed to scale terminal-based large language model agents. The system bridges the gap between raw terminal data and high-quality training datasets, addressing key challenges in agent reliability and generalization.
Nvidia's Open-Source Gambit: NeMoClaw Aims to Tame Enterprise AI Agents
Nvidia is preparing to launch NeMoClaw, an open-source platform designed for building secure, autonomous AI agents for enterprise workflows. Breaking from its proprietary CUDA tradition, the move targets software ecosystem dominance regardless of hardware.
Trillion Labs Builds Industrial World Models on NVIDIA Omnibus
Trillion Labs announced Industrial World Models for AI Factories using NVIDIA Omniverse and Nemotron to optimize data centers and power plants.
OpenAI Privacy Filter Gets 6x More PII Labels via Nvidia Data
OpenAI has retrained its privacy filter using Nvidia's Nemotron-PII dataset, expanding PII detection from 8 to over 50 label types, targeting healthcare and enterprise use cases with better accuracy.
MiniMax M2.7 Model Deploys on NVIDIA NIM Endpoints with OpenClaw Support
Chinese AI firm MiniMax has made its M2.7 model available through NVIDIA's GPU-accelerated NIM endpoints. This deployment includes support for the OpenClaw and NemoClaw frameworks, integrating it into a major AI development ecosystem.
SauerkrautLM-Doom-MultiVec: 1.3M-Param Model Outperforms LLMs 92,000x Its Size
Researchers built a 1.3M-parameter model that plays DOOM in real-time, scoring 178 frags in 10 episodes. It outperforms LLMs like Nemotron-120B and GPT-4o-mini, which scored only 13 combined, demonstrating the power of small, task-specific architectures.
NVIDIA and Cisco Publish Practical Guide for Fine-Tuning Enterprise Embedding Models
Cisco Blogs published a guide detailing how to fine-tune embedding models for enterprise retrieval using NVIDIA's Nemotron recipe. This provides a technical blueprint for improving domain-specific search and RAG systems, a critical component for AI-powered enterprise applications.
The LLM Evaluation Problem Nobody Talks About
An article highlights a critical, often overlooked flaw in LLM evaluation: the contamination of benchmark data in training sets. It discusses NVIDIA's open-source solution, Nemotron 3 Super, designed to generate clean, synthetic evaluation data.
Nvidia Commits $26 Billion to Open-Source AI, Aiming to Reshape the Ecosystem
Nvidia plans to invest $26 billion over five years in open-weight AI models, launching Nemotron 3 Super. This strategic move addresses a growing open-source gap left by major AI labs and counters rising Chinese model dominance while reinforcing Nvidia's hardware ecosystem.
From Prototype to Production: Streamlining LLM Evaluation for Luxury Clienteling & Chatbots
NVIDIA's new NeMo Evaluator Agent Skills dramatically simplifies testing and monitoring of conversational AI agents. For luxury retail, this means faster, more reliable deployment of high-quality clienteling assistants and customer service chatbots.
NVIDIA Vera Rubin: One Rack Matches TOP500, 35 EU Labs Deploy
NVIDIA's Vera Rubin NVL72 delivers TOP500-class performance in a single rack, with 35 European labs deploying the system for AI and HPC.
JUPITER Exascale Maps Brain at Cellular Scale on 4,096 Grace Hopper Nodes
JUPITER, Europe's first exascale supercomputer, trained CytoNet brain model on 6.5 PB in 5 days and runs climate, 6G, and quantum simulations.
NVIDIA, GENCI Launch AI Factory France Compute Access for Startups
NVIDIA and GENCI launched AI Factory France at VivaTech, giving European startups free access to AI supercomputers. The program includes compute, tools, and expert support for NVIDIA Inception members.
NVIDIA Blackwell Sweeps MLPerf Training 6.0, GB300 Hits 1.6x Speedup
NVIDIA Blackwell swept MLPerf Training 6.0 across all seven benchmarks. GB300 NVL72 delivered 1.6x speedup over GB200 NVL72 using NVFP4 and 8,192 GPUs.