NVIDIA released Nemotron 3 Ultra, a 550B parameter open-weight model. It reportedly achieves near state-of-the-art performance, on par with GLM-5.1 and Kimi K2.6, per @mweinbach.
Key facts
- 550B parameter model released by NVIDIA
- Reportedly on par with GLM-5.1 and Kimi K2.6
- No benchmark scores published yet
- Open-weight release, license unconfirmed
- Competes with Llama 4 and Mixtral 8x22B
NVIDIA’s Nemotron 3 Ultra enters the open-weight arena at 550B parameters, positioning itself alongside the strongest publicly available models. The claim of parity with GLM-5.1 (a 530B model from Zhipu AI) and Kimi K2.6 (Moonshot AI’s latest) suggests NVIDIA aims to challenge the top tier of open-weight research models.
The unique take: This release is notable not for raw size—550B is large but not unprecedented—but for NVIDIA’s strategic pivot. Historically, NVIDIA has focused on infrastructure and closed models (like Nemotron-4 340B for synthetic data). Nemotron 3 Ultra signals a direct push into the model-weight market, competing with the likes of Meta’s Llama 4 and Mistral’s upcoming releases. The timing aligns with the industry shift toward open-weight models as enterprises demand transparency and customizability.
What’s missing: No benchmark scores, training compute costs, or inference latency numbers have been published yet [per @mweinbach]. The claim of “near state of the art” lacks specific comparisons—no MMLU, HumanEval, or SWE-Bench deltas. NVIDIA’s track record with Nemotron-4 340B showed strong synthetic data generation but not top-tier general reasoning. Independent verification via standard benchmarks is pending.
Context: NVIDIA’s move comes as the open-weight model race intensifies. Meta’s Llama 4 (reported 1.2T parameters) and Mistral’s Mixtral 8x22B have set high bars. GLM-5.1 and Kimi K2.6 are Chinese contenders with strong multilingual performance. Nemotron 3 Ultra’s open-weight license—if permissive—could attract enterprise adopters wary of proprietary models.
Vendor skepticism: NVIDIA’s claim of “near state of the art” is unsubstantiated without benchmarks. The company’s strength in hardware and CUDA ecosystem gives it distribution advantages, but model quality will ultimately determine adoption. Past Nemotron releases have been strong in niche tasks (synthetic data, code generation) but not general reasoning.
What to watch

Watch for independent benchmark evaluations on MMLU, HumanEval, and SWE-Bench. NVIDIA’s licensing terms (Apache 2.0 vs. custom) will determine enterprise adoption velocity. Also track whether NVIDIA releases training details or ablation studies.







