Article Content

NVIDIA Launches Nemotron 3: Hybrid Architecture Enhances AI Agent Processing Efficiency

Published in Latest AI News

Time :Dec 18, 2025

Read :4minute

NVIDIA has recently released its new Nemotron 3 series, which combines the Mamba and Transformer architectures to efficiently handle long context windows while reducing resource consumption. The Nemotron 3 series is designed for agent-based artificial intelligence systems, which can autonomously perform complex tasks and engage in long-term interactions.

The new product series includes three models: Nano, Super, and Ultra. The current Nano model is now officially available, while the Super and Ultra are expected to be launched in the first half of 2026. In this release, NVIDIA broke away from traditional pure Transformer architectures, adopting a hybrid architecture that combines efficient Mamba layers with Transformer elements and Mixture of Experts (MoE) technology. Compared to traditional pure Transformer models, Nemotron 3 handles long input sequences more effectively while maintaining stable memory usage.

Nemotron 3 supports a context window of up to one million tokens, matching cutting-edge models like OpenAI and Google. It can store large amounts of information, such as entire codebases or long conversation histories, without placing excessive pressure on hardware. The Nano model has 31.6 billion parameters, but only 3 billion parameters are active at each processing step. According to benchmark tests by the Artificial Intelligence Analytics Index (AII), Nemotron 3 matches the accuracy of gpt-oss-20B and Qwen3-30B and performs better in token throughput.

NVIDIA also introduced two key architectural improvements for the more powerful Super and Ultra models. The first is LatentMoE, which aims to address memory bandwidth overhead in standard MoE models by projecting tokens into compressed latent representations before processing. The second improvement is Multi-Token Prediction (MTP) technology, which predicts multiple tokens simultaneously during training, thereby improving text generation speed and logical reasoning capabilities.

In addition, NVIDIA has released the weights, training schemes, and multiple datasets for the Nano model, including Nemotron-CC-v2.1 based on Common Crawl, providing strong support for developers. This release aligns with NVIDIA's strategy to develop smaller language models, prioritizing speed over raw performance.

Key Points:
🌟 The Nemotron 3 series combines Mamba and Transformer architectures to enhance AI agent efficiency.
🚀 The Nano model is now available, while the Super and Ultra are expected to launch in the first half of 2026.
📊 NVIDIA released model weights and training datasets to help developers innovate.

Related Recommendations

Nvidia Acquires SchedMD and Releases New Generation of Open AI Models, Further Expanding Its Open-Source Ecosystem

Nvidia acquires SchedMD, developer of Slurm, and launches Nemotron 3 AI models to boost open-source tech for HPC and AI innovation.....

Dec 16, 2025

103.8k

A 120-Person Team Takes Down a Trillion-Dollar Giant: Runway Gen-4.5 Wins in Blind Test and Announces Challenge to Google and OpenAI

Runway's latest model, Gen-4.5, defeated Google's Veo3 and OpenAI's Sora2Pro on the third-party blind testing platform Video Arena, becoming the first large model to ascend to the top by a small team. Its CEO emphasized the feasibility of focusing on research and rapid iteration, pointing out that a team of 100 people can challenge a trillion-dollar company not by budget, but by density. The model uses a self-developed space-time hybrid Transformer architecture, demonstrating breakthroughs in AI video generation by a small team.

Dec 2, 2025

174.2k

13GB VRAM Beats Hundreds of Billions: Dahua's 'Xinghan 2.0' Answers AI Deployment with a Single Report

Dahua Tech boosts Q3 net profit by 44% to 1.06B yuan, deploying 6B vision models in 16GB edge devices. Since 2019, its Transformer-based self-training system has evolved into V/M/L series for efficient edge AI.....

Nov 27, 2025

141.5k

Databricks Co-founder Konwinski Warns: The US AI Research Advantage Is Being Lost

Databricks Co-founder Andy Konwinski warned that the US is yielding AI research leadership to China, which poses an existential threat to democracy. He pointed out that feedback from Berkeley and Stanford PhD students shows that about half of the notable new AI ideas in the past year have come from Chinese teams, a significant increase in proportion. Konwinski co-founded the venture capital firm Laude with his partner in 2024 and runs a nonprofit accelerator called Laud.

Nov 17, 2025

148.0k

Chinese Academy of Sciences Launches SpikingBrain, a Brain-like Large Model: Achieving 100x Speed Breakthrough with 2% of the Data

Recently, Li Guoqi and Xu Bo's team from the Institute of Automation, Chinese Academy of Sciences, jointly released the world's first large-scale brain-like spiking large model - SpikingBrain1.0. The model demonstrates astonishing speed in processing long texts, capable of processing ultra-long texts of 4 million tokens at more than 100 times the speed of current mainstream Transformer models, while requiring only 2% of the data. Current mainstream large language models, such as the GPT series, are generally based on Transformer architecture.

Sep 22, 2025

162.9k

Intelligent Future, Your Artificial Intelligence Solution Think Tank

English 简体中文繁體中文にほんご