NVIDIA's 4B Small Model Makes a Comeback! Single Task Cost Is Just 1/36 of GPT-5 Pro

In the field of artificial intelligence, a fierce competition has recently taken place. NVIDIA's small model NVARC achieved an outstanding score of 27.64% in the latest ARC-AGI2 evaluation, surpassing its competitor GPT-5Pro's 18.3%, and successfully took the top position. This achievement not only demonstrates the powerful performance of NVARC, but also shows that its cost per task is only 20 cents, far lower than GPT-5Pro's $7, making it a real "king of cost-effectiveness" in the cost race.

The success of NVARC is attributed to its unique zero-pretraining deep learning approach. This strategy avoids the issues of domain bias and data dependency caused by pretraining on traditional large-scale general datasets. This evaluation was particularly challenging, as ARC-AGI2 adopted more difficult tests, aiming to examine whether the model can quickly learn and master new skills without direct training data.

NVIDIA team adopted an innovative approach for the training of NVARC. They moved complex reasoning processes to an offline synthetic data pipeline, using GPT-OSS-120B to generate high-quality synthetic puzzles, thereby reducing the demand for real-time computing resources. The team extracted questions from existing datasets and generated more complex new questions through combinations. To ensure the high quality of the generated data, they decomposed the reasoning process into multiple independent verification stages, ultimately forming a large synthetic dataset containing 3.2 million enhanced samples.

NVARC uses an improved version of the ARChitects method in its reasoning module and simplifies puzzle understanding with a conversational template. During training, they also used the NeMo RL framework and Megatron backend for supervised fine-tuning. Particularly noteworthy is the TTFT technology, which fine-tunes for each task, enabling the model to quickly adapt to new task rules.

Although some may question whether this small model is merely a "test machine," NVARC's success actually highlights its strong adaptability and efficiency within specific domains. The advantages of small models in terms of cost, speed, and adaptability make them particularly important in many application scenarios. In the future, how to apply the right methods to the appropriate fields will be key to further advancing the technology.

Only 7 People Can Beat It! New Gemini 3 Deep Think Released: Dominating Programming and Research Rankings

Google's Gemini 3 Deep Think model has been significantly upgraded, excelling in programming, research, and engineering. Its key highlight is achieving a high score of 3455 on Codeforces, surpassing most human players, with only 7 globally able to beat it, marking a new stage in AI reasoning capabilities.....

Highlighting Ultra-Low Latency! Mistral Launches a New Speech-to-Text AI Model

French AI company Mistral AI has released two speech-to-text models, Voxtral Mini Transcribe V2 and Voxtral Realtime, with high-speed transcription, privacy protection, and cost-effectiveness as their main features. The models offer high-precision transcription, speaker identification, and low-latency characteristics, suitable for commercial applications such as virtual assistants, call centers, and compliance records.

NVIDIA's 4B Small Model Makes a Comeback! Single Task Cost Is Just 1/36 of GPT-5 Pro

Related Recommendations

Only 7 People Can Beat It! New Gemini 3 Deep Think Released: Dominating Programming and Research Rankings

Another Breakthrough in Domestic Computing Infrastructure! Moortu MTT S5000 Completes Full-Process Compatibility with Zhipu GLM-5 Large Model

AI Inference Track Valuation Surges: Modal Labs Discusses New Funding Round, Valuation May Reach $2.5 Billion

Highlighting Ultra-Low Latency! Mistral Launches a New Speech-to-Text AI Model

Has the Robot Evolution Singularity Arrived? Alibaba Releases RynnBrain Large Model: Equipping Machines with Thinking Brains, Performance Exceeds Google Gemini