Article Content

Google DeepMind Releases VaultGemma with Differential Privacy Capabilities

Published in Latest AI News

Time :Sep 16, 2025

Read :4minute

Google DeepMind has recently launched a new language model called VaultGemma, an innovative technology that focuses on protecting user privacy. VaultGemma is not only open-source but also the largest language model with differential privacy capabilities to date, featuring a staggering 1 billion parameters. This release marks a significant advancement in the field of artificial intelligence regarding the protection of user data privacy.

Traditional large language models may accidentally remember sensitive information during training, such as names, addresses, and confidential documents. To address this challenge, VaultGemma introduces differential privacy technology, which adds controllable random noise during the training process to ensure that the model's output cannot be linked to specific training samples. This means that even if VaultGemma has encountered confidential files, their content cannot be statistically reconstructed. Google's preliminary tests show that VaultGemma indeed did not leak or reproduce any training data, further enhancing user trust.

In terms of technical architecture, VaultGemma is based on Google's Gemma2 architecture, using a decoder-only Transformer design with 26 layers and employing a multi-query attention mechanism. A key design choice was limiting the sequence length to 1024 tokens, which helps manage the high-density computation required for private training. The development team also leveraged a novel "differential privacy scaling law" to provide a framework for balancing computational power, privacy budget, and model utility.

Although VaultGemma's performance is comparable to that of ordinary language models from five years ago, it is somewhat conservative in its generation capabilities, but it offers stronger privacy protection. Google researchers stated that they will publicly release VaultGemma and its related code libraries under an open-source license on Hugging Face and Kaggle, allowing more people to easily access this private AI technology.

The release of this model undoubtedly provides new possibilities for combining privacy security and open-source technology, and it is expected to offer users a safer and more reliable experience in the future.

Related Recommendations

Aliyun Tongyi Qwen3-Max Launches Deep Thinking Function on Official Website

Alibaba's Qwen3-Max model introduces 'Deep Thinking' mode, enhancing complex task efficiency via reinforced reasoning and multi-step problem-solving. With over 1 trillion parameters and 36T tokens of pre-training data, it is the largest and most capable version, showing significant improvements in coding and agent capabilities.....

Nov 3, 2025

191.9k

Ant Group Releases Ling-1T, a 1 Trillion Parameter Language Model That Sets a New Industry Benchmark in Inference Speed and Capabilities

Ant Group launches Ling-1T, a trillion-parameter open-source AI model excelling in reasoning, code generation, and math, setting new benchmarks for domestic AI with superior speed and performance surpassing leading models.....

Oct 9, 2025

150.8k

ByteDance Launches New AgentGym-RL Framework: Enhancing the Decision-Making Capabilities of Large-Scale Language Models

Focuses on developing large-scale language model agents, requiring reinforcement learning frameworks for autonomous learning. Lacks effective training methods from scratch without supervised fine-tuning, exploring diverse real-world training solutions.....

Sep 11, 2025

177.0k

UIUC and Google Release Search-R1: A Large Language Model That Can Search and Answer Questions

A groundbreaking new AI technology allows language models to search the internet for information! Not only has this resulted in a 41% increase in exam scores, but it also unlocks a new level of reasoning and search capabilities. Learn about this academic 'cheat code' evolution and why you might want to get your AI a library card! Paper: https://arxiv.org/abs/2503.09516 Code: https://github.com/PeterGriffinJin/Search-R

Apr 21, 2025

147.9k

iFLYTEK Starfire Simultaneous Translation Voice Model Released: Achieving Human Expert Translator Level

Today, iFLYTEK officially launched its latest research and development achievement, the Starfire simultaneous translation voice model, marking the debut of the first domestic large model with end-to-end speech simultaneous translation capabilities. This innovative technology has significantly improved the translation performance across all scenes compared to iFLYTEK's previous translation technologies, and has greatly shortened the end-to-end response time.

Jan 15, 2025

212.6k

Intelligent Future, Your Artificial Intelligence Solution Think Tank

English 简体中文繁體中文にほんご