On February 12 local time, Google disclosed that its AI chatbot Gemini is facing a large-scale "model distillation attack." Attackers are inducing the model to leak its internal mechanisms by repeatedly asking a massive number of questions. In some attacks, the number of prompts exceeded 100,000, triggering heightened concerns across the industry about the security of large models. It is reported that such attacks involve repeatedly testing Gemini's output patterns and logic to explore its core internal mechanisms, ultimately aiming to clone the model or enhance their own AI systems.

Google stated that the attacks were mainly initiated by actors with commercial motives, often AI private companies or research institutions seeking a competitive advantage. The attack sources are spread across multiple regions globally, although Google did not disclose more information about the suspects. John Hottelquist, chief analyst at Google's threat intelligence team, pointed out that the scale of this attack sends a dangerous signal, indicating that such model distillation attacks may have already begun to spread into the field of custom AI tools for small and medium-sized enterprises.
He compared Google's experience to the "canary in the coal mine," meaning that the security crisis faced by large platforms signals an impending widespread risk for the entire AI industry. Google emphasized that model distillation attacks essentially constitute intellectual property theft. Major tech companies have invested billions of dollars in developing large language models, and the internal mechanisms of these models are core proprietary assets.
Although the industry has deployed mechanisms to identify and block such attacks, the open nature of mainstream large model services makes it fundamentally difficult to avoid such risks. The core objective of this attack was to extract Gemini's "reasoning" algorithm, which is its core decision-making mechanism for processing information.
Hottelquist issued a warning that as more and more companies train customized large models containing internal sensitive data, the potential harm of model distillation attacks will continue to expand, potentially leading to the gradual extraction of business insights and core knowledge accumulated by companies over years through such means.
