Responds as fast as lightning! Google launches Gemini 3.1 Flash-Lite: 2.5x faster first character, further reducing computing costs

Google has officially launched the latest member of its Gemini 3 series - Gemini 3.1 Flash-Lite. As the fastest and most cost-effective lightweight model in the series, its release marks Google's continued efforts in the "cost-effective AI" field, aiming to provide developers with an ultimate real-time interaction experience.

In terms of performance, Gemini 3.1 Flash-Lite shows remarkable progress. According to data from authoritative evaluation platforms, compared to its predecessor 2.5 Flash, the new model has achieved a 2.5 times increase in first-word response time (TTFT), and the overall output speed has also increased by 45%. This extremely low latency makes it perfectly suitable for dialogue robots and real-time processing scenarios that require immediate feedback.

Aside from being fast, this model also offers excellent value for money. Google has set a highly competitive pricing plan: only $0.25 per million input tokens. In multiple core capability tests, 3.1 Flash-Lite even showed the ability to challenge superior models, leading its peers at the same level in multi-modal understanding and logical reasoning indicators, with some data even surpassing larger predecessor models.

Additionally, Google has equipped this model with an innovative "thinking level" feature in AI Studio and Vertex AI. Developers can flexibly adjust the model's "depth of thinking" according to business needs: for simple translation or content review, they can pursue maximum efficiency; while facing complex logic simulation or data dashboard generation, they can trigger deeper reasoning capabilities. Currently, the model is available through API to preview users and enterprise platforms, providing a new tool for global developers to build low-latency AI applications.

Key points:

⚡ Significant improvement in response speed: First-word response speed increased by 2.5 times, overall speed improved by 45%, focusing on real-time interaction scenarios.
💰 Extreme cost control: Input price as low as $0.25 per million tokens, greatly reducing the barrier to large-scale AI deployment.
🧠 Controllable depth of thinking: New "thinking level" adjustment function, supporting flexible switching between efficiency and deep reasoning.

Rushing Before Listing! Anthropic Raises $6.5 Billion with a Valuation Close to $1 Trillion

AI company Anthropic completed a $65 billion Series H funding round, with a post-investment valuation of $965 billion. Led by Altimeter Capital and Sequoia Capital, with participation from Samsung and others, this is seen as the final private round before IPO. The funds include prior commitments from tech giants like Amazon.....

Apple iOS 27 Brings Epic Leaks! Collaborating with Google Gemini to Train Local AI, Some Requests for Siri Will Be Shifted to Google Cloud

Apple and Google are engaging in deep collaboration in the AI field, planning to prioritize privacy in iOS 27 and focus on local edge AI processing. Through the "knowledge distillation" technology, Apple uses Google's Gemini large model to train a lightweight edge model, transferring knowledge from the large model to the small one, enhancing the AI experience on devices without sacrificing user privacy.

Microsoft and Yoo Reassess AI Costs, Token Volume Surges Without Success

Tech media Toms reports that Uber and other tech giants are reassessing AI usage costs. As AI advances, token consumption has surged without delivering expected functional improvements. Goldman Sachs predicts that by 2030, the expansion of agentic AI applications will increase token consumption by 24 times, reaching 12 trillion per month, driving up demand for AI systems.....

Responds as fast as lightning! Google launches Gemini 3.1 Flash-Lite: 2.5x faster first character, further reducing computing costs

Related Recommendations

Japanese Infrastructure Company Datasection Collaborates with OpenAI

Rushing Before Listing! Anthropic Raises $6.5 Billion with a Valuation Close to $1 Trillion

Apple iOS 27 Brings Epic Leaks! Collaborating with Google Gemini to Train Local AI, Some Requests for Siri Will Be Shifted to Google Cloud

Google Spelled with Two Ps? Google AI Summary Shows Frequent Elementary Errors, Inherent Hardships of Large Models Exposed

Microsoft and Yoo Reassess AI Costs, Token Volume Surges Without Success