Alibaba Cloud Large Model Prices Cut in Half! Tongyi Qianwen 3-Max Call Cost Reduced by 50%, Only 10% Fee Charged for Cache Hits

The "price war" for large models has escalated again. Today, Alibaba Cloud's large model service platform, BaiLian, announced that starting from November 13, 2025, the Tongyi Qianwen 3-Max model for the China site (Beijing region) will be fully discounted, with core call costs halved, and the caching billing strategy will be optimized simultaneously, significantly reducing the long-term usage costs for enterprises and developers. This move aims to break the "high barrier" of large model applications and accelerate the implementation of AI in the digital transformation of small and medium-sized enterprises.

Three price reduction measures directly address user cost pain points

Batch call costs are halved: the cost for enterprise batch processing of text, logs, or customer service conversations drops by 50%, significantly improving the economics of high-concurrency applications;
Implicit cache hits only charge 20%: for repeated or similar requests, the system automatically enables caching, and the hit part is billed at 20% of the standard unit price of input tokens;
Explicit cache cost-effectiveness soars: the creation cost of a cache is 125% of the input token unit price, but subsequent hit calls only require 10% of the cost, allowing frequent business operations to save over 90% of expenses in the long term.

From "free trial" to "sustainable accessibility"

This price adjustment is not an isolated action. Previously, Alibaba Cloud had converted some model services from "limited-time free" to "limited-time quotas," guiding users to reasonably plan resources. The price reduction of Tongyi Qianwen 3-Max marks an upgrade in its strategy: through fine-grained billing + large-scale cost reduction, it achieves a "accessible yet sustainable" AI service model.

Small and medium enterprises are entering a golden window for AI implementation

With enterprises increasingly pushing for intelligence, high API call costs remain a major obstacle. Alibaba Cloud's price cuts are especially beneficial for scenarios requiring frequent large model calls, such as:

Intelligent customer service systems (tens of thousands of daily conversations);
Automatic generation of e-commerce product descriptions;
Financial compliance document reviews;
Personalized exercise generation in education.

A technical director of a SaaS service provider said, "After reducing call costs by 50%, our AI feature gross margin can increase by 15 percentage points, and we finally dare to deeply integrate the large model into our core products."

Industry impact: domestic large models enter a new stage of "value competition"

Following optimizations in model pricing by manufacturers such as Baidu and ByteDance, Alibaba Cloud's significant price cut reflects that the competition among domestic large models is shifting from "parameter arms race" to the "deep waters" of "cost efficiency and ecological value." When top players actively lower prices, industry consolidation may accelerate — only manufacturers with self-developed chips, efficient inference engines, and large-scale implementation capabilities can continue to lead in the "low price, high quality" era.

AIbase believes that the price reduction of Tongyi Qianwen 3-Max is not only a commercial strategy, but also a substantial promotion of "AI democratization." When large models shift from "luxury goods" to "daily necessities," the real wave of industrial intelligence has just begun.

Alibaba Cloud Large Model Prices Cut in Half! Tongyi Qianwen 3-Max Call Cost Reduced by 50%, Only 10% Fee Charged for Cache Hits

Related Recommendations

Baidu ERNIE-5.0-0110 Officially Released, Ranking Second in Mathematical Ability Globally

DeepSeek's DeepSeek-V4 Code Generation Capabilities to Be Released in Mid-February with Significant Improvements

Alibaba Cloud Launches a Multimodal Interaction Development Kit! Integrated with Tongyi Qianwen, Wanxiang, and Bailing to Empower Smart Hardware such as AI Glasses and Robots

Customize Your Own Large Model - Is It Necessary to Manually Write Code? This Is the Correct Way to Fine-tune!

Tencent Yuanbao Responds to AI Outburst Incident: No Human Intervention, Investigation and Optimization Have Been Initiated