At the upcoming 9th Digital China Construction Summit, China Mobile's self-developed "Jiu Tian" 35B general-purpose large model will officially be unveiled to the public. As an important progress in the domestic computing power ecosystem, MoLei (Moore Threads) recently announced that its flagship full-featured GPU, MTT S5000, has completed full-process adaptation and inference verification of this model.

The core of this adaptation work lies in deep integration. Based on its self-developed MUSA software stack and the SGLang-MUSA high-performance inference engine, Moore Threads successfully打通 the entire inference pipeline of the "Jiu Tian" 35B model. Through collaborative optimization of the MUSA C development framework, muDNN computing library, and the open-source MATE operator library, the MTT S5000 has been deeply customized for the attention mechanism and long-sequence inference features specific to large models, ensuring efficient and stable performance when processing long texts and handling high-concurrency requests.

image.png

As the technical foundation of this adaptation, the MTT S5000 computing card has shown outstanding performance. This graphics card is built on the fourth-generation MUSA "Pinghu" architecture, with a maximum AI dense computing power of up to 1000 TFLOPS per card. In terms of hardware configuration, it is equipped with 80GB of high-capacity VRAM, with a memory bandwidth of 1.6 TB/s, and supports full-precision computing from FP8 to FP64. Additionally, the high 784 GB/s inter-card interconnect bandwidth ensures its scalability in complex intelligent computing scenarios.

This collaboration not only verifies the reliability of domestic GPUs in supporting core large models of central enterprises but also demonstrates Moore Threads' maturity in high-performance operator optimization and software ecosystem construction. With the official release of the "Jiu Tian" 35B model, this combination of "domestic large models + domestic computing power" will provide a more reference-worthy practical case for achieving independent and controllable computing power.