Recently, MBLab has jointly released and open-sourced the latest breakthrough in low-bit large model training with Tsinghua University and the OpenBMB open-source community — BitCPM-CANN. This achievement was completed natively on Huawei's Ascend platform, marking a key step forward in the lightweight and engineering implementation of edge-side AI large models.

Releasing Six Times the Memory Benefit to Break Hardware Limitations
The open-sourced BitCPM-CANN includes four model sizes: 0.5B, 1B, 3B, and 8B. When compared item by item with full-precision models of the same size, it performs exceptionally well. Compared to traditional BF16 precision, this model can release about six times the memory benefit during inference, significantly lowering the hardware requirements for running large models.
For the mobile industry, the six times memory benefit means that large models with 8B parameters, which previously required very high configuration levels, can now run smoothly on mainstream flagship phones. This extreme release of memory space will directly accelerate the popularization and commercial application of edge-side AI technology on mobile devices.
High Ability Retention Rate Confirms Engineering Reproducibility
While reducing the model size, BitCPM-CANN still maintains an extremely high performance level, with its model ability retention rate successfully maintained between 90% and 97.2%. The ability retention rates of the three main model sizes have reached 95.7% to 97.2%, and even the smallest 0.5B model has a retention rate exceeding 90%.
This impressive evaluation result systematically proves that the low-bit training approach has strong scalability and engineering reproducibility. MBLab has built a complete low-bit training foundation based on the relevant core technologies, covering the entire engineering system including environment adaptation, support for 32K long sequences, and fused operators, thus laying a solid public infrastructure for future low-bit training work on Ascend.
