Major technological breakthroughs have emerged in China's artificial intelligence field. Xiyu Technology officially released its new large model, MiniMax M3, today. This model not only has cutting-edge programming capabilities but also supports an ultra-long context of up to 1 million (1 million) tokens. More impressively, it also supports native multi-modal capabilities including image and video input as well as computer desktop operations, becoming the first open-source model in China to integrate these three core capabilities.

image.png

Outstanding performance in authoritative evaluations

In the widely recognized hard-core programming evaluation set SWE-Bench Pro, MiniMax M3 achieved an excellent score of 59.0%, surpassing GPT-5.5 and Gemini 3.1 Pro, with its performance now extremely close to the top-tier Opus 4.7. In the Claw-Eval test, which specifically evaluates AI agent capabilities, as well as the OmniDocBench multi-modal document understanding benchmark, M3 also achieved impressive top scores and leading results.

New technical architecture unlocks computing power

Beneath the performance surge lies the adoption of a new sparse attention architecture (MSA) by M3. In extreme scenarios involving an ultra-long context of 1 million words, its single token computation is only half that of the previous generation model, allowing the model to speed up more than 9 times during the comprehension phase and over 15 times during the answer generation phase. Currently, the model's API is officially available for use, and the official team has promised to open-source the model weights and technical reports within 10 days for global developers.