AI music track experienced another shockwave in early 2026. On March 9, the music foundation model

Three Breakthroughs: Making AI Music No Longer "Plastic"
High musicality: Unlike simple melody stacking, this model can handle complex multi-track arrangements with strong spatial depth.
High lyric accuracy: Unclear pronunciation and hallucination pitch shifts are a thing of the past. Its phoneme error rate (PER) is as low as 8.55%, significantly better than the top commercial model
(12.4%), second only toSuno v5 .MiniMax2.5 Strong controllability: Whether it's text descriptions or audio prompts, it can accurately follow instructions, allowing for deep customization of style and emotion.

"Dual-Core" Drive: A Dream Collaboration Between LLM and Diffusion Models
In terms of architectural design,
Composing Brain (LeLM): Responsible for planning global structure and vocal details, solving the question of "how to sing".
High-Fidelity Renderer (Diffusion): Synthesizes extremely complex acoustic details under the guidance of the language model.
Hierarchical Representation: The first to adopt parallel modeling of mixed representation and multi-track representation, balancing the stability of melodies and the delicacy of sound quality.
True Open Source, Low Barrier: Ordinary Computers Can Also "Write Songs"
The most exciting part for developers was Tencent's great openness. The
To allow users to experience immediately, the project team also released the
From the performance of
