Kuaishou has recently upgraded its video generator Kling to version 2.6, introducing two core features: voice control and motion control, bringing a breakthrough in the AI video generation field. This update not only achieves native audio generation but also significantly improves the processing accuracy of complex actions.

Voice Control: From Sound Effects to Personalized Voice Customization
The voice control feature of Kling 2.6 is based on synchronized video and audio generation technology, similar to Google Veo3 and Sora2, capable of generating sound effects, vocals, and music that match the video content. This feature supports various vocal types such as speaking, dialogue, narration, singing, and rapping, and can handle ambient noise and complex scene sound effects.
More notably, users can now upload their own voice training models or directly upload audio files for text-to-video creation. This groundbreaking feature significantly enhances character consistency—characters in generated videos can speak with clear, recognizable voices, making it possible to create consistent characters across multiple video clips.
Kling AI's application scenarios include product demonstrations, lifestyle vlogs, news broadcasts, sports commentary, documentaries, interviews, short dramas, and music performances, even including complex forms like polyphonic choruses.
Motion Control Upgrade: Precisely Capturing Complex Full-Body Movements
The second major update focuses on a comprehensive upgrade of the motion control system. According to Kling AI, the system can now more precisely capture full-body movements, accurately handling even fast and complex actions such as martial arts or dance.
The company particularly emphasized improvements in two traditional challenges for AI video: hand movements are now precise without blurring, and facial expressions and lip synchronization remain natural. Users can upload action reference clips lasting 3 to 30 seconds to create coherent sequences, and scene details can be adjusted through text prompts.
On social media, there have been many impressive application cases showing that AI-generated video content is continuously growing, and creators are taking full advantage of this opportunity, while also giving rise to many creative works.

Price Advantages and Market Strategy
In addition to offering services on its own platform, Kling 2.6 can also be used through third-party platforms such as Fal.ai, Artlist, and Media.io. The API pricing is approximately $0.07 to $0.14 per second of generated video, with prices fluctuating based on generation speed, duration, and resolution, making it highly competitive in the market. Kling AI itself uses an points-based billing system.
In early December, Kuaishou also released Video O1—called "the world's first unified multimodal video model"—which can edit existing videos through text instructions, achieving functions such as changing the main character, weather, or video style.
With these innovative features, Kuaishou is competing with Western companies such as Google, OpenAI, and Runway, as well as Chinese competitors like Hai Luo, Shida, and Weidu. Notably, Kuaishou operates one of the world's largest short video platforms, Kwai, comparable in scale to TikTok, which allows it to access massive audiovisual and movement data, providing a unique advantage in training video models, achieving voice synchronization, and realistic motion.
