SenseTime Open Sources SenseNova U1, Achieving a Multimodal Native Unified Architecture

SenseTime officially released and open-sourced the SenseNova U1 series of native understanding and generation unified models on the 28th. This model is based on the self-developed NEO-unify architecture by SenseTime in March this year. It achieves deep integration of multi-modal understanding, reasoning, and generation within a single model framework, marking a significant breakthrough in the multi-modal AI paradigm from "integrated" to "native unification."

The NEO-unify architecture adopted by SenseNova U1 completely discards the common modular design found in mainstream models. By removing the visual encoder (VE) and variational autoencoder (VAE), it reconstructs a unified representation space. This architecture deeply integrates multi-modal processing into every layer of computation, allowing language and visual information to be modeled as a unified composite. It maintains pixel-level visual fidelity while preserving semantic richness. With this technology, the model demonstrates remarkable performance in logical reasoning and spatial intelligence, accurately understanding the complex layout and intricate relationships of the physical world.

SenseNova U1 - SencePhoto's Native Understanding and Generative Unified Model, Say Goodbye to Plugin-Based AI

On April 28, SenseTime open-sourced the 'SenseNova U1' series, a 'native understanding and generation unified model' that overcomes traditional multimodal models' reliance on modular splicing, achieving deep integration of vision and language through a unified architecture, marking a significant domestic AI breakthrough in multimodal technology.....

SenseTime Launches the Industry's First Multi-Series Generative AI Agent Seko2.0, Domestic AI Chip Successfully Integrates the Full Multimodal AIGC Pipeline

SenseTime launches Seko2.0, the world's first AI agent for multi-scene video generation, enabling continuous narratives from single clips. It ensures high consistency in characters, scenes, and style, advancing plot coherence and visual uniformity, scalable for short videos, ads, and education, powered by its proprietary multimodal model.....

New GoT-R1 Multimodal Model Released: Making AI Drawing Smarter, the New Era of Image Generation!

Recently, a research team from the University of Hong Kong, The Chinese University of Hong Kong, and SenseTime has released a groundbreaking framework - GoT-R1. This new multimodal large model significantly enhances the semantic and spatial reasoning capabilities of AI in visual generation tasks by introducing reinforcement learning (RL), successfully generating high-fidelity and semantically consistent images from complex text prompts. This advancement marks another leap in image generation technology. Currently, although existing multimodal large models have made significant progress in generating images based on text prompts

SenseTime's Vimi Camera Renamed to Performance Package APP Now Officially Launched in Various App Stores

Recently, SenseTime's Vimi Camera was officially renamed to Performance Package APP and has been launched in major app stores. According to reports, the Performance Package APP is an AI performance application designed for creative and expressive content creators, featuring powerful generation functions and an active creative community. Users can easily portray classic characters through the AI role-playing feature; utilize the AI voice changer to achieve a variety of voice transformations; and interact with other creators in the creative community to spark more creative ideas.

SenseTime Launches 'Magic Photo Capture' App, Offering Personalized New Year Photo Services

SenseTime recently launched its new app, 'Magic Photo Capture', providing users with a creative and fun AI imaging experience. With the arrival of the Lunar New Year, this app aims to help users create personalized New Year photos using advanced artificial intelligence technology, enhancing the festive atmosphere. 'Magic Photo Capture' is not just an ordinary photography app; it includes a variety of templates in its mini-program, allowing users to choose from graduation photos, creative group photos, classic art style portraits, travel photos, new national style, and more.