Inceptive: The AGI Editor of the Pre-Google Brain Era for Directing Drugs


Recently, Li Guoqi and Xu Bo's team from the Institute of Automation, Chinese Academy of Sciences, jointly released the world's first large-scale brain-like spiking large model - SpikingBrain1.0. The model demonstrates astonishing speed in processing long texts, capable of processing ultra-long texts of 4 million tokens at more than 100 times the speed of current mainstream Transformer models, while requiring only 2% of the data. Current mainstream large language models, such as the GPT series, are generally based on Transformer architecture.
In the field of deep learning, normalization layers are considered an indispensable component of modern neural networks. Recently, research led by Meta FAIR research scientist Zhuang Liu, titled "Transformer without Normalization Layers", has garnered significant attention. This research not only introduces a new technique called Dynamic Tanh (DyT), but also demonstrates the effectiveness of Transformer architectures even without traditional normalization layers.
Google DeepMind has released AlphaProteo, an artificial intelligence system focused on generating novel proteins tailored to specific target molecules, aimed at accelerating drug design, disease understanding, and health application research. AlphaProteo successfully designed binders for the protein VEGF-A, which is associated with tumor growth and various disease complications, outperforming traditional methods. Unlike AlphaFold, the proteins designed by this system can interact with biological processes, providing possibilities for developing targeted therapies against harmful proteins.
Chinese Academy of Engineering academician Wang Jian shared his in-depth thoughts on AI, AI+, and AI infrastructure at the opening ceremony of the 2024 Bund Conference. He emphasized that AI+ is not merely a simple integration of AI and industry, but rather the integration of data, models, and computing power, with cloud computing serving as the infrastructure of the AI era. Wang Jian pointed out that although artificial intelligence has a long history, it has only truly impacted industries for the past 7 years, with the critical turning point being Google’s introduction of the Transformer in 2017. Under the logic of AI+, ChatGPT is not just an application, but a broader application platform.
The Transformer architecture, a star in the field of artificial intelligence, has led a revolution in natural language processing with its self-attention mechanism at its core. However, when handling long contexts, the resource consumption of self-attention calculations becomes a bottleneck. To address this issue, researchers have proposed the Tree Attention method, which decomposes the calculation tasks through tree reduction to improve efficiency. This method not only reduces communication overhead and memory usage but is also 8 times faster than existing methods in a multi-GPU environment. The introduction of Tree Attention not only enhances performance.