Inceptive: The AGI Editor of the Pre-Google Brain Era for Directing Drugs

The article introduces Inceptive, a biotech startup founded by former Google employee and Transformer inventor Jakob Uszkoreit. The company applies deep learning to drug design, with a core philosophy of enabling the creation of new proteins and drugs through large-scale learning of "biological software languages." The article also discusses the issue of computational resource allocation for LLMs, with Jakob arguing that this is a key factor contributing to the inefficiency of LLMs. Overall, Inceptive represents an intriguing attempt at cross-disciplinary application of LLMs and is worth keeping an eye on for its advancements in the field of drug design.

Chinese Academy of Sciences Launches SpikingBrain, a Brain-like Large Model: Achieving 100x Speed Breakthrough with 2% of the Data

Recently, Li Guoqi and Xu Bo's team from the Institute of Automation, Chinese Academy of Sciences, jointly released the world's first large-scale brain-like spiking large model - SpikingBrain1.0. The model demonstrates astonishing speed in processing long texts, capable of processing ultra-long texts of 4 million tokens at more than 100 times the speed of current mainstream Transformer models, while requiring only 2% of the data. Current mainstream large language models, such as the GPT series, are generally based on Transformer architecture.

Challenging Conventions: A Breakthrough Transformer Architecture Without Normalization Layers

In the field of deep learning, normalization layers are considered an indispensable component of modern neural networks. Recently, research led by Meta FAIR research scientist Zhuang Liu, titled "Transformer without Normalization Layers", has garnered significant attention. This research not only introduces a new technique called Dynamic Tanh (DyT), but also demonstrates the effectiveness of Transformer architectures even without traditional normalization layers.

DeepMind's New AI Model AlphaProteo Will Change the Protein Design Game!

Google DeepMind has released AlphaProteo, an artificial intelligence system focused on generating novel proteins tailored to specific target molecules, aimed at accelerating drug design, disease understanding, and health application research. AlphaProteo successfully designed binders for the protein VEGF-A, which is associated with tumor growth and various disease complications, outperforming traditional methods. Unlike AlphaFold, the proteins designed by this system can interact with biological processes, providing possibilities for developing targeted therapies against harmful proteins.

Wang Jian Appears at the Bund Conference: The Revolution of AI Technology, Mechanisms, and Infrastructure is Creating the Future

Chinese Academy of Engineering academician Wang Jian shared his in-depth thoughts on AI, AI+, and AI infrastructure at the opening ceremony of the 2024 Bund Conference. He emphasized that AI+ is not merely a simple integration of AI and industry, but rather the integration of data, models, and computing power, with cloud computing serving as the infrastructure of the AI era. Wang Jian pointed out that although artificial intelligence has a long history, it has only truly impacted industries for the past 7 years, with the critical turning point being Google’s introduction of the Transformer in 2017. Under the logic of AI+, ChatGPT is not just an application, but a broader application platform.

New Breakthrough in GPU Optimization! 'Tree Attention' Accelerates Inference of 5 Million Long Texts by 8 Times

The Transformer architecture, a star in the field of artificial intelligence, has led a revolution in natural language processing with its self-attention mechanism at its core. However, when handling long contexts, the resource consumption of self-attention calculations becomes a bottleneck. To address this issue, researchers have proposed the Tree Attention method, which decomposes the calculation tasks through tree reduction to improve efficiency. This method not only reduces communication overhead and memory usage but is also 8 times faster than existing methods in a multi-GPU environment. The introduction of Tree Attention not only enhances performance.