Research from Renmin University: Caution Advised in Data Augmentation for Contrastive Learning


Recently, Tencent Technology (Shenzhen) Co., Ltd. published a patent regarding a training method and related equipment for large language models on the Tianyancha app. The patent is titled 'Training Method, Device, Computer Equipment, and Storage Medium for Large Language Models' and aims to enhance the learning capacity and accuracy of large language models through innovative training methods. In the training process of large language models, traditional methods often rely on a single text summary, which may lead to model overfitting and negatively impact the accuracy and diversity of generated content. However, Tencent's new...
In today's technology landscape, CLIP (Contrastive Language-Image Pre-training) is an important multimodal foundational model. It combines visual signals and text signals into a shared feature space using contrastive learning loss on a large-scale dataset of image-text pairs. As a retriever, CLIP supports various tasks such as zero-shot classification, detection, segmentation, and image-text retrieval. Meanwhile, as a feature extractor, it performs well in nearly all...
EasyRec is a recommendation system based on language models, developed by a team from the University of Hong Kong. Its uniqueness lies in analyzing emotional and detailed user behavior stories through a text behavior alignment framework to predict user preferences without requiring large amounts of user data. The system combines contrastive learning and collaborative language models, enabling accurate predictions of preferences for new users and new products, particularly excelling in zero-shot recommendation scenarios. EasyRec's plug-and-play features make it easy to integrate into existing recommendation systems, enhancing performance. The paper showcases EasyRec's performance across multiple...
French AI startup Mistral AI announced a strategic shift on May 28, expanding its AI models and infrastructure into advanced manufacturing, with deep partnerships with Airbus and BMW. Focusing on 'physical AI,' it aims to empower industrial production chains through generative AI, enabling an 'intellectual upgrade' of industrial engineering.....
Google's upgraded AI summary feature was ridiculed for basic spelling errors, frequently miscounting letters in words and even misspelling 'Google' during public tests. Users reported the AI incorrectly counted letters in 'poop,' highlighting its kindergarten-level spelling shortcomings.....