Large Model Agents Say Goodbye to Blind Stacking! Hong Kong Chinese University Team Releases SLIM Framework to Dynamically Manage the Lifecycle of External Skills

Large Model Agents (LLM Agents) are accelerating from the "chatting" phase to the continuous decision-making stage of "doing tasks." However, how to efficiently manage an agent's external capabilities has become a new challenge facing the entire industry. Recently, a research team from The Chinese University of Hong Kong proposed a dynamic skill lifecycle management framework called "SLIM" in a paper titled "Dynamic Skill Lifecycle Management for Agentic Reinforcement Learning." This innovative achievement breaks the previous industry trend of blindly "stacking skills" and provides a new approach to solving complex tasks in both physical and virtual worlds.

In complex long-tail scenarios such as web searching, automated office work, and embodied robots, agents often need to call external skills to handle error-prone and long-tail steps. However, traditional methods either tend to accumulate skills continuously, leading to increased retrieval noise and context interference; or pursue "zero-skill reasoning," trying to force all capabilities into model parameters, thus losing local but critical abilities. To address this issue, the SLIM framework treats external skills as a dynamic capability system with a lifecycle, allowing the model to autonomously decide on the retention, removal, or expansion of external skills during the reinforcement learning training process.

SLIM's basic operation mechanism is a sophisticated closed-loop cycle. During the training phase, the system precisely retrieves general or task-specific skills based on the current state and updates the agent's decision policy using the GRPO algorithm. Subsequently, the system performs a unique "leave-one-skill-out" skill audit: by temporarily disabling a specific skill to evaluate its marginal external contribution. If performance significantly declines after disabling, the skill is "retained" (Retain); if its contribution remains consistently low, it means the model has absorbed the ability or it causes interference, so the skill is "retired" (Retire). For continuously failing new scenarios, the system uses the "expand" (Expand) mechanism to summarize and supplement new skills from failure cases.

Experimental results show that this framework outperforms existing best comparison methods by an average of 7.1 percentage points. In more action-oriented and complex ALFWorld home environment tasks, SLIM achieved an 87.5% success rate through concise and efficient external skill management, far exceeding the baseline method SkillRL's 75.0%. In more information-retrieval and reasoning-oriented SearchQA tasks, SLIM also demonstrated strong competitiveness and validated the technical path of the model being able to internalize some search strategies.

Industry analysts point out that the core value of SLIM lies in elevating the external skill library from a fixed auxiliary tool to a training object that can be optimized in synergy with the strategy. It not only clarifies at the technical level which capabilities should be written into the model and which should remain external, but also enables large model agents to learn when to seek external support in complex and changing environments. This dynamic capability management paradigm undoubtedly lays a solid theoretical and engineering foundation for the next stage of embodied intelligence and large model agents moving toward large-scale industrial applications.

Large Model Agents Say Goodbye to Blind Stacking! Hong Kong Chinese University Team Releases SLIM Framework to Dynamically Manage the Lifecycle of External Skills

Related Recommendations

Aliyun Open Sources 0.8B Document Parsing Model OvisOCR2, Ends-to-End Solution Tops OmniDocBench

Report: Zhiyuan Robotics Said to Be Striving for IPO with a Target Valuation of $20 Billion

Tencent Hyra-1.0 Launches Research Intelligent Agent, Unifying AI Development and Scientific Discovery in a Single Framework

Shenzhen Science Multimodal Foundation Model Makes Debut in Shanghai: 11 Billion Parameters Integrate Six Types of Scientific Data, One Model Understands DNA to Weather Fields

Wang He, Founder of Galaxy General-Purpose Robot: The ChatGPT Moment of Embodied Intelligence Will Arrive by 2028!