Large Model Agents (LLM Agents) are accelerating from the "chatting" phase to the continuous decision-making stage of "doing tasks." However, how to efficiently manage an agent's external capabilities has become a new challenge facing the entire industry. Recently, a research team from The Chinese University of Hong Kong proposed a dynamic skill lifecycle management framework called "SLIM" in a paper titled "Dynamic Skill Lifecycle Management for Agentic Reinforcement Learning." This innovative achievement breaks the previous industry trend of blindly "stacking skills" and provides a new approach to solving complex tasks in both physical and virtual worlds.

In complex long-tail scenarios such as web searching, automated office work, and embodied robots, agents often need to call external skills to handle error-prone and long-tail steps. However, traditional methods either tend to accumulate skills continuously, leading to increased retrieval noise and context interference; or pursue "zero-skill reasoning," trying to force all capabilities into model parameters, thus losing local but critical abilities. To address this issue, the SLIM framework treats external skills as a dynamic capability system with a lifecycle, allowing the model to autonomously decide on the retention, removal, or expansion of external skills during the reinforcement learning training process.
SLIM's basic operation mechanism is a sophisticated closed-loop cycle. During the training phase, the system precisely retrieves general or task-specific skills based on the current state and updates the agent's decision policy using the GRPO algorithm. Subsequently, the system performs a unique "leave-one-skill-out" skill audit: by temporarily disabling a specific skill to evaluate its marginal external contribution. If performance significantly declines after disabling, the skill is "retained" (Retain); if its contribution remains consistently low, it means the model has absorbed the ability or it causes interference, so the skill is "retired" (Retire). For continuously failing new scenarios, the system uses the "expand" (Expand) mechanism to summarize and supplement new skills from failure cases.

Experimental results show that this framework outperforms existing best comparison methods by an average of 7.1 percentage points. In more action-oriented and complex ALFWorld home environment tasks, SLIM achieved an 87.5% success rate through concise and efficient external skill management, far exceeding the baseline method SkillRL's 75.0%. In more information-retrieval and reasoning-oriented SearchQA tasks, SLIM also demonstrated strong competitiveness and validated the technical path of the model being able to internalize some search strategies.
Industry analysts point out that the core value of SLIM lies in elevating the external skill library from a fixed auxiliary tool to a training object that can be optimized in synergy with the strategy. It not only clarifies at the technical level which capabilities should be written into the model and which should remain external, but also enables large model agents to learn when to seek external support in complex and changing environments. This dynamic capability management paradigm undoubtedly lays a solid theoretical and engineering foundation for the next stage of embodied intelligence and large model agents moving toward large-scale industrial applications.
