免费领取大会全套演讲PPT    

点击领取

我要参会

Chengqiang Lu

Head of AI Search Generation Algorithms at Xiaohongshu

Chengqiang Lu is currently Head of AI Search Generation Algorithms at Xiaohongshu. He received his Master’s degree from the University of Science and Technology of China. His main research areas include large model pre-training/post-training, Agents, RAG, and AI4Sci. He has previously worked at Tencent QLab, Alibaba Qwen Team, and Xiaohongshu Search, focusing on algorithm research and real-world applications. He has published dozens of papers in top conferences and journals such as NeurIPS, KDD, AAAI, and ACL, with around 6,000 citations on Google Scholar.

Topic

Aligning AI Search Agents at Xiaohongshu with Adaptive Reinforcement Learning

AI search agents must dynamically balance multiple objectives such as factual accuracy, safety, information richness, and user experience. This introduces new challenges for reinforcement learning–based alignment. This talk presents a complete RL alignment approach for AI search agents driven by adaptive curriculum learning. We propose an adaptive curriculum learning RL framework that dynamically monitors the learning curves of different reward dimensions and the marginal contribution of training data. During training, the system automatically adjusts optimization priorities and data weighting, enabling multi-objective optimization tailored to the learning progress. Based on this methodology, we built the Xiaohongshu SearchLLM Agent and designed a hierarchical multi-dimensional reward system. By combining rule-based checks, LLM-based evaluation, and gated aggregation, and optimizing with GRPO together with adaptive curriculum learning, the system achieves significant improvements on both offline and online metrics.

© boolan.com 博览 版权所有

沪ICP备15014563号-6

沪公网安备31011502003949号