Yi Wu

Former OpenAI researcher and assistant professor at Tsinghua University's Institute of Cross-Informatics

Wu Yi is an Assistant Professor and Ph.D. supervisor at the Institute for Interdisciplinary Information Sciences, Tsinghua University, and the lead of the intelligent agent reinforcement learning framework AReaL. He received his Ph.D. from the University of California, Berkeley in 2019 and was formerly a full-time researcher at OpenAI. His research focuses on reinforcement learning, reasoning models, and general-purpose agents. His representative works include state-of-the-art multi-agent learning algorithms MAPPO/MADDPG and OpenAI’s multi-agent hide-and-seek project. He has received multiple awards, including the NeurIPS 2016 Best Paper Award, ICRA 2024 Best Demo Award Finalist, WAIC 2025 Yunfan Award, and MIT Technology Review Asia-Pacific 35 Under 35.

Topic

AReaL: a Flexible and Efficient Open-Sourced RL System for Large Reasoning Model

With the o1/R1 series of models out of the loop, inference modeling has become an important paradigm on the road to AGI Scaling Law, and reinforcement learning is an important engine to promote the development of the inference modeling paradigm. However, reinforcement learning algorithms are more complex and have more modules than traditional deep learning, so it poses a great challenge to build a training system adapted to reinforcement learning algorithms. Here we introduce AReaL, a training system developed by Tsinghua University and Ant Research Institute for inference modeling and reinforcement learning, and the solution ideas of AReaL system in the face of the unique challenges of reinforcement learning.

© boolan.com 博览 版权所有

沪ICP备15014563号-6

沪公网安备31011502003949号