Yi Wu
Former OpenAI researcher and assistant professor at Tsinghua University's Institute of Cross-Informatics
Yi Wu, Assistant Professor at Institute of Cross-Information Studies, Tsinghua University, was a full-time researcher at OpenAI before returning to China, with research interests in deep reinforcement learning, multi-intelligent body learning, inference models, human-computer interaction, etc. He received his Ph.D. degree from the University of California, Berkeley, USA, under the supervision of Prof. Stuart Russell in 2019; and he received his undergraduate degree from the Institute of Cross-Information Studies, Tsinghua University, China, in 2014. Experimental Class of Computer Science (Yao class), representative works include reinforcement learning generalizability early work Value Iteration Network, multi-intelligence body learning highest citation algorithm MAPPO/MADDPG, OpenAI multi-intelligence body hide-and-seek project, etc., and has also won the top conference NIPS2016 best paper award and the ICRA2024 best demo award finalist.
Topic
AReaL: a Flexible and Efficient Open-Sourced RL System for Large Reasoning Model
With the o1/R1 series of models out of the loop, inference modeling has become an important paradigm on the road to AGI Scaling Law, and reinforcement learning is an important engine to promote the development of the inference modeling paradigm. However, reinforcement learning algorithms are more complex and have more modules than traditional deep learning, so it poses a great challenge to build a training system adapted to reinforcement learning algorithms. Here we introduce AReaL, a training system developed by Tsinghua University and Ant Research Institute for inference modeling and reinforcement learning, and the solution ideas of AReaL system in the face of the unique challenges of reinforcement learning.