Yi Wu
Former OpenAI researcher and assistant professor at Tsinghua University's Institute of Cross-Informatics
Wu Yi is an Assistant Professor and Ph.D. supervisor at the Institute for Interdisciplinary Information Sciences, Tsinghua University, and the lead of the intelligent agent reinforcement learning framework AReaL. He received his Ph.D. from the University of California, Berkeley in 2019 and was formerly a full-time researcher at OpenAI. His research focuses on reinforcement learning, reasoning models, and general-purpose agents. His representative works include state-of-the-art multi-agent learning algorithms MAPPO/MADDPG and OpenAI’s multi-agent hide-and-seek project. He has received multiple awards, including the NeurIPS 2016 Best Paper Award, ICRA 2024 Best Demo Award Finalist, WAIC 2025 Yunfan Award, and MIT Technology Review Asia-Pacific 35 Under 35.
Topic
AReaL: Fully Asynchronous Reinforcement Learning Framework for Intelligent Agents
Intelligent agents represent the most important application form of large models in the AGI era, and reinforcement learning (RL) is the core technology for training general-purpose agent models. AReaL is an open-source RL training framework jointly developed by the Institute for Interdisciplinary Information Sciences at Tsinghua University and the Ant Technology Research Institute’s Reinforcement Learning Lab. By adopting a fully asynchronous system design and an algorithm-centric approach, it enables the fastest and most developer-friendly Agent RL. In this talk, we will share the core challenges of Agent RL, the key technical ideas behind AReaL, fully asynchronous reinforcement learning techniques, and related best practices. Learn more about the AReaL project: [https://github.com/inclusionAI/AReaL](https://github.com/inclusionAI/AReaL) Outline: 1. Intersection of reinforcement learning and large models: RLHF, Reasoning RL, Agent RL 2. Challenges in Agent RL 3. AReaL: An agent-focused RL framework that uses fully asynchronous reinforcement learning to achieve the fastest RL training, delivering 3× speedup in inference RL and 3–5× speedup in agent search scenarios 4. AReaL-lite: The latest version of AReaL, featuring an algorithm-centric design for the most developer-friendly Agent RL framework