Huang Haoyang
Head of the Multimodal Foundation Model Team, JD Group
Huang Haoyang is the Head of the Multimodal Foundation Model Team at the Image and Multimodal Laboratory, Exploratory Research Institute, JD Group. He has published more than 30 papers in top-tier AI journals and conferences. He previously led the development of multilingual and multimodal foundation models at Microsoft Research Asia, with applications in Microsoft Bing Search and Microsoft Translator. He spearheaded the release of Unicoder, covering 100 languages, and M3P, the world’s first multilingual multimodal pre-trained model, and achieved first place globally in the WMT21 Large-Scale Multilingual Machine Translation competition. In 2024, he led the development of the 30B-parameter StepVideo series of video generation foundation models (Step-Video-T2V and Step-Video-TI2V), which were subsequently open-sourced.