Shaobo Zhang

Senior Algorithm Engineer, CodeGeex, Wisdom Spectrum AI

He graduated from the University of Texas with a master's degree in software engineering in 2017, and has been engaged in NLP-related research since graduation, focusing on model training and application landing. He joined Smart Spectrum AI in 2023, and is responsible for CodeGeex code model, Embedding model training, and led the landing of CodeGeeX project-level intelligent Q&A, connected search and other functions. Currently, he is in charge of cutting-edge algorithm development in CodeGeeX team, aiming to promote the innovation and application of big model technology in the code field.

Topic

CodeGeex-based AI Coding Practice and Exploration

With the rise of large language models, researchers are increasingly exploring its application in various verticals, such as the code domain. In this domain, large models have achieved remarkable success in code completion, intelligent Q&A, and so on. However, they still exhibit many limitations and shortcomings in certain scenarios, especially in the software engineering domain, in terms of how to use big models to understand the whole project and how to complete automatic fixes for vulnerabilities in engineering. Based on this, we would like to introduce to you some of our practices and explorations in this area and scenario, and we expect that this work can provide some insights to promote the future development of code-based big models in software engineering. Outline: this presentation is divided into three main parts: 1. Introducing the development of AI technology in the programming field 2. Introducing the CodeGeex team, and the main achievements at the model and product level. 3. Introduce CodeGeex team's practices and explorations in the field of software engineering.