Longfei Li
Senior Algorithm Expert, Ant Group
He has been working in Ant Group for ten years, and his main research directions include: logic learning, causal learning, automatic learning, big model, etc. He has published more than 70 papers in NeurIPS, ICML, KDD, SIGIR and other conferences. He has led and participated in many core platforms and projects within Ant, participated in the development of Ant Bailing big model, and led the development of big model offline reasoning framework flood : https://github.com/alipay/PainlessInferenceAcceleration. Received ccf2020 Award of Excellence in Scientific and Technological Progress, Wu Wenjun 2023 First Prize of Scientific and Technological Progress and so on.
Topic
Performance-Driven Exploration of Big Model Architectures - Network Architecture and Reasoning Architecture
In recent years, the big language model has been greatly improved in terms of capability, but it always faces an important problem in the application process-cost, so how to serve at a lower cost is an important direction. In order to solve this problem, Ant has made some attempts in the design and development of reasoning architecture and the exploration of network architecture. In the reasoning architecture, we combined with the specific business, redesigned the kvcache and scheduling strategy, and developed the Flood framework, which has a good performance in offline reasoning. In terms of network architecture, we have explored in the direction of MoE, linear model, and accumulated some experience. We will share some in these two directions.