免费领取大会全套演讲PPT    

报名领取

我要参会

Bingbing Ni

Professor, Shanghai Jiao Tong University

Bingbing Ni, male,is a professor in Shanghai Jiaotong University, and a Ph.D. supervisor.He worked as a research scientist in the Advanced Digital Science Centre Singapore (ADSCS) of the University of Illinois at Urbana-Champaign from 2010 to 2015. He received his B.S. degree in Electrical Engineering from Shanghai Jiaotong University in 2005, and his Ph.D. degree in Electrical and Computer Engineering from National University of Singapore in 2011. D. degree in Electrical and Computer Engineering from National University of Singapore in 2011. During his PhD, he worked at Microsoft Research Asia and Google Inc. His research interests include computer vision, machine learning and multimedia computing, expertise in media content generation (AIGC), video understanding, and end-side intelligence, and he is one of the earliest scholars in China to conduct research on intelligent generation of visual content. Dr Bingbing Ni has published more than 200 papers, including more than 20 papers in top journals in artificial intelligence, such as IEEE T-PAMI and IJCV; and more than 100 papers in Class A conferences recommended by Chinese Computer Society, such as CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, AAAI, MICCAI. The number of Google Scholar citations of the published papers is more than 14000, with an H-factor of 60, and he was awarded Elsevier 2022 Global Highly Cited Scholar. Dr Ni serves as the domain chair of ICCV2019, ECCV2020, ECCV2022, CVPR2023, NeurIPS2023, which are the top international computer vision conferences.

Topic

Vectorised representation of visual targets and content generation techniques

The introduction of new frameworks such as Stable diffusion, SORA, etc. has greatly improved the visual quality of generative AI (including images, graphics, videos, etc.). However, black-box deep network models that aim to fit probabilistic distributions have difficulties in essentially eliminating the difficulties of semantic structural errors, imprecise and unrealistic details, and users have difficulties in precisely controlling the generated results at all granularities. In addition, visual modality data is extremely high dimensional, and the current data representation form used in generative models cannot balance representation efficiency, computational density, rendering quality, and manipulation flexibility. To address the above challenges, we propose a new framework for vectorised representation and content generation of visual targets by semantically hierarchically deconstructing visual content (images, videos, 2D/3D graphics, etc.) to form visual objects instantiated at different levels of granularity; constructing semantic widget-guided distributed parameter representations of the internal shape space of each object instance, and constructing neural primitives-based combinatorial parameter representation in each visual attribute channel. The above novel representation paradigm achieves excellent performance in the generation of compact and scalable representations of shape, texture, material, motion and other attributes of 2D/3D visual content, high-quality reconstruction and rendering, and fine-grained editing and interaction, and provides a useful exploration path for the future evolution of the framework for intelligent media generation.

© boolan.com 博览 版权所有

沪ICP备15014563号-6

沪公网安备31011502003949号