Nan Wang

Co-founder and CTO of JINA AI

Dr Nan Wang, co-founder and CTO of Jina AI, graduated with a PhD in Computational Neuroscience from the University of Bochum, Germany. He then worked as a senior algorithm engineer at Zalando, a leading European e-commerce company, and Tencent, where he was responsible for the search and recommendation business, and accumulated rich experience in model design, implementation, and deployment in these areas. Since founding Jina AI in 2020, as co-founder and CTO, Dr Nan Wang led the team to develop and open source the neural search framework, jina. As a TAC member of the Linux Foundation AI&Data Fund, he facilitated DocArray's graduation from the Linux Foundation AI&DATA. Dr Nan Wang has organised the development and open-sourcing of several textual and multimodal vector models, which have accumulated over 10 million downloads worldwide. Dr Nan Wang is passionate about the practical application of AI technology in the search field and actively promotes the open source development of AI technology. His outstanding contribution in the field of AI technology has earned him the title of 33 Chinese Open Source Pioneers in 2023.

Topic

Practices, Challenges and Developments in Retrieval Augmented Generation RAG

Retrieval Augmented Generation (RAG)-based techniques are one of the key approaches to address the problem of large model illusions in languages. This sharing session will take a look at the opportunities and challenges under the RAG paradigm from the perspective of a startup open source company company. Dr Nan Wang will give an in-depth introduction to Jina AI's work under the RAG paradigm, including the latest work in the directions of text vector modelling, text ordering models, multimodal vector modelling, RAG evaluation and chunking. In addition, Dr Nan Wang will propose the direction of the next phase of RAG development in light of the challenges in current RAG practice. Outline:  introduces the RAG paradigm and opportunities for startups  Introduction to Jina AI's work in RAG o Introduction to the text vector model jina-embeddings o Introduction to the sorting model jina-reranker o Introduction to the multivector model jina-colbert oIntroduction to the multimodal vector model jina-clip oIntroduction to the RAG evaluation benchmark AIR-Bench oIntroduction of the chunking solution late chunking oIntroducing the web parsing tool jina-reader  summarises the main challenges and future directions of RAG oSemantic overflow and contextual truncation