演讲摘要:Querying high-dimensional vectors has been an active research topic for several decades. It is an essential procedure in a wide range of applications (e.g., classification & regression, data integration, image/video retrieval, and recommender systems). Recently, representation learning and auto-encoding methods as well as pre-trained models have gained popularity. They basically deal with dense high-dimensional vectors, and this brings new challenges and opportunities to high-dimensional query processing. Meanwhile, new techniques have emerged to tackle this long-standing problem theoretically and empirically, and new software has been developed for query processing at scale. This talk provides a brief review of solutions to querying high-dimensional vectors. It first discusses the challenges in real-world applications, and then reviews prevalent query processing techniques such as locality sensitive hashing, product quantization, and proximity graphs. Moreover, it introduces libraries and systems that have been developed recently for efficient query processing of high-dimensional vectors.
讲者简介:Chuan Xiao is an Associate Professor with Osaka University and a Guest Associate Professor with Nagoya University. He received the B.E. degree from Northeastern University (China) in 2005 and the Ph.D. degree from the University of New South Wales in 2010. He is a winner of the Kambayashi Young Researcher Award of the Database Society of Japan. His research interests include high-dimensional data management, data cleaning, data integration, spatio-temporal databases, and information retrieval.