主题报告嘉宾 
报告题目:Collaborative Interdisciplinary ML-Centric Data Analytics at Scale
演讲摘要:In recent years the society has enjoyed the huge benefits of sharing and collaborations, as evidenced by popular services such as Google Docs, Dropbox, and Github. The trend is becoming more prevalent due to the increasing popularity of cloud-based computing and new remote-work norm caused by the pandemic. In this talk we will discuss how to support this type of sharing and collaboration in ML-centric data analytics, especially for people from different disciplines with various skills. Based on our own experiences of working with domain scientists on social media analysis, we discuss the need in providing such support in typical steps during the life cycle of data analytics. We will show a feature “wish list” needed in data collection, cleaning, instance labeling, model training, and ETL. We focus on a topic related to run-time parallel execution of workflows to support debugging and pausing/resuming capabilities. We will present initial promising results developed in the Texera project, and identify future research directions.
讲者简介:Chen Li is a professor in the Department of Computer Science at UC Irvine. He received his Ph.D. degree in Computer Science from Stanford University, and his M.S. and B.S. in Computer Science from Tsinghua University, China, respectively. His research interests are in the field of data management, including data-intensive computing, query processing and optimization, visualization, and text analytics. His current focus is building open source systems for big data management and analytics. He was a recipient of an NSF CAREER Award and several test-of-time publication awards, a part-time visiting research scientist at Google, PC co-chair of VLDB 2015, and an ACM distinguished member. Since January 2020, he has been the Treasurer and a board member of the VLDB Endowment. He was a co-founder and CTO of a startup to commercialize his research results.