What Table Representation Learning Brings to Data Systems by Madelon Hulsebos (Dijkstra Award 2024)
Вставка
- Опубліковано 8 гру 2024
- Speaker: Madelon Hulsebos
Abstract:
We all observe the impressive capabilities of representation learning and generative models for text, videos, and images on a daily basis. Structured data such as tables in relational databases, however, have long been overlooked despite their prevalence in the organizational data landscape and critical use in high-value applications and decision-making processes. Learned representations, or embeddings, that capture the semantics of structured data can play a key role in making data systems more efficient, robust and accurate, at scale. Models that generalize to real-world databases are critical to make this work. In this context, I will discuss how rather compact and specialized column embeddings can be more effective than using GPT-something for table understanding, and reflect on the importance of capturing the core properties of relational databases in the embedding space. I will close by illustrating the value of embeddings for table retrieval to make LLM-powered query interfaces to structured data truly useful.
Biography:
Dr. ir. Madelon Hulsebos is a tenure track researcher at CWI in Amsterdam. Prior to that, she was a postdoctoral fellow at UC Berkeley, and received her PhD from the University of Amsterdam for which she did research at MIT and Sigma Computing. Her general research interest is on the intersection of machine learning and data management, currently focusing on Table Representation Learning to democratize insights from structured data. Madelon founded the Table Representation Learning workshop at NeurIPS, and leads various other efforts in this space. She was awarded a BIDS-Accenture fellowship for her postdoctoral research on retrieval systems for structured data at UC Berkeley as well as a 5-year AiNed fellowship grant.
Website: www.cwi.nl/en/...
About the Dijkstra Fellowship
The Dijkstra Fellowship is named after former CWI researcher Edsger W. Dijkstra, who was one of the most influential scientists in the history of CWI. Dijkstra developed the shortest path algorithm, among other contributions. The first Dijkstra Fellowships were awarded to David Chaum and Guido van Rossum in 2019.
Dijkstra Fellowship 2024 for Marcin Żukowski
Marcin Żukowski started his career at CWI. He did his MSc and PhD research on database management system architectures in our Database Architectures (DA) group. As a PhD student under the supervision of Peter Boncz, he developed the innovative concept of vectorized execution to improve the performance of database queries. This research received the DaMoN 2007 Best Paper Award and also the CIDR 2024 Test of Time Award, established by the Conference on Innovative Data Systems Research (CIDR).