Or Sharir - Efficiency in the Age of Large Scale Models

Поділитися
Вставка
  • Опубліковано 15 вер 2024
  • Presented on Thursday, February 8th, 2024, 10:30 AM, room C221
    Speaker
    Or Sharir (HUJI)
    Title
    Efficiency in the Age of Large Scale Models
    Abstract:
    Since the early days of the “deep learning revolution”, circa 2012, there has been a clear push to scale up models with the hope of reaching better performance. The machine learning community went from training models in size of the order of millions of parameters to about a trillion parameters in a bit more than a decade. As the scale of models grew, both compute and data size, breakthrough results were obtained, unlocking unprecedented, nearly human-like abilities across different domains and tasks (image recognition, translation, generative images, speech to/from text, coding skills, and more).
    However, large-scale deep learning models are both a blessing and a curse. The astronomical costs (both monetary and computational) mean that the state-of-the-art models remain out of reach to all but the largest companies in the world. Moreover, local inference is either impractical or comes with diminished performance, leading to remote inference that raises privacy concerns.
    In this talk I will discuss both the theoretical and practical aspects of designing more efficient models and algorithms, maintaining the same performance metrics while reducing costs. On the theory side, we will examine how different architectural choices affect the scaling of models in terms of expressiveness (what functions they can represent). On the practical side, we will see how we can achieve improved efficiency by leveraging specific domain knowledge (NLP, Quantum Physics). Finally, I will present a novel approach for achieving efficient inference in large-scale models. Rather than compressing models (distillation, pruning, low precision), I propose a new incremental computation approach allowing for up to 100X reduction in computational costs of running inference on large-scale language models.
    Bio:
    Or graduated from the Hebrew University of Jerusalem with a B.Sc. in physics, mathematics, and computer science, before completing his Ph.D. in machine learning under the supervision of Prof. Amnon Shashua. His work ranges from fundamental theoretical questions in machine learning to practical applications in various domains, including computer vision and natural language understanding. In parallel to his graduate studies, Or also worked at AI21 Labs on, amongst other things, training their 178B-parameters language model, the first competitive model to GPT3. In 2021 he joined Caltech, working jointly with Prof. Anima Anandkumar and Prof. Garnet Chan on methods at the intersection of machine learning and quantum many-body problems, as well as efficient inference of large language models.

    Link for the Panopto Recordings
    huji.cloud.pan...
    Link to past lectures
    / @hujimachinelearningcl...
    Online Calendar
    Learning Club @ HUJI
    www.google.com...
    Calendar ID: pqvap0i55u7ctv9uv3mblf152o@group.calendar.google.com
    Mailing List
    subscription-manager: mailman.cs.huji...

КОМЕНТАРІ •