An AI stack: from cloud orchestration to LLM evaluation

Поділитися
Вставка
  • Опубліковано 3 жов 2024
  • Ion Stoica, Professor of Electrical Engineering and Computer Science at the University of California Berkeley, delivers a CSE Distinguished Lecture about his research on March 13, 2024.
    Abstract: With the release of ChatGPT, just over one year ago, large language models (LLMs) have taken the world by storm: they have enabled new applications, have exacerbated GPU shortage, and raised new questions about their answers’ veracity. In this talk, I will present several projects I have been working on over the past three years, which are now part of an open-source stack for training, fine tuning, serving and evaluating LLMs. In this talk, I will focus on three projects: (i) SkyPilot, a broker architecture that makes it easy to run AI workloads on a variety of clouds to improve availability, cost and performance, (ii) vLLM, a high-throughput inference engine for LLMs, and (iii) Chatbot Arena, a system to accurately benchmark LLMs.

КОМЕНТАРІ •