"Towards Speech Foundation Models" - Jungo Kasai, Research at TTIC

Поділитися
Вставка
  • Опубліковано 10 чер 2024
  • Originally presented on: Friday, May 24th ,2024 at 12:30am CT, TTIC, 6045 S. Kenwood Avenue, 5th Floor, Room 530
    Title: "Towards Speech Foundation Models"
    Speaker: Jungo Kasai, TTIC
    Abstract: The success of large language models (LLMs) has been foundational for advancements in natural language processing and artificial intelligence, leading to widespread usage and deployment across various business applications. Yet, the potential of multimodal models, such as speech models, remains less explored. We argue that speech models are at a stage comparable to text models in the era of GPT-1 in 2017, and we predict that speech foundation models will soon become dominant in the field. This presentation emphasizes the critical challenges and the recent technical advancements that are instrumental in the development of speech foundation models.
    Tags: #largelanguagemodels #llms #computerscience #robotics #ai #artificialintelligence

КОМЕНТАРІ •