Ondřej Dušek: Skipping Chit-chat with ChatGPT: Large Language Models and Structured Outputs

Поділитися
Вставка
  • Опубліковано 18 жов 2024
  • The current state of the art in text generation are large language models (LLMs) pretrained on vast amounts of text and finetuned to produce solutions given instructions. LLMs represent significant progress, allowing the user to request outputs for various tasks by stating a query in natural language and being able to follow examples provided by the user (in-context learning/prompting), without the need for further training (finetuning) on task-specific data. However, they retain some of the problems of the previous generation of language models, in particular their opacity and lack of controllability. This talk will show experiments on using LLMs with prompting only for multiple tasks: data-to-text generation, task-oriented dialogues, and dialogue evaluation. All these tasks operate with structure (structured data input, structured outputs, structured dialogue), which is not what these LLMs were specifically pretrained for. I show that LLMs are usable for these tasks, but also point out their limitations and potential areas of improvement.
    👉 More information about the lecture series "Machines that understand?": dm.cs.univie.a...
    👉 Research Group Data Mining and Machine Learning at the University of Vienna: dm.cs.univie.a...
    👉 Playlist Machines that understand? • Was bedeutet Generativ...

КОМЕНТАРІ •