How language model post-training is done today

Поділитися
Вставка
  • Опубліковано 24 січ 2025

КОМЕНТАРІ • 7

  • @bharathrajm9318
    @bharathrajm9318 10 днів тому

    Thank you for sharing such valuable content Nathan!

  • @broyojo
    @broyojo 17 днів тому +2

    such a valuable video, thank you for this great content

  • @Pingu_astrocat21
    @Pingu_astrocat21 16 днів тому

    Thank you for sharing! So much to learn.

  • @owcsc
    @owcsc 13 днів тому

    Tysm❤🎉

  • @420_gunna
    @420_gunna 21 день тому

    If there's something that I'm curious about learning more about from you re: post-training, it would be your thoughts on tool use. I basically don't know much more than the Gorilla/Toolformer papers. I think the topic sort of dovetails with reasoning/verification when you're talking about tools like calculators and code executors, but there's also search/retrieval-type tools. I think some months ago I remembered you not being as into RAG as I would have expected you to be, wonder if that's changed at all.
    I too have always loved CAI and am excited to play (maybe with little SmolLM models) in these "not-traditionally-verifiable domains" using RL -- character-related things!