Evaluating LLM-based Applications

Поділитися
Вставка
  • Опубліковано 2 жов 2024
  • Evaluating LLM-based applications can feel like more of an art than a science. In this workshop, we'll give a hands-on introduction to evaluating language models. You'll come away with knowledge and tools you can use to evaluate your own applications, and answers to questions like:
    Where do I get evaluation data from, anyway?
    Is it possible to evaluate generative models in an automated way?
    What metrics can I use?
    What's the role of human evaluation?
    Talk by: Josh Tobin
    Here’s more to explore:
    LLM Compact Guide: dbricks.co/43W... Big Book of MLOps: dbricks.co/3r0...
    Connect with us: Website: databricks.com
    Twitter: / databricks
    LinkedIn: / databricks
    Instagram: / databricksinc
    Facebook: / databricksinc

КОМЕНТАРІ • 14

  • @ndamulelosbg8887
    @ndamulelosbg8887 7 місяців тому +1

    This is an exellent coverage of the challenging task of llm evaluatuon

  • @AnandShah-ds
    @AnandShah-ds 10 місяців тому +4

    Evaluations aside, I really enjoyed the presentation. I was hooked. Great story-telling skills Josh. Thanks for sharing your experience. We count on volunteers like you to spread knowledge.

  • @PoundJoanna
    @PoundJoanna 29 днів тому

    54127 George Stream

  • @alexisdamnit9012
    @alexisdamnit9012 18 днів тому

    So happy to see someone summarizing the difficulty of evaluation of LLM applications so well

  • @WilmottLynn
    @WilmottLynn 28 днів тому

    54289 Cristopher Glen

  • @ndamulelosbg8887
    @ndamulelosbg8887 7 місяців тому +1

    "Your opininon on LLMs does not matter" - I found this to be a great quote

  • @vaishnavipatil3319
    @vaishnavipatil3319 Рік тому +1

    Thank you for clearing this concepts. Would like to see more videos from you on evaluation frameworks, methods.

  • @bharath_v
    @bharath_v 10 місяців тому

    Good One!

  • @manishsharma2211
    @manishsharma2211 Рік тому

    Good work

  • @JeremiahTruda
    @JeremiahTruda 25 днів тому

    74728 Williamson Hills

  • @threevia.travel
    @threevia.travel 8 місяців тому

    Very generic, expected something more tangible! Sounds common sense which might work or might not work

  • @asfandiyar5829
    @asfandiyar5829 Рік тому +1

    Just what I was after. Thanks

  • @PJ-hi1gz
    @PJ-hi1gz 3 місяці тому

    Great talk, thanks for sharing

  • @SpartanPanda
    @SpartanPanda Рік тому

    Great storyline