Why Evals Matter | LangSmith Evaluations - Part 1

Поділитися
Вставка
  • Опубліковано 7 кві 2024
  • With the rapid pace of AI, developers are often faced with a paradox of choice: how to choose the right prompt, how to trade-off LLM quality vs cost? Evaluations can accelerate development with structured process for making these decisions. But, we've heard that it is challenging to get started. So, we are launching a series of short videos focused on explaining how to perform evaluations using LangSmith.
    This video lays out 4 main considerations for evaluation: (1) dataset, (2) evaluator, (3) task, (4) how to apply evaluation to improve your product (e.g., unit tests, A/B tests, etc).
    Getting started documentation:
    docs.smith.langchain.com/eval...

КОМЕНТАРІ • 4

  • @chaitanyagoel9837
    @chaitanyagoel9837 8 днів тому

    🎯 Key points for quick navigation:
    00:00 *🎥 Introduction to Evaluations*
    - Introduction to the importance of evaluations for new models.
    - Overview of public evaluations and the components involved.
    00:54 *🧪 Evaluation Methods*
    - Explanation of human evaluations and their structure.
    - Comparative evaluation methods like Chatbot Arena.
    - Different metrics used to interpret results, such as ELO scores.
    02:44 *🔍 Personalized Testing*
    - Discussion on the trend of personalized testing and evaluations.
    - Methods to build and curate datasets for evaluations.
    - Examples of user interactions and synthetic data generation.
    04:05 *🤖 Evaluation Judges*
    - Various types of judges for evaluations including humans and LLMs.
    - Modes of evaluation, both reference-free and ground-truth based.
    - Application of evaluations in different contexts like unit tests and AB testing.
    05:28 *🔧 Implementing Evaluations with LangSmith*
    - Introduction to LangSmith platform for running evaluations.
    - Overview of LangSmith features: dataset creation, evaluator definition, trace inspections.
    - Future videos will explore detailed steps to build evaluations using LangSmith.
    Made with HARPA AI

  • @andrianantenainaprincyraso7162
    @andrianantenainaprincyraso7162 Місяць тому

    cool !

  • @aaronbiliyok4553
    @aaronbiliyok4553 Місяць тому

    Hey Lance Good job... can you please share you slides?

  • @kareammohamad
    @kareammohamad 2 місяці тому

    Fine