Laurens Weijs - Making a benchmarking system for LLMs

Поділитися
Вставка
  • Опубліковано 27 чер 2024
  • Safeguarding LLMs will be important going forward if we want to productionize LLMs, by building a benchmark system we can run all our LLMs in research against the benchmarks and then have a better answer whether our LLMs have unwanted baises. With the AI Validation team within the Dutch Government we our now building this up and it will be open source from the start.
  • Наука та технологія

КОМЕНТАРІ • 1

  • @alexd7466
    @alexd7466 20 днів тому

    But why use a LLM for binary (yes/no) output? that is not what they're good at.