Empowering Data Science Teams: Harnessing AI with Appen

From Idea to Production: AI Infra for Scaling LLM Apps

Function Calling for LLMs: RAG without a Vector Database

How Much Tape To Stop A Lamborghini?

таба-лапка за миллион!? подпишись на тг «хей! это марьяна!» там больше секретов

В ЧЬИХ РУКАХ СУДЬБА МИРА В УКРАИНЕ? БЕСЕДА С ВИТАЛИЙ ПОРТНИКОВ @portnikov.argumenty

Evaluating LLMs and RAG Pipelines at Scale

MLOps World: Machine Learning in Production

Переглядів 440

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 15 тра 2024
Speakers: Eric O. Korman, Cofounder / Chief Science Officer, Striveworks
Large Language Models (LLMs) and their applications, such as Retrieval-Augmented Generation (RAG) pipelines, present unique evaluation challenges due to the often unstructured nature of their outputs. These challenges are compounded by the variety of moving parts and parameters involved, such as the choice of underlying LLM, prompt templates, document chunking strategies, and embedding models.
With the proliferation of available LLMs (both open and closed source), ML teams would like processes to enable answering the question: what is the best LLM model and parameters for my specific task and dataset?
In this talk, we will introduce Valor, our new open-source evaluation service. We will demonstrate how Valor facilitates rigorous, real-world testing of these systems in production settings and how it can be integrated into existing LLMOps tech stacks.

КОМЕНТАРІ •

Наступне

Автоматичне відтворення

Empowering Data Science Teams: Harnessing AI with Appen

Empowering Data Science Teams: Harnessing AI with Appen

From Idea to Production: AI Infra for Scaling LLM Apps

From Idea to Production: AI Infra for Scaling LLM Apps

Function Calling for LLMs: RAG without a Vector Database

Function Calling for LLMs: RAG without a Vector Database

How Much Tape To Stop A Lamborghini?

How Much Tape To Stop A Lamborghini?

таба-лапка за миллион!? подпишись на тг «хей! это марьяна!» там больше секретов

таба-лапка за миллион!? подпишись на тг «хей! это марьяна!» там больше секретов

В ЧЬИХ РУКАХ СУДЬБА МИРА В УКРАИНЕ? БЕСЕДА С ВИТАЛИЙ ПОРТНИКОВ @portnikov.argumenty

В ЧЬИХ РУКАХ СУДЬБА МИРА В УКРАИНЕ? БЕСЕДА С ВИТАЛИЙ ПОРТНИКОВ @portnikov.argumenty

ОБМЕНЯЛА КВИНКУ НА…😱(смотрите до конца😂)#роблокс #игры #смешное #интересное #квинка

ОБМЕНЯЛА КВИНКУ НА…😱(смотрите до конца😂)#роблокс #игры #смешное #интересное #квинка

LLM Fine-Tuning for Modern AI Teams: How One E-Commerce Unicorn Cut Inference Cost by 90%

LLM Fine-Tuning for Modern AI Teams: How One E-Commerce Unicorn Cut Inference Cost by 90%

Mitigating LLM Hallucinations with a Metrics-First Evaluation Framework

Mitigating LLM Hallucinations with a Metrics-First Evaluation Framework

@Microsoft AI CEO Mustafa Suleyman x Times Techies | An exclusive interaction | #copilot

@Microsoft AI CEO Mustafa Suleyman x Times Techies | An exclusive interaction | #copilot

Running prompts at CI does not make your GenAI app enterprise ready

Running prompts at CI does not make your GenAI app enterprise ready

The Who, What, and Why of Data Lake Table Formats

The Who, What, and Why of Data Lake Table Formats

Why you should build an LLM benchmark [English]

Why you should build an LLM benchmark [English]

The BEST component for your RAG system

The BEST component for your RAG system

Private, Local AI

Private, Local AI

Building Production RAG Over Complex Documents

Building Production RAG Over Complex Documents

В ЧЬИХ РУКАХ СУДЬБА МИРА В УКРАИНЕ? БЕСЕДА С ВИТАЛИЙ ПОРТНИКОВ @portnikov.argumenty

В ЧЬИХ РУКАХ СУДЬБА МИРА В УКРАИНЕ? БЕСЕДА С ВИТАЛИЙ ПОРТНИКОВ @portnikov.argumenty

Why no RONALDO?! 🤔⚽️

Why no RONALDO?! 🤔⚽️

Пропагандисти з РФ поглузували зі свого ж ПІДБИТОГО ТАНКА

Пропагандисти з РФ поглузували зі свого ж ПІДБИТОГО ТАНКА

Кто круче, как думаешь?

Кто круче, как думаешь?

Мой тг: Подвал Стинта #стинт #stint #stintik

Мой тг: Подвал Стинта #стинт #stint #stintik

Пробую гриб за 880 000 рублей за кг

Пробую гриб за 880 000 рублей за кг

Самый быстрый НОКАУТ в ИСТОРИИ бокса. Даже Тайсон на ТАКОЕ не способен #shorts

Самый быстрый НОКАУТ в ИСТОРИИ бокса. Даже Тайсон на ТАКОЕ не способен #shorts

За кого болели?😂

За кого болели?😂