How to Deploy ML Solutions with FastAPI, Docker, & AWS

How to Build Data Pipelines for ML Projects (w/ Python Code)

Fine-tuning Large Language Models (LLMs) | w/ Example Code

$10,000 Every Day You Survive In The Wilderness

Now He’ll Never Leave😭

MONATIK - А що? (Official video)

How to Build ML Solutions (w/ Python Code Walkthrough)

Shaw Talebi

Переглядів 2 815

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 6 чер 2024
👉 More on Full Stack Data Science: • Full Stack Data Science
This is the 4th video in a series on Full Stack Data Science. Here, I explain why experimentation is critical to the ML lifecycle and walk through the development of a semantic search tool for my UA-cam videos.
More Resources:
💻 Example Code: github.com/ShawhinT/UA-cam-B...
🤖 RAG: • How to Improve LLMs wi...
📚Text Embeddings: • Text Embeddings, Class...
References:
[1] / software-2-0
[2] arxiv.org/abs/2012.07919
--
Book a call: calendly.com/shawhintalebi
Homepage: shawhintalebi.com/
Socials
/ shawhin
/ shawhintalebi
/ shawhint
/ shawhintalebi
The Data Entrepreneurs
🎥 UA-cam: / @thedataentrepreneurs
👉 Discord: / discord
📰 Medium: / the-data
📅 Events: lu.ma/tde
🗞️ Newsletter: the-data-entrepreneurs.ck.pag...
Support ❤️
www.buymeacoffee.com/shawhint
Introduction - 0:00
Why ML is Different - 0:39
Role of Experimentation - 3:04
Semantic Search (Design Choices) - 5:09
Example Code: Semantic Search of YT Videos - 8:17
Preview of Final Product - 10:06
Step 1: Experimentation & Evaluation - 11:17
Step 2: Build Video Index - 34:14
Step 3: Build UI - 35:49
What's Next? - 43:43

КОМЕНТАРІ • 4

@ShawhinTalebi 27 днів тому
More on Full Stack Data Science 👇
👉 Series Playlist: ua-cam.com/play/PLz-ep5RbHosWmAt-AMK0MBgh3GeSvbCmL.html
💻 Example Code: github.com/ShawhinT/UA-cam-Blog/tree/main/full-stack-data-science/data-science
@kreddy8621 26 днів тому
Brilliant, thanks
@Tenebrisuk 26 днів тому
Great video, really interesting.
A question on the encoding process. Does condensing transcripts into an embedding with 384 dimensions lose much information, or does the encoding process truncate the text at a point?
How would something like this manage a lengthy transcript where you cover several different topics?
Does the embedding get too "noisy" in that case to be able to really stand above your threshold if only perhaps 5 lines out of 100 contain the information relating to the search?
@ShawhinTalebi 26 днів тому
That's a great question. Whether (much) information is lost depends on the specific use case. For example, if you have simple text chunks that either say "True" or "False" then even a 1 dimensional embedding will preserve all the information. However, as your describing, the longer the chunks the more information can be lost. This is why experimentation is so critical because you can't really know 1) how much "information" is preserved by embeddings and 2) how that impacts your use case, before just trying it out.

Наступне

Автоматичне відтворення

How to Deploy ML Solutions with FastAPI, Docker, & AWS

How to Deploy ML Solutions with FastAPI, Docker, & AWS

How to Build Data Pipelines for ML Projects (w/ Python Code)

How to Build Data Pipelines for ML Projects (w/ Python Code)

Fine-tuning Large Language Models (LLMs) | w/ Example Code

Fine-tuning Large Language Models (LLMs) | w/ Example Code

$10,000 Every Day You Survive In The Wilderness

$10,000 Every Day You Survive In The Wilderness

Now He’ll Never Leave😭

Now He’ll Never Leave😭

MONATIK - А що? (Official video)

MONATIK — А що? (Official video)

The GIRL's skills are AMAZING🔪🎋#camping #survival #bushcraft #outdoors #lifehack

The GIRL's skills are AMAZING🔪🎋#camping #survival #bushcraft #outdoors #lifehack

I learned to code from scratch in 1 year. Here's how.

I learned to code from scratch in 1 year. Here's how.

Fine Tuning Mistral v3.0 With Custom Data

Fine Tuning Mistral v3.0 With Custom Data

I Was Wrong About AI Consulting (what I learned)

I Was Wrong About AI Consulting (what I learned)

Text Embeddings, Classification, and Semantic Search (w/ Python Code)

Text Embeddings, Classification, and Semantic Search (w/ Python Code)

How to Improve LLMs with RAG (Overview + Python Code)

How to Improve LLMs with RAG (Overview + Python Code)

How I'd Learn Data Science (if I started over)

How I'd Learn Data Science (if I started over)

I Made 200 Python Projects...Here Are My 5 FAVORITES

I Made 200 Python Projects...Here Are My 5 FAVORITES

MLOps Course - Build Machine Learning Production Grade Projects

MLOps Course – Build Machine Learning Production Grade Projects

Don’t Build AI Products The Way Everyone Else Is Doing It

Don’t Build AI Products The Way Everyone Else Is Doing It

Угадай Беременную Женщину! 6 Девушек Врут, 1 Говорит Правду! (Хазяева, Кокошка)

Угадай Беременную Женщину! 6 Девушек Врут, 1 Говорит Правду! (Хазяева, Кокошка)

Универ. 13 лет спустя - ВСЕ СЕРИИ ПОДРЯД

Универ. 13 лет спустя - ВСЕ СЕРИИ ПОДРЯД

7 РАЗІВ ВОЗИЛИ НА РОЗСТРІЛ/ ЯК ВЕРБУВАЛИ РОСІЯНИ/ ЧОМУ СИДІВ У НАЙЖОРСТКІШІЙ УКРАЇНСЬКІЙ КОЛОНІЇ

7 РАЗІВ ВОЗИЛИ НА РОЗСТРІЛ/ ЯК ВЕРБУВАЛИ РОСІЯНИ/ ЧОМУ СИДІВ У НАЙЖОРСТКІШІЙ УКРАЇНСЬКІЙ КОЛОНІЇ

Неадекватная стюардесса

Неадекватная стюардесса

🍕Пиццерия FNAF в реальной жизни #shorts

🍕Пиццерия FNAF в реальной жизни #shorts

You can now keep your hands clean, and your toilet cleaner...🚽 #toilet #cooltech #future

You can now keep your hands clean, and your toilet cleaner...🚽 #toilet #cooltech #future

ЛЕБІГА, МАЙОРОВА, КУХАРЧУК, ТКАЧЕНКО. РОЗРЯД | ВИПУСК 13

ЛЕБІГА, МАЙОРОВА, КУХАРЧУК, ТКАЧЕНКО. РОЗРЯД | ВИПУСК 13

Я пытался разбить небьющийся бокал

Я пытался разбить небьющийся бокал