Entry Point AI
Entry Point AI
  • 13
  • 185 094
RLHF & DPO Explained (In Simple Terms!)
Learn how Reinforcement Learning from Human Feedback (RLHF) actually works and why Direct Preference Optimization (DPO) and Kahneman-Tversky Optimization (KTO) are changing the game.
This video doesn't go deep on math. Instead, I provide a high-level overview of each technique to help you make practical decisions about where to focus your time and energy.
0:52 The Idea of Reinforcement Learning
1:55 Reinforcement Learning from Human Feedback (RLHF)
4:21 RLHF in a Nutshell
5:06 RLHF Variations
6:11 Challenges with RLHF
7:02 Direct Preference Optimization (DPO)
7:47 Preferences Dataset Example
8:29 DPO in a Nutshell
9:25 DPO Advantages over RLHF
10:32 Challenges with DPO
10:50 Kahneman-Tversky Optimization (KTO)
11:39 Prospect Theory
13:35 Sigmoid vs Value Function
13:49 KTO Dataset
15:28 KTO in a Nutshell
15:54 Advantages of KTO
18:03 KTO Hyperparameters
These are the three papers referenced in the video:
1. Deep reinforcement learning from human preferences (arxiv.org/abs/1706.03741)
2. Direct Preference Optimization:
Your Language Model is Secretly a Reward Model (arxiv.org/abs/2305.18290)
3. KTO: Model Alignment as Prospect Theoretic Optimization (arxiv.org/abs/2402.01306)
The Huggingface TRL library offers implementations for PPO, DPO, and KTO:
huggingface.co/docs/trl/main/en/kto_trainer
Want to prototype with prompts and supervised fine-tuning? Try Entry Point AI:
www.entrypointai.com/
How about connecting? I'm on LinkedIn:
www.linkedin.com/in/markhennings/
Переглядів: 3 616

Відео

Ask GPT-4 in Google Sheets
Переглядів 1,6 тис.7 місяців тому
Learn an amazing trick to use OpenAI models directly inside Google Sheets. Watch as I transform whole columns of data using an AI prompt in seconds! After this video, you'll be able to do awesome things like write creative copy, standardize your data, or extract specific details from unstructured text. Topics covered: 0:27 Demo of LLM calls directly in Google Sheets 5:33 The custom function in ...
Fine-tuning Datasets with Synthetic Inputs
Переглядів 4 тис.9 місяців тому
👉 Start building your dataset at www.entrypointai.com There are virtually unlimited ways to fine-tune LLMs to improve performance at specific tasks... but where do you get the data from? In this video, I demonstrate one way that you can fine-tune without much data to start with - and use what little data you have to reverse-engineer the inputs required! I show step-by-step how to take a small s...
How Large Language Models (LLMs) Actually Work
Переглядів 2,1 тис.11 місяців тому
In this video, I explain how language models generate text, why most of the process is actually deterministic (not random), and how you can shape the probability when selecting a next token from LLMs using parameters like temperature and top p. I cover temperature in-depth and demonstrate with a spreadsheet how different values change the probabilities. Topics: 00:10 Tokens & Why They Matter 03...
LoRA & QLoRA Fine-tuning Explained In-Depth
Переглядів 54 тис.Рік тому
👉 Start fine-tuning at www.entrypointai.com In this video, I dive into how LoRA works vs full-parameter fine-tuning, explain why QLoRA is a step up, and provide an in-depth look at the LoRA-specific hyperparameters: Rank, Alpha, and Dropout. 0:26 - Why We Need Parameter-efficient Fine-tuning 1:32 - Full-parameter Fine-tuning 2:19 - LoRA Explanation 6:29 - What should Rank be? 8:04 - QLoRA and R...
Prompt Engineering, RAG, and Fine-tuning: Benefits and When to Use
Переглядів 106 тис.Рік тому
Explore the difference between Prompt Engineering, Retrieval-augmented Generation (RAG), and Fine-tuning in this detailed overview. 01:14 Prompt Engineering RAG 02:50 How Retrieval Augmented Generation Works - Step-by-step 06:23 What is fine-tuning? 08:25 Fine-tuning misconceptions debunked 09:53 Fine-tuning strategies 13:25 Putting it all together 13:44 Final review and comparison of technique...
Fine-tuning 101 | Prompt Engineering Conference
Переглядів 7 тис.Рік тому
👉 Start fine-tuning at www.entrypointai.com Intro to fine-tuning LLMs (large language models0 from the Prompt Engineering Conference (2023) Presented by Mark Hennings, founder of Entry Point AI. 00:13 - Part 1: Background Info -How a foundation model is born -Instruct tuning and safety tuning -Unpredictability of raw LLM behavior -Showing LLMs how to apply knowledge -Characteristics of fine-tun...
"I just fine-tuned GPT-3.5 Turbo…" - Here's how
Переглядів 1,1 тис.Рік тому
🎁 Join our Skool community: www.skool.com/entry-point-ai In this video, I'm diving into the power and potential of the newly released GPT-3.5's fine-tuning option. After fine-tuning some of my models, the enhancement in quality is undeniably remarkable. Join me as I: - Demonstrate the model I fine-tuned: Watch as the AI suggests additional items for an e-commerce shopping cart and the rationale...
No-code AI fine-tuning (with Entry Point!)
Переглядів 1,3 тис.Рік тому
👉 Sign up for Entry Point here: www.entrypointai.com Entry Point is a platform for no-code AI fine-tuning, with support for Large Language Models (LLMs) from multiple platforms: OpenAI, AI21, and more. In this video I'll demonstrate the core fine-tuning principles while creating an "eCommerce product recommendation" engine in three steps: 1. First I write ~20 examples by hand 2. Then I expand t...
28 April Update: Playground and Synthesis
Переглядів 83Рік тому
28 April Update: Playground and Synthesis
How to Fine-tune GPT-3 in less than 3 minutes.
Переглядів 4,4 тис.Рік тому
🎁 Join our Skool community: www.skool.com/entry-point-ai Learn how to fine-tune GPT-3 (and other) AI models without writing a single line of python code. In this video I'll show you how to create your own custom AI models out of GPT-3 for specialized use cases that can work better than ChatGPT. For demonstration, I'll be working on a classifier AI model for categorizing keywords from Google Ads...
Can GPT-4 Actually Lead a D&D Campaign? 🤯
Переглядів 390Рік тому
If you want to create / fine-tune your own AI models, check out www.entrypointai.com/
Entry Point Demo 1.0 - Keyword Classifier (AI Model)
Переглядів 350Рік тому
Watch over my shoulder as I fine-tune a model for classifying keywords from Google Ads.

КОМЕНТАРІ

  • @abdelrahmanmagdi6767
    @abdelrahmanmagdi6767 3 дні тому

    just perfect explaination !

  • @BijouBakson
    @BijouBakson 8 днів тому

    Thank you

  • @joshuatettey7771
    @joshuatettey7771 17 днів тому

    Great video

  • @omarsherif88
    @omarsherif88 18 днів тому

    Very useful, thank you!

  • @luckelly2378
    @luckelly2378 21 день тому

    Such an incredibly clear presentation! Thank you

  • @hoyinleunghk
    @hoyinleunghk 23 дні тому

    Excellent explanation thank a lot!

  • @pamelamadingdong
    @pamelamadingdong Місяць тому

    Fact that you gave a concrete examples really helped me go through this! Thank you for the great video

  • @web3namesai
    @web3namesai Місяць тому

    Great insights on AI techniques! With Web3NS, you can fine-tune AI agents on YourName.Web3, combining RAG and prompts to create smarter, decentralized responses tailored to your needs. 🚀

  • @gayathrisaranath666
    @gayathrisaranath666 Місяць тому

    Thanks for this clear explanation about the topic! Your way of relating back to research papers is very interesting and helpful!

  • @dickerjunge2119
    @dickerjunge2119 2 місяці тому

    I think there is a mistake your explanation. It uses higher precision while finetuning, but it does **not** recover "back to normal". It stays in the quantized dataformat.

  • @princekhunt1
    @princekhunt1 2 місяці тому

    Nice

  • @rameshpjain
    @rameshpjain 2 місяці тому

    Such a simple way to explain. Thank you.

  • @AbdoGhazala-y5p
    @AbdoGhazala-y5p 3 місяці тому

    can you share the presentation document

  • @pratikmandlecha6672
    @pratikmandlecha6672 3 місяці тому

    Loved the explanation. Despite knowing these terms well, I was curious to see how it would be explained and I am glad that I watched this video

  • @jb-mk5ln
    @jb-mk5ln 3 місяці тому

    Incredibly high quality content, thank you.

  • @Gayatritravelandfitnessvlogs
    @Gayatritravelandfitnessvlogs 3 місяці тому

    Thanks a ton!

  • @priscillaleapman2367
    @priscillaleapman2367 3 місяці тому

    Martin Shirley Jackson Kenneth Allen Mary

  • @chrisder1814
    @chrisder1814 3 місяці тому

    hello can you help me

  • @kritarthlohomi3305
    @kritarthlohomi3305 3 місяці тому

    bradley cooper in limitless tf

  • @MarshallRoy-h9e
    @MarshallRoy-h9e 3 місяці тому

    Melisa Branch

  • @mandrakexTV
    @mandrakexTV 3 місяці тому

    This is the best detailed video and nicest explanation on youtube right now. I do think your channel will grow because you are doing an EXCELENT job. Thank you man.

  • @TheDinosaurDemocracy
    @TheDinosaurDemocracy 3 місяці тому

    This is amazing! How do you deal with request limits? I'm set to 3RPM for all models, so I'm unable to drag and drop this into a sheet as it will only do 3 at a time.

    • @EntryPointAI
      @EntryPointAI 3 місяці тому

      Good question. I don't have a good answer right now except to find yourself a higher limit :) 3rpm is really really low and you should be able to find even a free service or wrapper that offers more. Check out Groq or Together.ai

  • @bobrarity
    @bobrarity 3 місяці тому

    damn bro thanks, I'm new to fine-tuning, needed to migrate my rag model to the ragtag model, and your video was very clear and helped me a lot 😊

  • @tgzhu3258
    @tgzhu3258 3 місяці тому

    so good!!

  • @NathanielMaymon
    @NathanielMaymon 3 місяці тому

    What's the name of the paper you referenced in the video?

    • @EntryPointAI
      @EntryPointAI 3 місяці тому

      Here's LoRA: arxiv.org/abs/2106.09685 and QLoRA: arxiv.org/abs/2305.14314

  • @adriadebatlle5755
    @adriadebatlle5755 4 місяці тому

    It may be a stupid qüestion, is it possible to download the fine tunned model at entrypoint? Thankyou. Grate video! I came trying to see if it is possible to download it but I understood more about the website and it's grate.

    • @EntryPointAI
      @EntryPointAI 3 місяці тому

      It depends on what service you use to fine-tune, the proprietary ones like OpenAI don't let you download it. Others like Replicate do.

  • @AryanKumarBaghel-cp1jv
    @AryanKumarBaghel-cp1jv 4 місяці тому

    Perfect differentiations. Thankyou

  • @partymarty1856
    @partymarty1856 4 місяці тому

    blud why your eyes like that

  • @someguyO2W
    @someguyO2W 4 місяці тому

    Best video I've seen on the topic. Thank you!

  • @andrepemmelaar8728
    @andrepemmelaar8728 4 місяці тому

    Very useful! Marvelous clear explanation with the right amount of detail about a subject that’s worth understanding

  • @routergods
    @routergods 4 місяці тому

    I've been binge watching LLM-related videos and most have been regurgitation of docs or (probably) GPT/AI generated fluff pieces. This video clearly explained several concepts that I was trying to wrap my head around. Great job!

    • @someguyO2W
      @someguyO2W 4 місяці тому

      Similar comments. He actually explained it properly.

  • @UfcFan-d6s
    @UfcFan-d6s 4 місяці тому

    Amazing for struggling students. Love from Korea😂

  • @thndesmondsaid
    @thndesmondsaid 4 місяці тому

    Thank you!

  • @archchana7756
    @archchana7756 4 місяці тому

    very well explained, thanks :)

  •  4 місяці тому

    Awesome. Thanks

  • @yesid3777
    @yesid3777 4 місяці тому

    Thanks!, an amazing explanation!, +1 Sub.

  • @mohammedAlbared
    @mohammedAlbared 4 місяці тому

    "Great video with an amazing explanation! Thank you for sharing."

  • @aliumut33
    @aliumut33 4 місяці тому

    Thanks. Is it possible to make a webscraping from a given url in a column?

    • @EntryPointAI
      @EntryPointAI 4 місяці тому

      This is a task better suited for another tool. I'd check out apify.com/

  • @Larimuss
    @Larimuss 5 місяців тому

    QLORA let's me train on a 4070ti with only 12gb vram. Though I can't go over 7b model

  • @CatarinaReis-g3y
    @CatarinaReis-g3y 5 місяців тому

    Thisa saved me. Thank you. Keep doing this :)

  • @brianbarnes746
    @brianbarnes746 5 місяців тому

    Great explanation, best that I've seen

  • @rohitvishwakarma2871
    @rohitvishwakarma2871 5 місяців тому

    Gojo ?

  • @user-wp8yx
    @user-wp8yx 5 місяців тому

    I'm pulling for another vid on alpha. Oobabooga suggests twice your rank. The Chinese alpaca lora people use a rank 8 with alpha 32 and I guess it worked. I've tried high alphas that make the model kinda crazy. Need guidence.

    • @EntryPointAI
      @EntryPointAI 5 місяців тому

      When in doubt, set alpha = rank for the effective scale factor to be 1. There are better ways to have a larger impact on training than bluntly multiplying the change in weights, like improving your dataset or dialing in the learning rate.

    • @user-wp8yx
      @user-wp8yx 5 місяців тому

      @@EntryPointAI this does make sense they way you put it. Thanks so much for your reply!

  • @MudroZvon
    @MudroZvon 5 місяців тому

    I used RAG while creating life coach UA-camr AI character in Telegram. I prepared a lot of AI processed data of him, with fine-grained segments. The prompt itself is also crafted according to the latest Claude prompt discovery. the most amazing AI experience for me yet - JulienHimselfBot

  • @619vijay
    @619vijay 5 місяців тому

    Eyes!

  • @NY1man1NY
    @NY1man1NY 5 місяців тому

    Watching the first 10 minutes of this video, explains how AI works. This should be showed in schools.

  • @stevenwessel9641
    @stevenwessel9641 5 місяців тому

    Nice logo my guy

  • @aashwinsharma8194
    @aashwinsharma8194 5 місяців тому

    Great explanation...

  • @CharmiTrivedi-q4n
    @CharmiTrivedi-q4n 5 місяців тому

    Hello, thank you for a great video. I am trying to integrate it with the app script but i keep getting error on var Response line in saying invalid user content. Is there any alternative to that?

  • @omarpalominoscortez5541
    @omarpalominoscortez5541 5 місяців тому

    Súper... 👍 me funciono... Muchas gracias, Saludos desde Chile... 🇨🇱