ChatGPT has Never Seen a SINGLE Word (Despite Reading Most of The Internet). Meet LLM Tokenizers.

Поділитися
Вставка
  • Опубліковано 15 вер 2024

КОМЕНТАРІ • 26

  • @shabiehsaeed8633
    @shabiehsaeed8633 Рік тому +5

    Hi Jay,
    I love the work you have done! Ever since I read the Illustrated Transformer, I was blown away by your explanations and illustrations. You really explain advance concepts with such clarity and simplicity. I am very grateful to you for that! I really look forward to reading and learning from your book! Thank you so much!!

    • @devashishsoni9371
      @devashishsoni9371 11 місяців тому

      Can you please share the link of that?

    • @lazycomedy9358
      @lazycomedy9358 10 місяців тому

      yeah. Same here, shout out for that.

  • @WhatsAI
    @WhatsAI Рік тому +1

    Great video as always Jay! :)

    • @arp_ai
      @arp_ai  Рік тому

      Thank you Louis!

  • @ranjancse26
    @ranjancse26 Рік тому

    This is incredible. Great work! Keep it up :)

  • @jwilber92
    @jwilber92 Рік тому

    Great content as always, Jay!

  • @TheAero
    @TheAero Рік тому

    I would love if you could go into the following:
    RLHF.
    PPO.
    PEFT.
    LORA etc.
    Adapters.
    soft-prompting.
    scaling transformers.

    • @arp_ai
      @arp_ai  Рік тому

      Delicious topics indeed

  • @kidsfungaming6756
    @kidsfungaming6756 Рік тому

    Hi Jay,
    I love your presentation, it is so inspiring and you make the hard concepts simple and clearer. Regarding the tokenizer, if every word is one token and the same is mapped over a single vector (embeddings) then how do LLMs clearly understand the meaning of the same word in different contexts? I will appreciate your answer and I am sorry if my question is too naive.
    Thank you

  • @boonkiathan
    @boonkiathan Рік тому +6

    Neither have our neurones

    • @arp_ai
      @arp_ai  Рік тому +1

      Aha! But which neurons though!

  • @KumR
    @KumR 7 місяців тому

    Hi Jay - Great video.. Wondering if this is similar to computers doing everything in 0s and 1s although from the OS level the abstraction is different . At least conceptually. Coming to the book , I am not able to find it anywhere... Is there a link ?

  • @prabaj84
    @prabaj84 Рік тому

    Hi Jay, thanks again for explaining a complex topic in simple way, if I may ask, what tool do you use to generate graphics for your blogs? Thanks in advance

  • @tanmoy.mazumder
    @tanmoy.mazumder Рік тому +1

    could you perhaps do an even deeper dive about how these models exactly produce the output vectors and then how those get turned into these tokens and stuff?

    • @arp_ai
      @arp_ai  Рік тому

      Not much has changed since my videos on GPT3, honestly. Check those out.

  • @goelnikhils
    @goelnikhils Рік тому

    Great Work

  • @ashisranjanlahiri
    @ashisranjanlahiri Рік тому +1

    Hi Sir, your video always amazed me. Need more videos for sure. Can you please share the notebook link.

    • @arp_ai
      @arp_ai  Рік тому

      Thank you! Haven't published the notebook yet, but that's a good idea

  • @mohamadbebah8416
    @mohamadbebah8416 Рік тому

    Great!!
    Thank you very much

  • @khaledsrrr
    @khaledsrrr Рік тому

    ❤ very nice

  • @123arskas
    @123arskas Рік тому

    Good one

  • @gama3181
    @gama3181 Рік тому

    And how the people know wich tokenizer is the best way to split the vocab? This follow a math rule or statistical pattern? or it depend on the computing budget?

  • @amittripathi6664
    @amittripathi6664 Рік тому

    Hi Jay, thanks for the video. Could you also please share the code?

  • @Patapom3
    @Patapom3 Рік тому

    Great! But how does the tokenizer works now? 😅

    • @arp_ai
      @arp_ai  Рік тому

      Wonderful! If you feel comfortable to tackle this now, then this video has done its job. We'll address it more in the book (and possible a subsequent video). But if you wanna get into training tokenizers now, this is a great guide: huggingface.co/learn/nlp-course/chapter6/5?fw=pt