The AI Buzz, Episode #1: ChatGPT, Transformers and Attention

Поділитися
Вставка
  • Опубліковано 9 січ 2023
  • The AI Buzz is a conversation about the latest trends in AI, plus Q&A, between me and Luca Antiga, the Chief Technology Officer at Lightning AI. We talk about what's new and why it has the potential to change everything. And, because it's StatQuest, we'll go the extra mile to make sure everything is clearly explained!!!
    The AI Buzz with Luca and Josh is also a podcast! Check it out on your favorite platform, including Spotify: open.spotify.com/show/06580Wp...
    If you'd like to support StatQuest, please consider...
    Patreon: / statquest
    ...or...
    UA-cam Membership: / @statquest
    ...buying my book, a study guide, a t-shirt or hoodie, or a song from the StatQuest store...
    statquest.org/statquest-store/
    ...or just donating to StatQuest!
    www.paypal.me/statquest
    Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
    / joshuastarmer
    #StatQuest

КОМЕНТАРІ • 61

  • @statquest
    @statquest  Рік тому

    The AI Buzz with Luca and Josh is also a podcast! Check it out on your favorite platform, including Spotify: open.spotify.com/show/06580WpFqTt27tIbzBS8VQ?si=d5c72b581bb84fb0
    To learn more about Lightning: lightning.ai/
    Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/

  • @Axcellaful
    @Axcellaful Рік тому +6

    Would be awesome to see a StatQuest series on implementing a transformer!

  • @Pongant
    @Pongant Рік тому +1

    Awesome stuff Josh. Love your channel since my early studies, and it will remain a steady companion to me in the future.

  • @ChocolateMilkCultLeader
    @ChocolateMilkCultLeader Рік тому +22

    These models aren't going to put people out of business. They will make the people good at their jobs much better than people who are average. The more your skill level, the more you can specify the prompt and fine-tune the final result. I will do a more thorough breakdown of it soon, but an analogy that you might appreciate is the publishing of the sklearn library. That library lets anyone write ML models easily. In that way, you could argue that it theoretically lets my grandma replace me. However, the reality is reversed. Because the implementation headache of writing the models is handled for me, I can spend time working on the actual problem at hand. More domain knowledge and technical skill will actually allow me to do much more, than someone blindly using ChatGPT w/o the underlying foundations.

    • @Maceta444
      @Maceta444 Рік тому +1

      Good point

    • @statquest
      @statquest  Рік тому +8

      bam!

    • @dennisthemenace567
      @dennisthemenace567 Рік тому +1

      Maybe it will put mediocre workers out of work. Don't plan to be one of those.

    • @petersq5532
      @petersq5532 5 місяців тому

      you idealistic. what you say works on individual level if you have the inner motivation forcreativity and improvement. but doesnt work on corporate level where different metrics counts (speed, salary, routine, sickleave, pension) the CEO doesnt give a damn how creative you are or what's your job satisfection. they see the numbers so you go....

    • @petersq5532
      @petersq5532 5 місяців тому

      ​@@dennisthemenace567you rather condecent on mediocre workers but forgetvthat 65% of the population is mediocre or below (IQ [trait] normal distribution) and even those how are above do work based on mediocre qualities in 90% of the cases (wild guess) . those positions which need creativity or special skills are very very rare.
      we suppose responsible for the whole of society and not only those prevelged with IQ and possibilities. we develop a technology which influence and change societies more than th e discovery of fire. and there are not pompous exaggerations. at this stage we responsible beyond your paygrade we all who deal with AI. that's unavoidable feature of this technology like itr not.

  • @davidbrown409
    @davidbrown409 Рік тому +2

    Great first episode! I'm excited for this project. So much of ai/ml/ds content is vapid and geared towards people that want just enough to get by, or are just starting. I really look forward to episode 2

  • @atlasflare7824
    @atlasflare7824 Рік тому +2

    Triple BAM! Thank you!

  • @ares106
    @ares106 Рік тому +1

    Really enjoyed this show. Hope there will be more visual explanations in the future.

  • @AlexandrosPoulis
    @AlexandrosPoulis Рік тому +1

    Great chat guys!

  • @madarauchiha2584
    @madarauchiha2584 Рік тому +1

    Eagerly waiting for next episode

  • @vipanpatial2243
    @vipanpatial2243 Рік тому +1

    Very good discussion

  • @pog_champ
    @pog_champ Рік тому +3

    awesome discussion. Do you plan on making a statquest on transformers? If so, looking forward to it when its ready : )

    • @statquest
      @statquest  Рік тому +5

      I'm working on the transformer video.

  • @prashlovessamosa
    @prashlovessamosa Рік тому +1

    Your Channel is heaven to me.

  • @engiboye9893
    @engiboye9893 Рік тому +1

    This is amazing, thank you! But is it possible to upload this as a podcast to standard podcasting platforms? That would be much more convenient I believe.

    • @statquest
      @statquest  Рік тому +1

      Once we have a handful of episodes recorded, we plan on publishing them as a podcast (on the standard platforms).

  • @alputkuiyidilli
    @alputkuiyidilli Рік тому

    Hi Josh! I have been watching your videos as if they are my actual curriculum! Would you also consider a question library or exercise book? Thanks for great content sir!

    • @statquest
      @statquest  Рік тому

      Thanks! What's a question library?

  • @joseduarte5663
    @joseduarte5663 Рік тому +1

    Great vid as always Josh. I transitioned from Data Science to Software Development just a month ago and I would love to know if you think that DS has a lower, bigger or the same chance of being automated completely by AI in the future

    • @statquest
      @statquest  Рік тому +1

      Data Science will change, because the tools we use will change, but it will not go away. If you have a solid understanding in the main ideas of statistics, machine learning and data visualization, you should be good to go - just be flexible and willing to adapt to new tools.

  • @Larzsolice
    @Larzsolice Рік тому

    Sometimes it is easier to think of neural networks as neurons that light up (rather than being a line or plane drawn on a graph representing the intensity of each neuron).
    So when a neuron lights up, it represents some sort of abstract cluster of the data. The sum of the neurons (output) is like the weighted sum of abstract components (the sum of the intensities of the neurons).
    This paradigm comes from visual processing using neural networks, which has the fascinating side effect that visualising the neural network in the case of digit recognition allows you to intuitively see what is going on.

    • @statquest
      @statquest  Рік тому

      I'll keep that in mind!

    • @Larzsolice
      @Larzsolice Рік тому

      @@statquest A better term for "abstract cluster" is an impression. The weights of the neurons store an impression of the training data. Transformer models create an impression in the encoder and the impression is decoded in the decoder.
      In that regards, generating new text is simply a derivative of translation where an input is translated into an output of the same language using attention to predict which impressions are relevant given the impression matrix from the encoder and the output that the decoder has produced up until that point.
      Calling it "impressions" or "an impression matrix" might not be technically correct, but it is useful for intuition.

  • @charlesmurtaugh3771
    @charlesmurtaugh3771 Рік тому

    Any suggestions for further reading on “Attention”?

    • @statquest
      @statquest  Рік тому +2

      I'm working on a video on it.

  • @IvanGarcia-cx5jm
    @IvanGarcia-cx5jm Рік тому +1

    I found it very ridiculous that in the BBC Humans series an engineer was surprised that the humanoid was only made with 15k files or lines of code. I found that silly. It smelled like a rule/heuristic based system to me. Here Luca mentions that a small version of GPT has around 300 lines of code. Make sense on how DL/RL work.

  • @HUEHUEUHEPony
    @HUEHUEUHEPony Рік тому +2

    inb4 chatgpt integrated with windows, cortana will get superpowers

  • @MohamadSerhan-bm8bc
    @MohamadSerhan-bm8bc Рік тому

    we need a video about Transformers and TFT, you stopped after LSTM...

    • @statquest
      @statquest  Рік тому +1

      I didn't stop. I'm working on it.

    • @MohamadSerhan-bm8bc
      @MohamadSerhan-bm8bc Рік тому +2

      @@statquest thank you for your hard work and dedication to help countless of students around the world

  • @MrRobo930
    @MrRobo930 Рік тому

    Sir please give some explanation with codes how can we start thinking in this way and i am working on Chatbot which answer my questions but that is becoming very tough for me to prepare data accordingly.

    • @statquest
      @statquest  Рік тому +1

      I'm working on that with my series of PyTorch + Lightning videos.

  • @rajarshidey424
    @rajarshidey424 6 місяців тому

    Is that a table behind you?

  • @vipanpatial2243
    @vipanpatial2243 Рік тому +1

    I have a question , can I add a data set to chatgpt and do the eda on it. If yes how?

    • @dhirajmeenavilli5508
      @dhirajmeenavilli5508 Рік тому +2

      You can probably try it. I know it can generate code so you can ask it to write some eda code for you even if you can't pass it the data directly. But you probably can't because it's a text generation model, so I don't know if you'd be able to pass it data and have it understand it.

    • @statquest
      @statquest  Рік тому

      I believe I've heard of people doing this, but I don't know the details yet.

    • @hoangng84
      @hoangng84 Рік тому +1

      What I would do, actually I did, is ask ChatGPT to create some data, and do things with the data;

  • @ulamss5
    @ulamss5 Рік тому +1

    i think luca is somewhat overselling chatgpt's ability to understand, reason, or do calculations.

    • @statquest
      @statquest  Рік тому

      I think that is a reasonable comment. However, Luca and I would both agree that using chatgpt can be very surprising in terms of what it can do.

    • @Lucaantiga-vv8nk
      @Lucaantiga-vv8nk Рік тому +1

      ​ @StatQuest with Josh Starmer 100% fair.
      What we were trying to convey though (it might be obvious to people working in AI of course) is that a model trained on predicting the next word achieves the ability to handle abstract concepts as an emerging trait. I've lived through times in which what GPT and other models are doing today would have been considered sorcery! But we (humanity) achieved all this through a super simple computing model (similar things can be said about diffusion models).
      We kind of started from the pretty complicated and chopped things out, and what we are left with is pretty simple (if you abstract away the engineering aspects). And this very fact, the fact that GPT is such a a simple model that acquires the ability to abstract (to some degree) is a sign that we are scraping the surface of something fundamental.

  • @petersq5532
    @petersq5532 5 місяців тому

    I just wonder if you ask the algorithm to write something for you how does it became your creation? how do someone dare to say: it is my song....
    people replace their own creativity with algorithm. yeah if you dont have mich....
    but the long term problem is that creativity is a trait needing practice and training. if you dont use it you lose it. iv you replace your mindwork with AI than you lose in a long run

    • @statquest
      @statquest  5 місяців тому +1

      It's true - being creative takes practice - and tools like ChatGPT will change how we define creativity.

  • @dihancheng952
    @dihancheng952 Рік тому

    This is a summary of the video, generated by OpenAI Whisper API, for those who don't have time to watch the entire video.
    Luca and Josh discuss the potential of GPT-3 to help people with creative tasks like writing songs or brainstorming ideas for a book. Luca shares how he used GPT-3 to write lyrics for a song, and Josh talks about how GPT-3 could be used to generate names for a show.,The text discusses how the chatbot Chachi PT works. Chachi PT is a GPT (generative pre-trained transformer) that has been conditioned to chat. GPT is a model that is not easy to understand, but it can be written in 300 lines of python. The model is trained on a massive amount of data, and then it is fine-tuned to interact with people. The model is ranked by another model, which is trained to rank the output of the big model. This ranking is done by humans, who say which output is better than another. The ranking is then used to improve the parameters of the larger model.,The text discusses how large language models can be used to generate text, and how they are capable of learning algorithms to do so. It also notes that these models are becoming more common and that they are getting better as people learn how to use them more effectively.,The text discusses the development of quantum computing, and how it has made it possible to understand code and create algorithms that can be executed at inference time. The author also talks about the importance of diffusion in this process, and how it has led to the development of sound generation.,The text discusses the use of attention mechanisms in neural networks, and how they can be used to improve image generation. It also notes that these attention mechanisms can be expressed in a few lines, which makes them easier to understand and work with.,The text describes the Transformer architecture, which is a way of learning non-linear relationships between input and output sequences. The Transformer is efficient at learning these relationships, and this makes it a powerful tool for tasks like image recognition and text classification.,The text discusses how Arabic numerals are more efficient than Roman numerals, and how this efficiency can be used to improve neural networks and AI. It also argues that products which remove the need for conceptual boilerplate will improve people's lives.,The text discusses the idea that it is often only when we see the full context of a situation that we are able to understand it fully. This is compared to the process of translating a sentence, where it is only when we see the full sentence that we can get an accurate translation. The author suggests that the same principle can be applied to notes, and that it is only when we are done adding to them that the structure becomes apparent.