Word Embedding in PyTorch + Lightning

Поділитися
Вставка
  • Опубліковано 6 чер 2024
  • Word embedding is the first step in lots of neural networks, including Transformers (like ChatGPT) and other state of the art models. Here we learn how to code a stand alone word embedding network from scratch and with nn.Linear. We then learn how to load and use pre-trained word embedding values with nn.Embedding.
    NOTE: This StatQuest assumes that you are already familiar with Word Embedding, if not, check out the 'Quest: • Word Embedding and Wor...
    If you'd like to support StatQuest, please consider...
    Patreon: / statquest
    ...or...
    UA-cam Membership: / @statquest
    ...buying my book, a study guide, a t-shirt or hoodie, or a song from the StatQuest store...
    statquest.org/statquest-store/
    ...or just donating to StatQuest!
    paypal: www.paypal.me/statquest
    venmo: @JoshStarmer
    Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
    / joshuastarmer
    0:00 Awesome song and introduction
    1:53 Importing modules
    2:48 Encoding the training data
    6:55 Word Embedding from scratch
    16:54 Graphing the embedding values
    21:17 Printing out predicted words
    20:37 Word Embedding with nn.Linear
    28:12 Loading and using pre-trained Embedding values with nn.Embedding
    #StatQuest #neuralnetworks #transformers

КОМЕНТАРІ • 71

  • @statquest
    @statquest  7 місяців тому +9

    Here's the code: lightning.ai/lightning-ai/studios/statquest-word-embedding-with-pytorch-lightning?view=public§ion=all
    To learn more about Lightning: lightning.ai/
    Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/

    • @loserc1854
      @loserc1854 2 місяці тому +1

      waiting for ur next book

  • @drintro
    @drintro 4 місяці тому +3

    The best part of this example is the way the first example is written with single dimension arrays for all of the parameter values. That makes the matrix operations explicit and more clear even to an experienced developer. I recommend writing and debugging the code from watching the video. There are small things that you might get wrong that will teach you something.

    • @statquest
      @statquest  4 місяці тому

      Thank you! I'm glad you liked the example.

  • @scottp131
    @scottp131 7 місяців тому +8

    I bought The StatQuest Illustrated Guide to Machine Learning, and it's absolutely amazing, same with every single one of your videos! I highly recommend that book to anybody who is interested in learning about this stuff! Thank you thank you thank you so much for taking the time to put together the content you make, I can't believe how well you paint this stuff into perspective! You're a freaking awesome person Josh, I'm still absolutely elated that I stumbled across this channel! I never would have expected to be so consumed with learning this and everything about this!

    • @statquest
      @statquest  7 місяців тому

      Thank you very much! I'm really glad you enjoy my videos and my book! Thank you! :)

  • @InglesConConfianza
    @InglesConConfianza 6 місяців тому +1

    Just finished the Deep Learning playlist. Thank you so much for this great work.

    • @statquest
      @statquest  6 місяців тому

      BAM! Thank you very much! You deserve an award!

  • @tom19860526
    @tom19860526 4 місяці тому +1

    I like your videos. Both the slides and explanations are very detailed and clear. I cherish them every time I watch them. It is a good learning video. thank you very much.

    • @statquest
      @statquest  4 місяці тому

      Thank you very much! :)

  • @exxzxxe
    @exxzxxe 4 місяці тому +1

    Josh- you are a Master in making the difficult comprehensible!

  • @itaydagan7459
    @itaydagan7459 6 місяців тому +1

    Josh you are one of a kind!! Thank you for all the content!

  • @vishnumuralidharan9858
    @vishnumuralidharan9858 3 місяці тому +1

    Hi Josh, I just want to say that you are an absolutel godsend to the ML and DS community. I have been following your content since 2020 and you have never let me down. I hit a block with PyTorch code implementation and your videos simply untangled a lot of my mental knots. Keep up the great work!

    • @statquest
      @statquest  3 місяці тому

      Awesome! Thank you!

  • @sweetlearning6629
    @sweetlearning6629 7 місяців тому +1

    Thing is; i always look forward to new videos from you, cause i know it would be awesome as always.
    I would want to see more videos on computer vision tho, just hope you'd consider this. Thanks

    • @statquest
      @statquest  7 місяців тому

      I'll keep that in mind.

  • @itsawonderfullife4802
    @itsawonderfullife4802 7 місяців тому +2

    I too, love your work and specially appreciate your playful style and all the movie references. ;)) Thanks for your videos, all of them.

    • @statquest
      @statquest  7 місяців тому

      Thank you very much!

  • @eliyahubasa9401
    @eliyahubasa9401 7 місяців тому +1

    Great video, a great way to study. Thank you.

  • @nourinsiddiqueananna
    @nourinsiddiqueananna 7 місяців тому +2

    Yayy !! I was wanting a video like this !!!

  • @RaynerGS
    @RaynerGS 7 місяців тому +1

    You rock. I love your work. Salute from Brazil.

    • @statquest
      @statquest  7 місяців тому

      Muito obrigado! :)

  • @diegoandradex12
    @diegoandradex12 7 місяців тому +1

    Great video as always

  • @d25102
    @d25102 7 місяців тому +1

    Thank you!

  • @marwolaeth111
    @marwolaeth111 6 місяців тому +1

    StatQuest is so inspiring!
    Please consider making a 'Quest about Geometric deep learning. What do you think?

    • @statquest
      @statquest  6 місяців тому

      I'll keep that in mind.

  • @elodiebeitman8251
    @elodiebeitman8251 6 місяців тому

    Hi! I also bought The StatQuest Illustrated Guide to Machine Learning - It's awesome! Triple BAM! I think there might be a small error with the formula on page 135 (Naive Bayes: FAQ Part 3) though, FYI! Thanks again!

    • @statquest
      @statquest  6 місяців тому +1

      This is a known error in the book. All of them are listed here: statquest.org/sigml-errata/

  • @TJ-hs1qm
    @TJ-hs1qm 7 місяців тому +1

    StatQuest Time!!

  • @kartikchaturvedi7868
    @kartikchaturvedi7868 6 місяців тому +1

    Superrrb Awesome Fantastic video

  • @abhilashb722
    @abhilashb722 6 місяців тому +1

    This is really great 🎉🎉🎉.
    Can you please continue this pytorch + lightning series

    • @statquest
      @statquest  6 місяців тому +1

      Yep! Working on it right now.

    • @kisholoymukherjee
      @kisholoymukherjee 6 місяців тому +1

      And hope you will also add them to the Deep Learning Playlist. Keep rocking (and BAM-ing!) @@statquest

    • @statquest
      @statquest  6 місяців тому

      @@kisholoymukherjee Will do!

  • @yashsonune4391
    @yashsonune4391 6 місяців тому

    Thank you again for the quality content. 🔥. Btw is there any plans for teaching about fine-tuning Language models. I heard this paper (Universal Language Model Fine-tuning for Text Classification) is of higher relevance and backbone for many state of the art solutions.

    • @statquest
      @statquest  6 місяців тому

      I'm working on one.

  • @gsestream
    @gsestream 6 місяців тому

    as a related thing, finding closest things on a map/grid, without making a cell division system, sort all dimensions then check if they collide, like spheres or aabb bounding boxes, n log n time complexity, as opposed to n^2 time complexity, similar to grid cell partitioning of data, for clustering, or k-nearest clustering, but just finds closest at range, any number, very fast. should be applicable directly in high dimensionality complex data analysis, dimensionality reduction, covariance matrices, yep its designed to be fast

  • @user-xl9lj2qo4r
    @user-xl9lj2qo4r 5 місяців тому

    Hi josh Starmer, after learning ML, whether this series is enough for a s beginner to learn deep learning? and do any additional things to know other than this series for deep learning?

    • @statquest
      @statquest  5 місяців тому

      It's a great way to start.

  • @vigneshvicky6720
    @vigneshvicky6720 7 місяців тому +4

    Plz start teaching yolov8 which is used for object detection, segmentation, classification ... Every problem will be solved plz plz

    • @statquest
      @statquest  7 місяців тому +2

      I'll definitely keep that in mind! Keep reminding me on future "in PyTorch + Lightning" videos.

    • @vigneshvicky6720
      @vigneshvicky6720 7 місяців тому +2

      @@statquest sure sir but try to do it from scratch because everyone are using build in one so many of them dont know what is going inside. I would like to learn it from scratch so that I can finetune architecture myself . Do it as soon as possible🙏

  • @lebronjames8507
    @lebronjames8507 7 місяців тому

    Do u guys have an education I can get somewhere? I saw u guys had a probability basics playlist but do u have some type of course to take me through every single subtopic of stats?!

  • @d_b_
    @d_b_ 7 місяців тому

    👍So, these are word embeddings. Do you know/think that sentence embeddings and text embeddings differ much from this process?

    • @statquest
      @statquest  7 місяців тому

      I'm pretty sure they are the same, but I'm not certain.

  • @RaynerGS
    @RaynerGS 7 місяців тому

    The original work [1] uses a multidimensional vector for each word instead of a unique real value. For instance, Troll2 = [,025, 0,735, 0,256, 0,145], in this case, four dimensionality. In the paper, the authors use a matrix instead of a neuron. Question: using neurons, how would you increase the word dimensionality representation?
    1: "Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient Estimation of Word Representations in Vector Space. In Proceedings of Workshop at ICLR, 2013."

    • @statquest
      @statquest  7 місяців тому

      The number of values associated with each word is determined by the number of activation functions in the hidden layer. If you want 4 numbers, then add 4 activation functions.

  • @Johan-zs9xh
    @Johan-zs9xh 22 дні тому

    Excuse me, what is your code editor?

  • @debarunkumer2019
    @debarunkumer2019 5 місяців тому

    Can you please create a playlist to demonstrate the coding part of Encoders, Decoders and Transformers? This is an earnest request from your FAN. Thanks.

    • @statquest
      @statquest  5 місяців тому

      I'm working on it right now.

  • @iurgnail
    @iurgnail 7 місяців тому

    do a video for ARIMA and VAR and their cousins please!

    • @statquest
      @statquest  7 місяців тому

      I'll keep that in mind.

  • @mikinyaa
    @mikinyaa 7 місяців тому +2

    🥳

  • @arseniykan
    @arseniykan 7 місяців тому +1

    BAMbastic

  • @wibulord926
    @wibulord926 7 місяців тому

    what is the different between pytorch and pytorch lightning sir ?

    • @statquest
      @statquest  7 місяців тому +1

      PyTorch Lightning is something that works with PyTorch that makes it easier to code and makes it easier to scale in the cloud and makes your code run faster in genernal.