ChatGPT - Word embeddings and semantics: Transformers & NLP 2

Поділитися
Вставка
  • Опубліковано 4 січ 2023
  • How do transformers like ChatGPT learn and represent words?
    Transformers are a type of neural network architecture that are used in natural language processing tasks like language translation, language modeling, and text classification. They are effective at converting words into numerical values, which is necessary for AI to understand language. There are three key concepts to consider when encoding words numerically: semantics (meaning), position (relative and absolute), and relationships and attention (grammar). Transformers excel at capturing relationships and attention, or the way words relate to and pay attention to each other in a sentence. They do this using an attention mechanism, which allows the model to selectively focus on certain parts of the input while processing it. In the next video, we will look at the attention mechanism in more detail and how it works.
    We can encode word semantics using a neural network to predict a target word based on a series of surrounding words in a corpus of text. The network is trained using backpropagation, adjusting the weights and biases of the input and hidden layers until the updates become negligible and the network is said to be "trained". The weights connecting the input neurons to the hidden layer will then contain an encoding of the word, with similar words having similar encodings. This allows for more efficient processing and a better understanding of the meaning and context of words in the language model.
    Video links:
    On www.lucidate.co.uk:
    - One-hot vector Encoding - www.lucidate.co.uk/forum/data...
    - Neural Networks Primer - www.lucidate.co.uk/blog/categ...
    On UA-cam:
    - One-hot vector Encoding - • EDA 2 - Categorical Data
    - Neural Networks Primer - • Neural Network Primer
    #chatgpt #gpt3 #ai #artificialintelligence #neuralnetworks

КОМЕНТАРІ • 13

  • @seyedmatintavakoliafshari8272
    @seyedmatintavakoliafshari8272 3 місяці тому +2

    Very impressed by this series. Thanks Richard!

  • @scenariowilderness7560
    @scenariowilderness7560 Рік тому +3

    This video series really shines a light on the "magical" ability of chatGPT to assume different personas and roles. I was never able to reconcile the simple idea of "next word prediction" with the observed behaviours until I watched these. Thank you!

    • @lucidateAI
      @lucidateAI  Рік тому

      My dear sir/madam, I am absolutely delighted to hear that you have found these videos illuminating and informative. ChatGPT is indeed a remarkable creation that showcases the remarkable progress that has been made in natural language processing and machine learning.
      It is a testament to the power of human ingenuity and collaboration (think of all of the technologies that it is build on top of, and how many women and men have devoted their professional lives to these discoveries), and I for one am thrilled to see the possibilities it presents for further exploration and discovery. Thank you for your kind words, and may you continue to find joy and fascination in the world of AI and machine learning. A huge thank-you for supporting the channel - Lucidate.

  • @Trackman2007
    @Trackman2007 Рік тому +5

    Absolutely fantastic explanation of a complex topic! Progressive build-up in a easy-to-understand fashion, great animation, nice background music which adds some special mood ) Thanks a lot!

  • @johnsmith-wt8gq
    @johnsmith-wt8gq Рік тому

    Really useful!

  • @linguipster1744
    @linguipster1744 3 місяці тому

    Hi there! Thank you so much for these videos. I have a question re: Son + Extended - Nuclear (21:01). We say the expected word should be "cousin", but why? Wouldn't "nephew" be more fitting? (As in; one step "below" in the family tree instead of on the same level, but less nuclear than son, still male, etc.) Which then at least did show up in the top 10 list. :) Or the other way around: If we want cousin, wouldn't "brother" be the base word?
    Again, thanks so much for taking the time to make these.

    • @lucidateAI
      @lucidateAI  3 місяці тому

      Thanks. I’m glad you are enjoying the channel. I think you make a great point, and upon reflection I perhaps should have used the base word “sister” or “brother” to lead to the target word of “cousin” after the add “extended” and subtract “nuclear”. It is a while since I made the video, but if my memory serves me well (and sometimes it fails me spectacularly!) I took the examples from an intelligence test I found on the Internet, and this was the answer provided by the puzzle. But I think your logic is more valid. The main point though is with the amount of context that the embedding model has it will get close answers, but not always correct or precise ones. LLMs can’t simply use this type of vector arithmetic in their predictions, they rely heavily on other constructs - principally the Attention mechanism - to improve their predictions. Attention is covered here -> ua-cam.com/video/sznZ78HquPc/v-deo.html, while this video -> ua-cam.com/video/BCabX69KbCA/v-deo.html showcases how providing more context massively improves predictive power

    • @lucidateAI
      @lucidateAI  3 місяці тому

      elearning.shisu.edu.cn/pluginfile.php/36509/mod_resource/content/1/ANALOGIES.pdf. My memory didn’t fail me. (At least this time!)

  • @arunsrinivasa8643
    @arunsrinivasa8643 Рік тому

    I think that for 300 dimensional vectors you can show it as a sort of bar chart. Ie vectors as step funtions.

    • @lucidateAI
      @lucidateAI  Рік тому +1

      Thanks Arun. You are correct. There is always a way! A two-year stock chart for every business day has around 500 elements and can easily be visualised as a bar chart and even seen on a small screen like a phone - which many people use to watch UA-cam videos. Clearly a stock chart has some advantages for this type of visualisation compared to a word embedding vector. The points on a stock chart are largely correlated (today's value will be broadly in the same range as yesterdays) whereas the points in an embedding vector look like random noise which can make them harder to interpret as a bar chart. But you are 100% correct, it can be done. Lucidate has a video series on Data Visualisation if this is a topic that interests you. Here is the link -> ua-cam.com/play/PLaJCKi8Nk1hw5_MtYSBCJEDF1omczEO4x.html. Many thanks for your comment and support of the channel.