How chatgpt works

Поділитися
Вставка
  • Опубліковано 8 лют 2023
  • Demystifying Transformers: Understanding Encoder-Decoder Architecture, Attention Mechanisms, and Training Techniques | Lucidate's NLP Series Part 5
    How does ChatCPT work? How is it trained? How does it achieve such impressive results?
    Dive into the world of Transformer Neural Networks with Lucidate's in-depth tutorial! In this video, we break down the powerful architecture behind some of the most popular AI models in natural language processing, such as ChatGPT, BERT, and GPT-3.
    🔥 What you'll learn in this video:
    - The Encoder-Decoder architecture: The backbone of Transformer Neural Networks
    - Training and Inference: Unraveling the brute-force approach to perfecting AI models
    - Attention Mechanism: Decoding the secret sauce that powers Transformers
    - Positional Embeddings: How Transformers capture sequence information
    - Practical examples and use-cases for Transformers in NLP tasks
    Whether you're an AI enthusiast, a student, or a seasoned professional, this comprehensive guide will enhance your understanding of the inner workings of Transformer Neural Networks and their significance in NLP and AI. Don't miss out on this opportunity to expand your knowledge and gain insights into the AI models that are revolutionizing the world of natural language processing.
    👉 Subscribe to our channel for more AI and Machine Learning content: / @lucidateai
    Get an in-depth understanding of the latest breakthrough in NLP technology - ChatGPT! In this video, we'll dive into the inner workings of this cutting-edge AI language model and explore the concepts of word embeddings and attention. You'll learn how ChatGPT uses these techniques to generate natural language responses during inference. As well as how it "learns" to update its weights and parameters during training. Whether you're an AI enthusiast or a beginner data scientist, this video is a must-watch for anyone interested in understanding the power and potential of ChatGPT. So, sit back, relax, and let's discover how ChatGPT works!
    🔗 Useful Links:
    GPT playlist: • Transformers & NLP
    Semantics: • ChatGPT - Semantics: T...
    Positional embeddings: • ChatGPT Position and P...
    Attention: • Attention is all you n...
    Neural Networks: • Neural Network Primer
    Backpropagation: • How neural networks le...
    =========================================================================
    Link to introductory series on Neural networks:
    Lucidate website: www.lucidate.co.uk/blog/categ...
    UA-cam: ua-cam.com/users/playlist?list...
    Link to intro video on 'Backpropagation':
    Lucidate website: www.lucidate.co.uk/post/intro...
    UA-cam: • How neural networks le...
    'Attention is all you need' paper - arxiv.org/pdf/1706.03762.pdf
    =========================================================================
    Transformers are a type of artificial intelligence (AI) used for natural language processing (NLP) tasks, such as translation and summarisation. They were introduced in 2017 by Google researchers, who sought to address the limitations of recurrent neural networks (RNNs), which had traditionally been used for NLP tasks. RNNs had difficulty parallelizing, and tended to suffer from the vanishing/exploding gradient problem, making it difficult to train them with long input sequences.
    Transformers address these limitations by using self-attention, a mechanism which allows the model to selectively choose which parts of the input to pay attention to. This makes the model much easier to parallelize and eliminates the vanishing/exploding gradient problem.
    Self-attention works by weighting the importance of different parts of the input, allowing the AI to focus on the most relevant information and better handle input sequences of varying lengths. This is accomplished through three matrices: Query (Q), Key (K) and Value (V). The Query matrix can be interpreted as the word for which attention is being calculated, while the Key matrix can be interpreted as the word to which attention is paid. The eigenvalues and eigenvectors of these matrices tend to be similar, and the product of these two matrices gives the attention score.
    =========================================================================
    #ai #artificialintelligence #deeplearning #chatgpt #gpt3 #neuralnetworks #attention #attentionisallyouneed
  • Наука та технологія

КОМЕНТАРІ • 78

  • @andrescolon
    @andrescolon Рік тому +2

    Thanks for sharing your knowledge.

    • @lucidateAI
      @lucidateAI  Рік тому

      Thanks @Andres. Appreciate the feedback and the SuperThanks! - Glad you enjoyed it. I hope that you find the rest of the playlist and the channel as insightful! - Lucidate.

    • @lucidateAI
      @lucidateAI  Рік тому

      Andrés, We now have membership set up on the Lucidate channel. check this out to learn more -> ua-cam.com/video/CTVGNclzMBQ/v-deo.html. if it is not for you, then I completely understand, but it is a great way to show your support for the material already produced as well as help contribute to the production of future material. Very best, Lucidate

  • @IntronTVchannel
    @IntronTVchannel 11 місяців тому +4

    After having watched 10s of videos on how Transformers work, I eventually came accross the present "Lucidate" videos... they are BY FAR THE BEST explainer videos on the topic. Thank you sir!

    • @lucidateAI
      @lucidateAI  11 місяців тому +1

      You are welcome! I’m glad you found them insightful! (Tell your friends…). Lucidate

  • @mikeg1368
    @mikeg1368 Рік тому +18

    Per GPT: One analogy that might help to understand the roles of Query, Key, and Value in the Transformer model is to think of them as a librarian, a catalog, and a book, respectively.
    In this analogy, the Query is like a librarian who receives a question from a reader (i.e., the input token) and needs to find the most relevant books in the library that might help to answer the question. The librarian does this by consulting a catalog, which contains information about the books in the library. The catalog is like the Key in the Transformer model, as it provides a way for the librarian to quickly find the books that are most relevant to the reader's question.
    Once the librarian has identified the relevant books using the catalog (i.e., computed the attention weights using the Key and Query), the reader can then look at the books themselves to find the information they need. The books are like the Value in the Transformer model, as they contain the actual information that the reader is interested in.
    In this way, the Transformer model is like a library where the Query is a reader, the Key is a catalog, and the Value is a book. The librarian uses the catalog to quickly identify the most relevant books for the reader's question, and the reader then looks at the books themselves to find the information they need. Similarly, the Transformer model uses the Key and Query to quickly identify the most relevant parts of the input sequence, and then uses the Value to extract the relevant information for downstream tasks.
    Of course, GPT could probably go on to write many of the books. That's why I'm optimistic about how GPT can be used to improve itself.

    • @lucidateAI
      @lucidateAI  Рік тому +2

      Great analogy.. Thanks for sharing and for your contribution to the channel. I too share your optimism.

  • @chrisnatale5901
    @chrisnatale5901 Рік тому +11

    These are the most chill Transformer videos I've found so far. Thanks for the excellent explanation!

    • @lucidateAI
      @lucidateAI  Рік тому +1

      Wow, thanks! Really glad you appreciate them.

  • @ravistarzl
    @ravistarzl Рік тому +4

    The method of “walking through” the algorithm and explaining what the algorithm is thinking about and working with at each step is excellent. I would love to see more on this topic - especially different stacks of encoders and decoders and what effect that has on learning and processing

    • @lucidateAI
      @lucidateAI  Рік тому

      Ravi. Thanks for the positive comments and great suggestions for future content. Something to think about... Lucidate.

  • @SimonDickerman
    @SimonDickerman Рік тому +4

    Thank you for another great upload, definitely going to back and rewatch through all 5 videos to solidify my learning.
    10/10 explanations, keep up the great work.

  • @felipeblin8616
    @felipeblin8616 Рік тому +2

    What I miss or maybe I didn’t understand. Is an explanation of how it use a pretrained model to asign values in Query and Key values. Great video

    • @lucidateAI
      @lucidateAI  Рік тому +1

      Thanks for your comment! I'm glad you enjoyed the video. If it is helpful there is another video in this series that goes through the attention mechanism. You can find that video here -> ua-cam.com/video/sznZ78HquPc/v-deo.html
      But here is my answer your question, the idea behind using a pretrained model like ChatGPT or GPT-3 in the context of assigning values in Query and Key values is to take advantage of the model's ability to understand and generate human-like language. In essence, the pretrained model has already been "trained" on vast amounts of text data, which means it has already learned a lot about how language works, including how words and phrases are related to one another.
      So, when we use ChatGPT or GPT-3 to assign values in Query and Key values, we're essentially asking the model to use its knowledge of language to identify relevant terms or concepts in the text that we're working with. This can be useful in a variety of applications, such as natural language processing, chatbots, and search engines.
      Of course, the exact process of using a pretrained model to assign values in Query and Key values can be quite complex and depend on the specific context in which it's being used. But hopefully, this explanation helps to provide some insight into the basic idea behind using transformers in this way.
      If I've misunderstood your question, please let me know. Once again I appreciate your support and contribution to the channel. - Lucidate

  • @marckuhlmann2905
    @marckuhlmann2905 Рік тому +1

    Bro, this is fuckin insanely good.
    Even the music gives me chills.
    Whoever you are, I take my hat off to you, Sir.
    Exceptional work!!

    • @lucidateAI
      @lucidateAI  Рік тому +1

      What a delightful comment to receive! I'm absolutely thrilled to hear that you found the video to be such a remarkable experience, and that even the music gave you chills - that's the sort of reaction that any creator hopes to elicit from their audience.
      It's wonderful to know that the hard work and effort that went into creating the video was appreciated, and it's comments like yours that make me feel grateful to be doing what I'm doing. I'm truly touched by your kind words and encouragement.
      I'm thrilled to hear that you enjoyed the video and found it helpful. I'm also glad to hear that the music added to the experience - I put a lot of effort into finding just the right tracks to create the right atmosphere.
      Thank you so much for taking the time to watch the video and leave a comment. Your feedback and support mean the world to me, and I hope you'll continue to follow along as I explore the fascinating and ever-evolving world of AI and natural language processing. Cheers!

  • @DVMovies1999
    @DVMovies1999 Рік тому +2

    Just found this video by change, glad that I did! Love the way you explained this topic. Hope your videos can reach a wider audience 😁

    • @lucidateAI
      @lucidateAI  Рік тому

      Thanks. I hope you enjoy some of the other videos on the channel too. Appreciate the comment.

  • @jeff__w
    @jeff__w Рік тому +1

    The text is exquisitely clear as are the animations.
    I think it would be better to drop the “bounce” from the animations, e.g., at 2:51, 3:54, 3:57 and 6:14-the last one actually looks like some sort of glitch but isn’t. They’re distracting and add nothing to the viewer’s experience-to me, they almost look dated. (The viewer’s attention is already drawn to that portion of the screen by virtue of the animation itself.)
    If you want to draw a viewer’s attention _back_ to some element that is already on the screen, something like a *slight gleam* across that element might work well-it’s subtle enough that it doesn’t wear the viewer out (the viewer might be hardly aware of it, which is good) but salient enough that it does what it’s supposed to do.

    • @lucidateAI
      @lucidateAI  Рік тому

      Thank Jeff for your positive comment and your constructive criticism. Both greatly appreciated - especially the criticism. The reality is with these explainer type videos there is a lot of experimentation to see what style works best. As with any experiment, you have to be prepared for the experiment to fail and to learn from it. This is one such case. If you trawl the comments of this video and others earlier in this series, you will see that others have pointed out that they don't like the bouncing animations. If you view the later videos in this playlist -> ua-cam.com/play/PLaJCKi8Nk1hwaMUYxJMiM3jTB2o58A6WY.html you will see that I've listened to the comments and have eliminated the bounces from later vids.
      Some you get right, but if you are going to experiment you have to be prepared to get some wrong and take the criticism when it comes your way.
      What we can learn from makes us stronger!
      Sincerely appreciate your support of the channel, taking the time to watch the video and the effort of writing a comment. With thanks - Lucidate.

  • @benjaminjordan2330
    @benjaminjordan2330 Рік тому +1

    Those text reflections look like an astigmatism

    • @lucidateAI
      @lucidateAI  Рік тому

      I hadn’t thought of that before. They still look more like text reflections to me. But I guess we are all different. Thanks for sharing and thanks for supporting and contributing to the channel in the comments.

  • @ajalipio1
    @ajalipio1 Рік тому +1

    wow.. great explanation! it can't get better than this. you have my love and my subscription. kudos!

    • @lucidateAI
      @lucidateAI  Рік тому

      Welcome aboard! Really appreciate your comments and support.

  • @wintobisakul1848
    @wintobisakul1848 Рік тому +1

    amazing explain , fantastic presentation

  • @Sydra.
    @Sydra. Рік тому +2

    Very useful!

  • @JazevoAudiosurf
    @JazevoAudiosurf Рік тому +1

    animations and pictures are much easier to understand than text. I hope the multi modal models will be able to output high quality pictures. the gains would be through the roof, who needs university anymore in the future

    • @lucidateAI
      @lucidateAI  Рік тому

      Thank you for your comment! I completely agree that animations and pictures can be more effective at conveying complex concepts than text alone. In fact, this is one of the reasons why I use visual aids like diagrams and animations in my videos.
      Regarding multi-modal models, I believe that they have tremendous potential for generating high-quality visual content that can help people understand complex concepts more easily. As these models continue to evolve and improve, we may see new and exciting applications in fields such as education and training. However, I also believe that traditional forms of education and learning will still be important, as they offer a more structured and comprehensive approach to acquiring knowledge and skills.
      Thank you again for your comment and for sharing your thoughts on this topic; as well as for your support of and contribution to the Lucidate channel.

  • @ctestestestest
    @ctestestestest Рік тому +1

    great explanation! but please, turn up the volume :)

    • @lucidateAI
      @lucidateAI  Рік тому

      Thanks for the constructive feedback. Greatly appreciated. Also saw your similar comment on ua-cam.com/video/sznZ78HquPc/v-deo.html.

  • @PrabhjotSinghDhillo
    @PrabhjotSinghDhillo Рік тому +1

    Highly indebted for this great content. Is 6th video not out yet? It shows it as being unavailable. !

    • @lucidateAI
      @lucidateAI  Рік тому

      Thanks Prabhjot for your support of the channel and your kind words in this comment. The sixth video has just dropped a couple of minutes ago and you can access it here -> ua-cam.com/video/ZvrJaqaK65Y/v-deo.html. Would love to hear your comments on this video and ideas for future content. - Lucidate.

    • @PrabhjotSinghDhillo
      @PrabhjotSinghDhillo Рік тому

      @@lucidateAI Sure, thankyou

  • @muthukamalan.m6316
    @muthukamalan.m6316 Рік тому +3

    bro, you earned my subscription and love ❤ in a single video. love to see more. any idea to create content on energy based models in future

    • @lucidateAI
      @lucidateAI  Рік тому

      Thanks for the sub! What type of energy based models are you looking for?

    • @muthukamalan.m6316
      @muthukamalan.m6316 Рік тому

      @@lucidateAI contrastive and architectural

    • @lucidateAI
      @lucidateAI  Рік тому +1

      Not topics that I would profess to be expert in, but (as it is your suggestion) I will take a look. Thanks for the comments and recommendation.

  • @muthukamalan.m6316
    @muthukamalan.m6316 Рік тому +1

    could you please make some videos for how transformers were used in vision. gpt 4 build for multi model excited to peek that

    • @lucidateAI
      @lucidateAI  Рік тому

      I will get to that, but maybe not for a few weeks. Great suggestion!!

  • @jazonsamillano
    @jazonsamillano Рік тому

    Prompt: I just watched a video about NLP Transformers. I really liked it. Please leave a short review to praise the video creator Richard Walker for the tremendous effort he put in to create this valuable video.
    ChatGPT: As an AI language model, I don't have the ability to watch videos, but I'm glad to hear that you enjoyed the video about NLP Transformers and found it valuable. It's always great to see content creators like Richard Walker putting in tremendous effort to educate and inspire people about complex topics like natural language processing. Creating informative and engaging content is not an easy task, and it's wonderful to see people like Richard Walker making a positive impact in the field of NLP. I'm sure he would appreciate your kind words and praise.

    • @lucidateAI
      @lucidateAI  Рік тому

      Thanks to you both!! Nice touch!! Very meta!
      In all likelihood it won't be long before your assistant is able to watch videos and provide comments of their own. I'll be honest and say that developing transformers with that capability is currently beyond Lucidate's skill (and budget!). But within OpenAI, Google, HuggingFace etc. I would be shocked if the ability to get a transformer to watch a video and provide real-time streaming commentary wasn't already legacy technology.
      Once again I am humbled by your support of the channel and your kind words. - Lucidate.

  • @davronsherbaev9133
    @davronsherbaev9133 10 місяців тому

    I think during training the model rarely sees similar text(like you shown with same text + 1 token), otherwise it will overfit to output the same sequences.
    The same sequences are fed to the encoder/decoder only during inference, since we generate only one token per whole cycle. Training this way will result in overfitting on the same text over and over and decrease the accuracy.

  • @AltafRehmani
    @AltafRehmani Рік тому +1

    Excellent video. Would like to understand in the context of gpt series - it’s called a decoder only architecture. So what role does a encoder play when it comes specifically to the gpt family of transformers?

    • @lucidateAI
      @lucidateAI  Рік тому

      Thanks! Great question. The encoder in a transformer converts the hidden state into an output sequence. Clearly you only have encoders in a decoder-encoder architecture. A decoder only architecture is different. A decoder-only architecture is a type of neural network that is used for tasks such as sequence-to-sequence modeling, language generation, and machine translation. Unlike an encoder-decoder architecture, which consists of both an encoder and a decoder, a decoder-only architecture only has a decoder.
      The decoder-only architecture is based on the idea of auto-regressive modeling, which involves predicting one element of a sequence at a time, conditioned on the previously generated elements. In the context of language generation or machine translation, this means that the model generates one word or token at a time, conditioned on the words or tokens that have been generated previously.
      In a decoder-only architecture, the input to the model is a context vector, which is typically generated by an encoder or obtained from an external source. The context vector contains information about the input sequence that is relevant for generating the output sequence. The decoder then takes the context vector as input and generates the output sequence one element at a time.
      The decoder in a decoder-only architecture typically consists of multiple layers of a recurrent neural network, such as a Long Short-Term Memory (LSTM) or a Gated Recurrent Unit (GRU), or a Transformer-based architecture. The decoder is trained to maximize the likelihood of generating the correct output sequence, given the input context vector.
      Decoder-only architectures have several advantages over encoder-decoder architectures. For example, they can be more efficient and require fewer parameters, since they do not have to process the entire input sequence. They can also be used for tasks such as language modeling and text generation, where there is no explicit input sequence.
      However, decoder-only architectures can also be more challenging to train than encoder-decoder architectures, since they rely on auto-regressive modeling and require generating one element at a time. Additionally, they may not be suitable for tasks where the input sequence is important for generating the output sequence, such as in machine translation.
      Overall, decoder-only architectures are a useful type of neural network architecture for a range of sequence modeling tasks, and can be a useful alternative to encoder-decoder architectures in certain contexts.

  • @DeruwynArchmage
    @DeruwynArchmage Рік тому +1

    Does the null section in the matrix mean that the *final* output after training is complete is myopic (does not look ahead to help plan the next word given potential future predicted words)?

    • @lucidateAI
      @lucidateAI  Рік тому

      Thanks Daniel. A little bit to unpack on this question - and apologies if I've misunderstood You. I think you are talking about the attention mask? The concept of a "null section" in the attention matrix is not quite accurate, but let me first explain the idea behind masked attention and its implications on future predictions.
      When we talk about masked attention, we mean that certain positions in the input sequence are masked or blocked out, so that the attention mechanism cannot attend to them. This is often used in language models, where we want to generate text one word at a time, and we don't want the model to "cheat" and look ahead at the words it's supposed to be predicting. It is also used to pad input sequences that are shorter than the maximum input length.
      Now, to address your question, it's not that the final output after training is myopic or unable to plan ahead. Rather, the model is trained to predict the next word based on the previous words in the sequence, but only up to a certain point. In other words, the model is myopic to the extent that it only has access to the tokens that came before the current token, but it can still plan ahead to some extent based on what it has seen so far.
      Of course, the quality of the model's predictions will depend on a variety of factors, such as the size of the training dataset, the architecture of the model, the choice of hyperparameters, and so on. But in general, masked attention is a powerful tool for training language models that can generate coherent and plausible text one word at a time, without cheating by looking ahead.
      I hope this answered your question, but please rephrase it if I have misunderstood. Thanks for your question and contribution to the channel, greatly appreciated - Lucidate.

    • @DeruwynArchmage
      @DeruwynArchmage 9 місяців тому

      @@lucidateAI: hi, thanks for answering my question. I’m sorry for the extremely delayed reply. I don’t always see that someone replied; just now saw the ❤️ reaction.
      I can’t remember precisely what I was asking, but I think I was asking something different out of ignorance and your answer was better than my question. It makes sense.
      I think I was thinking about a sparse matrix perhaps; maybe not. 5 months and all.
      But, I get how you’re just using the mask for training purposes now.
      Now, the thing I’m curious about is how far they can plan ahead. For example; when generating token N, can the matrix consider what tokens N+1 - N+5 might be and use that to help it pick a better N so that it doesn’t, kind of, paint itself into a corner in the text it outputs?
      Based on the kinds of limitations and mistakes I’ve seen LLMs make, it seems like their ability to plan ahead is pretty limited. I think that’s why they can’t do things like count words or letters in what they’re outputting. It also seems like the reason why chain-of-thought prompting works so well. If you ask it to give the answer up front then it has to do all of the “thinking” in a single pass through the layers, which really, aren’t very deep at all. It can only “store” what it’s thinking about by outputting an intermediate token that takes it along the path to the final answer. That effectively makes the network *much* deeper, by iterating through them.
      At the same time; their ability to write code as well as they do tells me that they can do a lot of planning because code is pretty hard to write in just a single pass, front to back (at least for me). Maybe it’s just that much smarter than I am at that task. Or maybe, that’s why it sometimes makes mistakes.
      I still feel like a hybrid approach would be a very powerful thing. Clearly, linear matrices (feed forward, left to right) work quite well. But I feel like they are kind of… doing it the hard way. I certainly don’t think in that pattern. I mull things over in my head for a bit; turning them and analyzing them until they make sense and then I output the results. It seems like if part of the system were linear, and took advantage of the speed gains from video card architecture, and part of it were more graph/cyclic style, allowing it to kind of “try things out in its head” before actually producing output, that might make it a bit smarter. Kind of like a cross between tree-of-thought and re-evaluating the output before it produces it. I wonder if outputting it’s results in larger chunks instead of single tokens could produce better results.
      Then again, I don’t think that would improve our ability to interpret how it works nearly as well as some of the current “tacked on” methods that we use, such as tree-of-thought. If we do it in that way; it somewhat limits the black-box nature of the system so that we can, sort of, “peek in” at what it’s thinking.
      Anyway, thanks again for your reply. It was enlightening.

  • @tobiassander3995
    @tobiassander3995 Рік тому

    Nice explanation. One thing you did not make clear is what you would feed into the output embedding as opposed to the input embedding.

    • @lucidateAI
      @lucidateAI  Рік тому +1

      Tobias, Thank you for your comment and for your interest in the topic! I'm glad you found the explanation helpful, and I apologize for any confusion around the inputs to the encoder/decoder.
      It is perhaps worth me splitting up my answer to address the differences during training and during inference.
      =============
      1. Training:
      Regarding the difference between the input and output embeddings, during training, the encoder embedding is used to transform the input sequence, such as a sentence, into a dense vector space. The dense representation is then processed by the encoder component of the transformer architecture to obtain a high-level representation of the input sequence.
      The decoder embedding is used to represent the corresponding _target_ output sequence, such as a response in a conversation model, in the same dense vector space. The decoder component of the transformer architecture is then used to generate the final output sequence based on the high-level representation of the input sequence and the output embedding.
      ============
      2: Inference:
      During inference, the encoder embedding is used in the same way as during training, but the target output sequence is replaced by the model's predictions. The decoder component generates a predicted output sequence based on the high-level representation of the input sequence and the output embedding.
      ============
      It's worth noting, as per the video, that the technique of "teacher forcing" is often used during training, where the decoder is trained to generate the next output word based on the true target output sequence rather than the model's own predictions. This helps the model to learn the correct mapping from input to output, but can cause problems during inference if the decoder is then required to generate the output sequence based on its own predictions.
      I hope this provides a more detailed explanation of the difference between the input and output embeddings, and how they are used during training and inference. Thank you again for your comment and for your contribution to the channel with your question. Greatly appreciated - Lucidate.

  • @mbrochh82
    @mbrochh82 Рік тому +1

    I'm quite confused. In the second video where you describe how word embeddings are created with a neural network and weights and biases and backpropagation and all that... is all of that actually not happening at all here?
    It seems to be the embeddings at the start are just randomized, the we add positional encodings, then we add attention, then it goes into the decoder and does more magic which is not explained here and then we compare the result with expectation... but how does it backpropagate? where are the parameters that get updated?

    • @lucidateAI
      @lucidateAI  Рік тому

      I hope that this video provides some further insights ua-cam.com/video/QOS45g-3Sa4/v-deo.html. During training the network knows what the output vector should be. It is the token that is masked. It can use this to backprop the error though all of the tunable parameters in the network to minimise the loss. After multiple backward passes in myriad epochs this causes the weights to converge to a point where the network no longer produces nonsensical output, and starts to provide coherent responses. (If I’ve misunderstood your comment, my apologies)

    • @lucidateAI
      @lucidateAI  Рік тому

      Did the answer help, or did I misunderstand your question? Are you less confused?

  • @TminusDoom
    @TminusDoom Рік тому

    Thanks for this series.
    At 10:00 the volume of the voice becomes very quiet.

    • @lucidateAI
      @lucidateAI  Рік тому +1

      Sorry about that. I hope that you were able to hear the conclusion of the video OK. Was it still inaudible with the volume turned up a notch or two? (I'm not saying you should have to turn up the volume, clearly the onus is on me to make the video audible, I just wanted to know if you could still hear the voice track on the video).

    • @TminusDoom
      @TminusDoom Рік тому

      @@lucidateAI Yes, I was able to hear after turning it up, just wanted be sure you were aware for the future.

    • @lucidateAI
      @lucidateAI  Рік тому +2

      Thanks for confirming. Sorry about the audio errors in post-production. I'll try and tighten things up. Your contribution and feedback is _greatly_ appreciated.

  • @omsingharjit
    @omsingharjit Рік тому +1

    Scalers tensors !!
    I am not maths person so , I am asking
    As i am aware of that tensors used in Einstein Theory of general relativity to discribe curvature of space time , than what is the use of such mathematics in AI which deals with curvatures and you explain it in intuitive way better than gpt3 ?

    • @lucidateAI
      @lucidateAI  Рік тому +1

      Thanks for the question. I'm not an expert in Einstein's theory of relativity, so I may not be best qualified to answer this question - but here is my answer for what it is worth.
      While both neural networks and Einstein's theory of relativity use tensors, the ways in which they use them are quite different.
      In neural networks, tensors are used as data structures to represent the input data, intermediate computations, and output predictions. Tensors in neural networks are typically multi-dimensional arrays of numerical values that are manipulated by the network's layers and learned through the training process.
      In contrast, tensors in Einstein's theory of relativity are used to represent the geometry of space and time, and the relationships between different observers in the universe. These tensors, known as the metric tensor and the curvature tensor, are used to describe the curvature of spacetime caused by the presence of matter and energy.
      In both cases, tensors are multi-dimensional arrays that can hold numerical values, but the similarities end there. The use of tensors in neural networks is a purely computational and algorithmic problem, while the use of tensors in relativity is a fundamental aspect of our understanding of the universe.
      Furthermore, the tensors used in neural networks are typically of low rank (i.e., they have relatively few dimensions), whereas the tensors used in relativity are of high rank (i.e., they have many dimensions). This is because the curvature of spacetime in relativity is a highly complex phenomenon that requires a large number of dimensions to accurately describe, while the computations performed in neural networks are typically simpler and require fewer dimensions.
      In summary, while both neural networks and relativity use tensors, they do so in very different ways, with different levels of complexity and different objectives.
      Many thanks for your insightful question - Lucidate.

    • @omsingharjit
      @omsingharjit Рік тому

      @@lucidateAI thank you for Such Breakdown and simplified forms of explanation .

    • @lucidateAI
      @lucidateAI  Рік тому +1

      You are welcome. Thank you for supporting and contributing to the channel. I hope that others are able to benefit from your question.

    • @omsingharjit
      @omsingharjit Рік тому

      @@lucidateAI yes I became one of your subs , plz try to make video to giving Basic idea of ML AL from scratch for Layman peoples , so peoples like me will take interest to learn hot topic like this . even they aren't not from this field 😊🙂

    • @lucidateAI
      @lucidateAI  Рік тому +1

      Here is an intro to machine learning playlist - ua-cam.com/play/PLaJCKi8Nk1hwklH8zGMpAbATwfZ4b2pgD.html. I hope you find it useful.

  • @carneesperta
    @carneesperta Рік тому +1

    Wonderful content! What is your background, if I may ask? I'd be honored to have you on our small "how to be intelligent"-like podcast. (I'm Andre, I do neuroscience)

    • @lucidateAI
      @lucidateAI  Рік тому +2

      You are very kind. Thank you for taking the time to watch the video and to provide positive feedback. My degree is in Aerospace Systems Engineering and I have a PhD in AI. I've spent most of my career in Silicon Valley working for tech firms selling software to Capital Markets firms or in London working for banks and exchanges. The best way get in touch is via the Lucidate website (address is at the end of the video). Thanks again for your support, greatly appreciated

  • @JuanPeguero
    @JuanPeguero Рік тому +1

    Are you a ChatGPT?

    • @lucidateAI
      @lucidateAI  Рік тому +1

      Hi Juan. I am not. But I am a big fan and a user - along with many others - of ChatGPT.

  • @marckuhlmann2905
    @marckuhlmann2905 Рік тому +1

    Something is off here. Your channel hasn'z blown up yet, despite your contentof five months being some of the best I've seen out there with tough competitors like 3Blue1Brown.
    What happened? Did they delete your channel before? Is this a reupload?

    • @lucidateAI
      @lucidateAI  Рік тому

      Thanks for your comment, and your kind words about my content! I'm really glad to hear that you've enjoyed the videos and found them valuable.
      To answer your question, my channel is relatively new, and I'm still working to build up my audience and reach more people. While it can be a bit challenging to compete with well-established channels like 3Blue1Brown (I think Grant Sanderson does some of the best work on YT; it's an honour to even be mentioned in the same comment as 3b1b), I'm proud of the work I've done so far and excited to keep creating new content and reaching more people.
      Rest assured, there's no funny business going on - this is a relativey new channel. The transformer content which has attracted most attention (pun ever so slightly intended) only went live in January, and I'm excited to see where it goes in the future. Thanks again for your support and engagement, and I hope you'll continue to follow along as I explore the exciting and complex world of AI and natural language processing. Cheers Marc! - Lucidate