What does it mean for computers to understand language? | LM1

Поділитися
Вставка
  • Опубліковано 28 лис 2024

КОМЕНТАРІ •

  • @vcubingx
    @vcubingx  8 місяців тому +23

    If you enjoyed the video, please consider subscribing :)
    Part 2: ua-cam.com/video/rTz6hadM1Lg/v-deo.html
    I'm excited to be starting this new series! NLP is the topic I feel like I have the most to say about, but I'll avoid throwing in my personal opinions into these videos :p Stay tuned for the next chapter which I'll be posting next Monday!! (And the third chapter next to next week). Also, let me know what other kinds of topics you'd be interested in seeing!!!

    • @sethdon1100
      @sethdon1100 8 місяців тому +1

      Strange timing you go there…
      3B1B published basically the same video before you.

    • @yjp20
      @yjp20 8 місяців тому

      The 🐐

    • @1XxDoubleshotxX1
      @1XxDoubleshotxX1 8 місяців тому

      so hot

    • @buddhadevbhattacharjee1363
      @buddhadevbhattacharjee1363 8 місяців тому +1

      Hi, One of the topics which I am struggling with understanding is the requirement of V in QKV and why the multihead attention outputs are concatenated rather than doing any other operation.If you could make a video on concatenation of vectors and how they retain information better that would be great

    • @vcubingx
      @vcubingx  8 місяців тому

      @@buddhadevbhattacharjee1363 hmmm, interesting question for sure! I believe the reason concatenation is done is just because its loss-less (retains all information).This is just a standard DL practice - for example, we concatenate the positional embeddings too.
      The next to next chapter will be on attention. Let me know if that addresses you questions, and if not, I'll look into what a follow-up video could contain.
      Thanks ofr your input!

  • @johndewey7243
    @johndewey7243 7 місяців тому +5

    Just came from 3B1B, subbed this is excellent. Thanks!

  • @PiercingSight
    @PiercingSight 8 місяців тому +16

    What timing for this video~
    Looking forward to more!

  • @davidespinosa1910
    @davidespinosa1910 2 місяці тому +1

    Just for general interest, here are a few examples of syntax vs semantics. Consider "The tree ate a banana". It's syntactically valid, but it doesn't mean anything. Or, "Is a dog conscious ?". It's also syntactically valid, but it doesn't mean anything until we decide what "conscious" means. Or, "Does the past still exist ?". It doesn't mean anything until we decide what "exist" means.

  • @miss-magic-maya
    @miss-magic-maya 8 місяців тому +3

    This is wonderful, excited for the next video!
    Also nice choice of music :)

    • @vcubingx
      @vcubingx  7 місяців тому

      Thanks! Love nintendo music :)

  • @dattatreyadas
    @dattatreyadas 7 місяців тому +4

    10:14 Brought to you by... 3Blue1Brown!!

  • @ianthehunter3532
    @ianthehunter3532 8 місяців тому +81

    Odd timing 🤔

  • @chineduezeofor2481
    @chineduezeofor2481 6 місяців тому +1

    Found this by watching 3Blue1Brown
    Awesome channel!

  • @cyanurecyanures130
    @cyanurecyanures130 5 місяців тому +1

    Thanks a lot for the video ! Now I understood that Trigrams model just take into acount the last three words.

  • @artahir123
    @artahir123 8 місяців тому +3

    brother never stop making these videos
    these are very interesting

    • @vcubingx
      @vcubingx  8 місяців тому +1

      Glad you like them!

  • @averagemilffan
    @averagemilffan 8 місяців тому +2

    Great video!! I'm hoping you discuss some of the history in the next episodes too though

    • @1XxDoubleshotxX1
      @1XxDoubleshotxX1 8 місяців тому

      agree!

    • @vcubingx
      @vcubingx  7 місяців тому +1

      That's the plan! I'm trying to touch on key papers until 2016

  • @jamesking2439
    @jamesking2439 8 місяців тому +8

    We're eating good today guys.

  • @MrWater2
    @MrWater2 7 місяців тому

    Good one! I'll be waiting for the next one

    • @vcubingx
      @vcubingx  7 місяців тому

      Thanks! Currently working on it - should be up on Monday

  • @imtanuki4106
    @imtanuki4106 Місяць тому

    Nicely done!

  • @amirjutt0
    @amirjutt0 8 місяців тому

    You'll go up boi. Just put in the effort. Make the quality content. People are looking for quality content related to ML.

  • @Randomstiontastic
    @Randomstiontastic 8 місяців тому +34

    You uploaded this a minute after 3b1b’s video, how?

    • @Orillians
      @Orillians 8 місяців тому +6

      IKR. FIRST I WAS WONDERING HOW AND THEN THIS TOO WHAT WHAT

    • @cwaddle
      @cwaddle 8 місяців тому +1

      This dude must be 3b1bs younger bro, or a buddy

    • @laycookie-f6i
      @laycookie-f6i 8 місяців тому

      I was like when did 3b1b release the video about transformers? Turns out same time as this video

    • @vcubingx
      @vcubingx  8 місяців тому +8

      :)

    • @laycookie-f6i
      @laycookie-f6i 8 місяців тому

      @@vcubingx What a troll.

  • @PastisPastek
    @PastisPastek 8 місяців тому +1

    Perfect timing

  • @mihairobert-catalin951
    @mihairobert-catalin951 5 днів тому

    What's the module of |V| , what it represents in the context?

  • @gwonchanjasonyoon8087
    @gwonchanjasonyoon8087 4 місяці тому

    From 3b1b!

  • @stellastaraj
    @stellastaraj 6 місяців тому

    Hi, love the video - just one thing, the C in the Probability equation throws me off. I keep reading it in my mind as "complement" - as in the complement of a set. I'm probably missing the right context for it. I can grasp from what you're saying that it probably signifies occurrences of the event, but uncertain why it's "c". Is it c for condition ?

    • @vcubingx
      @vcubingx  6 місяців тому

      C stands for count. Sorry! It can be a bit confusing - should’ve explained it. Some of the notation NLP folk use is certainly questionable

  • @tomoki-v6o
    @tomoki-v6o 8 місяців тому +1

    Still waiting for part 3 on neural networks

    • @vcubingx
      @vcubingx  8 місяців тому +6

      Dang, it's been 4 years already...how time flies by.
      I'll try and make this my next-to-next-to-next video (After Chapter 3 of this series). Sorry for the delay, and I'm happy you're still around to wait for it :)

  • @calix-tang
    @calix-tang 8 місяців тому

    mfv what a great job you have done

  • @AnmolSharma-ij1ut
    @AnmolSharma-ij1ut 8 місяців тому

    Dame bro it was too good i don't know about g gram

  • @alexeypankov8180
    @alexeypankov8180 8 місяців тому

    great vid frfr

  • @rohitkavuluru8998
    @rohitkavuluru8998 8 місяців тому

    Goated

  • @tolgaerdonmez3574
    @tolgaerdonmez3574 6 днів тому

    its awesome, please remove the background music, very distracting :(

  • @skifast_takechances
    @skifast_takechances 8 місяців тому

    bro is basically alan turing at this point

    • @vcubingx
      @vcubingx  8 місяців тому

      ski fast take chances

  • @ucngominh3354
    @ucngominh3354 7 місяців тому +1

    hi

  • @YoussefMohamed-er6zy
    @YoussefMohamed-er6zy 7 місяців тому +1

    you know what chatgpt is unfortunately, the manifestation of the Chinese room paradox, and it is SO humorous that we are taking that much time to realize

    • @Iknowwereyousleep289
      @Iknowwereyousleep289 7 місяців тому

      You’re stupid:
      The Chinese room argument doesn't work for complex tasks beyond fixed rule-based symbolic manipulation. AI like ChatGPT goes beyond counting word co-occurrences, making decisions based on intricate feature interactions. We need to clearly define "understanding" first.
      Understanding involves making functional predictions by compressing data into representations in vector space synaptic interactions etc. GPT-4 doesn’t store explicit symbols but extracts features from data, comprehending context rather than concrete content. Fixed translation are without representational ability to demonstrate understanding.

  • @dannysunginpark7561
    @dannysunginpark7561 8 місяців тому

    breh

    • @vcubingx
      @vcubingx  8 місяців тому

      dang retired cuber comes out of the dead only to smash mohanraj's 3x3x3 PR average

  • @1XxDoubleshotxX1
    @1XxDoubleshotxX1 8 місяців тому +1

    you should make a video on how to get girls

  • @blankboy-ww7jt
    @blankboy-ww7jt 8 місяців тому

    Third

  • @Dhruvbala
    @Dhruvbala 8 місяців тому

    First

  • @fintech1378
    @fintech1378 7 місяців тому

    This is so Asian

  • @OBGynKenobi
    @OBGynKenobi 8 місяців тому

    Computers "understand" languages in as far as they can compute statistics. But they don't really understand like humans do. For example can they understand the levels of meaning of poetry, or sarcasm, or cynicism?

    • @panulli4
      @panulli4 8 місяців тому +8

      What makes you think that human brains don’t just compute statistics?

    • @aaronspeedy7780
      @aaronspeedy7780 8 місяців тому

      @@panulli4 I think the difference is that LLMs compute statistics on words themselves, while humans "perform statistics" on lots of different inputs, and then transform whatever result it gets into language

    • @vcubingx
      @vcubingx  8 місяців тому +7

      To be honest, it's really unclear what it even means to "understand" language. I'm fairly certain that we should be able to get to a sarcasm-detection level of humans within the next 10 years. See relevant work: scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=sarcasm+detection&btnG=&oq=sarcasm+detection
      I feel like 5 years ago, the idea of being able to generate code was unfathomable. Yet, here we are, and Github Copilot knows C++ syntax almost perfectly. Who's to say that everything in our brain is not a type matrix multiplication? We don't know :)

    • @JosephParker7
      @JosephParker7 8 місяців тому

      @@panulli4 And that intuition may be a consequence of analogical thinking and overlooking the subtleties involved. Not that it's "wrong", but arguments such as "the brain is he brain is definitely like a stack of LSTMs", or "the brain is just a Markov chain" etc. has always existed and they've only focused on certain overlaps to construct a simplistic explanation.
      Sure, certain submodules of the brain may operate stochastically, but it's also evident that there are a lot of other architectural complexities involved that allows for agentic behavior, continuous learning, inferring priors from observations, meta-awareness and deliberate allocation of attention and cognitive resources, and adapting to highly chaotic and out-of-distribution environments and contexts to name a few. Qualia itself hasn't been fully explained or understood and it's unclear if it can be, however there are good reasons to think it's a crucial mechanism that allows for agentic models to operate consistently and develop a coherent world model. It's highly likely it wouldn't simply "emerge" from scaling up statistical models. And equivalently, it's easy to conceptualize why a statistical model can achieve a high level of mastery in specific domains which are already deterministic or statistical in nature, or can at least be brute-force computed and generalized for but a lot of things aren't. You can for example, give the impression that you understand quantum mechanics by simply paraphrasing scientific articles, especially if you can do so at scale and very efficiently.

    • @OBGynKenobi
      @OBGynKenobi 8 місяців тому

      @vcubingx yes, I'm not saying it can't happen. I'm only saying that at this point it's not there and it may take a while with more tech. And when I say a while, I mean that in the most open sense.