Can LLMs reason? | Yann LeCun and Lex Fridman

Поділитися
Вставка
  • Опубліковано 18 жов 2024
  • Lex Fridman Podcast full episode: • Yann Lecun: Meta AI, O...
    Please support this podcast by checking out our sponsors:
    HiddenLayer: hiddenlayer.co...
    LMNT: drinkLMNT.com/lex to get free sample pack
    Shopify: shopify.com/lex to get $1 per month trial
    AG1: drinkag1.com/lex to get 1 month supply of fish oil
    GUEST BIO:
    Yann LeCun is the Chief AI Scientist at Meta, professor at NYU, Turing Award winner, and one of the most influential researchers in the history of AI.
    PODCAST INFO:
    Podcast website: lexfridman.com...
    Apple Podcasts: apple.co/2lwqZIr
    Spotify: spoti.fi/2nEwCF8
    RSS: lexfridman.com...
    Full episodes playlist: • Lex Fridman Podcast
    Clips playlist: • Lex Fridman Podcast Clips
    SOCIAL:
    Twitter: / lexfridman
    LinkedIn: / lexfridman
    Facebook: / lexfridman
    Instagram: / lexfridman
    Medium: / lexfridman
    Reddit: / lexfridman
    Support on Patreon: / lexfridman

КОМЕНТАРІ • 51

  • @LexClips
    @LexClips  7 місяців тому +2

    Full podcast episode: ua-cam.com/video/5t1vTLU7s40/v-deo.html
    Lex Fridman podcast channel: ua-cam.com/users/lexfridman
    Guest bio: Yann LeCun is the Chief AI Scientist at Meta, professor at NYU, Turing Award winner, and one of the most influential researchers in the history of AI.

  • @NimTheHuman
    @NimTheHuman 2 місяці тому +3

    The first point about each token prediction requiring the same amount of computation (regardless of the complexity of the question/prompt being asked) isn't valid. If I understand correctly, that argument goes like:
    1. "LLMs use roughly the same amount of effort to predict the next token for simple questions like 'are cats mammals' and complex questions like 'what is life's purpose'."
    2. "There is no correlation between the amount of reasoning required to answer a question/prompt and the compute used by the LLM."
    3. "Therefore, the LLM cannot reason."
    First, this is an implementation detail - that shouldn't really matter when measuring a system's ability to reason.
    Second, we could also interpret this as: The LLM is over-utilizing (spending too much) compute on simple questions (rather than "not enough compute on complex questions"). And this (over-utilizing on simple questions) is definitely the more accurate interpretation since all/most of the LLM's layers/neurons/weights are used for any question (of the same token length). The LLM is simply wasting compute for simple questions like 'are cats mammals'.

    • @justinlloyd3
      @justinlloyd3 2 місяці тому

      No you are wrong. There is a limitation in LLMs that causes their reasoning to be very brittle. Unless the LLM explicitly writes out each step in it's reasoning then the next iteration that information is gone. Reasoning requires many steps to converge on an answer. LLMs will start writing an answer before even thinking about it. This is akin to system 1 thinking. Shooting from the hip and hoping it all works out in the end. The reason they do reasoning at all currently is because there is some structure of reasoning found in the order of the words themselves. Once the system enters into a distribution that it's not been trained on then it's previous experience isn't useful. A true reasoning system should be able to solve difficult problems it has never seen before. LLMs are far from this goal.

    • @NimTheHuman
      @NimTheHuman 2 місяці тому +1

      @@justinlloyd3 thanks, you explained that well. :)
      (I'm not convinced that LLMs can reason either yet. I just think the argument "LLMs don't reason because question complexity isn't correlated with compute use" isn't valid.)
      You mentioned "LLMs will start writing an answer before even thinking about it." Could you elaborate on that? When an LLM predicts the next token, isn't that prediction influenced by a whole bunch of neurons and weights? And it is possible that some of those neurons and weights have deductive reasoning encoded in them?

    • @AigleAquilin-fv4kj
      @AigleAquilin-fv4kj 19 днів тому

      No it is actually a very convincing argument, demonstrating that LLM are not processing, therefore "thinking" through problems, they are just very elaborate mechanisms that spew awnsers automatically, based on data used for training.
      Abstract perrots, but still perrots.

  • @ChristianIce
    @ChristianIce 2 місяці тому +1

    NO.
    Next question.

  • @2020-e2b
    @2020-e2b 2 години тому

    The “right” answer still has programmed in. In that the LLM has to have basis towards the right answer. What leads to innovation is insight, which can only happen when thought stops or is absent. Therefore, intelligence in cannot be constructed for intelligence is attention or awareness without the functioning memory. This whole thing whilst is better than what we have in terms or software will and can only go so far.

  • @mariofrancocarbone7593
    @mariofrancocarbone7593 19 днів тому

    Keep an eye on OpenAI o1

  • @norbu_la
    @norbu_la 7 місяців тому

    Maybe the weights calculated in the inference process could be done like how LoRAs are trained.

  • @ForceOfChaos1776
    @ForceOfChaos1776 7 місяців тому +1

    Database rebuilding…
    But with like actions inputs maths conversion. Thanks for clipping this

    • @JamilaJibril-e8h
      @JamilaJibril-e8h 7 місяців тому

      😕 if they don't allow people to develop no one will reach it

  • @dr.mikeybee
    @dr.mikeybee 7 місяців тому +2

    BERT encodings are abstract representations.

  • @ForceOfChaos1776
    @ForceOfChaos1776 7 місяців тому

    Can you use both the contrastive and abstract way to train a visual model or conversational one, the specific task in question is null as long as the way the “robot” was trained, is sound? I’m curious to understand how the energy conversion executions may be on the ai entirely to generate all those itself or if it uses a backend type reference point on a binary or integer based approach

  • @nobodymove232
    @nobodymove232 7 місяців тому +2

    Linear search, binary search, even google search requires varying computational resources
    It's actually wierd that compute per tokens generated is constant, there probably is a reason for this like llms are "programmed" to keep compute constant, or costs would go 'bazang', through the roof but it's not a good argument for llms "faking reasoning" it might actually be the opposite.

    • @manny3031
      @manny3031 5 місяців тому +2

      It's because they are linear. Human brain is non linear. Neurons connect in all directions in all areas.
      Thus different inputs take varying time for a human to compute. But an AI just spits things at a constant speed.

    • @FlyingHenroxx
      @FlyingHenroxx Місяць тому +1

      Each model has a fixed size of layers, weights and perceptrons. Every question is fed into the same model and the computation uses the whole model. I don't know why you should compare that to searching algorithms, where it depends on the amount of data which is searched, while LLM's have the abstraction of the whole training data represented by their weights und biases.

  • @ForceOfChaos1776
    @ForceOfChaos1776 7 місяців тому +1

    Yesss for sure 1:28

  • @marmoset3
    @marmoset3 7 місяців тому +29

    Really, how many humans actually reason? Vast amounts of human behaviour can be classified as irrational.

    • @leanalcantara4976
      @leanalcantara4976 7 місяців тому +14

      Having the ability to reason is not synonymous with rationality. A lot of people reason; and you’re correct, they sometimes are irrational.

    • @Hacktheplanet_
      @Hacktheplanet_ 7 місяців тому

      Good point ! 😅

    • @low_rider5335
      @low_rider5335 7 місяців тому +2

      You just reasoned by posting this question

    • @Kannatron
      @Kannatron 7 місяців тому +1

      They do reason behind the scenes of conscious thought, even if it may be really bad reasoning/rationality. And if they don’t reason something, it’s usually because they thought about a similar thing in their past and are applying the same actions without thought.

    • @marmoset3
      @marmoset3 7 місяців тому

      ​@@KannatronI'm just trying to explore the idea that if AI is to replicate human consciousness then perhaps it should exhibit some degree of irrationality.

  • @gemini_537
    @gemini_537 3 місяці тому

    First, the amount of computation for each input question is not fixed, modern neural network architecture can automatically determine how many neurons are activated, simple questions - a few neurons, complex questions - massive number of neurons. Secondly, for a complex question, LLMs can take the divide-and-conquer approach, solve it in multiple iterations.

    • @NachodeGregorio
      @NachodeGregorio 3 місяці тому

      That is simply not true. As for your first point, all parameters in the network are queried; some neurons will fire, some won't, but all are queried. Even in the case of Mixture-of-Experts, where only a fraction run, the amount of computation that the network will consume is entirely predictable, as the number of chosen experts that each sigmoid gate in every FFN will choose is a hyperparameter of the network, it's not up to the gate to decide how many, it only decides which ones are queried.
      Thus, Yann is completely right. To date, LLMs have a deterministic outcome on a per-token basis; you can predict how much FLOPs they will consume (predicting how much energy they require is less deterministic as many factors like GPU cooling and datacenter PUE come into play).
      The only method I've seen where this is not the case is Google's Mixture-of-Depths, but it isn't widely used, so Yann is completely right.

    • @gemini_537
      @gemini_537 3 місяці тому

      @@NachodeGregorio I can give you an analogy at higher level - use a router agent (e.g., Gemma 2 9B) to decide which expert agents (e.g., Llama 3 70B, Gemini 1.5 Flash, ChatGPT 4o, etc) to send the query to, the expert agents are very different in size the capabilities, obviously the amount of computation is not fixed. The same principle can be applied implicitly within the neural network. Also, there is no limitation for LLMs to take multiple steps to solve a big problem, just like humans, when getting a big problem, you take more steps with longer time.

  • @ForceOfChaos1776
    @ForceOfChaos1776 7 місяців тому

    Larger sample sizes have to be integrated 12:15

  • @ForceOfChaos1776
    @ForceOfChaos1776 7 місяців тому +1

    So the elaboration in a language models abstract… may be directly correlated to human thinking? 7:10

  • @dusanbosnjakovic6588
    @dusanbosnjakovic6588 7 місяців тому +3

    Llms can absolutely reason. Reasoning means to evaluate or conclude anything using existing facts to deduce new information. Like figuring out why a plant is dying given information about how it's treated and what type of plant it is. Llms can do that.

  • @ForceOfChaos1776
    @ForceOfChaos1776 7 місяців тому

    That’s the scariest subject I don’t like engaging in 16:43

    • @BeefZupreme
      @BeefZupreme 3 місяці тому

      Im slow asf what it mean??

  • @mariofrancocarbone7593
    @mariofrancocarbone7593 7 місяців тому

    What about AlphaGeometry?

    • @vectorhacker-r2
      @vectorhacker-r2 19 днів тому

      AlphaGeometry is not an llm, but a neuro-symbolic AI system that combines machine learning models (including llms) and symbolic ai to solve problems using logic.

  • @justinlloyd3
    @justinlloyd3 2 місяці тому

    Yann LeCun is right.

  • @artukikemty
    @artukikemty 2 місяці тому

    Abstract Thought, one of the key elements of human intelligence, is the capacity to create models of the world. HUman cognition may even come up with a model of its own intelligence like these models. However they're usually far from being perfect and machines are even much further from generating their own models of the world. LLM do not reason, at least not in the way human mind does, and the kind of reasoning they display is infinitely more limited than human reasoning.

  • @SuperCatbert
    @SuperCatbert 3 місяці тому +1

    no

  • @ForceOfChaos1776
    @ForceOfChaos1776 7 місяців тому +1

    Fucking yessss. Depends on the parameters set by the models development team and so on

  • @ForceOfChaos1776
    @ForceOfChaos1776 7 місяців тому

    That’s a difficult differential system entirely in my opinion

  • @SeniorScriptKitty
    @SeniorScriptKitty 2 місяці тому

    fun question to ask because most humans cant

  • @ForceOfChaos1776
    @ForceOfChaos1776 7 місяців тому +1

    Teenager brain* reward model haha

  • @ForceOfChaos1776
    @ForceOfChaos1776 7 місяців тому

    Before not after

  • @LionAstrology
    @LionAstrology 7 місяців тому

    Filter infinity, answers and questions are fundamentally tied.. 1/12 where ÷12 what ÷12 how ÷ 7 will/power, love/wisdom, active intelligence and remaining 4 are combinations on the 3 (trinity) ..4.art harmony through conflict 5. Concrete knowledge and science 6. Devotion and idealism 7. Ceremony and magic.
    The divide by 4
    Fire
    Air
    Water
    Earth
    (Symbolically)
    Then divide that by 2
    Objective/subjective.
    Expressing infinity is impossible, just filter it instead to navigate it.😊

    • @LionAstrology
      @LionAstrology 7 місяців тому

      Final filter , it's modality....cardinal, fixed, mutable.😊

  • @artukikemty
    @artukikemty 7 місяців тому +1

    LLLM are accurately modeling the knowledge they are trained on. No, they can't reason the way humans do. In all case it's a probabilitic reasoning

  • @goranjohansson2495
    @goranjohansson2495 7 місяців тому +3

    Why so complicated babbling for a simple word:No.

    • @nullvoid12
      @nullvoid12 7 місяців тому +1

      That's what a researcher does!

    • @TheRealUsername
      @TheRealUsername 6 місяців тому

      Welcome to the academical bureaucracy.