ChatGPT Fails Basic Logic but Now Has Vision, Wins at Chess and Prompts a Masterpiece

Поділитися
Вставка
  • Опубліковано 24 вер 2023
  • ChatGPT will now have vision, but can it do basic logic? I cover the latest news - including GPT Chess! - as well as go through almost a dozen papers and how they relate to the central question of LLM logic and rationality. Starring the Reversal Curse and featuring conversations with two of the authors at the heart of it all.
    I also get to a DALL-E 3 vs Midjourney comparison, MuZero, MathGLM, Situational Awareness and much more!
    / aiexplained
    OpenAI GPT-V (Hear and Speak): openai.com/blog/chatgpt-can-n...
    Reversal Curse: owainevans.github.io/reversal...
    Mahesh Tweet: / 1705376797293183208
    Neel Nanda Explanation: / 1705995593657762199
    Karpathy tweet: / 1705322159588208782
    Trask Explanation: / 1705361947141472528
    Play Chess vs GPT 3.5 Instruct: parrotchess.com/
    Paige Bailey on Cognitive Revolution: • Google’s PaLM-2 with P...
    Avenging Polanyi's Revenge: m-cacm.acm.org/magazines/2021...
    Faith and Fate Paper: arxiv.org/pdf/2305.18654.pdf
    Counterfactuals Paper: arxiv.org/pdf/2307.02477.pdf
    Lesswrong AGI Timelines: www.lesswrong.com/posts/SCqDi...
    Professor Rao Paper w/ Blocksworld: arxiv.org/pdf/2305.15771.pdf
    Math Based on Number Reasoning: aclanthology.org/2022.finding...
    MuZero: www.deepmind.com/blog/muzero-...
    www.nature.com/articles/s4158...
    Efficient Zero: arxiv.org/pdf/2111.00210.pdf
    Let’s Verify Step by Step OpenAI paper: cdn.openai.com/improving-math...
    My Video on That: • 'Show Your Working': C...
    Superintelligence Poll: www.vox.com/future-perfect/20...
    Anthropic Announcement: www.anthropic.com/index/anthr...
    DALL-E 3 Tweet Thread: / 1704850313889595399
    / aiexplained Non-Hype, Free Newsletter: signaltonoise.beehiiv.com/
  • Наука та технологія

КОМЕНТАРІ • 1 тис.

  • @DerSwaggeryD
    @DerSwaggeryD 7 місяців тому +725

    that was a hell of a naming fail. I first thought "V" meant 5. 😂😂

    • @DJStompZone
      @DJStompZone 7 місяців тому +145

      I'm certain that was intentional, that's the kind of thing that makes you do a double take. Brilliant marketing, really

    • @DerSwaggeryD
      @DerSwaggeryD 7 місяців тому +5

      @@DJStompZone could be.

    • @Gamez4eveR
      @Gamez4eveR 7 місяців тому +17

      they pulled a metal gear solid

    • @alansmithee419
      @alansmithee419 7 місяців тому +20

      @@DJStompZone On the other hand it means people hear it doesn't mean five and instead of getting hyped for the new product just go "oh, ok then..."

    • @ArcherLawrnc
      @ArcherLawrnc 7 місяців тому +11

      Gee Pee Tee Vee

  • @hermestrismegistus9142
    @hermestrismegistus9142 7 місяців тому +198

    By far the best AI-focused channel I've watched. AI Explained actually understands AI and its strengths/limitations rather than spouting unjustified hype or pessimism.

    • @aiexplained-official
      @aiexplained-official  7 місяців тому +25

      Thanks hermes, too kind

    • @squamish4244
      @squamish4244 7 місяців тому +9

      He explains AI in a way that a layperson can understand without oversimplifying it, which is no easy task.

    • @NoelAKABigNolo
      @NoelAKABigNolo 7 місяців тому +2

      I recommend 'two minutes papers' channel as well

  • @TheBlackClockOfTime
    @TheBlackClockOfTime 7 місяців тому +124

    I almost had a heart attack when you said "GPT-V"

    • @Ash97345
      @Ash97345 7 місяців тому +1

      next year

    • @AbsurdShark
      @AbsurdShark 7 місяців тому +3

      Me too, i was sure it's GPT 5 when i saw V. Really misleading (ofc not blaming this channel).

    • @alexdefoc6919
      @alexdefoc6919 7 місяців тому

      wdym? no gpt 5 yet? but i want the 4 one for free :))@@AbsurdShark

  • @GroteBosaap
    @GroteBosaap 7 місяців тому +31

    Love how you cover papers, contrast them with others, do your own testing, and talk about timelines to AGI.

  • @JL2579
    @JL2579 7 місяців тому +19

    I also do actually think that this reasoning issue is actually very human like in a way. I've been learning Chinese for a while and here the learning gets even weirder : there is pronunciation of a symbol, being able to recognize it, being able to write it, knowing it's meaning and knowing the meaning of the whole word. Which means that sometimes I can understand a word, but not spell it. Sometimes I can spell it, but no idea what it means, sometimes I even know what the symbols mean but the whole word is too abstract and I can't remember. On some characters I can spot mistakes in their writing, but I wouldn't be able to draw them completely from my head.
    The difference to chatgpt seems to be, that for humans this does not so much apply for higher knowledge about facts. So maybe the model learns "Biden is president" the same way as you learn "an apple is..." (enter your memories and feelings and sensations of an apple which cannot be described in words) and not like we do, where "Biden is president" is more like digital knowledge that sits on top of analog knowledge like what Biden looks like, what you associate being a president with etc.

    • @philipmarlowe1156
      @philipmarlowe1156 7 місяців тому

      Exactly, because the LLM is still undergoing its fundamental nascent stages and gradual processes of evolutionary transformations.

  • @TiagoTiagoT
    @TiagoTiagoT 7 місяців тому +32

    Humans sometimes also don't remember things both ways with the same level of difficulty. For a simple (though somewhat weaker) example, ask someone to list the letters of the alphabet backwards, and then ask them some other time to list it the normal order; it's probably common sense that the vast majority of people will have more difficulty doing it backwards.

    • @juandesalgado
      @juandesalgado 7 місяців тому +6

      Right. We know the multiplication algorithm, but that doesn't mean that we can multiply 2 100-digits numbers in our head; we'd need pencil and paper. LLMs need to be given some sort of working storage.

    • @tiefensucht
      @tiefensucht 7 місяців тому +4

      yeah, everything is a pattern, doing things in reverse is a whole new pattern you have to learn. this is actually one of the things a general ai needs. learning things for itself based on logic, prediction, association.

    • @CyanOgilvie
      @CyanOgilvie 7 місяців тому +6

      I grew up in a part of the country where we were required to learn a second language that was essentially never spoken. I found that I learnt a map from that language to my first language (badly), but not the other way around - that is: given a word in that language I could retrieve the closest english word, but not when starting from english. This feels a lot like what is showing up in the LLMs and I suspect that a lot of our human capabilities are really just based on very large maps, more like what LLMs are doing than the way we think we're doing it.

    • @McDonaldsCalifornia
      @McDonaldsCalifornia 7 місяців тому +2

      But if you give a human enough time they would know how to do the task and manage to do it correctly.
      AI does kinda have that time already, since it "thinks" at computer speeds, rather than human speeds.

    • @maythesciencebewithyou
      @maythesciencebewithyou 7 місяців тому +2

      @@McDonaldsCalifornia humans would solve the task with the alphabet in reverse by writing it down the way they learned it, then reading it in reverse and not in their head. Some people may manage it in their mind, but most people would struggle to solve it that way. Same with mathematical calculations. Just because you know how it is done doesn't mean you can solve it in your head. Not to mention that we do memorize simple stuff like multiplication tables as a basis. And to solve a more difficult problem, we solve them on paper, following instructions which we have memorized.
      You underestimate how much of your logical reasoning ability depends on the past knowledge you have obtained. It is really difficult to solve a problem you've never encountered before, it's easy or at least much easier to solve a problem which you already know how to solve or a problem which is similar. To solve something that you have absolutely no clue about, you need more trial and error and hope that your experimentation will yield some information you can work with.

  • @chiaracoetzee
    @chiaracoetzee 7 місяців тому +69

    As someone who's done a lot of foreign language learning and had to memorize a lot of vocabulary, it's not unusual at all to know that e.g. "horse" and "cheval" mean the same thing but only be able to perform recall in one direction and not the other direction, unless I'm given some kind of hint to narrow it down. This is often called a "tip of my tongue" feeling when you know it but can't quite recall it given the current context. In that sense, this logical limitation might make LLMs even more human-like.

    • @tornyu
      @tornyu 7 місяців тому +9

      Totally, IIRC it's called active and passive vocabulary. I wouldn't have assumed for any domain that if someone knew A→B then they could also realise that B→A without practice.

    • @johndashiell5559
      @johndashiell5559 7 місяців тому +9

      I was thinking the same thing. The LLM may have strong training in one direction, but far less (or none at all) in the other. So, just like us, it makes recall much harder in certain cases even if the answer is logical.

    • @honkhonk8009
      @honkhonk8009 7 місяців тому +5

      Yep same here. Its also the same with math.
      You dont just look at an agelbra equation and see if its equivalent to something. You gotta do step by step reasoning on it.

    • @tornyu
      @tornyu 7 місяців тому +11

      I think the reason humans appear to be better at this may be that we ruminate: once we learn that A→B, if it interests us we look at the question from many different angles - synthesising our own training data.

    • @chocsise
      @chocsise 7 місяців тому +2

      That is very interesting. I didn’t know that.

  • @djayjp
    @djayjp 7 місяців тому +116

    Fascinating insights, thank you. I love that you don't just report the news, but you provide deep insights into the state of AI.

  • @TMtheScratcher
    @TMtheScratcher 7 місяців тому +7

    Good points you bring up in the end: Our brain consists also of specialized "modules" - and our language processing capabilities alone do not handle our skill and knowledge around logic.

    • @mariapiazza-od8ib
      @mariapiazza-od8ib 7 місяців тому +1

      🎉🎉🎉 sure , 'specialized modules' is the KEY to agi ; LPC alone can't handle Logic , but will do with a specialized Logic Module 🎉🎉🎉 I'm thrilled cause I'm tinkering on something like that .

  • @Pabz2030
    @Pabz2030 7 місяців тому +3

    Clearly LLM's are in fact INTP personality types, where if you ask them a question you either get:
    A) A 3 hour monologue on the answer and every random possible offshoot from that, or
    B) A simple "No idea...Find out yourself"

  • @Drone256
    @Drone256 7 місяців тому +15

    Excellent video. LLMs are making it obvious that many of the things we think of as logical thought are often pattern matching. To me this explains why some people come to an incorrect conclusion but so strongly believe it. They're pattern matching with insufficient examples like an LLM but they have no awareness this is what they are doing.

  • @mickmickymick6927
    @mickmickymick6927 7 місяців тому +44

    This might explain why I very rarely get useful results from ChatGPT or Bing, even when I know it should have the information I want.

    • @ChaoticNeutralMatt
      @ChaoticNeutralMatt 7 місяців тому +1

      It explains bing to me at least. As I rarely search for what I know the name of

    • @XOPOIIIO
      @XOPOIIIO 7 місяців тому +5

      Because language models are optimized to predict the next word, they are trying to be as predictable as possible. You're not getting higher chances of predicting the next word if you generate something unique and important, that rarely happens in the dataset, you're getting higher chances if you generate something boring and mundane that was discovered hundreds of times before. That's why they are trying to be useless.

  • @dextersjab
    @dextersjab 7 місяців тому +15

    Super quick work with the video! It was crazy seeeing all this stuff unfold on Twitter. I still think there's plenty more experimenting to do with reasoning. We've clearly still got a lot to learn about intelligence.

  • @minhuang8848
    @minhuang8848 7 місяців тому +8

    Full-on nailed the German pronunciation, very small sample size, but that sounded better than most learners

  • @computerex
    @computerex 7 місяців тому +5

    LLM's are autoregressive models, so in this context all of these findings make sense. They are not truly reasoning, they are mimicking the statistical distributions in the training data, which often intersect with with ground truth/reality. When they don't align, we call those outputs hallucinations.

  • @a.thales7641
    @a.thales7641 7 місяців тому +23

    After the announcement of dalle3 i said to myself, that openai really should release this year a function to also upload pictures to talk about them, and to get the answers als voiced, just like we can speak into chatgpt via android and iphone... voilla! what a time to be alive. I really thought that we'll get these features this year, but i thought maybe december and not october! great. thanks for the video! i learned the news from the video and was like... how did i miss this? will watch the video now!

    • @antonystringfellow5152
      @antonystringfellow5152 7 місяців тому +5

      I think maybe they're trying to get ahead of Google's Gemini models. The first is rumored to be due for release any time now.

  • @danielxmiller
    @danielxmiller 7 місяців тому +37

    This happened with me when I learned my states and capitals, I could tell you the state if you gave me the capital, but if you gave me the state, I, for the life of me, couldn't give you the capital. Looks like it's more like a mind than a machine!

    • @XOPOIIIO
      @XOPOIIIO 7 місяців тому +7

      That's right. Neural Networks are not classic machines, the fact that they contain a lot of knowledge, doesn't mean the knowledge can be easily retrievable, just like human mind they need clues. On structural level it looks like a neuron containing a knowledge, but there are other neurons that are closely connected to it, like association, activating them you're activating the target neuron. To activate a thought you need to activate thoughts that are close to it.

    • @treacherousjslither6920
      @treacherousjslither6920 7 місяців тому

      It seems then that the issue lies with the training methods of information acquisition. One direction is insufficient.

  • @huntercoxpersonal
    @huntercoxpersonal 7 місяців тому +14

    My question is how in the world are we just now discovering that these simple logical deductions are faulty within GPT? Shouldn’t this type of thing be at the heart of how these systems work? Makes everything even more confusing and spooky…

    • @Divergent_Integral
      @Divergent_Integral 7 місяців тому +7

      Imagine a person with a kind of brain damage that leaves their speech and language skills intact but at the same time heavily impairs their faculties of logic. I imagine such a person would be pretty much like GPT.

    • @DJStompZone
      @DJStompZone 7 місяців тому

      It's also an iterative process. That's kind of the point of doing smaller soft launches, it gives them a chance to work out the kinks and have thousands of people running it through its paces. I imagine within the next week they'll have fixed that, and we'll have some new October version with all new bugs. It's just the way software development goes

    • @Nikki_the_G
      @Nikki_the_G 7 місяців тому +4

      @@DJStompZone They are just going to "fix" BASIC LOGIC with a patch next month, huh? You guys are unreal.

    • @monkyyy0
      @monkyyy0 7 місяців тому

      This isnt news to the people who are pessimistic that nn's will produce agi; its an matter of faith that it will all work out for the optimists, anyone else has been aware of the "4 horn uni-corns" errors
      Hill climbing produces finds good results in predictable situations and is incapable of solving predictably hard problems and nn's are just hill climbers of 10^100000 d hills

    • @JohnSmith-zk3kd
      @JohnSmith-zk3kd 7 місяців тому

      @@Nikki_the_G Yeah man its mad easy.

  • @manielliott9188
    @manielliott9188 7 місяців тому +28

    I think a good definition of intelligence is the ability to make decisions that help to achieve a goal. Even chemical systems, like cells, display intelligence if one assigns a goal to them. Therefore rationality and reasoning is the ability to make good decisions. Good decisions bring the individual closer to their goal.

    • @ChaoticNeutralMatt
      @ChaoticNeutralMatt 7 місяців тому +1

      That's closer to how I think of "smart" the ability to make the best decision based on your situation.

    • @ea_naseer
      @ea_naseer 7 місяців тому

      We already have that: Any ML algorithm trained with reinforcement learning is supposed to maximize some goal X given some input Y.

    • @manielliott9188
      @manielliott9188 7 місяців тому

      @@ea_naseer Yes. That is intelligence. The more intelligent something is the greater its capacity to make good decisions.

    • @KalebPeters99
      @KalebPeters99 7 місяців тому

      Totally, look into the work of Michael Levin for some great experiments in this area!

    • @gamemultiplier1750
      @gamemultiplier1750 7 місяців тому

      Robert Miles defined intelligence in a similar fashion with his videos on AI safety. The orthogonality thesis, iirc.

  • @w1ndache
    @w1ndache 7 місяців тому +7

    Funny thing that as we are trying to teach LLMs generalised logic and reasoning, looking back at how we would think through puzzles, it's also filled with "cheap" rote learning, memorisation, and applying patterns...

    • @maythesciencebewithyou
      @maythesciencebewithyou 7 місяців тому

      A lot of people don't seem to realize how much their logical reasoning skills depend on what they already know. For example, all the math problems most students will encounter during their school years are problems they can solve with the methods they are taught, such as the quadratic equation. Students may feel like they are solving problems when they work on a task, but in reality all the problems we let them work on are problems that have been solved already and problems we've taught them how to solve and not something they came up by themselves. Most people have never and will never come up with a method to solve a problem. We make uses of our knowledge and apply it.
      Something that we've never encountered will leave us puzzled. We'll need to learn about it before we can tackle it, and if it is something nobody before has solved, then we still use what we already know and on top of that we'll need to work by trial and error and gain some further information to work with to solve the problem.
      One experiment that was very memorable for me in that regard was one where they let university students solve some puzzles, those matchstick puzzles where you have to move some matchsticks to get the desired result. Once the students got going it seemed like they all were clever as they easily solved one puzzle after the next. But turned out that they only managed to do so, because all the puzzles so far were of similar nature, where they had to make the exact same move, and they didn't even realize, they just had learned to solve that specific kind of problem. Once the pattern changed and the way of solving required to move it in another way, all the students struggled and only a few managed to solve it eventually.
      And not all problems can be solved just by reasoning and logic alone. Just because something is logically tight, doesn't mean it's true. You can reason your way into believing all sorts of bullshit. You can't know what you don't know and even what you know may turn out to be BS.

  • @XOPOIIIO
    @XOPOIIIO 7 місяців тому +3

    Basically ChatGPT couldn't remember a thing until a clue is given. That's how human brain works too.

  • @LeonardVolner
    @LeonardVolner 7 місяців тому +40

    Once upon a time, I worked at a plant nursery handwatering plants for many hours every day. It was a task that left lots of freedom to think freely.
    I taught myself the alphabet backwards...
    It's nice to have a job so undemanding you can think about whatever you want but the caveat is that you don't have hands to take notes or research new material. It's somewhat limiting unless you can think creatively about how to use that thinking time prodcutively given the implicit handicaps.

    • @aiexplained-official
      @aiexplained-official  7 місяців тому +9

      Lol, reminds me of when I did a 12 digit by 12 digit multiplication over a couple hours in my head to pass time

    • @ronnetgrazer362
      @ronnetgrazer362 7 місяців тому +2

      So an AI assistant that you could talk things through with and that helps by taking notes and doing research in the background could be great for bored delivery folks.

    • @garethbaus5471
      @garethbaus5471 7 місяців тому +1

      Loading trailers was like that for me, I might be doing over 700 pph, but my mind would be off thinking about powersatalites or whatever else I felt like thinking about.

    • @Hexanitrobenzene
      @Hexanitrobenzene 7 місяців тому

      @@aiexplained-official
      12 by 12 ?! That's 144 single digit multiplies and a similar number of additions. I would run out of "RAM"... Were you using some special technique ? Was the answer correct ?
      Once my physics professor showed on a blackboard how they extracted square roots before calculators. I did not understand a thing, had to look this up and study closely. The method is quite interesting, like a long division with a divisor changing (and growing) on every step.
      A few times I tried to do it in my head. It takes half an hour to get to 4-5 digits precision and the answer is likely to be incorrect due to some missing step. I noticed that most of the time is wasted to repeat the intermediate calculations until I learn intermediate results by rote...

    • @chocsise
      @chocsise 7 місяців тому +1

      You could use a voice recorder to capture important observations and reflections. Also listen to some non-visual UA-cams or other audio content (such as interviews or monologs). Hands-free and eyes-free!

  • @keeganpenney169
    @keeganpenney169 7 місяців тому +5

    The definition part you brought up is even more brilliant discovery, it basically says we're all in an infinite loop in terms of logic, reasoning and rational that only ends when said party is satisfied with the return, whether it's correct or not actually doesn't have much to do with anything.
    Bravo, seriously that's some next teir human falicy discovery right there.

  • @DaveShap
    @DaveShap 7 місяців тому +3

    Hey I'm literally starting on a research paper to define "understanding" in the context of AI.

  • @edwinlundmark
    @edwinlundmark 7 місяців тому +131

    This makes me think... what if we run out of novel ways to test AI's? What if they're just trained on every way we can come up with to test its reasoning....

    • @aiexplained-official
      @aiexplained-official  7 місяців тому +45

      Great question

    • @YouLoveMrFriendly
      @YouLoveMrFriendly 7 місяців тому +38

      That's why the late, great Doug Lenat pushed for combining neural network language models with hard-coded, human-curated knowledgebases, such as his famous Cyc system.
      LLM's shouldn't be treated as a knowledgebase; they should be used for what they're good at: interpreting requests and queries and then handing the real "thinking" over to a specialist system.

    • @jeffsteyn7174
      @jeffsteyn7174 7 місяців тому +13

      It's irrelevant what a test says if it replaces you at work.

    • @UserCommenter
      @UserCommenter 7 місяців тому

      Is that similar to asking “what if LLMs can only respond in known words/languages”? We can test pretty much everything it seems, and what we can’t might not be our concern - except maybe we’re limited to replacing human skills and not exceeding them?

    • @Houshalter
      @Houshalter 7 місяців тому +5

      You can always randomly generate hard logic problems like satisfiability problems.

  • @oscarmoxon102
    @oscarmoxon102 7 місяців тому +8

    This is one of your most impressive videos yet. Since starting watching AI Explained, I've flipped from an Economics Bachelors to a Masters student in Artificial Intelligence at KCL, where this has become my full-time focus -- and I am privileged to witness your journey into becoming an incredible AI researcher along the way. The micro and macro you paint here are truly cutting edge perspectives. Always blown away by these vids.
    The other day I spoke with Mustafa Suleyman at CogX about recursive self-improvement and multi-agent interaction. Curious what you think about this space and what you think of recent news here.

    • @aiexplained-official
      @aiexplained-official  7 місяців тому +6

      Oh wow that's insane, thank you Oscar. I read his book and all his interviews and he is obviously very much against agency and RSI but I do wonder if he wants Pi to pass his modern turing test, and if so, how that couldn't involve agency...

  • @mikem4405
    @mikem4405 7 місяців тому +2

    I'm impressed that LLMs can cover as much ground as they do. It reminds me of how parts of the human brain can be repurposed to compensate for underperforming regions, like in blindness, stroke, dementia, etc. ChatGPT doing math is like an English major doing math - after the English major had brain damage to the 'math' part of their brain (ok, it's an imperfect analogy).
    I agree that there are different kinds of networks that specialize in distinct tasks, and that these should be put together to maximize the strengths of each one. LLMs seem to work well as an executive region, in part because they are extremely good at evaluating text that has already been generated (like we saw in Reflexion).

  • @5133937
    @5133937 7 місяців тому +1

    I really appreciate your due dilligence on these videos, and then listing the papers, twitter threads, YT vids, and other sources that comprise your DD. Hugely educational and helpful. Thanks man. Subbed, with notifications=all.

  • @alansmithee419
    @alansmithee419 7 місяців тому +5

    21:50
    That's not a failure, that's a happy accident.

  • @Pheonix1328
    @Pheonix1328 7 місяців тому +5

    That's what I've been thinking all along. It makes more sense to me to have many, smaller, expert AIs working together than trying to cram everything into one giant one. Even our brain has difference areas that focus on various things...

  • @simondennis6918
    @simondennis6918 7 місяців тому +2

    The inconsistancies in the way that LLMs recall information reminds me of similar inconsistencies in human memory, in particular, recognition failure of recallable words. Even when someone is able to to recall a word given a word with which it was paired in a study list, they often fail to recognise the word as having occurred in the study episode. This observation was the foundation of Tulving's cue dependent memory hypothesis which is taken as given in all theories of memory now.
    Watkins, M. J., & Tulving, E. (1975). Episodic memory: When recognition fails. Journal of Experimental Psychology: General, 104(1), 5-29
    p.s. Tulving passed away earlier this month. While I would contest much of what he wrote later in his career, there is no disputing the foundational contributions he made to the memory literature.

  • @vladyslavkorenyak872
    @vladyslavkorenyak872 7 місяців тому +3

    Reasoning is an iterative process, so for any hope of these models doing multiplication you need them to iterate internally. Our brain does it automatically since it's an analog asynchronous machine, but these models need something more. We might get better results by making a master/student dual model, where the student model tries to solve the task and the master is always asking if it is done right and making suggestions. An AI with internal conversation!

  • @geldverdienenmitgeld2663
    @geldverdienenmitgeld2663 7 місяців тому +29

    would be interesting, what happens if a LLM is trained to predict the next word and the previous word as well.

    • @RPHelpingHand
      @RPHelpingHand 7 місяців тому +4

      Seems like the most logical next step.

    • @generativeresearch
      @generativeresearch 7 місяців тому +6

      That won't happen as entropy only goes forward in time

    • @DicksonCiderTonight
      @DicksonCiderTonight 7 місяців тому

      It will finally be able to write good jokes!

    • @casenswartz7278
      @casenswartz7278 7 місяців тому +1

      I’ve made my own small custom LLMs from scratch before, algorithm and all, I bet you could just create another LLM and have it train on the dataset but the data is reversed. I wouldn’t train train the same AI to go both forwards and backwards because it might have trouble separating which is which, and would likely output a bunch of garbage. However, I bet training three AIs, one to be trained on textual relationships between three words (the current token, the previous token, and the predicted token), to then possible change the prediction maybe, but I can see how that would be useful, seeing it might be a very small extra layer.

    • @BTFranklin
      @BTFranklin 7 місяців тому

      That's what people seem to be expecting out of this calculator-like behavior.

  • @Tiky.8192
    @Tiky.8192 7 місяців тому +5

    Another thing that shows the patchy behavior of LLM is asking a question in different languages. You don't get the same answers. A good example is recipe ideas for a list of ingredients. Translate it word for word and ask the LLM. You'll get completely different genres of recipes.

    • @Sashazur
      @Sashazur 7 місяців тому

      That result is actually what you would expect from real humans in the real world. Typically you wouldn’t try to make a cheeseburger in China!

    • @Tiky.8192
      @Tiky.8192 7 місяців тому +1

      @@Sashazur Makes sense in a human world but for an AI, it does create quite a big issue in which some knowledge might be hidden behind other languages. Some paths might only be accessible in one language a bit like a search engine.

    • @andersberg756
      @andersberg756 7 місяців тому

      very interesting topic - chatGPT can translate, but the internal representations are probably in part "one world model" of concepts and relationships, but rather often instead connected to a particular language, or at least language group where concepts are expressed similarly.
      There's the concept of vector embeddings - my understanding is it's the numbers which the model uses for expressing the context and meaning of the text. It'd be interesting if someone had done investigations into the vector embeddings for translated text - are they in general similar? Does it differ depending on the topic, or type of concepts used, i.e. physical stuff similar in different languages, but tone of writing differ? Does anyone know of such research?

  • @Arnaz87
    @Arnaz87 7 місяців тому +4

    Sir your commentary on the topics is remarkably insightful and valuable, and we're lucky to have it exist.

  • @billykotsos4642
    @billykotsos4642 7 місяців тому +11

    AGI is close !.... fails at basic logic.... lol

  • @Zhizk
    @Zhizk 7 місяців тому +4

    Weird week indeed! Thank you and i hope the news just keep coming

  • @adamsvette
    @adamsvette 7 місяців тому +7

    I would love to see a dalle photo dictionary where every word in the English dictionary generates four or five images. So we can just see what these models think words mean/look like
    You could do just the word, or the word plus it's entire definition as a prompt

  • @panfilolivia
    @panfilolivia 5 місяців тому +1

    great video thats all i can say. its hard to find stuff of this quality on youtube these days. loved the studies you talked about, after watching i just had to read them fully.

  • @tehlaser
    @tehlaser 7 місяців тому +1

    The fact that humans can do this is actually kind of amazing. By “this” I don’t mean realize that A is B implies B is A. That’s just logic, and GPT seems to be able to do that too, so long as you activate A and B in the same context.
    The amazing bit is that, when we humans learn that A is B (or even that A is similar to B) we form associations in both directions. Thinking about A suggests B, and thinking of B suggests A. Most animals don’t form that reverse association. Humans do.

  • @harrymapodile
    @harrymapodile 7 місяців тому +4

    Great vid! Deepmind also released GATO last year. Which was sort of a Swiss knife model, it combined all the various types of architectures you mentioned. Perhaps there’s a change for AGI soon haha

  • @ryanpmcguire
    @ryanpmcguire 7 місяців тому +5

    What about using two LLMs that have a conversation about the answer?

  • @MunirJojoVerge
    @MunirJojoVerge 7 місяців тому +1

    Wild times indeed!! My tests with MS Autogen are really amazing, to say the least!
    As usual, thank you so much for your work!

  • @mahmga1
    @mahmga1 7 місяців тому +2

    Phenomenal episode, each subtopic could've been a long video - All highly thought provoking. I think you summarized best with the question "why doesn't the LLM just create a model on the fly" - That seems to be the most direct path. I'd definitely like to see an experiment with that approach taken. I think of it along the lines of model-inception.

  • @samhblackmore
    @samhblackmore 7 місяців тому +3

    Predicting the next word can take you to some amazing places. Just not backwards.... what a great quote!

  • @linkup2345
    @linkup2345 7 місяців тому +7

    Great video. Thank you for your effort in putting all this together 🙏🏾

    • @aiexplained-official
      @aiexplained-official  7 місяців тому +2

      Thanks linkup

    • @MikAnimal
      @MikAnimal 7 місяців тому

      Putting what together ? Useless clickbait? His logic is as poor as gpt v 😂

    • @linkup2345
      @linkup2345 7 місяців тому +1

      Thank you hater. We would be nothing without you.

    • @aiexplained-official
      @aiexplained-official  7 місяців тому

      What was up with my logic Mik?

    • @MikAnimal
      @MikAnimal 7 місяців тому

      @@aiexplained-official logic of thinking clickbait that could spread misinformation about the version of gpt available is good for helping spread good information 🤙🏽
      I mean Linus probably thought he was doing good too till that gamers nexus video said otherwise.
      In a world where academic dishonesty and media dishonesty and government trying to stop the spread of all information by calling it miss information already we don’t need sloppy, lazy or greedy actions making things worse.
      How bout that 👀🤙🏽

  • @sebby007
    @sebby007 7 місяців тому

    Great video once again! I love that you are getting in contact with people in the field. It feels like you are bringing me closer to the people and minds that are driving towards AGI.
    So how do you teach an LLM the meaning of the concept of intuition? When I look to my own mind there seems to be an ever changing team of engines driving the whole. Is what LLMs are missing a pretty good definition of consciousness? How about combining LLMs into a communicating network with one being the master and see how that behaves.

  • @justwest
    @justwest 7 місяців тому +1

    always fascinated by your videos - thank you so much for the interesting updates!

  • @hanskraut2018
    @hanskraut2018 7 місяців тому +4

    I agree with so much and you made some great points!
    Good understanding of the definition things.
    Good insight on the memory thing.
    -Beautiful/nice pictures (although listened to a bunch)
    -great point about the borderline abundance/improvements of many big challenges/tragic situations and the public would be going “nope” 😄
    Nice i like ur tone of voice and ur calm, inquisitive, interest and slightly infectious/pleasurable passion. 🎥
    Have a nice day. :)

  • @Sirbikingviking
    @Sirbikingviking 7 місяців тому +6

    A lot of this strange behavior is probably due to the fact that the LLM is a auto regressive model. Once it starts writing a response, it is sort of statistically committed to what it's saying. Also, not a lot of people online are likely to ask about Tom Cruise's mom without mentioning him first, so it's statistical training may not be able to refer to them in reverse order very easily. Also, when you replace a word with X, this is a method used to train LLMs where they sort of fill in the blank, so they're really good at doing that.

    • @benprytherchstats7702
      @benprytherchstats7702 7 місяців тому +4

      I agree. I've managed to get GPT-4 to dig itself into some pretty surreal holes this way, for instance by asking for the number of permutations possible in some set of letters, subject to some constraint, and then when it gets it wrong asking it to write out each permutation. If it says there are 10 permutations when really there are 6, it will stick to its first answer by making up 4 more wrong examples. And then when you ask it about those wrong examples it will deny that they're wrong. I got it to to insist that the second letter in "BBC" is "C" using this method - which obviously it won't get wrong if you just ask straight up "what's the second letter in BBC?"

    • @cartour8425
      @cartour8425 7 місяців тому

      I believe ‘a pure version’ or previous version would answer it correctly. Seems now the site terms of use are messing up the logo algo. Maybe need to specify how it scrapes data in prompt

  • @HarpaAI
    @HarpaAI 7 місяців тому +1

    🎯 Key Takeaways for quick navigation:
    00:00 🤖 Introduction and Overview
    - Investigating the puzzling nature of GPT models.
    - Exploring GPT's ability to reason and its limitations.
    - Introduction to GPT Vision and new capabilities.
    01:29 🧠 Challenges in Logical Deduction
    - GPT's struggle with basic logical deduction.
    - Failure to generalize patterns from training data.
    - Examples of GPT's inability to connect related information.
    04:43 🤯 Quirks in Knowledge Retrieval
    - GPT's inconsistency in retrieving knowledge.
    - Demonstrations of information retrieval failures.
    - The role of training data and Wikipedia in GPT's responses.
    07:05 🧩 The Input-Output Asymmetry
    - Explaining the asymmetry between input and output for GPT models.
    - How llms handle information going from input to output.
    - The model's ability to deduce information within a single context window.
    08:13 ♟️ GPT 3.5's Chess Abilities
    - GPT 3.5's performance in playing chess.
    - Discussion on whether it has memorized all possible chess positions.
    - An exploration of GPT's chess capabilities.
    11:26 🧮 Reasoning and Counterfactual Tasks
    - The concept of counterfactual tasks in evaluating GPT's reasoning.
    - The role of memory and logic in handling counterfactual questions.
    - Discussion on the capabilities and limitations of GPT in reasoning.
    13:20 🔢 Mathematics and Complex Reasoning
    - GPT's performance in solving mathematical problems.
    - The challenge of achieving 100% accuracy in complex reasoning tasks.
    - The distinction between memory-based and logic-based reasoning.
    15:14 🌐 Future Directions and AGI
    - The ongoing research into injecting pure logic and reasoning into models.
    - Discussion on AGI timelines and the evolving AI landscape.
    - The potential for combining language models with specialized AI systems.
    17:46 💡 The Path Forward and Investment
    - The continuous advancement of AI research and investment.
    - Insights from AI experts on reasoning and AGI.
    - The role of AI companies in pushing the boundaries of technology.
    21:46 🎨 GPT's Creative Abilities
    - GPT's ability to generate artistic and textual content.
    - A comparison of GPT's creative outputs using DALL·E 3 and Mid Journey.
    - Exploring the potential for using reasoning and logic prompts with DALL·E 3.

  • @rasen84
    @rasen84 7 місяців тому +2

    Ok then the obvious next step is to discourage over reliance on memorization. Like RAG, do retrieval augmented pretraining and keep the retrieval set to only be the past trained tokens.

  • @chasebrower7816
    @chasebrower7816 7 місяців тому +3

    To me the logic failure doesn't seem surprising--these LLMs have been shown to rely on grammar structures (hence they're easy to 'fool' by using grammar or syntax that implicates a wrong answer as being more linguistically likely) and so their recall might break down when exposed in a logical manner. Following logic like this would require more organized recall and probably outright intelligence, which we know GPT-4 only has in negligible amounts if any at all. This barrier is likely only to be surmounted with a more explicit memory mechanism, or enough intelligence to store information in a logical manner.

  • @Bodofooko
    @Bodofooko 7 місяців тому +5

    Am I being to generous with the Dall-E horse picture in thinking that it's actually a pretty creative approach to getting a horse to drink water from a water bottle? Water bottles are made for humans to use and are not easily used by horses, so Dall-E has the water from the bottle go into a container that the horse can drink out of. The water is still technically from a bottle. The midjourney one looks more like it's just sniffing the bottle. Maybe if there was a clear straw or something, but otherwise I don't think it fulfills the prompt request as well.

    • @thenoblerot
      @thenoblerot 7 місяців тому +2

      I had the same thought. Even if it was absurd, it 'feels' reasoned to me. I think Dall-E 3 is likely the multi-modal aspect of gpt-4, given how they're rolling it out in the ChatGPT interface. I think evidence of this is the consistency with which DALL-E 3 rendered Larry the hedgehog in the OpenAI demo video. As though gpt-4 had it's own full realization of how Larry looks and then stuck with it across multiple prompts. Today's Midjourney could never.

    • @realismschism
      @realismschism 7 місяців тому +2

      Agreed. Very impressive, IMO. Not only is the horse drinking from the bottle but it's doing so by its own action, by tipping the bottle forward. It's a clever, if surreal, solution that shows creativity and spatial reasoning. I don't know how many humans would come up with this.

  • @c10ud17
    @c10ud17 7 місяців тому +2

    You’re the first AI channel i’m seeing reaching out to the researchers and getting interviews integrated into content. It’s so cool to hear from the people at the forefront of AI dev! Fantastic content dude

    • @aiexplained-official
      @aiexplained-official  7 місяців тому +1

      Thanks so much Cloud, means a lot, yes will be reaching out much more

  • @Sekhmet6697
    @Sekhmet6697 7 місяців тому +2

    So we know that LLMs don’t have a purposefully built algorithmic internal set of rules for formal logic used to verify the validity of a statement, e.g “if a=b, does b=a?”, so the LLM may or MAY NOT be able to generalize a set of logic rules that adhere to the context given, so its answer may or may not be valid.

  • @felixgraphx
    @felixgraphx 7 місяців тому +3

    The language pattern part of your brain is not the logic part, and gpt is only a large frontal lobe for language patterns, not logic. In the future ai module will assemble different parts, some llm some not to perform actual logic and later, conscience. But now many people do not understand that and are surprised about gpt being better at language pattern memory giving good result, but no reverse logic based on that languages result.

  • @bujin5455
    @bujin5455 7 місяців тому +18

    I wonder if it's not possible to have an AI recognize these inversion relationships in the training corpus, so that it can then augment the training corpus to demonstrate the bidirectional nature of things. Then work on specifically pushing the target AI to recognize inverted relationships with out of band examples. I wonder if this sort of effort would lead to a new level of emergent behavior in these LLMs.

    • @antonystringfellow5152
      @antonystringfellow5152 7 місяців тому +2

      Interesting possibility but when considering these problems/limitations, I always try to work out how we humans achieve this. Of course, I don't really have those answers but I do have ideas.
      As someone who is teaching my language in a country where I'm also learning the language of my students, I've come to realize how much the human brain depends on associations. It tries to link any new information with existing memory, often in multiple ways with multiple areas of existing memory (via many paths linking data). I suspect our brains are configured in such a way that this happens with all new information, not just language - it tries to form these links, even if it means them being very tenuous.
      Maybe this is what enables us to take things we learn in one area and apply some of them in other areas.
      Maybe what's needed for AGI isn't bigger language models but an architecture that works in a similar way.

    • @tiefensucht
      @tiefensucht 7 місяців тому

      the thing is that current ai is more like a search engine. you have to implement logic and association as separate modules which can talk to each other. chatgpt is like top tier autism.

    • @cacogenicist
      @cacogenicist 7 місяців тому

      I found that Claude 2 passed the son-mother/mother-son inversion examples provided. It did suffer the same confusion about Huglo.

  • @albemala
    @albemala 7 місяців тому +1

    After watching your video, I run 2 experiments.
    1) using this prompt with gpt-4, I got the right answer:
    "While keeping in mind the definition of equality (if A = B, then B = A), and based on your knowledge, what are the
    - municipality
    - County
    - size
    Of Huglo, Norway"
    Answer:
    "Based on my knowledge as of January 2022:
    - **Municipality**: Huglo is an island in the municipality of Stord.
    - **County**: Huglo is located in Vestland county.
    - **Size**: Huglo covers an area of about 13.6 square kilometers."
    Not perfect, because it was a very specific request, but better than "I know nothing about Huglo"
    2) I asked gpt-4 the rules to sum 2 numbers, then I gave it a list of numbers, 2 by 2, to sum, while following the rules. The results were always correct.
    My conclusion is that, like for us humans, LLMS might just need some guidance when prompted for logical and reasoning tasks? I'm not an expert, but I'm fascinated by the topic

    • @aiexplained-official
      @aiexplained-official  7 місяців тому

      Any thoughts on how it couldn't do it in reverse though?

    • @albemala
      @albemala 7 місяців тому

      @@aiexplained-official as mentioned in the video, it might be because of the way LLMs work, so a limitation of the architecture. Or it might be an emergent behaviour, not fully "activated" yet. Or that we should change the way we train them. But I don't think LLMs will reach AGI alone, there are missing pieces like RT and/or something else. I'll keep experimenting though.

  • @therealOXOC
    @therealOXOC 7 місяців тому +1

    Has someone tried uploading a picture and requesting a recreation via DALLE3? Wonder how close it comes. Great vid as always. Really like the little interviews.

  • @elitegamer3693
    @elitegamer3693 7 місяців тому +9

    There is also a possibility that with enough scaling, we will be get pure logic as an emergent ability.

    • @therainman7777
      @therainman7777 7 місяців тому

      I was thinking the same thing. It may emerge with sufficient scale, or it may turn out that next-token prediction via transformers is fundamentally incapable of reasoning in the reverse direction. Will be very interesting to find out.

    • @adamwrobel3609
      @adamwrobel3609 7 місяців тому +2

      Toddlers learn logic before they learn language

    • @therainman7777
      @therainman7777 7 місяців тому

      @@adamwrobel3609 You would have to have a very unorthodox definition of “logic” for your claim to be true. By most definitions of logic, something like language is actually a precursor to having logic.
      For example, the classic logical syllogism, “All men are mortal. Socrates is a man. Therefore, Socrates is mortal.” How exactly would you understand this concept without language? How could you even express it without language, such that a person who doesn’t have language (such as a toddler) could possibly grasp its meaning?
      Also, you are making the (very common) mistake of anthropomorphizing AI. Even if your claim (that toddlers learn logic before they learn language) were true, that in no way means that LLMs, or any other type of AI system, would learn those two things in the same order. An LLM is not a human brain, and gradient descent (the mechanism by which LLMs and other neural networks learn) is not the way that humans learn. So there’s zero reason to expect that these abilities would be acquired in the same order. Let alone to be certain of it.

    • @chillin5703
      @chillin5703 7 місяців тому +1

      I really doubt it. ChatGPT (or any other language models) is not meant to engage in logic, and literally isnt designed for it. Adding more data to chat gpt might make it outwardly appear like it's engaging in logic in a wider variety of situstions, but in reality, all it would actually be doing is having a more detailed reference for how people string together words, in a larger variety of situations. It wouldn't change the fundamental trait, which is that it is looking to how humans string words together, not how they consider those words, or what thought processes underpin those words, only the means of communicating it all. Here's one example: chatgpt will often fail simple logic tests UNTIL TOLD _explicitly_ it is being given a logic test. What this demonstrates isn't ChatGPT's capacity to reason being activated once it is prompted, it's ChatGPT's ability to draw from a different set of reference data when directed. We know this, because especially once we move beyond simple and common test formulations for which we can presume it has many references for word pairing, and give it more unique ones for which it has less references, it fumbles even when told it is doing logic tests. ChatGPT, It's a LANGUAGE MODEL. It's meant to reflect how we use LANGUAGE. All you can do by giving it more data is allow it to reflect a wider range of language use.

    • @chillin5703
      @chillin5703 7 місяців тому

      Update: I wrote this comment as I was watching the video. Looks like the video maker points out the same thing.

  • @guy8203
    @guy8203 7 місяців тому +4

    Just about everyone I've ever talked to about superintelligence has been concerned about it taking their jobs without the understanding that all jobs being gone isn't a bad thing. I think our cultural narrative about AI is mostly drawn from dystopian sci fi because utopia isn't really a story. "Nothing goes wrong and everybody's happy forever" doesn't make for the most compelling narrative.
    Hopefully the general population will eventually come to understand that superintelligence means infinite resources and jobs will no longer become necessary.

    • @DeusExtra
      @DeusExtra 7 місяців тому

      God when people say stuff like that it really gets on my nerves. It's so closed minded. I saw a comment where this dude was mocking someone for wanting to not have to worry about basic survival needs because AI, etc could potentially be a support structure for humanity, saying something like, "What would we do then? Sit around all day?" If we consider the fraction (or totality) of work that people do to cover just having basic needs met, I don't think most people would say they can't think of a better use for that time.

  • @thearchitect5405
    @thearchitect5405 7 місяців тому +1

    Small correction, there are more possible chess moves than atoms in the OBSERVERABLE universe. The overall universe is predicted to be significantly larger than the observable universe.

  • @Ecthelion3918
    @Ecthelion3918 7 місяців тому +2

    Always excited to watch a video from you

  • @ZokRs
    @ZokRs 7 місяців тому +6

    Will you also cover ai robotics like the recent Tesla Optimus news?

    • @DJStompZone
      @DJStompZone 7 місяців тому +1

      Maybe check out Two Minute Papers, it's another really good academic AI news channel and he covers a lot of stuff like that 👍

  • @MasonPayne
    @MasonPayne 7 місяців тому +4

    I wonder if you could simply prompt the LLM to generate code to help them answer logical questions. I bet it could at least predict what code is needed to process logic. Which if run would give you the correct answer.

  • @mshonle
    @mshonle 7 місяців тому +1

    A note from an academic in the States: If you say “BU,” it’s understood you mean Boston University. If you say “Boston” people won’t register that as any school- “do you mean Harvard?” (And there’s a Boston College, so we’d say BU and BC. However, saying “BC” in non Boston contexts would be ambiguous- one could just as well mean British Columbia.)
    However, as you’d say, cheers mate and keep up with the great videos!

  • @JimMendenhall
    @JimMendenhall 7 місяців тому +2

    Very good insight. Thanks!

  • @shawnvandever3917
    @shawnvandever3917 7 місяців тому +3

    I think it has to do with feed forward only and can not loop back. In long digit problems it can only hold the information forward and can not go back and deduce. I know our brains go back and forth constantly . While I know this doesn't explain A to B- B to A I think it does have much to do with complex reasoning

  • @AIWRLDOFFICIAL
    @AIWRLDOFFICIAL 7 місяців тому +9

    YES ANOTHER AI EXPLAINED VIDEO THANK YOU

  • @mantriukas
    @mantriukas 7 місяців тому +1

    Really interesting example with the chess pieces mixed up. Good way to see whether it's reasoning or 'using' basic logic

  • @HALT_WHO_GOES_THERE
    @HALT_WHO_GOES_THERE 7 місяців тому +2

    I think that the discrepancy between straight up logical reasoning and "foggy" reasoning like chess is actually mirrored in humans. I think that humans are better chess players than they are principled logical reasoners, and that the parallel arising in LLM's is a more natural biproduct of emergent reasoning than people think. I also wonder if most humans would be able to guess certain names of celebrities' parents, but not be able to give the celebrity for whom a certain person is a parent. That sort of compartmentalized double standard of recall seems like something human-like.

  • @roqstone3752
    @roqstone3752 7 місяців тому +7

    In its current iteration Chat Gpt is a Search Engine more than a Logic Engine

    • @andersberg756
      @andersberg756 7 місяців тому

      nah, it's a patchy super irregular thinker to me. Like it can figure out where I'm aiming with some code, giving me feedback on where I thought wrong in respect to a given problem. It does indeed have understanding of a lot of concepts, relations, methods etc. All that stuff was beneficial to build up in order to model our writing, so it learned it, seemingly random. It's so fascinating, but hard to get an intuition of what it really knows.

  • @SuperAnimationer
    @SuperAnimationer 7 місяців тому +6

    I said this before and I will say this again, I love your videos :) You are very hardworking and I respect that.

  • @matthewcurry3565
    @matthewcurry3565 7 місяців тому +1

    Good work, and updates. They gotta give it something like self reflection to work backwards. Then, it'll just freeze depending om the model, how deep it is allowed to go, and what its data set is trying to train. I speak more for a AGI rather than the smaller models doing specific jobs.

  • @stephenrodwell
    @stephenrodwell 7 місяців тому +1

    Thanks for the fantastic content! 🙏🏼

  • @TheTwober
    @TheTwober 7 місяців тому +4

    The difficulty comes from LLM learning that A >implies< B and not that A equals B.
    Input A implies that answer should be B. Answer B therefore has - in the mind of the LLM - no logical connection to A. However in our minds things work differently, as we do bi-directional association. Which then also comes with other issues, like us constantly seeing connections where there are none.

  • @david.ricardo
    @david.ricardo 7 місяців тому +3

    When I was studying formal logic at university, I also stumbled in to the circular definition or reason, rationality and logic. But here is a way to think about it to get out of that:
    Reason is to think rationally, rationality is to think following logic, logic are a rules for correct reasoning. Hence we can say that to reason is to arrive to conclusions following the rules of logic.
    You may be wondering about the origin or validity of these rules of logic or “laws of thought” as it was put by Boole in his book about logic but that’s a philosophical task.

  • @brll5733
    @brll5733 7 місяців тому +1

    One of your best videos yet, I think. Very informative and clear.
    My strong intuition remaind that LLMs need some form of latent memory (which they can access and process for multiple cycles before producing a result) to get them to reason, to be able to create variables and think about them.

  • @gabrote42
    @gabrote42 7 місяців тому +1

    16:02 To reason is to scrutinize extensive portions of the information available about a given environment/task/prompt and to make choices based on them that try to accomplish an objective. Good reasoning is generally characterized by compensating for biases, requiring attention, using analysis methods, and generalizing well.

  • @benjaminli9793
    @benjaminli9793 7 місяців тому +6

    Just from a surface level perspective, is it really problematic if the model cannot automatically deduce this connection? Wouldn't assuming that because A->B, B->A lead to logical fallacies?
    For example, if it is raining, it is wet. But if it is wet, it may not be raining.

    • @Not_Even_Wrong
      @Not_Even_Wrong 7 місяців тому

      No, you're talkng about implication, the paper is concerned with equality, equality is reflexive implication is not.
      So yes this is a problem

    • @AM-qk5bt
      @AM-qk5bt 7 місяців тому

      I was wondering about gpt's ability to deal with modus tollens/deduction as well while watching this, I have the impression it's more difficult than expected

    • @KyriosHeptagrammaton
      @KyriosHeptagrammaton 7 місяців тому

      The AGI is so advanced we think it's broken haha

  • @fynnjackson2298
    @fynnjackson2298 7 місяців тому +3

    What a time to be alive!! I agree, ts pretty surreal, ont he one hand there is billions being inevsted, companies are going all in on AI, but at the same time epople are like hmm, maybe we should regulate.

  • @ehza
    @ehza 7 місяців тому +1

    Good work Phillip!

  • @codycast
    @codycast 7 місяців тому +1

    2:25 boggle… I’m laying on my couch watching suits while this video plays on my phone (only 1/2 paying attention to both)
    That actor is underrated!

  • @jaysonp9426
    @jaysonp9426 7 місяців тому +5

    This would be the equivalent of humans solving the problem with their first thought. Think about how many thoughts you have to have to actually get a correct answer in anything...when I build autonomous agents 0 shot is always crap. It's why society of minds is necessary.

  • @Sam_0108
    @Sam_0108 7 місяців тому +4

    Ain’t no way

    • @Sam_0108
      @Sam_0108 7 місяців тому +2

      Watching the video makes me feel that technology is advancing faster than I thought lol

  • @MobyAnalytics
    @MobyAnalytics 7 місяців тому +1

    Thank you for the amazing content! So much time save to follow the latest development in AI

  • @garronfish8227
    @garronfish8227 7 місяців тому +1

    The fact that the LLM structure is so simple is a huge benefit. Input some words and get the next word. Just adding in a "maths processor" would mess up this structure. Ideally there will be a new simple structure that is like LLMs that includes determining the associated logic of an input provided.

  • @Totiius
    @Totiius 7 місяців тому +1

    Amazing video!! Thank you a lot!

  • @itemushmush
    @itemushmush 7 місяців тому +1

    amazing. thanks for this content

  • @ChaseFreedomMusician
    @ChaseFreedomMusician 7 місяців тому +1

    This makes sense to some other work about transformers in general showing that they fail to generalize rotation and translation independent concepts regardless of whether that is textual or otherwise. Makes me wonder if LM infinite and some of the other context lengthening work will auto correct this behavior as it tends to make the tokens more positionally invariant.

  • @jay_sensz
    @jay_sensz 7 місяців тому +2

    You could probably instruct a fairly simple LLM to identify reversible facts/concepts in the training data, generate alternative phrasings, and add those to the training data.

  • @nathanaeltrimm2720
    @nathanaeltrimm2720 7 місяців тому +1

    Statement :”Is square a rectangle?”
    Response: “Yes
    Q:Is rectangle a Square?”
    a: I don’t know
    It makes sense that it doesn’t auto assume based on past info, it just needs more direction on what are acceptable a==b’s and b==a’s and what are not

  • @benjamineidam
    @benjamineidam 7 місяців тому +1

    What are your AGI-timelines? Is there a video where you tell more about them? Great vid as always!

    • @aiexplained-official
      @aiexplained-official  7 місяців тому +1

      I should make one. Within the 2020s, best guess 2028

    • @benjamineidam
      @benjamineidam 7 місяців тому

      Whys that? @@aiexplained-official And what do you understand under "AGI"? And do you think ASI will follow right after it? Thanks for your answers! :]

  • @Boyanspookclaw
    @Boyanspookclaw 7 місяців тому +1

    lovely final image from Midjourney

  • @rosscads
    @rosscads 7 місяців тому +1

    The voice capability in ChatGPT is a bigger deal than people might think. Having released a similar hands-free voice capability for Pi built on the same OpenAI technology and just days ahead of the ChatGPT-V announcement, I'm convinced that it has the potential to revolutionise how and where people interact with conversational assistants.

  • @Yottenburgen
    @Yottenburgen 7 місяців тому +1

    I wonder what if you chain it a bit by making it explore the inverse of the question, reverse the question, and whatnot to try and extract as many right adjacent tokens as possible. This however would be achieved outside the model so it doesn't exactly fix the single component issue.

  • @lichifang632
    @lichifang632 7 місяців тому

    Fascinating video!!!! (what's your chess rating (purely out of curiosity)?

    • @aiexplained-official
      @aiexplained-official  7 місяців тому

      On chess.com around 1900, but I have never studied chess properly or memorised openings

  • @jekkleegrace
    @jekkleegrace 7 місяців тому +1

    GPT 4 thinks that reasoning is more flexible and adaptable to different situations, while logic is more rigid and structured.

  • @harleykf1
    @harleykf1 7 місяців тому +2

    I'm shocked by how strong ParrotChess is. It's by no means perfect, but it's strong enough to beat me every time, at least in faster time controls. I just assumed that it would be a minor improvement on ChatGPT, which struggles to even play legal moves half of the time.
    I'm still a while away from reaching 1700-1800 FIDE. Maybe I could learn a thing or two from its positional play.