Q* - Clues to the Puzzle?

Поділитися
Вставка
  • Опубліковано 23 лис 2023
  • Are these some clues to the Q* (Q star) mystery? Featuring barely noticed references, UA-cam videos, article exclusives and more, I put together a theory about OpenAI’s apparent breakthrough. Join me for the journey and let me know what you think at the end.
    www.assemblyai.com/playground
    AI Explained Bot: chat.openai.com/g/g-804sC5lJ6...
    AI Explained Twitter: / aiexplainedyt
    Lukasz Kaiser Videos: • Deep Learning Decade a...
    • Lukasz Kaiser (OpenAI)...
    Let’s Verify Step by Step: arxiv.org/abs/2305.20050
    The Information Exclusive: www.theinformation.com/articl...
    Reuters Article: www.reuters.com/technology/sa...
    Original Test Time Compute Paper arxiv.org/pdf/2104.03113.pdf
    OpenAI Denial: / 1727472179283919032
    DeepMind Music: deepmind.google/discover/blog...
    Altman Angelo: / sama
    Karpathy: peterjliu/status/...
    STaR: arxiv.org/abs/2203.14465
    Noam Brown Tweets: polynoamial/statu...
    Q Policy: www.analyticsvidhya.com/blog/...
    Sutskever Alignment: • Ilya Sutskever - Openi...
    / aiexplained Non-Hype, Free Newsletter: signaltonoise.beehiiv.com/
  • Наука та технологія

КОМЕНТАРІ • 992

  • @aiexplained-official
    @aiexplained-official  5 місяців тому +584

    My computer crashed 7 times while making this video and I had a hard deadline to get a flight. There is little of my normal editing in here, or captions, just my raw investigation! Do follow the links for more details.

    • @literailly
      @literailly 5 місяців тому +26

      We appreciate your dedication, sir!

    • @JohnVance
      @JohnVance 5 місяців тому +38

      Still the best AI channel on UA-cam, none of the hype of the other channels. Maybe the news cycle will calm down and you can get some sleep!

    • @patronspatron7681
      @patronspatron7681 5 місяців тому +1

      Bon voyage

    • @thebrownfrog
      @thebrownfrog 5 місяців тому +4

      It's great as always!

    • @alertbri
      @alertbri 5 місяців тому +6

      You did a great job Philip, as always! Much appreciated attention to detail and balance. Exciting times ahead! Have a safe trip. 🙏👍

  • @SaInTDomagos
    @SaInTDomagos 5 місяців тому +601

    Dude woke up and thought to himself, how thorough will I be today and said: “Yes!” You definitely should get some interviews with those top researcher’s.

    • @Dannnneh
      @Dannnneh 5 місяців тому +11

      Oooh, that would be interesting!

    • @aiexplained-official
      @aiexplained-official  5 місяців тому +139

      Stay tuned :)

    • @JustinHalford
      @JustinHalford 5 місяців тому

      @@aiexplained-official🔥🫡

    • @daikennett
      @daikennett 5 місяців тому +11

      We'll hold you to this. ;) @@aiexplained-official

    • @DaveShap
      @DaveShap 5 місяців тому +7

      Philip is nothing if not thorough. Dude reads like several novels worth of text per day.

  • @nathanfielding8587
    @nathanfielding8587 5 місяців тому +423

    I'm truly grateful for this channel. Finding accurate news about almost anything is hard as heck, and having accurate AI news is especially important. We can't afford to be mislead.

    • @akathelobster1914
      @akathelobster1914 5 місяців тому

      He's good, I'm very interested in reading the references.

  • @gaborfuisz9516
    @gaborfuisz9516 5 місяців тому +649

    Who else is addicted to this channel

    • @danielbrockman7402
      @danielbrockman7402 5 місяців тому +6

      me

    • @FranXiT
      @FranXiT 5 місяців тому +12

      He is literally me

    • @a.thales7641
      @a.thales7641 5 місяців тому +3

      I am

    • @shaftymaze
      @shaftymaze 5 місяців тому +5

      7 min later. He digs a bit further than I have time to. And yeah. Ilya was on our side.(humanity) Remember that.

    • @ytrew9717
      @ytrew9717 5 місяців тому +3

      who else do you follow? (Please feed me)

  • @DevinSloan
    @DevinSloan 5 місяців тому +169

    Ah, the Q* video I have been waiting for from the only youtuber i really trust on the subject. Thanks!

    • @aiexplained-official
      @aiexplained-official  5 місяців тому +18

      Let me know what you think of the theory

    • @AllisterVinris
      @AllisterVinris 5 місяців тому

      Same

    • @Elintasokas
      @Elintasokas 5 місяців тому +1

      @@aiexplained-official Rather hypothesis, not theory.

    • @aiexplained-official
      @aiexplained-official  5 місяців тому +13

      @@Elintasokas but the evidence came first, so a theory no?

    • @sebby007
      @sebby007 5 місяців тому

      My thought exactly

  • @dcgamer1027
    @dcgamer1027 5 місяців тому +152

    I'd expect the Q to refer to Q-learning. Human beings think/function by predicting the future and acting upon those predictions, at least at a subconscious level. The way we make these predictions is by simulating our environment and observing what would happen in different variations of that simulation given the different choices we make. We then pick the future we feel is best and take the actions to manifest that future.
    I think a good example might be walking through a messy room with legos everywhere. You observe that environment(the room) identify the hazards(legos) then plan out a course through the room of where you can step to be safe(not step on lego). You would imagine that stepping in one spot would mean you are stuck or would step on a lego, so that whole route is bad and you try another. Repeat till you find a solution or decide there isn't one and just pick some legos up, or give up, or whatever. Of course not everyone does this, some people just walk on through without thought and either accept stepping on legos or regretting that they did not stop to think. These emotional responses of acceptance of consequences or regretting them is more akin to reinforcement learning imo. There are times when you need to act without thought, for example, if the room was on fire you might not have the time (or compute) to plan it all out.
    The Q learning stuff, in the context of these LLMs, seems like it would be their version of simulating the future/environment. It would generate a whole bunch of potential options(futures) then pick the best one. The difficult task there is creating a program that knows what the best option actually is, but they apparently already have that figured out.
    My bet is we will need to add in a few different systems of ‘thought’ that the AI can choose from given different contexts and circumstances, these different methods of decision-making will become tools for the AI to use and deploy and at that point it will really look like AGI. That’s just my guess and who knows how many tools it will even need.
    Either way it's cool to see progress and all this stuff is so cool and exciting.
    Now to go look for some mundane job so I can eat and pay off student loans lmao, post-money world come quickly plz XD.

    • @gregoryallen0001
      @gregoryallen0001 5 місяців тому +7

      normally a long post like this will be trash so THANK YOU for this helpful and engaging response ❤

    • @RichardGrigonis
      @RichardGrigonis 5 місяців тому +4

      Many years ago AI researchers speculated how to represent "thoughts." One approach was to treat them essentially as "mental objects," the other was to resort to possible worlds theory.

    • @GS-tk1hk
      @GS-tk1hk 5 місяців тому +5

      What you described is just reinforcement learning, Q-learning is a specific algorithm for solving the RL objective and the "Q" refers to the Q-function, which has a specific meaning in RL. It seems likely that Q* refers to the Q-function (and star generally means "optimal"), but not necessarily the Q-learning algorithm.

    • @kokopelli314
      @kokopelli314 5 місяців тому

      But if you have the whole world in q learning you can just use your intelligence to make money and pay someone to sweep up the room

    • @lucasblanc1295
      @lucasblanc1295 5 місяців тому +1

      Anyone that played a bit with those LLMs intuitively know that already. I prompt it all the time chain-of-thought and other reasoning methods like "Write a truth table to check for errors in our logic". The major issue I always arrive at, is that it always ends up getting stuck somewhere along its line of reasoning and it needs human intervention. This happens exactly because it was never taught how to think and structure its thoughts, it was just a side-effect of language. I believe once its able to reason through mathematical problems with the proper proofs, it will be able to generalize for any fields due to its lateral knowledge transfer. So, they will just need to keep fine-tuning the model towards that direction, effectively creating a feedback loop of improving its capability at reasoning correctly, so that it will require less parameters and less compute for the same quality. And adding on top of that new breakthroughs such as bigger context window, AGI is just matter of quantity and quality of the same technique.
      Just run that thing in a loop, because that's how thinking happens. It's a trial and error process. Then, fine-tune it at being better at trial-and-error processes, instead of simply giving seemingly useful answers. We were simply being lazy about it by tuning it towards being useful quickly, without caring about how it's doing it in the first place.
      It is already AGI, but it's severely misaligned, just like GPT-3 was impressive before Chat fine-tuning. Now, we are fine-tuning Chat as Q*. It's just a step.
      After Q*, it will probably be fine-tuned for improvement at further generalization, instead of simply the domain of math/programming.
      This will be tricky to train, humans don't generate textual content for the sake of thinking through it, perhaps only mathematical proofs get there, and it's extremely time-consuming. Because we make assumptions about the reader's pre-existing intelligence, we tell information through text without ever showing our full thought process.
      In other words, we are truly starting to fine-tune it for using text for thinking, not simply generating cute answers to fool humans. This may seem obvious, but I don't think people get this.

  • @pedxing
    @pedxing 5 місяців тому +98

    THIS was the technical dive I've wanted to find for the last few days. thank you so much for taking the time to dig into the development of these papers and the technologies they represent.

  • @Madlintelf
    @Madlintelf 5 місяців тому +118

    We all spent the last week watching the soap opera drama and listening to wild ideas and nobody put it all together in a nice package with a bow on it until you posted this video. It is a theory, but one that is well thought out has references, and seems extremely logical. Thanks for putting so much work into this, but it's not falling on deaf ears, we truly appreciate you. Thanks, Bill Borgeson

    • @lollerwaffleable
      @lollerwaffleable 5 місяців тому

      Who is listening? Remember I just want like a fucking job. From OpenAI specifically.

    • @lollerwaffleable
      @lollerwaffleable 5 місяців тому

      When do we announce that I’m the new ceo of open ai

    • @lollerwaffleable
      @lollerwaffleable 5 місяців тому

      Lmao

  • @Peteismi
    @Peteismi 5 місяців тому +85

    The Q* as an optimizing search through the action space sounds quite plausible. Just like the A* algorithm that is more of a generic optimal path finding algorithm.

    • @adfaklsdjf
      @adfaklsdjf 5 місяців тому +7

      ohhh that Q* / A* link is very interesting!

    • @productjoe4069
      @productjoe4069 5 місяців тому +10

      This was my thought too. Possibly using edits of the step-by-step reasoning as the edges, or some more abstract model. You could then weight the edges by using a verifier that only needs to see a bounded context (the original, the edited, and the prompt) to say whether or not the edit is of high quality. It’s sort of like graph-of-thought, but more efficient.

    • @ZeroUm_
      @ZeroUm_ 5 місяців тому +10

      A* was my first thought as well, it's such a famous, CompSci graduate level algorithm.
      (Sagittarius A* is also the name of the Milky Way's central supermassive black hole)

    • @mawungeteye657
      @mawungeteye657 5 місяців тому

      Even if it's just speculative it's a decent idea for an actual study. Wish someone would test it.

    • @sensorlock
      @sensorlock 5 місяців тому +3

      I was thinking something along this line too. Is there a way to prune chains of thought, like A* prunes minimax?

  • @caiorondon
    @caiorondon 5 місяців тому +50

    This channel outpaces in quality ANY other channel on AI News in UA-cam. The way you try your best to keep the hype out and reduce the amount of speculation is really something to be proud of and really what makes your content so different from other creators.
    You sir, is the only channel in the topic that I am happy to watch (and like) every video. ❤
    Cheers from Brazil!

  • @bobtivnan
    @bobtivnan 5 місяців тому +80

    Wow. Very impressive investigative journalism. No other AI channel does their homework better than you. Well done sir.

  • @nescirian
    @nescirian 5 місяців тому +14

    At 17:20 Lukacs Kaiser says multi-modal chain of thought would be basically a simulation of the world. Unpacking this, you can think of our own imaginations as essentially a multi-modal "next experience predictor", which we run forwards as part of planning future actions. We imagine a series of experiences, evaluate the desirability of those experiences, and then make choices to select the path to the desired outcome. This description of human planning sounds a lot like Q-learning - modeling the future experience space as a graph of nodes, where the nodes are experiences and the edges are choices, then evaluating paths through that space based on expected reward. An A* algorithm could also be used to navigate the space of experiences and choices, possibly giving rise to the name Q*, but it's been many years since I formally studied abstract pathfinding as a planning method for AI, and as far as I can tell from googling just now over my morning coffee, it seems like the A* Algorithm would not be an improvement over the markov decision process traditionally used to map the state space underlying Q-learning.
    My extrapolation gets a bit muddy at that point, but maybe there's something there. To me, a method that allows AI to choose a path to a preferred future experience would seem a valuable next step in AI development, and a possible match for both the name Q* and the thoughts of a researcher involved with it.

  • @a.s8897
    @a.s8897 5 місяців тому +14

    you are my first source for AI news, you go deep into the details and do not cut corners, like a true teacher

  • @grimaffiliations3671
    @grimaffiliations3671 5 місяців тому +45

    This really is the best AI channel around, we're lucky to have you

  • @rcnhsuailsnyfiue2
    @rcnhsuailsnyfiue2 5 місяців тому +13

    18:49 I believe Q* is a reference to the “A* search algorithm” in graph theory. Machine learning is fundamentally described by graph theory, and an algorithm like A* (which traverses each layer of a graph as efficiently as possible) would make total sense.

    • @bl2575
      @bl2575 5 місяців тому

      It was also my though when I heard the algorithm name. It is basically a cost minimization algorithm to reach a target node. Difficult part in this context is figuring out what heuristic to use to evaluate if a step of reasoning is closer to answering the question than another one. Maybe that where the Q-learning policy play a role.

  • @gmmgmmg
    @gmmgmmg 5 місяців тому +22

    The New York Times or another major newspaper should hire you, seriously. The amount and quality of research and the way you explain and convey AI news and information is truly remarkable. You are currently my favourite yt channel.

  • @DavidsKanal
    @DavidsKanal 5 місяців тому +5

    "You need to give the model the ability to think longer than it has layers" is what really sticks with me, it's such an obvious next step for LLMs which currently run in constant time. Let's see where this leads!

  • @xXWillyxWonkaXx
    @xXWillyxWonkaXx 5 місяців тому +4

    By far one of the most informative and condensed videos about the essential concepts/building blocks towards creating AGI. Very succinct, great tempo. 👏🏼

  • @stcredzero
    @stcredzero 5 місяців тому +78

    This makes me want to produce a generative AI comic called, "The Verifier." It would be about a verifier AGI fighting a David versus Goliath guerilla war against a malevolent superoptimizer, using its ability to poke holes in the answers of a much larger model to save humanity. EDIT: The tactic of doing lots of iterations, then rewarding on the raw probability of winning -- This smells a lot like evolution by natural selection. It's a brutally simple emergent fitness function!

  • @garrettmyles6493
    @garrettmyles6493 5 місяців тому +8

    As someone outside the industry, this is such a great resource. Thank you very much for the hard work and keeping us in the loop! I've been waiting for this video since the Reuters article

  • @apester2
    @apester2 5 місяців тому +16

    I was in two minds about whether to take the Q* thing seriously until you posted about it. Now I accept that it is atleast not just sensational hype. Thanks for keeping us up to date!

  • @colin2utube
    @colin2utube 5 місяців тому +18

    Game Developers will be familiar with the "A*" algorithm, used to find optimal shortest paths between 2 points on a grid containing obstacles (eg. a path between the players location and some target, or between an AI opponents position and the players position). I wonder if Q* is some similar shortest path finding algorithm between two more abstract nodes in an AI network problem containing some kind of obstruction that has to be navigated around ?

    • @johntiede2428
      @johntiede2428 5 місяців тому +2

      I'd add that mazes can be decomposed into trees, and A* is applied to that. Think Trees of Thought not just Chain of Thought, and applying an A*-like algorithm.

  • @MasonPayne
    @MasonPayne 5 місяців тому +7

    A* is an algorithm mainly used in path finding. Which works very similar to what you described as Q. Imagine the idea landscape as a set of information you need to search through to find a path to the answer. That is what I think they mean by Q*

  • @ShadyRonin
    @ShadyRonin 5 місяців тому +2

    Love the longer video format! Amazing as usual

  • @ddwarful
    @ddwarful 5 місяців тому +8

    Q* found the fabled NSA AES backdoor.

  • @TheLegendaryHacker
    @TheLegendaryHacker 5 місяців тому +5

    Damn, to me this feels like the discovery of nuclear chain reactions. It's not quite there yet, but you can see the faint glimmer of something world changing to come. Especially that "general self-improvement" stuff... GPT-5 is gonna be wild.

  • @zandrrlife
    @zandrrlife 5 місяців тому +9

    I would say he's actually understating the dramatic impact CoT has on multi-modal output. Also things get wacky when you combine vertical CoT iteratively reflecting horizontal CoT outputs(actual outputted tokens). Increasing model inner monologue(computation width) across layers is def the wave.
    Again why I think synthetic data/hybrid data curation cost will soon match model pretraining. Even if you're perturbating existing data, you can lift it's salient density to better fit this framework. Also why I keep saying local models are the way and why I've been obsessed with increasing representational capacity in smaller models.

  • @rioiart
    @rioiart 5 місяців тому +20

    Hands down best UA-cam channel for AI news.

  • @tai222
    @tai222 5 місяців тому +18

    This channel and Dave Shapiro are my go to for AI news!

    • @Veileihi
      @Veileihi 5 місяців тому +4

      lmao, I left the same comment on one of Daves videos but in reverse

    • @MarkosMiller15
      @MarkosMiller15 5 місяців тому +2

      I'd add Wes too which I discovered recently but yeah, those 2 really are the main trustworthy non *cryptobro* vibes channels

    • @krishp1104
      @krishp1104 5 місяців тому +5

      I just found Dave Shapiro today but I think he's wayyy too impulsive to sound the AGI alarm

  • @adfaklsdjf
    @adfaklsdjf 5 місяців тому +9

    as always, _whatever happens_ , thank you for your work

  • @jimg8296
    @jimg8296 5 місяців тому +1

    Great research. Thank you very much. I appreciate how you have pulled together vast amount of data into an understandable video. It would take me months to get close to this understanding of Q*. Now is was just 1/2 hr with your research and video editing. RESPECT!

  • @zaid6527
    @zaid6527 5 місяців тому +1

    Just came across your ai channel, i found it to be one of the best ai channels on youtube that you can find, and also i like the intuition part where you told about the lets verify, Amazing video, keep up the good work 👍

  • @sgstair
    @sgstair 5 місяців тому +7

    Here's the idea that I had:
    Let's say you think of the output of a "Let's verify step by step" prompt as a tree of possible responses. Each step has a wide variety of possible subsequent steps.
    Then let's say you have a classifier network that decides relatively how good chains of responses are
    Then you could run an A* search algorithm over the tree of possible response chains efficiently, only following the most useful ones, and explore an unimaginably huge search space without that much compute.

  • @etunimenisukunimeni1302
    @etunimenisukunimeni1302 5 місяців тому +46

    Amazing work. Thanks for, ahem, pushing back the veil of ignorance 😁
    So refreshing to get an informed and non-sensational take on this latest OpenAI X-Files case. It doesn't even matter if your educated guess ends up missing the mark. It's this kind of detective work that is sorely needed in any case, at least before we get some official and/or trustworthy info on this James Bond style "great achievement" called Q*

  • @guilleru2365
    @guilleru2365 5 місяців тому +1

    Things are going so fast that it’s hard to imagine how it will look like in just a few more weeks. Amazing work!

  • @DaveShap
    @DaveShap 5 місяців тому +5

    This is way better than breaking AES-192.

    • @zero_given
      @zero_given 5 місяців тому +2

      Loved your video mate!

    • @prolamer7
      @prolamer7 5 місяців тому +2

      You are big person for acknowledging that this video is better than yours!

    • @DaveShap
      @DaveShap 5 місяців тому

      @@prolamer7 we're all speculating here and I have a lot of respect for my fellow creators. I view it as all part of a bigger conversation.

    • @prolamer7
      @prolamer7 5 місяців тому

      @@DaveShap That said!!! Of many other AI youtubers you are consistently among TOP too!!! I hate to sound too simplistic. Sadly yt comment system is kinda designed to allow only short thoughts and shouts.

  • @FranXiT
    @FranXiT 5 місяців тому +5

    I was just thinking about how much I wanted a new video from you :3 thank you.

  • @agenticmark
    @agenticmark 5 місяців тому +3

    this is the basis of a montecarlo search, or even a genetic algorithm. you are simulating many worlds, and selecting the world that best fits the needed model. - by the way, this is great work. the research you did, the papers you referenced, and the video in general! love it.

  • @kombinatsiya6000
    @kombinatsiya6000 5 місяців тому +2

    This is the channel i return to over and over again to make sense of the latest AI research.

  • @xwkya
    @xwkya 5 місяців тому +2

    This channel is a blessing. I have been navigating news the past week, but this is the place that I feel gives the most accurate information and informed speculations.

    • @xwkya
      @xwkya 5 місяців тому

      And the theory on Q* applying Q learning to decode is very interesting. Thinking of GPT Zero, I have wondered if algorithms used in Alphazero such as MCTS (using GPT as a policy function) have been tested for decoding purpose, this also fits the idea of increasing inference cost. I hope you will continue to share your knowledge and investigations

  • @Neomadra
    @Neomadra 5 місяців тому +7

    It's just incredible how you connect all these dots in such a short amount of time. Even if Q* turns out to be a mirage, at least I learned something about promising research directions :)

    • @xXWillyxWonkaXx
      @xXWillyxWonkaXx 5 місяців тому +2

      If im understanding this correct, its: Test Time Computation, Chain of Thought (CoT), Let's Verify Step by Step and Self-Taught Reasoning.

  • @6GaliX
    @6GaliX 5 місяців тому +3

    The name Q* might be just an hommage to the A* pathfinding method.
    Therefor a special way of creating chain of thoughts.
    While "Q" = Q-learning a common reinforment learning method in machine learning.

  • @MrSchweppes
    @MrSchweppes 5 місяців тому +2

    Oh, I've been waiting for your video since the Q Star news. Great dive. Thanks a lot for making this video! 👍

  • @Datalata
    @Datalata 5 місяців тому

    This is the information I’ve been looking for. Thanks for doing the heavy lifting on the research that we’ve all needed on this topic!

  • @spaceadv6060
    @spaceadv6060 5 місяців тому +22

    Still the highest quality AI channel on UA-cam. Thanks again!

  • @tlskillman
    @tlskillman 5 місяців тому +6

    Great job. A real service to us all. Thank you.

  • @DiscoTuna
    @DiscoTuna 5 місяців тому

    Wow - what a detailed line of thought and extensive amount of research you have gone through to produce this vid. Thanks

  • @Rawi888
    @Rawi888 5 місяців тому +1

    I'm laying here depressed beyond all reasoning, hearing you speak about your passions really lift my spirits. Thank you friend.

    • @aiexplained-official
      @aiexplained-official  5 місяців тому +1

      Thanks Rawi, that's so kind. Now time for you to find and speak on your passions!

    • @Rawi888
      @Rawi888 5 місяців тому

      @@aiexplained-official GOTCHA 🫡. You just joined twitter, imma find you and make you proud.

  • @darinkishore9606
    @darinkishore9606 5 місяців тому +5

    you’re goated for this one man

  • @Lvxurie
    @Lvxurie 5 місяців тому +9

    Listening to the guy talk about AlphaGo reminds me of how human development occurs.
    An early stage of learning is the actor stage where kids copy what the people around them do to try and figure out the correct way to act often also copying poor behaviours.
    The next stage is called the motivated agent. To be an agent is to act with direction and purpose, to move forward into the future in pursuit of self-chosen and valued goals.
    Since AI is essentially trying to recreate human thinking i wonder if creating AI models that following the development of humans is the best way to get to AGI.

    • @honkhonk8009
      @honkhonk8009 5 місяців тому

      Lol Il apply that with my math courses.
      Im having trouble with proofs. Right now all I can do, is just copy what other people have wrote and regurgitate it.
      But hopefully with enough practice I can get into the "motivated agent" phase likie you suggest I ugess lmfao.

  • @skier340
    @skier340 5 місяців тому

    Fantastic breakdown. Really doing your homework to get us some real concrete possibilities of what's actually happening with the architecture of Q* when everything else just seems wild speculation.

  • @KP-sg9fm
    @KP-sg9fm 5 місяців тому +29

    Would love to see you do interviews with lesser known but key figures in the industry, you would have such good questions.

  • @user-hk8jt6so3l
    @user-hk8jt6so3l 5 місяців тому +11

    YOU ARE THE BEST! I am so happy to have found you back at the beginning of AI "craze", and words cannot describe how grateful me and your other viewers are to you for such a high quality content! I believe your work will play a huge role in humanity's future!
    edit: grammar

  • @gobl-analienabductedbyhuma5387
    @gobl-analienabductedbyhuma5387 5 місяців тому

    Such deep research! Man, you're just always way ahead of everyone else with your work. Thank you!

  • @En1Gm4A
    @En1Gm4A 5 місяців тому +1

    thats i can go sleep fine again now - u uncovered the behind the scenes - It is even aligned with what i thought might be the key to more capabilities. THANK YOU !!!!

  • @middle-agedmacdonald2965
    @middle-agedmacdonald2965 5 місяців тому +4

    Thanks! Very down to earth, and well thought out.

  • @ryanhm1004
    @ryanhm1004 5 місяців тому +7

    This reminds me of the movie "Arrival", when it was very difficult to communicate with aliens because you had to explain what an adjective is, what a noun is and it would be easier to communicate with robots through mathematics than language(like said Karpathy), because you could simply reward giving functions to solve and evolving this ability to reason about, in the end, as it says in Aristotelian Logic Theory, language is mathematics too.

    • @Mr_Duck_RVA
      @Mr_Duck_RVA 5 місяців тому +1

      I just watched that movie for first time the other night

    • @electron6825
      @electron6825 5 місяців тому

      ​@@Mr_Duck_RVAWhat did you think about it?

  • @SamGirgenti
    @SamGirgenti 5 місяців тому

    You and Wes are the best AI presenters on youtube in my opinion. Thanks for taking the time to teach. :)

  • @sushihusi35
    @sushihusi35 5 місяців тому

    Damn, the research/investigation you just did on this topic is insane. Hats off, thank you for this video!

  • @KyriosHeptagrammaton
    @KyriosHeptagrammaton 5 місяців тому +4

    I remember back when they had AIs learning to play mario and it was super slow to get generically good, and then they encouraged it to get a high score instead of reaching the end goal, or something like that, and suddenly it was learning way faster and much better at arbitrary levels.

  • @krishp1104
    @krishp1104 5 місяців тому +4

    I've been checking your channel impuslively waiting for this video

  • @JustinHalford
    @JustinHalford 5 місяців тому +1

    I was waiting for this one! Absolutely riveting. Our collective progress is quickly being rendered compute bound.

    • @JohnSmith762A11B
      @JohnSmith762A11B 5 місяців тому +1

      This is perhaps why Sam has been running around trying to get new chip fabs built. Nvidia is simply not enough when infinite computing power is best. This in fact has always been a primary doom scenario: that an AGI/ASI becomes addicted to getting smarter and reformats the entire cosmos into one gigantic mind.

  • @uraszz
    @uraszz 5 місяців тому +1

    I've been seeing news about Q* for a day or two but refused to watch anything before you uploaded. I trust you with anything AI. Thank you!!

    • @aiexplained-official
      @aiexplained-official  5 місяців тому +1

      I might be wrong, but I gathered quite a bit of evidence for you to evaluate!

  • @nathanbanks2354
    @nathanbanks2354 5 місяців тому +18

    GPT-4 is already using let's verify step by step. I've often asked it to program something or refactor something, at the first thing it does is come up with an English list of what it's about to do. This list then becomes part of the tokens it uses to generate the following tokens as it actually writes the program. It's like it changes my query into an easier query. It wasn't doing this when I signed up in April.

    • @thearchitect5405
      @thearchitect5405 5 місяців тому +5

      It does it on small scales, but not quite on the same scale as in the paper. Otherwise you'd be getting 30 line responses to basic questions. It also doesn't verify on a step by step basis.

    • @nathanbanks2354
      @nathanbanks2354 5 місяців тому +1

      @@thearchitect5405 I meant that they're using the techniques suggested from some papers earlier this year which suggested to use "think step-by-step" as part of the query to an LLM. It was a prompt-engineering technique. This was one of several techniques which substantially improved accuracy for answering exam questions. It could definitely be improved and I didn't read this particular paper, so I'm sure you're right about the scale being larger.

    • @adfaklsdjf
      @adfaklsdjf 5 місяців тому

      @@nathanbanks2354 have you set any custom instructions, by chance? ;)

    • @homelessrobot
      @homelessrobot 5 місяців тому +1

      @@thearchitect5405 maybe there is a threshold of compexity or something, but a couple of weeks ago I did an open book calculus course with GPT-4. It was generating step-by-step answers so large that it would stop and ask me if I wanted it to continue. > 30 lines each. Much greater. These answers took several minutes to generate each in full. It also passes that course with flying colors.

  • @antoniopaulodamiance
    @antoniopaulodamiance 5 місяців тому +4

    Best channel. The amount of time dude spend reading and following all the noise to get to a high quality 15 min videos is fantastic

  • @errgo2713
    @errgo2713 5 місяців тому

    This is such a helpful round up. Thank you!!

  • @HenriKoppen
    @HenriKoppen 5 місяців тому

    Whenever I have a discussion about any topic, when someone is making a claim, I ask "please help me understanding your conclusion, can you bring me there step by step?" . This is so powerful, because when some claim is heavily biased it will emerge from this step-by-step process. It really made me stronger in having discussions and share my step-by-step reasoning. All truth comes from the details... This video is really inspiring, smart, in the right tone, well explained. Thank you for spending the time to do this right!

  • @andrew.nicholson
    @andrew.nicholson 5 місяців тому +7

    20:45 The idea of training a model on its own output makes me think about our own brains and how dreaming and sleep are critical to our ability to learn. Sleep is when we take what has happened during the day - our success and failures - and integrate them into our long term memory.

    • @adfaklsdjf
      @adfaklsdjf 5 місяців тому +1

      we also loop over our thoughts while we're thinking about a problem... we come up with an idea and then reconsider it, poke holes, test it out in various ways, compare it to other ideas. a neural net's inputs traverse the network once and become outputs.. without loops it's like it's only given one shot to "think about" something before giving its answer..
      the sleeping/dreaming/integration analogy is interesting.

    • @homelessrobot
      @homelessrobot 5 місяців тому

      or the concept of active recall. You read a little, then you answer questions about what you learned or summarize what you have learned and receive feedback for that. It's important to note that the output its training on isn't just raw output, it's not a closed loop. There is a second model 'grading' the answers. So, there is external feedback involved.

  • @Stephen_Lafferty
    @Stephen_Lafferty 5 місяців тому +4

    I can barely believe that is has been just seven days since Sam Altman was fired by OpenAI. What an American Thanksgiviing it was for Sam to return to OpenAI. Thank you for your insightful analysis as always!

  • @Robert_McGarry_Poems
    @Robert_McGarry_Poems 5 місяців тому

    20:00 I think you pretty much nailed it. This sounds pretty amazing. In all honesty this should be how the core models are trained. In my opinion, this type of processing would make alignment super easy. In the sense that you could have multiple "observers" all with their own _obviously programmed in bias_ as a second layer, that then would be filtered by a third layer, which is the true autonomous "observer."

  • @williamjmccartan8879
    @williamjmccartan8879 5 місяців тому

    Thank you Phillip, glad to see you've taken the dive on X, thank you again, teaching these lessons are really important to a lot of us who haven't your skills and experience in researching all of this material and are educated through that process. Peace

  • @randomuser5237
    @randomuser5237 5 місяців тому +3

    This actually makes me even less hopeful about open-source AI. It's quite clear that most of the people who can make new breakthroughs are working in these companies and will not publish their research. It also throws out the idea that it's only about data and compute to make better models. Open source will keep lagging behind them every day unless the government steps up and provide the financial incentives to the national labs so that they get the top researchers and publish open source models.

    • @prolamer7
      @prolamer7 5 місяців тому

      You are right there is only really handful of really smart people in opensource which is for "free" unline in companies where you are paid milions. BUT once there is as smart model as GPT4 for everyone to use it will help even small guys to create novel and good models.

  • @jtjames79
    @jtjames79 5 місяців тому +7

    Q* make me a design for a cold fusion powered jetpack, please. 😎👍

  • @Rawi888
    @Rawi888 5 місяців тому

    You, Matt Wolfe, Wes and David Shapiro are my only trusted sources. Especially you. I reaaaaally love and appreciate all the work you do.

  • @tristanwegner
    @tristanwegner 5 місяців тому

    I follow closely over last weekend, but as my time investment is limited, to is great to have you to synthesize the facts together with some digging, like the barely watched videos!

  • @Y3llowMustang
    @Y3llowMustang 5 місяців тому +4

    I've been refreshing waiting for this video from you

  • @jeff__w
    @jeff__w 5 місяців тому +4

    25:45 “I think the development is likely a big step forward for narrow domains like mathematics but is in no way yet a solution for AGI the world is still a bit too complex for this to work yet.”
    That’s a really important qualification-we’re not _yet_ on the verge of our glorious/terrifying AGI future-and that, I think, undercuts the (to me, much over-hyped) theory that some AI “breakthrough” was what spooked the board into ousting Sam Altman. Some old-fashioned power play/interpersonal conflict seems a lot more likely to me (although an AI breakthrough might have exacerbated the already-existing tensions).
    And that Q* is a reference to “the optimal Q-function” 18:44 seems entirely plausible. It’s just what you’d expect from the AI researchers at OpenAI.

  • @WilliamsDarkoh
    @WilliamsDarkoh 5 місяців тому +1

    Congrats on the 200 k, my predictions were on point!

  • @jacorachan
    @jacorachan 5 місяців тому

    Great video as usual. Please keep on making them! You provide a thoughtful vision of current state of AI and I really appreciate the way that you elaborate your ideas with what you read or listen in videos.
    Again, fantastic work 👏

  • @nomadv7860
    @nomadv7860 5 місяців тому +3

    Amazing video. I appreciate your investigation into this

  • @felipoto
    @felipoto 5 місяців тому +3

    New Ai Explained video letss goooooo

  • @holographicman
    @holographicman 5 місяців тому +2

    Hands down the best AI update channel, I just remove any suggested channels popping up at this point. Oh and as a musician and synth developer, that last demo is cool, I can imagine a synthesizer or DAW in the future with where humans can interact in super creative ways. Love it. ❤

  • @Gonko100
    @Gonko100 5 місяців тому +2

    By far the best channel regarding this topic. Like, it's not even close.

  • @tomaszkarwik6357
    @tomaszkarwik6357 5 місяців тому +4

    3:44, hey, another polish note. "Łukasz" is indeed the polish version of lucas, but if you want to be 100% correct, the translatileretion into british english would be something like "wukash"
    Edit, i forgot to "cite" my sources. I am a native polish speaker

  • @aspuzling
    @aspuzling 5 місяців тому +3

    Thank you for not just spouting the "have OpenAI reached AGI?" hyperbole. This is really interesting research.

  • @MrBorndd
    @MrBorndd 5 місяців тому +1

    This channel provides the best, most well researched, cutting edge information about AI development available. While the competition just repeat eachother and dont offer much more than what we already got from reuters. Excellent journalism!

  • @QuarkTwain
    @QuarkTwain 5 місяців тому +5

    As if things weren't trending enough towards the conspiratorial, now they have their own "Q". Feel the AGI!

  • @nanow1990
    @nanow1990 5 місяців тому +4

    Let's breakdown this step-by-step.

  • @geepytee
    @geepytee 5 місяців тому +1

    Excellent video, and glad you're on twitter now

  • @KP-sg9fm
    @KP-sg9fm 5 місяців тому +1

    Truly love this channel

  • @vnehru1
    @vnehru1 5 місяців тому +4

    This is one of the only channels on AI - or virtually anything - that is no fluff, all good information. I never write comments, but I can't help but commend the high quality.

    • @attilaszekeres7435
      @attilaszekeres7435 5 місяців тому

      I too rarely write comments, but I was compelled to say that I skip over comments like yours. Flattery adds nothing to the conversation and only makes it harder to find valuable information. Hopefully my little feedback contributes toward seeing less ass licking and more juice.

    • @brycefegley
      @brycefegley 5 місяців тому

      I smashed the subscribe button

  • @supremebeme
    @supremebeme 5 місяців тому +9

    AGI happening sooner than we think?

    • @SaInTDomagos
      @SaInTDomagos 5 місяців тому

      That’s the power of exponential functions.

  • @Partric
    @Partric 5 місяців тому +2

    Thank you for doing these

  • @nathanieldimemmo1330
    @nathanieldimemmo1330 5 місяців тому +1

    As always, thanks for your insights!

  • @beowulf2772
    @beowulf2772 5 місяців тому +3

    A 6B model with that much capability 💀

  • @memegazer
    @memegazer 5 місяців тому +3

    Thanks!

    • @aiexplained-official
      @aiexplained-official  5 місяців тому +2

      Thanks memegazer!!

    • @memegazer
      @memegazer 5 місяців тому +1

      Really glad you dug into this to offer some new insight...it is really fascinating

  • @johnbrisbin3626
    @johnbrisbin3626 5 місяців тому +1

    In recent days, I have heard more than one of your fellow 'tubers speak of your channel with the greatest respect.
    Congratulations.

  • @danielcsillag1726
    @danielcsillag1726 5 місяців тому +1

    Been eagerly waiting for this 😂, great video!

  • @jumpstar9000
    @jumpstar9000 5 місяців тому +4

    Maybe it is a search for Quality/model refinement of the weights using something similar to the A* algorithm. Pure speculation of course.
    Very interesting stuff. Thanks for the insights and commentary Philip.

    • @aiexplained-official
      @aiexplained-official  5 місяців тому +1

      Thanks Jumpstar!

    • @jumpstar9000
      @jumpstar9000 5 місяців тому

      @@aiexplained-official I was thinking. Maybe it's a goal seeking strategy that's better than simple CoT. That would make a lot of sense.

    • @jumpstar9000
      @jumpstar9000 5 місяців тому

      @@aiexplained-official You know what else I was thinking. If Ilya is running the superalignment team, and super means superintelligence, doesn't that kind of imply that AGI is already done if they are on to superintelligence. Unless they are just trying to get ahead of the game a bit of course. but it is difficult to guess what an ASI would even be like.

  • @Chris-se3nc
    @Chris-se3nc 5 місяців тому +4

    Obviously they have developed Q from Star Trek. Q is initially presented as a cosmic force judging humanity to see if it is becoming a threat to the universe, but as the series progresses, his role morphs more into one of a teacher to Picard and the human race generally - albeit often in seemingly destructive or disruptive ways, subject to his own will