OpenAI's Noam Brown, Ilge Akkaya and Hunter Lightman on o1 and Teaching LLMs to Reason Better

Поділитися
Вставка
  • Опубліковано 1 лют 2025

КОМЕНТАРІ • 82

  • @andrewwalker8985
    @andrewwalker8985 3 місяці тому +23

    You can see the value of open source in this interview. We don’t get smart people sharing their thoughts and excitement openly, we got smart people who were excited and would love to share armed with pre-approved sentences that they were allowed to say.

  • @user-pt1kj5uw3b
    @user-pt1kj5uw3b 4 місяці тому +41

    I hate to thank our corporate VC overlords, but these interviews are pretty cool. I think they will be historically significant in a few years.

    • @sup3a
      @sup3a 3 місяці тому +1

      100%

    • @rollingrock3480
      @rollingrock3480 3 місяці тому

      They will be legally significant as an example of how big tech companies have defeated the spirit of the law time and time again to the overall detriment of society. (Remember Facebook giving 5 points for an angry reaction, and 1 point for a like, when it comes to recommending posts for your FB feed, then telling congress they want to "Bring us all together"?)

    • @ashh3051
      @ashh3051 2 місяці тому

      Corporate VC overlords 🤔

  • @sup3a
    @sup3a 3 місяці тому +3

    Very good podcast thank you. No extra hype, just very matter of factly. Just what i need in the middle of all the hype

  • @NandoPr1m3
    @NandoPr1m3 4 місяці тому +5

    I like that we are getting to see the real people behind the curtain at OpenAI. My big takeaway is that they A) have other ideas being researched and B) that they aren't afraid to try new paradigms, which is basically what led to the o1 Models.

  • @xiaoxiandong7382
    @xiaoxiandong7382 3 місяці тому +7

    It's funny the researchers kept looking at the paper in front of them. Does it say what they can say vs not?

  • @rickandelon9374
    @rickandelon9374 4 місяці тому +26

    Btw the capital of Bhutan is Thimpu. Mostly hilly country and the world's only negative carbon output country.

    • @marwin4348
      @marwin4348 4 місяці тому

      Sucks for them, they still did not archieve industrialisation?

    • @rickandelon9374
      @rickandelon9374 4 місяці тому

      @@marwin4348the country is expensive as hell and poor interms of self dependence. mostly they import their goods from India which bullies them constantly with various political pressure.

    • @ominousplatypus380
      @ominousplatypus380 3 місяці тому

      "Mostly hilly" might be the understatement of the century. The entirety of the country is enveloped by the Himalayas and it's arguably the most mountainous country that exists.

    • @andrewwalker8985
      @andrewwalker8985 3 місяці тому +1

      @@marwin4348 that seems uncalled for

    • @-rate6326
      @-rate6326 3 місяці тому

      ​@@rickandelon9374 india doesn't actually bullie Bhutan. India is security garrantor for Bhutan against china. This year indian allotted 267 million USD for Bhutan. India doesn't need to bullie Bhutan. Bhutan just accepts whatever india says. Bhutani military is trained by india. They train in India.
      Real bullie is china china is responsible for salami slicing around bhutani borders.
      India and bhutan has ten-article, perpetual treaty signed right after independence. In this treaty india can't interfere in bhutan's internal matters. Bhutan's external matters are guided by india. Recently china said if bhutani permanently agrees to give certain part of bhutan to china they will return the part china has taken from Bhutan. Bhutan was agreeing to this but india said it's Chinese trap. Why bhutan should permanently give the territories that belongs to bhutan.
      What you are saying is probably Chinese influence operation. China is big bullie in asia

  • @senju2024
    @senju2024 4 місяці тому +7

    They have "Strawberries" on the table while talking about O1. NICE!~

  • @emmanuelgoldstein3682
    @emmanuelgoldstein3682 4 місяці тому +51

    That's the biggest bowl of strawberries I've ever seen

    • @solomonmatthews7921
      @solomonmatthews7921 4 місяці тому +9

      Large beyond reason.

    • @tomenglish9340
      @tomenglish9340 4 місяці тому +4

      @@solomonmatthews7921 Perhaps not strawberries all the way down.

    • @DaronKabe
      @DaronKabe 3 місяці тому

      @@tomenglish9340what model are you?

    • @tomenglish9340
      @tomenglish9340 3 місяці тому

      @@DaronKabe Model T

  • @thatthotho
    @thatthotho 4 місяці тому +16

    How many R's are in the bowl?

    • @Crux69
      @Crux69 4 місяці тому +3

      Technically, none :D

    • @tomenglish9340
      @tomenglish9340 4 місяці тому

      There are 3 R's in STRAWBERRIES, as in STRAWBERRY.

    • @adityakrishnaakula746
      @adityakrishnaakula746 4 місяці тому

      Quite cheeky 😂 that they have a bowl of strawberries there

  • @whemmakatatt5311
    @whemmakatatt5311 4 місяці тому +1

    Dayum , one to watch for suuure

  • @JoshuaGottlieb-oz4er
    @JoshuaGottlieb-oz4er 4 місяці тому

    Great content; thank you

  • @PaddyLamont
    @PaddyLamont 4 місяці тому +3

    That little beep sound before the intro had me guessing whether my headphones had gone haywire.

    • @user-pt1kj5uw3b
      @user-pt1kj5uw3b 4 місяці тому +1

      Same. Felt like a telegram operator interpreting morse code for a second. They need to add a visual component.

  • @ashh3051
    @ashh3051 2 місяці тому

    Has there been any research into letting the model make edits to its reasoning text instead of only being able to append tokens? That way it could think longer and improve the quality of its work.

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w 4 місяці тому +6

    What’s on the paper? Why everyone is staring at theirs?

    • @Crux69
      @Crux69 4 місяці тому +8

      PR and Legal notes from their internal ASI ;)

    • @tomenglish9340
      @tomenglish9340 4 місяці тому +2

      @@Crux69 That's what it looks like to me -- no joke.

  • @spinvalve
    @spinvalve 3 місяці тому

    Is it just me but is the male host from Sequoia there a doppelganger of 3Blue1Brown? Both his voice and appearance is stupendously uncanny

  • @uw10isplaya
    @uw10isplaya 4 місяці тому

    24:11 is the most interesting topic in AI for me

  • @constantinelinardakis8394
    @constantinelinardakis8394 3 місяці тому

    21:30 on STEM in hard reasoning thats why o1 is so good

  • @alexiscao8749
    @alexiscao8749 4 місяці тому +1

    The definition of reasoning: @the ability to consider more options and evaluate the correctness of the choice" isn't that Search for optimal?

    • @tomenglish9340
      @tomenglish9340 4 місяці тому +1

      In a recent talk (I've forgotten which), he made it clear that he was conflating search with reasoning. I wouldn't do that, but I don't think it's a sin.

  • @maxziebell4013
    @maxziebell4013 4 місяці тому

    Great discussion

  • @JumpDiffusion
    @JumpDiffusion 4 місяці тому +8

    9:15 so he basically avoided the question 😏

    • @vnehru1
      @vnehru1 4 місяці тому +1

      Yes. Also noticed.

    • @MrC0MPUT3R
      @MrC0MPUT3R 3 місяці тому

      He didn't really avoid it; he just said he doesn't know. They're hoping that as the reasoning method the model uses is tested in a diverse set of domains that the weaknesses and strengths become clear so that at some point in the future they can actually answer that question and further refine training methods.

  • @prince-din
    @prince-din 4 місяці тому

    Why can't i open my SeqCap?

  • @constantinelinardakis8394
    @constantinelinardakis8394 3 місяці тому

    36:00 on just data and timr

  • @constantinelinardakis8394
    @constantinelinardakis8394 3 місяці тому

    26:22 def on agi

  • @constantinelinardakis8394
    @constantinelinardakis8394 3 місяці тому

    12:00 training on tons of data

  • @redyican5341
    @redyican5341 4 місяці тому

    It will look like this: math > programming > simulations > agents > answering hard open questions
    IMO it easier to source info from real world that to run some simulations. Infinite IQ doesn’t exist. IQ is search is solution space and it has constraints even with best heuristics and we can see in humans that these heuristics are maladaptive when applied to too narrow problems. So for single model trained on general questions wont develop these insane heuristics of 170 IQ people.
    MoE architecture can kinda have this high IQ in different domains.
    Also I think there needs to be ability to act/experiment to answer some harder open problems.
    I still think we need to master online learning but it’s likely that better training on long context can achieve it. Even better if it could adjust weights after

    • @redyican5341
      @redyican5341 4 місяці тому

      I think actually one need to have model rerun after outputting stop token and decide to which questions it want to have answers after own reasoning chain, adjusting these knowledge weights. I kinda know it works like that in pretraining with synthetic data but would be cool to have it live

  • @Mayeverycreaturefindhappiness
    @Mayeverycreaturefindhappiness 4 місяці тому

    they never answered if they have a ongoing experiment where they let it keep thinking.

    • @ashh3051
      @ashh3051 2 місяці тому

      In o1 it would fill the context window with reasoning tokens, wouldn’t it?

    • @Mayeverycreaturefindhappiness
      @Mayeverycreaturefindhappiness 2 місяці тому

      @ when you use 01 the thinking doesn’t go to your context window

  • @BrutalStrike2
    @BrutalStrike2 3 місяці тому

    18:38

  • @redyican5341
    @redyican5341 4 місяці тому +1

    Limit is in energy. We would need energy to outcompete humanity. If it can be cheaper per watt. I think it might work because it doesn’t have to be that general. Anyway happy that rich noobs finally will invest in more energy

  • @superfliping
    @superfliping 4 місяці тому

    Now that most of your top leadership is gone seems like they don't want to invest in it anymore kind of a contradiction to what we are seeing

  • @Drackomass
    @Drackomass 4 місяці тому +2

    Fastest click ever

  • @constantinelinardakis8394
    @constantinelinardakis8394 3 місяці тому

    17:38 left off

  • @DanielleNewnham
    @DanielleNewnham 4 місяці тому +1

    Thimphu is the capital of Bhutan. You're welcome :)

  • @findjoseph
    @findjoseph 4 місяці тому

    W

  • @mpnikhil
    @mpnikhil 4 місяці тому +1

    The capital of Bhutan is Thimphu. System 1 human response 😂.

  • @jamdec123
    @jamdec123 4 місяці тому

    interesting enough conversation However, it may be beneficial to have people possessing models first before designing models clearly. There's a lack of life experience somewhere. anywho, I'll let these guys get back to facilitating AI on how, They can best lick their own parts, PeaceOUT

  • @attilaszasz-mb2sj
    @attilaszasz-mb2sj 3 місяці тому

    someone please tell these people that o1 is not good at all :D

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w 4 місяці тому +4

    Does the seating make sense? I think it would have been better to have girls seated in the center given their height stature.

    • @thenoblerot
      @thenoblerot 4 місяці тому

      I am so unreasonable annoyed for her awful framing! And make sure the guests mics aren't blocking their face!? That said... I was mostly listening, as I'm sure most are.
      Great talk regardless!

    • @tomenglish9340
      @tomenglish9340 4 місяці тому

      Randomized to ensure that there was no gender bias.

  • @videochampion
    @videochampion 3 місяці тому +1

    Dude is smart but such a dorky speaker

    • @videochampion
      @videochampion 3 місяці тому

      Take your time to process your output, like O1... Erhm uhm erhm lol

  • @OBGynKenobi
    @OBGynKenobi 4 місяці тому

    It's not thinking, it's calculating.
    No one thinks that Mathematica or Wolfram alpha is thinking.

    • @MrC0MPUT3R
      @MrC0MPUT3R 3 місяці тому +3

      You're not thinking. Your brain is just undergoing some electrochemical reactions.

  • @wwkk4964
    @wwkk4964 4 місяці тому +2

    Noam has been saying to room fulls of people "you csnt know the capital of Bhutan", which is a very silly (and offebsive example). Most Chinese and Indian sub continent people (40% of world populatikn ) know its thimphu since elementary school.

    • @wwkk4964
      @wwkk4964 4 місяці тому +2

      I'm just saying because it detracts from his other well thought out points but his example is so poor it will make him lose credibility in the wrong audience who can't judge the rest of his claim.

    • @DavidToddSports
      @DavidToddSports 4 місяці тому +9

      That is not what he is saying. He is saying if you don't know the answer when the question is asked, there is no amount of time which is going to allow you to "think" the answer.

    • @wwkk4964
      @wwkk4964 4 місяці тому

      @@DavidToddSports it's not a good example because it's not demonstrating the salience of the point he is making. It's equivalent to saying, no amount of thinking will help you recognise the capital of Australia or canada or Brazil, but is this strictly true?

    • @wwkk4964
      @wwkk4964 4 місяці тому

      @@DavidToddSports here's another way to think about why it's a defective example: "No amount of thinking is going to allow you to know if baseball comes from cricket or vice versa." Is this kind of statement a good example unreachable or computationally disconnected island of knowledge ? I don't think so, it muddles things up because the question is undecidable.

    • @jinhongyu911
      @jinhongyu911 3 місяці тому +1

      @@wwkk4964 I think the example he gave about the capital of Bhutan is perfectly sound given the topic of reasoning. Rather, I think your example of baseball that's the wrong type of example of give the question. What Noam is saying is that, unless you've heard of the name of the capital of Bhutan before, there is no amount of time which is going to allow you to reason out the answer (like david said above), given that you don't have access to the internet or books of course. As for your example of baseball, if you can have enough facts and historical records on hand, I'm sure you can come to a reasonable conclusion of which one came first. Just like the chicken and egg problem, if you have can set the right definitions, you can surely reason out the answer. So again, the capital of Bhutan is not a 'reasoning' problem, you can't figure it out step by step, you either know it or you don't.