Let's Settle the o3-mini Vs. R1 Debate for Coding ONCE and for ALL

Поділитися
Вставка
  • Опубліковано 5 лют 2025

КОМЕНТАРІ • 47

  • @ColeMedin
    @ColeMedin  15 годин тому +1

    The Community Voting period of the oTTomator Hackathon is open! Head on over to the Live Agent Studio now and test out the submissions and vote for your favorite agents. There are so many incredible projects to try out!
    studio.ottomator.ai
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Head on over to RepoCloud with the link below and check out all the incredible open source platforms you can deploy with one click (including bolt.diy, n8n, and Open WebUI!):
    repocloud.io/

  • @5yr0
    @5yr0 Годину тому +2

    The fact it was hard for you to decide or no clear winner says a lot of R1

  • @LessLol
    @LessLol 6 годин тому +2

    Thanks for a very nice video again Cole! Could you maybe share the ultimate prompt to do an AI assistant UI in one shot, without the double message issue and working chat history?

  • @SouthbayJay_com
    @SouthbayJay_com 11 хвилин тому

    Hey Cole! Great video and really great comparison! Thanks for sharing the info! Jay

  • @jacobwilson6296
    @jacobwilson6296 9 годин тому +1

    This was a great video. Taking all your advice into account from other vids and I have to say, great job bro!

    • @ColeMedin
      @ColeMedin  19 секунд тому

      Thanks Jacob, I appreciate it a lot!

  • @brandoneubank447
    @brandoneubank447 14 годин тому +4

    I would have liked to see some mention of how each dealt with their own issues. That is, for example, after r1 didn’t load any history, were you able to ask it to improve that area?
    This would help those of us that want to use these to make things get a better sense of which might be a better product to work with from start to finish. Did you test that?

    • @ColeMedin
      @ColeMedin  13 годин тому +2

      I appreciate the feedback - I agree that would have been good to show! Yes with R1 after a couple more prompts I was able to get it to show the conversation history.

    • @brandoneubank447
      @brandoneubank447 12 годин тому +1

      @@ColeMedinI appreciate your work. You’re killing it. Whenever I see your stuff on my feed, it an instant watch. I hope to start working through your rag stuff soon.

  • @5yr0
    @5yr0 Годину тому

    How would you use both of them together? Wrap them both in a script? I’m working on something like that at the moment wrap 5 wrap 10 etc just figuring out how to do the logic side of the wrapper

  • @Bob-r7v9p
    @Bob-r7v9p 14 годин тому +3

    Can you clarify, was this comparision 03 mini HIGH or just 03 mini low or medium? Also apparently the bolt.diy implementation removes the /thinking tag area in the code generation based upon section? I'm reading 03 has structured output control vs R1 does not and assumed that issue would help agent and bolt.diy type applications?

    • @ColeMedin
      @ColeMedin  13 годин тому +1

      The API only has one version of o3-mini, not sure which one it is actually! OpenRouter actually doesn't show the thinking tokens it seems, because when I go right through the DeepSeek API I see the thinking tokens in a separate place in bolt.diy since we implemented.
      Yes, o3 has structured outputs, but that is not useful for bolt.diy as far as I'm aware since structured outputs is for generating JSON which we don't do behind the scenes.

  • @PreRendered
    @PreRendered 12 годин тому

    I like this comparison, and it makes me think how cool a basic agentic approach would be where you could break queries into different categories that can be routed to different models, recombining the results at the end. I guess this sort of thing suffers from the same kinds of limitations that you have in parallel computing, but it sounds promising.

  • @mellowtones1985
    @mellowtones1985 3 години тому

    RepoCloud looks interesting.

  • @telegrphavenuetv
    @telegrphavenuetv 11 годин тому +1

    Try gemini 2.0 thinking vs r1

  • @d.d.z.
    @d.d.z. 13 годин тому +1

    Hi Cole. Do you know when will be available the 03 mini in the Bolt's Github API options?

    • @ColeMedin
      @ColeMedin  13 годин тому +1

      Probably pretty soon ;)

  • @ImTarekA
    @ImTarekA 15 годин тому +11

    I think the whole DeepSeek R1 hype is overblown. Sure, it’s open-source, but 99.9% of people can’t run the full version at home anyway-so what’s the point? If you use the R1 API, you’re sending all your data straight to China. And the heavily distilled versions are no match for O3-Mini. In the end, R1’s ‘advantages’ feel like a losing proposition for most of us.

    • @ColeMedin
      @ColeMedin  15 годин тому +6

      Yeah the R1 distill models are certainly no match for o3-mini, but no model you can run locally actually is. They are super impressive models for their size and a lot you can do with them - more of which I'll be covering on my channel!
      I do agree though that R1 is overhyped in general

    • @yurijmikhassiak7342
      @yurijmikhassiak7342 14 годин тому +3

      There dozens of providers like even Microsoft that provide full deepseek r1. Cerebras has 2000 tokens/second model. Deepseek is the baseline cheap model now, and all other models will be better than it in a few months (e.g. Google Gemini, Llama, all at minimum will be Deepseek on steroids)

    • @abderrahimelayadi
      @abderrahimelayadi 14 годин тому

      Also Claude is good

    • @Phobos11
      @Phobos11 11 годин тому +4

      You can’t run o3-mini locally either and you can run Deepseek R1 from a lot of providers not in China, precisely because it’s open source, you’re defeating your own argument

    • @madhudson1
      @madhudson1 10 годин тому +1

      I'm still having the most success in terms of reliability with Claude 3.5 sonnet

  • @puremajik
    @puremajik 14 годин тому

    O3 mini - which version ?

    • @ColeMedin
      @ColeMedin  13 годин тому +1

      There is only one option through the API - I'm not actually sure if it is medium or high. Once I find out I'll update the pinned comment!

  • @og_23yg54
    @og_23yg54 9 годин тому +1

    Where is the video RAG. U were making ???? When getting out ???

    • @ColeMedin
      @ColeMedin  30 секунд тому

      Haha it's coming next week! Just takes a while to make it because it's gonna be good :)

  • @PathLink-fk3cp
    @PathLink-fk3cp 15 годин тому +1

    Great comparison!
    Could potentially use 3 model approach to bring them together into a finalized product and communicate between both development processes.
    That would be Dope!

    • @ColeMedin
      @ColeMedin  15 годин тому

      Thanks and yeah that's what I'm thinking!

  • @noviceartisan
    @noviceartisan 13 годин тому +4

    Making todo apps, or simiar ones, is so bloody redundant its PAINFUL to see them keep been used... Pretty please try makign somethign NEW and internetsting, we do not care about to do lists... its' worse than HELLO WORLD lol

    • @ColeMedin
      @ColeMedin  13 годин тому +2

      Yeah I do agree in a sense, though I just wanted to start with something super simple as a first comparison, but then that's why I moved on to more unique use cases for the second two!

    • @noviceartisan
      @noviceartisan 12 годин тому +1

      @@ColeMedin yah, is appreciated. I just get frustrated seeing the same thing done a trillion times, like, it's so simple it's not even a test any more and my brain fully shuts off seeing it repeatedly, unique or at least interesting use cases that do different things each time tho, that keeps me motivated to watch and to learn from the processes :)

    • @xnegusx
      @xnegusx 7 годин тому

      I agree most AI content is shit only Riley Brown is building real stuff

  • @benjaitacademy3053
    @benjaitacademy3053 Годину тому

    R1 is the best :) and cost less

  • @PrashadDey
    @PrashadDey 9 годин тому +1

  • @Timers124
    @Timers124 15 годин тому +1

    Did you know ChatGPT can make ai just make sure it is 4o but you have to add the nlu and the Training data ( I took it from deep seek and gpt 2 )

  • @tcoderex
    @tcoderex 13 годин тому

    u bolt will be powerfull after 2 years

  • @willdoit222
    @willdoit222 14 годин тому +4

    I tested o3 and it sucks

    • @YoucefKrm-i6z
      @YoucefKrm-i6z 13 годин тому +2

      True

    • @theyreatinthecatsndogs
      @theyreatinthecatsndogs 9 годин тому

      😂 What was your "test"

    • @mtor983
      @mtor983 2 години тому

      try 03 mini HIGH - it is capable to build 3d games or any 2d games without any issues. Yes it has less attactive design than r1 but o1mini-high is winner when it comes to following rules of games or defining constrains of the world

  • @annonymbruger
    @annonymbruger 10 годин тому +1

    This is so stupid. There is so many ways to use these models for coding, and saying your strategies and toolchain should be representative for way to use them is so incredibly naive.

  • @ChristiaanRoest79
    @ChristiaanRoest79 9 годин тому +2

    Deepseek R1 is overhyped and overrated.

    • @mtor983
      @mtor983 2 години тому +1

      and insecure