The Community Voting period of the oTTomator Hackathon is open! Head on over to the Live Agent Studio now and test out the submissions and vote for your favorite agents. There are so many incredible projects to try out! studio.ottomator.ai ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Head on over to RepoCloud with the link below and check out all the incredible open source platforms you can deploy with one click (including bolt.diy, n8n, and Open WebUI!): repocloud.io/
Thanks for a very nice video again Cole! Could you maybe share the ultimate prompt to do an AI assistant UI in one shot, without the double message issue and working chat history?
I would have liked to see some mention of how each dealt with their own issues. That is, for example, after r1 didn’t load any history, were you able to ask it to improve that area? This would help those of us that want to use these to make things get a better sense of which might be a better product to work with from start to finish. Did you test that?
I appreciate the feedback - I agree that would have been good to show! Yes with R1 after a couple more prompts I was able to get it to show the conversation history.
@@ColeMedinI appreciate your work. You’re killing it. Whenever I see your stuff on my feed, it an instant watch. I hope to start working through your rag stuff soon.
How would you use both of them together? Wrap them both in a script? I’m working on something like that at the moment wrap 5 wrap 10 etc just figuring out how to do the logic side of the wrapper
Can you clarify, was this comparision 03 mini HIGH or just 03 mini low or medium? Also apparently the bolt.diy implementation removes the /thinking tag area in the code generation based upon section? I'm reading 03 has structured output control vs R1 does not and assumed that issue would help agent and bolt.diy type applications?
The API only has one version of o3-mini, not sure which one it is actually! OpenRouter actually doesn't show the thinking tokens it seems, because when I go right through the DeepSeek API I see the thinking tokens in a separate place in bolt.diy since we implemented. Yes, o3 has structured outputs, but that is not useful for bolt.diy as far as I'm aware since structured outputs is for generating JSON which we don't do behind the scenes.
I like this comparison, and it makes me think how cool a basic agentic approach would be where you could break queries into different categories that can be routed to different models, recombining the results at the end. I guess this sort of thing suffers from the same kinds of limitations that you have in parallel computing, but it sounds promising.
I think the whole DeepSeek R1 hype is overblown. Sure, it’s open-source, but 99.9% of people can’t run the full version at home anyway-so what’s the point? If you use the R1 API, you’re sending all your data straight to China. And the heavily distilled versions are no match for O3-Mini. In the end, R1’s ‘advantages’ feel like a losing proposition for most of us.
Yeah the R1 distill models are certainly no match for o3-mini, but no model you can run locally actually is. They are super impressive models for their size and a lot you can do with them - more of which I'll be covering on my channel! I do agree though that R1 is overhyped in general
There dozens of providers like even Microsoft that provide full deepseek r1. Cerebras has 2000 tokens/second model. Deepseek is the baseline cheap model now, and all other models will be better than it in a few months (e.g. Google Gemini, Llama, all at minimum will be Deepseek on steroids)
You can’t run o3-mini locally either and you can run Deepseek R1 from a lot of providers not in China, precisely because it’s open source, you’re defeating your own argument
Great comparison! Could potentially use 3 model approach to bring them together into a finalized product and communicate between both development processes. That would be Dope!
Making todo apps, or simiar ones, is so bloody redundant its PAINFUL to see them keep been used... Pretty please try makign somethign NEW and internetsting, we do not care about to do lists... its' worse than HELLO WORLD lol
Yeah I do agree in a sense, though I just wanted to start with something super simple as a first comparison, but then that's why I moved on to more unique use cases for the second two!
@@ColeMedin yah, is appreciated. I just get frustrated seeing the same thing done a trillion times, like, it's so simple it's not even a test any more and my brain fully shuts off seeing it repeatedly, unique or at least interesting use cases that do different things each time tho, that keeps me motivated to watch and to learn from the processes :)
try 03 mini HIGH - it is capable to build 3d games or any 2d games without any issues. Yes it has less attactive design than r1 but o1mini-high is winner when it comes to following rules of games or defining constrains of the world
This is so stupid. There is so many ways to use these models for coding, and saying your strategies and toolchain should be representative for way to use them is so incredibly naive.
The Community Voting period of the oTTomator Hackathon is open! Head on over to the Live Agent Studio now and test out the submissions and vote for your favorite agents. There are so many incredible projects to try out!
studio.ottomator.ai
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Head on over to RepoCloud with the link below and check out all the incredible open source platforms you can deploy with one click (including bolt.diy, n8n, and Open WebUI!):
repocloud.io/
The fact it was hard for you to decide or no clear winner says a lot of R1
Thanks for a very nice video again Cole! Could you maybe share the ultimate prompt to do an AI assistant UI in one shot, without the double message issue and working chat history?
Hey Cole! Great video and really great comparison! Thanks for sharing the info! Jay
This was a great video. Taking all your advice into account from other vids and I have to say, great job bro!
Thanks Jacob, I appreciate it a lot!
I would have liked to see some mention of how each dealt with their own issues. That is, for example, after r1 didn’t load any history, were you able to ask it to improve that area?
This would help those of us that want to use these to make things get a better sense of which might be a better product to work with from start to finish. Did you test that?
I appreciate the feedback - I agree that would have been good to show! Yes with R1 after a couple more prompts I was able to get it to show the conversation history.
@@ColeMedinI appreciate your work. You’re killing it. Whenever I see your stuff on my feed, it an instant watch. I hope to start working through your rag stuff soon.
How would you use both of them together? Wrap them both in a script? I’m working on something like that at the moment wrap 5 wrap 10 etc just figuring out how to do the logic side of the wrapper
Can you clarify, was this comparision 03 mini HIGH or just 03 mini low or medium? Also apparently the bolt.diy implementation removes the /thinking tag area in the code generation based upon section? I'm reading 03 has structured output control vs R1 does not and assumed that issue would help agent and bolt.diy type applications?
The API only has one version of o3-mini, not sure which one it is actually! OpenRouter actually doesn't show the thinking tokens it seems, because when I go right through the DeepSeek API I see the thinking tokens in a separate place in bolt.diy since we implemented.
Yes, o3 has structured outputs, but that is not useful for bolt.diy as far as I'm aware since structured outputs is for generating JSON which we don't do behind the scenes.
I like this comparison, and it makes me think how cool a basic agentic approach would be where you could break queries into different categories that can be routed to different models, recombining the results at the end. I guess this sort of thing suffers from the same kinds of limitations that you have in parallel computing, but it sounds promising.
RepoCloud looks interesting.
Try gemini 2.0 thinking vs r1
Hi Cole. Do you know when will be available the 03 mini in the Bolt's Github API options?
Probably pretty soon ;)
I think the whole DeepSeek R1 hype is overblown. Sure, it’s open-source, but 99.9% of people can’t run the full version at home anyway-so what’s the point? If you use the R1 API, you’re sending all your data straight to China. And the heavily distilled versions are no match for O3-Mini. In the end, R1’s ‘advantages’ feel like a losing proposition for most of us.
Yeah the R1 distill models are certainly no match for o3-mini, but no model you can run locally actually is. They are super impressive models for their size and a lot you can do with them - more of which I'll be covering on my channel!
I do agree though that R1 is overhyped in general
There dozens of providers like even Microsoft that provide full deepseek r1. Cerebras has 2000 tokens/second model. Deepseek is the baseline cheap model now, and all other models will be better than it in a few months (e.g. Google Gemini, Llama, all at minimum will be Deepseek on steroids)
Also Claude is good
You can’t run o3-mini locally either and you can run Deepseek R1 from a lot of providers not in China, precisely because it’s open source, you’re defeating your own argument
I'm still having the most success in terms of reliability with Claude 3.5 sonnet
O3 mini - which version ?
There is only one option through the API - I'm not actually sure if it is medium or high. Once I find out I'll update the pinned comment!
Where is the video RAG. U were making ???? When getting out ???
Haha it's coming next week! Just takes a while to make it because it's gonna be good :)
Great comparison!
Could potentially use 3 model approach to bring them together into a finalized product and communicate between both development processes.
That would be Dope!
Thanks and yeah that's what I'm thinking!
Making todo apps, or simiar ones, is so bloody redundant its PAINFUL to see them keep been used... Pretty please try makign somethign NEW and internetsting, we do not care about to do lists... its' worse than HELLO WORLD lol
Yeah I do agree in a sense, though I just wanted to start with something super simple as a first comparison, but then that's why I moved on to more unique use cases for the second two!
@@ColeMedin yah, is appreciated. I just get frustrated seeing the same thing done a trillion times, like, it's so simple it's not even a test any more and my brain fully shuts off seeing it repeatedly, unique or at least interesting use cases that do different things each time tho, that keeps me motivated to watch and to learn from the processes :)
I agree most AI content is shit only Riley Brown is building real stuff
R1 is the best :) and cost less
❤
Did you know ChatGPT can make ai just make sure it is 4o but you have to add the nlu and the Training data ( I took it from deep seek and gpt 2 )
u bolt will be powerfull after 2 years
I tested o3 and it sucks
True
😂 What was your "test"
try 03 mini HIGH - it is capable to build 3d games or any 2d games without any issues. Yes it has less attactive design than r1 but o1mini-high is winner when it comes to following rules of games or defining constrains of the world
This is so stupid. There is so many ways to use these models for coding, and saying your strategies and toolchain should be representative for way to use them is so incredibly naive.
Deepseek R1 is overhyped and overrated.
and insecure