I just tried Gemini 2.0 Thinking experimental. It was absolutely horrible for my use case! I am the record holder for the fastest assembly routines for 6502 arithmetic, and I tried to prompt it towards the optimal solution, but it just doesn't understand. It had no concept that this CPU can only operate on 8 bit values. It tried to load the high byte of a 32 bit number into a register, then the low byte, thus missing the middle two bytes. It also doesn't understand the concept of self-modifying code, even when given an example. Claude and Deepseek are the only models which produced sensible results in my case. Claude even used fragments of my world record post online in it's commentary; obviously my code was in it's data set.
R1 with a provider like Hyperbolic is working for me. DeepSeek V3 also works well with providers like OpenRouter. Then use Sonnet if any of them are stuck, because it's expensive @@elbrody252
I was looking through your videos to find that tool that you demonstrated a while ago that allows you to point to several models and compare their outputs. What's the name of the tool?
Yeah the leader of Gemini AI guy said that he wasn't as impressed over the performance boost, but he probably means just the 2.0 flash because it been a while ago.
I think the only benefit of using these models are for their multi-model capabilities. At least google had the good sense to make the models cheap considering they are not as good as V3 or R1. Gemini version 2.5 or 3 should be as good or better than DS V3/R1 or don't even put out another model. Now, there is no excuse why Google would produce a closed source product that would not perform as good as DS models. Just figure out how to increase the context size, add multi model capabilities, train the model using DS process and Google would have a product worth paying for IMO. Lastly, google could probably train a DS/MOE type model a lot faster given the fact google has access to Nvidia's latest and greatest processors.
I'm currently using Gemini 2.0 Flash Experimental using API key in my VSCODE Continue extension. What should I change to point to the stable Gemini 2.0 Flash model ?
This model can deepthink with a prompt in system instruction. Still fixates on hallucinations, but imho not as bad as previous models from google. Strong nudging can help, but it blames user when it finally acknowledges its error.
ye true AICodeKing; is *insanely better*; brightening our days; transcending his cohort; always bringing forth the freshest produce and impeccable delivery
You point about deepseek The api is down for straight 7 days and the chat interface also does not respond in deep think mode the how can we compare it with others if not work at all.
10:32 Gemini 2 pro just designed the best SVG butterfly so far! No other model was even close to this!
I used Gemini 2.0 and it solved a code that Claude 3.5 sonnet wasn’t able to. I thought it did it accidentally.
the same Gemini 2.0 is not better than Claude 3.5
same never expected anything from gemini, even qwen 2.5 max seems better
But Agentic capability are same bad as R1,V3
@@mrkoalamanda Anyway if it involves just one LLM and API's then the task can be accomplished by codes and wouldn't need agents.
Gemini 2.0 Flash Thinking with Search is my favorite! It automatically determines if a search if needed before answering.
I think in terms of performance Gemini lost to R1, but the 1M context window is enough to make an app in one chat
You are the best channel, who says what audiences wants to listen, no bullshitting and straight to point. Keep up the good job. ❤
the context window makes up for it 1m token is more than anyone
It is 2 million, sir.
@@cbgaming08 For pro it is 2 Million, for flash 1 Million
@@fabiankliebhan yezzir!
Deepseek is unusable for a week. I tried both Flash 2.0 and pro. Looks good for my use case.
I wish there were UI benchmarks. I use it a TON for UI development because I am so bad at design
I just tried Gemini 2.0 Thinking experimental. It was absolutely horrible for my use case! I am the record holder for the fastest assembly routines for 6502 arithmetic, and I tried to prompt it towards the optimal solution, but it just doesn't understand. It had no concept that this CPU can only operate on 8 bit values. It tried to load the high byte of a 32 bit number into a register, then the low byte, thus missing the middle two bytes. It also doesn't understand the concept of self-modifying code, even when given an example. Claude and Deepseek are the only models which produced sensible results in my case. Claude even used fragments of my world record post online in it's commentary; obviously my code was in it's data set.
Oh those Butterfly's are nice, must be the image generation training data part of these models doing its thing!
Gemini 2.0 Pro is very bad at coding and makes tons of silly mistakes.
Yeah and also works super bad with n8n ai agents
what cheap ia to use for code then?
R1 with a provider like Hyperbolic is working for me. DeepSeek V3 also works well with providers like OpenRouter. Then use Sonnet if any of them are stuck, because it's expensive @@elbrody252
All google things are hype kn UA-cam but fall bad when test it in real life lol, practical to own the platform 😂
Just check Question 4, and you’ll know directly if LLMs are a beast or not ...
I was looking through your videos to find that tool that you demonstrated a while ago that allows you to point to several models and compare their outputs. What's the name of the tool?
Think that's ninjachat
Yeah the leader of Gemini AI guy said that he wasn't as impressed over the performance boost, but he probably means just the 2.0 flash because it been a while ago.
Deepseek-r1 is king for LLM model for now😂😂
GEMINI is my go to for a while with RooCode, it much better than R1, like by miles
Is this model a thinking model? Pretty crazy that deepseek is still on top.
No. The pro think model is not out yet I believe
I think the only benefit of using these models are for their multi-model capabilities. At least google had the good sense to make the models cheap considering they are not as good as V3 or R1. Gemini version 2.5 or 3 should be as good or better than DS V3/R1 or don't even put out another model. Now, there is no excuse why Google would produce a closed source product that would not perform as good as DS models. Just figure out how to increase the context size, add multi model capabilities, train the model using DS process and Google would have a product worth paying for IMO. Lastly, google could probably train a DS/MOE type model a lot faster given the fact google has access to Nvidia's latest and greatest processors.
Wow another video ❤
Is it better than Gemini Exp 1206? It seems to me that this is the same model, just renamed.
Probably yes.
no worse
I'm currently using Gemini 2.0 Flash Experimental using API key in my VSCODE Continue extension. What should I change to point to the stable Gemini 2.0 Flash model ?
2 Million Token Window, Man Anthropic just doing Nothing man
Do you guys think that it's now the best model to use in free Copilot in VS Code?
No, Sonnet is still king
it's good but how do they achieve 1 million tokens context, is it a sliding context window?
No, I think its full context Window.
Love your videos.
Please compare low costs models for api devs for text generation.
gpt 4o-mini vs gemini 2.0 flash vs gemini 2.0 flash lite
Between the ones you mentioned it's definitely Gemini 2 Flash
@@MarvijoSoftware Thank you
😶🌫️😶🌫️😶🌫️😶🌫️
How do you churn out videos at this rate? I can barely keep up😅
This model can deepthink with a prompt in system instruction. Still fixates on hallucinations, but imho not as bad as previous models from google. Strong nudging can help, but it blames user when it finally acknowledges its error.
ye true AICodeKing; is *insanely better*; brightening our days; transcending his cohort; always bringing forth the freshest produce and impeccable delivery
You point about deepseek The api is down for straight 7 days and the chat interface also does not respond in deep think mode the how can we compare it with others if not work at all.
It's an opensource model. There are a bunch of providers who have it as well including Together, Hyperbolic and a bunch of others.
@AICodeKing they cost more than the deepseek
@@AnuragRai-xy7kh you have the option to run it locally
@neazybanga yeh but it won't be that effective as 600B model I know open source and all but it should be reliable as well
nVidia NIM offers free 1000 credits and it has R1
But how come its stock plummeted this morning?
Revenue is down a bit. Probably bc people are starting to use google search less as there are better AI assisted search alternatives.
hey bro, from somalia
just found out i can switch between different audio languages on youtube
Both of those synth keyboards the tuning sounded wrong
Alright, meta's turn.
Thank you
Try using it with sonnet 3.5 system prompt
well its not even that good though there thinking model is kinda cracked
google = too much DEI
hard to justify any other ai when deepseek is around.
Agreed. Even o3-mini doesn't compete when I tested
Add to new test to make android app
2.0 better than sonnet atm
I beg to differ. How did you test them?
🤑