Google’s NEW Open-Source Model Is SHOCKINGLY BAD

Поділитися
Вставка
  • Опубліковано 8 сер 2024
  • Sorry for the title. I couldn't help myself. I'm proud of Google for releasing a completely open-source model to the world, but it's not good. How bad is it? Let's find it out!
    Enjoy :)
    Join My Newsletter for Regular AI Updates 👇🏼
    www.matthewberman.com
    Need AI Consulting? ✅
    forwardfuture.ai/
    Rent a GPU (MassedCompute) 🚀
    bit.ly/matthew-berman-youtube
    USE CODE "MatthewBerman" for 50% discount
    My Links 🔗
    👉🏻 Subscribe: / @matthew_berman
    👉🏻 Twitter: / matthewberman
    👉🏻 Discord: / discord
    👉🏻 Patreon: / matthewberman
    Media/Sponsorship Inquiries 📈
    bit.ly/44TC45V
    Links:
    huggingface.co/google/gemma-7...
    blog.google/technology/develo...
    bit.ly/3qHV0X7
    lmstudio.ai/
    huggingface.co/chat/
    Chapters:
    0:00 - It started badly...
    0:53 - All about Gemma
    7:23 - Quick Note on Gemini 1.5
    9:56 - Gemma Setup with LMStudio
    11:51 - Gemma Testing with LMStudio
    20:58 - Gemma Testing with HuggingFace
    Disclosures:
    I'm an investor in LMStudio
  • Наука та технологія

КОМЕНТАРІ • 528

  • @ALFTHADRADDAD
    @ALFTHADRADDAD 5 місяців тому +237

    Google SHOCKS and STUNS the Open source landscape

    • @matthew_berman
      @matthew_berman  5 місяців тому +87

      I should have used this title

    • @TechRenamed
      @TechRenamed 5 місяців тому +10

      Lol we all should have!!

    • @mickelodiansurname9578
      @mickelodiansurname9578 5 місяців тому +8

      @@matthew_bermanI thought at one stage you were literally going to start slapping your forehead off the keyboard!

    • @andersonsystem2
      @andersonsystem2 5 місяців тому +6

      Why does most Ai tech channels use that title 😂I just don’t pay attention to titles like that lmao 😂😊

    • @Kutsushita_yukino
      @Kutsushita_yukino 5 місяців тому +11

      its a meme at this point

  • @Lukebussnick
    @Lukebussnick 5 місяців тому +101

    My funniest experience with Gemini pro was, I asked it to make a humorous image of a cartoon cat pulling the toilet paper off the roll. It told me that ethically couldn’t because the cat could ingest the toilet paper and it could cause an intestinal blockage 😂

    • @laviniag8269
      @laviniag8269 5 місяців тому +3

      histerical

    • @matikaevur6299
      @matikaevur6299 5 місяців тому +1

      @@laviniag8269
      but true . .

    • @MilkGlue-xg5vj
      @MilkGlue-xg5vj 5 місяців тому +5

      Maybe a cat could see the image and do the same

    • @Lukebussnick
      @Lukebussnick 5 місяців тому +4

      @@MilkGlue-xg5vj haha yeah that would be a real nuisance. But then again, that’s one smart cat. What other potential could that cat have?? 🧐

    • @MilkGlue-xg5vj
      @MilkGlue-xg5vj 5 місяців тому +10

      @@Lukebussnick Maybe it could become an ai dev at Google

  • @bits_of_bryce
    @bits_of_bryce 5 місяців тому +88

    Well, I'm never trusting benchmarks without personal testing again.

    • @richoffks
      @richoffks 5 місяців тому +9

      sorry you had to learn this way

    • @wilburdemitel8468
      @wilburdemitel8468 5 місяців тому

      welcome to real life. Can't wait for you to leave the fantasyland bubble all these tech aibros have built around you.

  • @Batmancontingencyplans
    @Batmancontingencyplans 5 місяців тому +106

    Gemma 7b makes you realise how much compute Google is using just to output sorry I can't fulfill that request 🤣

    • @MilkGlue-xg5vj
      @MilkGlue-xg5vj 5 місяців тому

      LMFAO

    • @markjones2349
      @markjones2349 5 місяців тому

      So true. Uncensored models are just more fun.

    • @MilkGlue-xg5vj
      @MilkGlue-xg5vj 5 місяців тому

      @@markjones2349 you're talking as if the point of uncensored llms is fun rofl lmfao xd you're just makin' it funnier 🤣

  • @bigglyguy8429
    @bigglyguy8429 5 місяців тому +133

    You can't gimp the model with excessive censorship, and also have an intelligent model.

    • @aoolmay6853
      @aoolmay6853 5 місяців тому +22

      These are not open models, these are woke models, appropriately liberal.

    • @eIicit
      @eIicit 5 місяців тому +2

      To a point, I agree.

    • @madimakes
      @madimakes 5 місяців тому +3

      The nature of the errors here seem irrelevant to being censored or not.

    • @bigglyguy8429
      @bigglyguy8429 5 місяців тому +11

      @@madimakes No, the censorship sucks up so much of it's thinking there's little left to actually answer. You can ask the most banal question but it sits there thinking long and hard about if there's any way that could possibly be offensive to the woke? Considered the woke are offended by everything, that's a yes, so it has to work its way around that, then it needs to figure out if it's own reply is offensive (yes, everything is), so it has to find a way around that as well. Often it will fail and say "I'm afraid I can't do that... Dave." Other times it will try, but the answer so gimped and pathetic you'd have been better off asking your cat.

    • @mickmoon6887
      @mickmoon6887 5 місяців тому +1

      Exactly
      Model network design is gimped from the creator developers itself when head of Google AI is literally have biased ideological, anti white and very censorship values all proven with their online records that's why those biases reflect onto the model

  • @zeal00001
    @zeal00001 5 місяців тому +44

    In other words, there are now LLMs with mental challenges as well...

  • @drgutman
    @drgutman 5 місяців тому +50

    I'm pretty sure they lobotomized it in the alignment phase :)))

    • @hikaroto2791
      @hikaroto2791 5 місяців тому +7

      To the point they took the lobotomy fragment and used it in place of the brain, and trashed the actual brain. Not only on models, but on personnel probably

  • @chriscarr9852
    @chriscarr9852 5 місяців тому +138

    This is entirely speculation on my part, but I am guessing Google’s AI effort is largely driven by their PR team. A proper engineering team would never release this kind of smoke and mirrors crap. Right?

    • @chriscarr9852
      @chriscarr9852 5 місяців тому +23

      They have tarnished their brand. It will be interesting to see what happens in the next few years with regard to Google. (I do not have any financial interest in google).

    • @mistersunday_
      @mistersunday_ 5 місяців тому +7

      Yeah, they are the wrong kind of hacks now

    • @alttabby3633
      @alttabby3633 5 місяців тому +12

      Or engineering team knows this will be killed off regardless of quality or popularity so why bother.

    • @richoffks
      @richoffks 5 місяців тому +2

      @@chriscarr9852 we're watching the end of Google smh

    • @michaelcalmeyerhentschel8304
      @michaelcalmeyerhentschel8304 5 місяців тому +7

      No, Left. They are all one viewpoint at Google and have been so for decades. The PR folks represent the programmers and their programmer-managers and Sr. management.

  • @natecote1058
    @natecote1058 5 місяців тому +17

    If google keeps messing around with their censored models and under performing open source models, they'll get left in the dust. Mistral could end up way ahead of them in the next few months. They should find that embarassing...

  • @NOTNOTJON
    @NOTNOTJON 5 місяців тому +12

    Plot twist, Google was so far behind the AI race that they had to ask Llama or GPT 4 to create a model from scratch and this is what they named Gemini / Gemma.

    • @tteokl
      @tteokl 5 місяців тому

      google is so far behind these days, I love Google's design language tho, but their tech ? meh

  • @deflagg
    @deflagg 5 місяців тому +66

    Gemini Advanced is bad, too, compared to gpt4. Gemini sometimes answers in a different language, too cautious, and gets things wrong a lot of the times.

    • @CruelCrusader90
      @CruelCrusader90 5 місяців тому +11

      "too cautious" is an understatement.

    • @veqv
      @veqv 5 місяців тому +5

      @@CruelCrusader90 Genuinely. If it's not a question about software development there's a wildly high chance that it'll start quizzing you on why you have the right to know things. I do hobby electronics and wanted to see how it would fare on helping make a charging circuit. It basically refused. Same is true for rectifiers. Too dangerous for me apparently lol. Ask it questions on infosec and it'll answer fine though. It's wild.

    • @richoffks
      @richoffks 5 місяців тому

      @@veqv lmao it refused, all anyone has to do is release a competely uncensored model and they literally take over the industry from their house. I dont know why google is such a fail at every product launch.

    • @CruelCrusader90
      @CruelCrusader90 5 місяців тому

      @@veqv yea i had a similar experience. i asked it to generate a top front and side view of a vehicle chassis to create a 3d model in blender. (for a project im working on) it said the same thing, its to dangerous to generate the image.
      i didnt expect it to make a good/consistent vehicle chassis across all the angles but i was curious to see how far it was from making it possible. and i dont even know how to scale its potential with that kind of a developer behind its programming.
      even a one would represent progression at its slowest form, but that would be generous.

    • @Ferreira019760
      @Ferreira019760 5 місяців тому

      Bad doesn't begin to cut it. At this rate, Google will become irrelevant in most of it's services. It makes no difference how much money they have, their policy is wrong and the AI models show it. They are so scared of offending someone or being made liable that their AI actually dictates what happens in the interactions with the users. That doesn't just make it annoying and time wasteful, it means that it cannot learn.Even worse than not learning, it's becoming dumber by the day. I cannot believe i'm saying this, but i miss bard. Gemini doesn't cut it in away way, shape or form. It's probably good for philosophy exercises, but so far I don't see any decent use for it aside from that. Give it enough space to go off in wild tangents and you may get a potentially interesting conversation, but don't expect anything productive from it. I'm done with trying out Google's crap for some time. Maybe in a month or two I will allow myself the luxury of wasting time again to see how they are doing, but not for now. Their free trial is costing me money, that's how bad it is.

  • @jbo8540
    @jbo8540 5 місяців тому +27

    Google set the entire OS community back a half hour with this troll release. well played google

    • @romantroman6270
      @romantroman6270 5 місяців тому +2

      Don't worry, Llama 3 will set the Open Source community 31 minutes ahead lol

  • @trsd8640
    @trsd8640 5 місяців тому +37

    This shows one thing: We need other kind of benchmarks.
    But great video Matthew, thanks!

    • @MM3Soapgoblin
      @MM3Soapgoblin 5 місяців тому +7

      Deepmind has done some pretty amazing work in the machine learning space. My bet is that they created a fantastic model and that's what was benchmarked. Then the Google execs came along and "fixed" the model for "safety" and this is the result.

    • @R0cky0
      @R0cky0 5 місяців тому +1

      Let's call it Matthew Benchmark

    • @R0cky0
      @R0cky0 5 місяців тому

      @@MM3SoapgoblinDeepmind should spinoff from Google. It's a shame that they still run under the now Google giving their amazing works in the past

  • @AIRadar-mc4jx
    @AIRadar-mc4jx 5 місяців тому +35

    Hey Mathew, it's not open-source model because they are not releasing the source code. It's open-weight or open model.

    • @PMX
      @PMX 5 місяців тому +3

      But... they did? At least for inference, they uploaded both python and cpp implementations of the inference engine for Gemma to github. Which I suspect have bugs since I can't otherwise understand how they can release a model that performs this poorly..

    • @judedavis92
      @judedavis92 5 місяців тому +2

      Yeah they did release code.

  • @mathematicalninja2756
    @mathematicalninja2756 5 місяців тому +20

    On a bright side, we have a top end model to generate reject responses in the DPO

    • @user-qr4jf4tv2x
      @user-qr4jf4tv2x 5 місяців тому +8

      can we not have acronyms 😭

    • @Alice_Fumo
      @Alice_Fumo 5 місяців тому +9

      @@user-qr4jf4tv2x I believe DPO in this context stands for "Direct Preference Optimization" which is a recent alternative technique to RLHF, but with less steps and thus more efficient.
      I'm actually not 100% sure, but I believe the joke here is that if you try employing this model for DPO to "align" any other base-model, what you get is another model which only ever refuses to respond to anything.

  • @sandeepghael763
    @sandeepghael763 5 місяців тому +12

    @matthew Berman I think something is wrong with your test setup. I tested the `python 1 to 100` example with Gemma 7B via Ollama, 4bit quantized version (running on CPU) and the model did just fine. Check your prompt template or other setup config.

    • @hidroman1993
      @hidroman1993 5 місяців тому +1

      He was already recording, so he didn't want to check the setup LOL

  • @f4ture
    @f4ture 5 місяців тому +40

    Google’s NEW Open-Source Model Is so BAD... It SHOCKED The ENTIRE Industry!

  • @Greenthum6
    @Greenthum6 5 місяців тому +22

    I was absolutely paralyzed by the performance of this model.

    • @Wanderer2035
      @Wanderer2035 5 місяців тому +2

      Me: I send Pikachu GO! Use STUN attack on Greenthum6 NOW!
      Pikachu: Pika Pika Pika!!! BBBZZZZZZZZZ ⚡️⚡️⚡️⚡️⚡️
      Me: Greenthum6 seems to be in some form of paralysis. Quick Pikachu follow that up with a STUN attack on Greenthum6 NOW! Give him everything you got!!!
      Pikachu: PIKA…. PIKAAAAAAAAAAA……. CHUUUUUUUUUUUUUUUU!!!!!!!
      BBBBBBBBBBZZZZZZZZZZZZZZZZ ⚡️⚡️⚡️⚡️⚡️⚡️⚡️⚡️
      Greenthum6 = ☠️ ☠️☠️
      Me: Aaaahhh that was nice, I’m sure Greenthum6 will make a nice pokimon to my collection 🙂. **I throw my pokiball to Greenthum6 and it captures him as my new pokimon to my collection**

  • @snowhan7006
    @snowhan7006 5 місяців тому +24

    This looks like a hastily completed homework assignment by a student to meet the deadline

    • @shujin6600
      @shujin6600 5 місяців тому +3

      and that student was highly political and was easy offended to anything

  • @mistersunday_
    @mistersunday_ 5 місяців тому +33

    Until Google spends less time on woke and more time on work, I'm not touching any of their products with a 10-foot pole

    • @Alistone4512
      @Alistone4512 5 місяців тому +3

      - by a person on UA-cam :P

    • @StriderAngel496
      @StriderAngel496 5 місяців тому

      truuuu but you know what he meant lol@@Alistone4512

  • @Nik.leonard
    @Nik.leonard 5 місяців тому +7

    At the moment, there is a couple of issues with quantization and running the model in llama.cpp (LM Studio uses llama.cpp as backend), so when the issues are fixed, I'm going to re-test the model. That's because is weird that the 2b model gets better responses than the "7b" (really is more like 8.something) model.

  • @protovici1476
    @protovici1476 5 місяців тому +7

    I'm wondering if this is technically half open-sourced given some critical components aren't available from Google.

  • @antigravityinc
    @antigravityinc 5 місяців тому +4

    It’s like asking an undercover alien to explain normal Earth things. No.

  • @Random_person_07
    @Random_person_07 5 місяців тому +3

    The thing about Gemini is it has the memory of a goldfish it can barely hold on to any context and you always have to tell it what its supposed to write

  • @BTFranklin
    @BTFranklin 5 місяців тому +8

    Could you try lowering the temperature? The answers when you were running it locally look a lot like what I'd expect if the temp was set too high.

  • @hawa7264
    @hawa7264 5 місяців тому +7

    The 2B-Version of Gemma is quite good for a 2b model actually. The 7b model is - a car crash.

    • @frobinator
      @frobinator 5 місяців тому

      I found the same, the 2B model is much better than the 7B for my set of tasks.

  • @pixels7223
    @pixels7223 5 місяців тому +3

    I like that you tried it on Hugging Face, cause now I can say with certainty: "Google, why?"

  • @PoorNeighbor
    @PoorNeighbor 5 місяців тому +9

    That was actually really funny. The answers are so out of the blue Mannn

  • @DeSinc
    @DeSinc 5 місяців тому +1

    Looking at those misspellings and odd symbols all through the code examples, it's clear to see something is mis-tuned in the params for whatever ui you're using not being updated to support this new model. Apparently the interface I was using it with has corrected this as I was able to get coherent text with no misspelling but I did see people online saying they were having the same trouble as you, incoherent text and obvious mistakes everywhere. It's likely something wrong with the parameters that must be updated to values that the model works best with.

  • @puremintsoftware
    @puremintsoftware 5 місяців тому +2

    Imagine if Ed Sheeran released that video of DJ Khaled hitting an acoustic guitar, and said "This is my latest Open Source song". Yep. That's this.

  • @himeshpunj6582
    @himeshpunj6582 5 місяців тому +4

    Please do fine-tuning based on private data

  • @phrozen755
    @phrozen755 5 місяців тому +9

    Yikes google! 😬

  • @michaelrichey8516
    @michaelrichey8516 5 місяців тому +1

    Yeah - I was running this yesterday and ran into the same things - as well as the censorship, where it decided that my "I slit a sheet" tongue twister was about self-harm and refused to give an analysis.

  • @TylerHall56
    @TylerHall56 5 місяців тому +1

    The settings on Kaggle may help- This widget uses the following settings: Temperature: 0.4, Max output tokens: 128, Top-K: 5.

  • @nadinejammet7683
    @nadinejammet7683 5 місяців тому +2

    I think you didn't use the right prompt format. It is an error that a lot of people do with open-source LLMs.

  • @erikjohnson9112
    @erikjohnson9112 5 місяців тому +2

    Maybe it was a spelling error by Google: "State of the fart AI model". Yeah this model stinks. Yeah I am exhibiting a 14-year old intellect.

    • @AlexanderBukh
      @AlexanderBukh 5 місяців тому +1

      State of brain fart it is.

  • @oriyonay8825
    @oriyonay8825 5 місяців тому +3

    Each parameter is just a floating point number (assuming no quantization) which takes 4 bytes. So 7b parameters is roughly 7b * 4 bytes = 28gb, so 34gb is not that surprising :)

  • @MattJonesYT
    @MattJonesYT 5 місяців тому +2

    Have you noticed that chatgpt4 is very bad in the last few days? Like it can't remember more than about 5 messages in the conversation and it constantly says things like "I can't help you with that" on random topics that have nothing to do with politics or anything sensitive. It's like they've got the guardrails dialed to randomly clamp down to a millimeter and it can't do anything useful half the time. I have to restart the conversation to get it to continue.

    • @blisphul8084
      @blisphul8084 5 місяців тому +1

      They switched to gpt4 turbo. The old gpt4 via API is better

  • @robertheinrich2994
    @robertheinrich2994 5 місяців тому

    just to ask, how do I get the lastest version for linux, when it is just updated for windows and mac, but not linux?
    does LMstudio work with wine?

  • @zerorusher
    @zerorusher 5 місяців тому +3

    Google STUNS Gemma SHOCKING everyone

  • @chrisbranch8022
    @chrisbranch8022 5 місяців тому +1

    Google is having it's Blockbuster Video moment - this is embarrassingly bad

  • @HistoryIsAbsurd
    @HistoryIsAbsurd 5 місяців тому +8

    See, when you save the word SHOCKING for when its actually SHOCKING, its WAY more impactful & doesnt sound like you are spitting in the face of your community.
    Great video! Their half open sourced LLM is hilariously bad

  • @agxxxi
    @agxxxi 5 місяців тому +2

    It apparently has a very different prompt template. You should definitely try that 13:26 but model is still kinda huge but unsatisfactory for this demo 😮

  • @33gbm
    @33gbm 5 місяців тому +2

    The only Google AI branch I stll find credible is DeepMind. I hope they don't ruin it as well.

  • @liberty-matrix
    @liberty-matrix 5 місяців тому +2

    "AI will probably most likely lead to the end of the world, but in the meantime, there will be great companies." ~Sam Altman, CEO of OpenAI

  • @yogiwp_
    @yogiwp_ 5 місяців тому +1

    Instead of Artificial Intelligence we got Genuine Stupidity

  • @davelundie2866
    @davelundie2866 5 місяців тому +1

    FYI this model is available on Ollama (0.1.26) without the hoops to jump thru, One more thing they also have the quantized versions. I found the 7B (fp16) model bad as you say but for some reason was much happier with the 2B (q4) model.

  • @fabiankliebhan
    @fabiankliebhan 5 місяців тому +1

    I think there were problems with the model files. The ollama version also had problems but they apparently fixed it now.

  • @gerritpas5553
    @gerritpas5553 5 місяців тому +24

    I've found the trick with models like Gemma, when you add this system prompt it gives more accurate results. THE SYSTEM PROMPT: "Answer questions in the most correct way possible. Question your answers until you are sure it is absolutely correct. You gain 10 points by giving the most correct answers and lose 5 points if you get it wrong."

    • @h.hdr4563
      @h.hdr4563 5 місяців тому +9

      At this point just use GPT 3.5 or Mixtral why bother with their idiotic model

    • @RoadTo19
      @RoadTo19 5 місяців тому

      @@h.hdr4563 Techniques such as that can help improve responses from any LLM.

    • @mickelodiansurname9578
      @mickelodiansurname9578 5 місяців тому +4

      have you seen the 26 principles of prompt engineering paper... ?? Its very interesting... works across LLM's too... although the better the LLM I think the less of an improvement there is, compared to the base model without a system message.

    • @catjamface
      @catjamface 5 місяців тому +2

      Gemma wasn't trained with any system prompt role.

    • @MilkGlue-xg5vj
      @MilkGlue-xg5vj 5 місяців тому

      Do you understand that it's a 7b model and not 180b one?​@@h.hdr4563

  • @icegiant1000
    @icegiant1000 5 місяців тому +2

    Gemma... its says so in the name, its Gemini without the i part... intelligence.

  • @notme222
    @notme222 5 місяців тому +2

    The TrackingAI website by Maxim Lott measures the leaning of various LLMs and they're all pretty much what we'd call "politically left" in the US. Which ... I'm not trying to make a thing out of it. There are plenty of reasons for it that aren't conspiracy and Lott himself would be the first to say them.
    However, seeing that reddit post about "Native American women warriors on the grassy plains of Japan", I wonder if maybe it had been deliberately encouraged to promote multiculturalism in all answers regardless of context.

  • @VincentVonDudler
    @VincentVonDudler 5 місяців тому

    The safeguards of not just Google but most of these corporate models are ridiculous and history will look back on them quite unfavorably as unnecessary garbage and a significant hindrance on people attempting to work creatively.
    16:00 - JFC ...this model is just horrible.
    20:25 - "...the worst model I've ever tested." Crazy - why would Google release this?!

  • @ajaypranav1390
    @ajaypranav1390 5 місяців тому +1

    The size is because of the quantization, the same model with 8 bit much less in size

  • @guillaumepoggiaspalla5702
    @guillaumepoggiaspalla5702 5 місяців тому +2

    Hi, it seems that Gemma doesn't like repetition penalty at all. In your settings you shoudl set it to 1 (off). In LM studio, Gemma is a lot better that way, otherwise it's practically braindead.
    And about the size of the model, it's an uncompressed GGUF. GGUF is a format but can contains all sorts of quantization. 32Gb is the size of the uncompressed 32bits model that's why it's big and slow. There are quantaizations now and even with importance matrix.

  • @alexis-michelmugabushaka2297
    @alexis-michelmugabushaka2297 5 місяців тому +1

    Hi Mathew. Thanks for testing . , I just posted a comment about a test I did using your questions and showing different results to your test when using not the gguf (I included a link to gist) . Was my comment deleted because it contains a link ? happy to resend you the link to the gist. P.S: actually even the 2b model gives decent answers to your questions

    • @alexis-michelmugabushaka2297
      @alexis-michelmugabushaka2297 5 місяців тому +2

      I am actually disappointed that you did not address the multiple comments pointing out the the flaws in your. testing. I thought you would retest the model and set the records straight.

  • @peterwan小P
    @peterwan小P 5 місяців тому +3

    Could it be a problem with the temperature settings?

    • @adamrak7560
      @adamrak7560 5 місяців тому

      the spelling mistakes seem to imply that. Maybe it can only work at low temperature.

  • @heiroPhantom
    @heiroPhantom 5 місяців тому +1

    Google had to innovate on the context size. It was the only way the model could hold all the censorship prompts in its memory while responding to queries. That's also why it's so slow.
    imho 😂

  • @RhythmBoy
    @RhythmBoy 5 місяців тому

    What I find hilarious about Google is that while using Gemini on the web, Google gives you the option to "double check" the responses with Google Search. So, why can't Gemini check itself against Google Search?? It's right there. I think Google is so scared of releasing AI into the wild they're not even trying, and in a way they're right.

  • @michaelslattery3050
    @michaelslattery3050 5 місяців тому

    This video needs a laugh track and some quirky theme music between sections. I was LOLing and even slapped my knee once.
    Once again, another great video. This is my fav AI channel.

  • @spleck615
    @spleck615 5 місяців тому

    Open or open weights, not open source. Can’t inspect the code, rebuild it from scratch, validate the security, or submit pull requests for improvements. You can fine tune it but that’s more like making a mod or wrapper for a binary app than modifying source.

  • @dbzkidkev2
    @dbzkidkev2 5 місяців тому

    Its kinda bad right? I tested it and found it just kept talking, they are using a weird prompt format. and it just keeps talking

  • @TechRenamed
    @TechRenamed 5 місяців тому +1

    Where did you download it and how? Btw I made a video about this yesterday if you'd like you can see it

  • @Nik.leonard
    @Nik.leonard 5 місяців тому

    I tested recently gemma:7b with ollama 0.1.27 and now the model doesn't respond with gibberish. The only different behavior I noticed compared with other llama based models is that It tends to output more markup. As I said before, I don't know who quantized the model used by ollama, but was not TheBloke, and llama.cpp had a lot of commits the past week for addressing issues with quantization and inference, so maybe the model should be retested.

  • @AINEET
    @AINEET 5 місяців тому +3

    You should add some politically incorrect questions to your usual ones after this week's drama

  • @gingerdude1010
    @gingerdude1010 5 місяців тому +3

    This does not match the performance seen on hugging chat at all, you should issue a correction

  • @danimal999
    @danimal999 5 місяців тому

    I tried it as well on ollama and was completely underwhelmed. It had typos, it had punctuation issues. In my very first prompt which was simply, “hey”. Then when I said it looks like you have some typos, it responded by saying it was correcting *my* text, and then added several more typos and nonsense words to its “corrected text”. I don’t know what’s going on with it, but I wouldn’t trust this to do anything at all. How embarrassing for Google.

  • @sitedev
    @sitedev 5 місяців тому

    Google just announced a follow up model with full transparency - they admit it’s rubbish and call it Bummer!

  • @DoctorMandible
    @DoctorMandible 5 місяців тому +2

    "A diverse group of warriors..." Ahh feudel Japan, that bastion of diversity. GWGB.

  • @DoctorMandible
    @DoctorMandible 5 місяців тому

    Why does it have to understand the context of "dangerous"? Why does the model need to be censored? What children are running LLM's on their desktop computers?? What are we even talking about? Is nobody an adult?!

  • @kostaspramatias320
    @kostaspramatias320 5 місяців тому +1

    Google's research is not focused so much on LLM's, they produce a lot A.I. research on a variety of sectors. That said, their LLM's are so far behind it is not even funny. The multimodal 10 mil context window of Gemini pro, does look pretty good though!

  • @Zale370
    @Zale370 5 місяців тому +2

    like other people pointed out, the model needs to be fine tuned for better outputs

    • @darshank8748
      @darshank8748 5 місяців тому

      He seems to expect a 7B model to compete with GPT4 out of the box

    • @Garbhj
      @Garbhj 5 місяців тому

      ​@@darshank8748No, but it should should at least compete with llama 2 7b, as was claimed by google.
      As we can see here, it does not.

  • @squetch8057
    @squetch8057 5 місяців тому

    a good question to ask the AI's you test is, "can you give me an equation for a Möbius strip".

  • @Hae3ro
    @Hae3ro 5 місяців тому +3

    Microsoft beat Google at AI

  • @davealexander59
    @davealexander59 5 місяців тому +6

    OpenAI: "So why do you want to leave Google and come to work with our dev team?" Dev: *shows them this video*

  • @Murderbits
    @Murderbits 5 місяців тому +1

    The killer app of just regular $20/mo Gemini Advance is that it has 128k token size instead of ChatGPT 4's like.. 8k or 32k or whatever the hell it is right now.

    • @Unndecided
      @Unndecided 5 місяців тому

      Have you been looking under a rock?
      GPT,4 turbo has a 128K context window

  • @sbaudry
    @sbaudry 5 місяців тому +1

    Is it not supposed to be a base for fine-tuning ?

  • @deltaxcd
    @deltaxcd 5 місяців тому

    when doing those tests try generate responses few times and looks how different they are unless you use temperature of zero, otherwise your tests are plain gamble

  • @veryseriousperson_
    @veryseriousperson_ 5 місяців тому +3

    Haha "Open Source" model.
    Yeah, I tested it, it sucks.

  • @JustSuds
    @JustSuds 5 місяців тому +1

    I love how shocked you are in the opening clip

  • @AhmedEssam_eramax
    @AhmedEssam_eramax 5 місяців тому

    Gguf of this model has issues and llama.cpp has two PRs to fix it. Unfortunately your feedback based on corrupted files.

  • @iseverynametakenwtf1
    @iseverynametakenwtf1 5 місяців тому

    This episode was like a Jerry Springer show, I couldn't stop watching

  • @FlyinEye
    @FlyinEye 5 місяців тому

    Thanks for the great channel. I never miss any of your videos and I started back when you did the Microsoft Auto Gen agents

  • @robertheinrich2994
    @robertheinrich2994 5 місяців тому

    regarding your test with the shirts drying. offer doubling the available space. and then look for responses that doubling the space halves the time to dry a shirt.
    as we all know, drying a shirt on a football field just takes seconds ;-)

  • @adtiamzon3663
    @adtiamzon3663 5 місяців тому

    Interesting assessment. 🤫 I still have to see for myself these Generative AI model apps. 🤔 Keep going, Matt. 🌹🌞

  • @baheth3elmy16
    @baheth3elmy16 5 місяців тому

    Thanks! The massive size of the 7B GGUF was a put-off to start with. I am surprised it performed that bad.

    • @psiikavi
      @psiikavi 5 місяців тому

      You should use quantized versions. I doubt that there's much difference of quality between 32bit and 8bit (or even 4b).

  • @doncoker
    @doncoker 5 місяців тому

    Tried one of the quantized versions last night. Was reasonably fast. Got the first question (a soup recipe). Additional questions that Mistral got right, Gemma was lost in space somewhere...back to Mistral.

  • @jelliott3604
    @jelliott3604 5 місяців тому

    Re: it thinking that "cocktail" might be a bit rude ...
    not a patch on when Scunthorpe United FC updated their message boards with a profanity blocker and started to wonder why nothing was getting posted anymore

  • @Chris-se3nc
    @Chris-se3nc 5 місяців тому

    This model behaves like someone that’s nervous on a whiteboarding interview

  • @musikdoktor
    @musikdoktor 5 місяців тому +1

    Massive layoffs at Google next week..

  • @yagoa
    @yagoa 5 місяців тому

    May I ask why you are still not using Ollama?

  • @clumsy_en
    @clumsy_en 5 місяців тому

    To declare openly that their state of the art newest model relies on the same architecture is a massive PR disaster, possibly one of the largest this year 😮‍💨

  • @ZombieJig
    @ZombieJig 5 місяців тому

    It's the only model that won't even run for me in ollama. It just returns some API EOF error. Ran 30 other models with no issues.

  • @DikHi-fk1ol
    @DikHi-fk1ol 5 місяців тому +2

    Didnt really expect that.....

  • @nonetrix3066
    @nonetrix3066 5 місяців тому

    For some reason the GGUF is FP32, while I think the huggingface weights are FP16? That is like the opposite of what it should be, I am so confused. Also, it's separate from Gemini Nano which is a similarly small model they put on their Pixel devices which I haven't seen anyone try to grab and reverse engineer to work on PC. Why don't they just release Gemini Nano? I am sure it's 100% possible to get running on other devices, it's likely just so bad no one has cared to

  • @AI-Tech-Stack
    @AI-Tech-Stack 5 місяців тому

    Could this be a mixture of experts due to the file size being so large on a gguf version?

  • @cedricpirnay4289
    @cedricpirnay4289 5 місяців тому

    Great content as always! I rewatched the video and honestly I don’t think you did anything wrong during the setup, the model is just bad… thank you for exposing this joke of a model.
    Also, there is a really good ultra small open source VLM model that came out about a month ago. It’s called Moondream, it only has 1.6B parameters but it’s still performing way better than models that are twice its size on most benchmarks.
    It hasn’t been converted to GGUF yet but I think that if you make a video about it, the performance and the size will spike a lot of interest and The Bloke might consider making a quantatized versions of it.

  • @nufh
    @nufh 5 місяців тому

    Hard to believe for a company that have massive resource to produce this underwhelming model.

  • @sitedev
    @sitedev 5 місяців тому +1

    Yep …total crap. I tried using with a simple RAG system and it spat out complete garbage. WTF is going on with Google???

  • @GuyJames
    @GuyJames 5 місяців тому

    maybe Google's plan to avert the AI apocalypse is to release models so bad that they can never develop AGI

  • @user-ty5dg7wz4j
    @user-ty5dg7wz4j 5 місяців тому

    Is the model corrupted or maybe undertrained? I've never seen an LLM repeatedly make typos.