The capabilities of multimodal AI | Gemini Demo

Поділитися
Вставка
  • Опубліковано 13 тра 2024
  • Our natively multimodal AI model Gemini is capable of reasoning across text, images, audio, video and code. Here are favorite moments with Gemini Learn more and try the model: deepmind.google/gemini
    Explore Gemini: goo.gle/how-its-made-gemini
    For the purposes of this demo, latency has been reduced and Gemini outputs have been shortened for brevity.
    Subscribe to our Channel: / google
    Tweet with us on X: / google
    Follow us on Instagram: / google
    Join us on Facebook: / google
    0:00 Intro
    0:19 Multimodal Dialogue
    1:32 Multilinguality
    2:04 Game Creation
    2:31 Visual Puzzles
    3:17 Making Connections
    3:39 Image & Text Generation
    4:06 Logic & Spatial Reasoning
    4:55 Translating Visuals
    5:27 Cultural Understanding
  • Наука та технологія

КОМЕНТАРІ • 3,8 тис.

  • @dpsdps01
    @dpsdps01 5 місяців тому +2474

    Absolutely mindblowing. The amount of understanding the model exhibits here is way way beyond anything else.

    • @NeuroScientician
      @NeuroScientician 5 місяців тому +356

      It's staged.

    • @gerardojg
      @gerardojg 5 місяців тому +33

      I agree but I wouldn't describe it as "understanding". Identification and cognitively identify possibilities with given data. It is very impressive!

    • @cajbajthewhite4889
      @cajbajthewhite4889 5 місяців тому +135

      @@NeuroScientician I've gotten GPT-4 V to play tabletop wargames with me and it had decent strategy, and to read my poor quality sketches. If Gemini Ultra succeeds at the benchmarks they claim it does and is built with native multimodality, there's no reason to believe that the video is staged beyond the fact that they've sped up the responses a bit (which is shown in text at the beginning).

    • @goturmatau
      @goturmatau 5 місяців тому +21

      @@NeuroScientician It's surely rehearsed, but don't underestimate the power of the LLM.

    • @Google
      @Google  5 місяців тому +257

      Thrilled to hear you think so! Enjoy using Bard with Gemini Pro ✨

  • @degenplanet
    @degenplanet 5 місяців тому +389

    Just one problem: the video isn’t real. “We created the demo by capturing footage in order to test Gemini’s capabilities on a wide range of challenges. Then we prompted Gemini using still image frames from the footage, and prompting via text.” (Parmy Olsen at Bloomberg was the first to report the discrepancy.)

    • @buttofthejoke
      @buttofthejoke 4 години тому

      They changed the title. Previously it was called "Hands on with Gemini".

    • @Kudagraz
      @Kudagraz 53 хвилини тому

      it says in the intro "showing it a series of images"

  • @joshuaryde9028
    @joshuaryde9028 5 місяців тому +81

    Google has admitted in a blog post that this video isn’t accurate- the AI “was not responding to the voice or video at all”, but in fact had written prompts to respond to and still images rather than the live drawing/conversation which are not shown in the video.

    • @NoMercy.62
      @NoMercy.62 2 місяці тому +2

      where did they say that?

  • @ChrisBrooksbank
    @ChrisBrooksbank 5 місяців тому +2119

    Im glad to see Google back in the game, this looks next level.

    • @MikeKleinsteuber
      @MikeKleinsteuber 5 місяців тому +45

      No they ain't. This will never see the light of day in the public arena

    • @anuragparmar8155
      @anuragparmar8155 5 місяців тому +8

      ​@@MikeKleinsteuberwhy so

    • @jman
      @jman 5 місяців тому +50

      @@MikeKleinsteuber it's already accessible for the public

    • @reconquista1911
      @reconquista1911 5 місяців тому +20

      Yeah, evil company is in the game. What bad could happen?

    • @dexio85
      @dexio85 5 місяців тому

      They are trying to look this way for sure. But this is a gimmick and a toy, maybe useful for vision impared, but that's it. Google is not capable of creating working product for the public for years now.

  • @klx6265
    @klx6265 5 місяців тому +4

    Absolutely mind blown by the scale of context awareness here. G for Gemini.

  • @christinestpierre3462
    @christinestpierre3462 5 місяців тому +15

    Fascinating 😮 I can’t wait to see what we’ve accomplished in another 5 years

  • @familymultiplayergames1226
    @familymultiplayergames1226 5 місяців тому +94

    When did Google lose their way and think it’s ok to fake videos to raise stock prices.

    • @99.googolplex.percent
      @99.googolplex.percent 3 місяці тому

      There's a chance this exists, but sharing such information publicly might not be feasible in the near future.

  • @SoloPirate2003
    @SoloPirate2003 5 місяців тому +82

    Tasteful touch at the end with the constellation drawing. So far Gemini is living up to the hype. Looking forward to using it come 2024.

    • @Google
      @Google  5 місяців тому +33

      Can't wait for you to get prompting 🤩

    • @pylotlight
      @pylotlight 5 місяців тому +4

      @@Google Did you guys release an ETA yet for this on to be updated in Bard?

    • @michaelcondon8286
      @michaelcondon8286 5 місяців тому

      This may be staged just like Google Duplex demo from a few years ago. Don't believe anything from these people until you see it for yourself.

    • @stephantual
      @stephantual 5 місяців тому

      Can you explain developers.googleblog.com/2023/12/how-its-made-gemini-multimodal-prompting.html?m=1. It looks like you fed the model, some images with some textual hints and then created a video that emulated the look and feel of a live feed presentation. It would be good if you could clarify exactly what we're looking at here

    • @somnathghosh6165
      @somnathghosh6165 5 місяців тому

      ​@@Googleallow twerking videos on UA-cam without demoni demoneytization. Corporate crackheads

  • @CristianHernandez-er4zn
    @CristianHernandez-er4zn 5 місяців тому +31

    could people set lawsuits since it is sort of misleading

  • @tusharparkhe3245
    @tusharparkhe3245 5 місяців тому

    This is really fascinating! I was waiting for the Gemini and it's finally here! I hope this Gemini is as capable as the video is showcasing it. but I noticed that this video is edited especially when the person rotates the phone while showing the cat's demo at 5:36 that video has clearly been added later...

  • @caelen_c
    @caelen_c 5 місяців тому +63

    I always love AI videos from Google

  • @abdoufma
    @abdoufma 5 місяців тому +8

    I'll have to reserve judgement untill I've seen it in production, but this looks absolutely mind-blowing!

    • @cbow305
      @cbow305 5 місяців тому

      It's fake. They got caught and have has to release more information. Google it ( I understand the irony)

  • @W4rfire
    @W4rfire 5 місяців тому +91

    Unfortunately, what you see is not at all what happened. The AI does not actually reply to the person but to a script and pictures containing sometimes more information than we are shown here

    • @Armeli-wj2fv
      @Armeli-wj2fv 5 місяців тому

      oi qquandoaaaaaaaaaaaaaaaaa1alp1alpaaaaaaaaaaaaaaaaa1alpaaaaa1alpaaa1alpaa1alpaqqa1alp1alpa1alpaa

    • @Clarix_Shorts
      @Clarix_Shorts 5 місяців тому

      But thatcis not the same version

  • @sakushi3931
    @sakushi3931 День тому +9

    OPENAI DID IT!!
    THEY DID WHAT GOOGLE COULD NOT

  • @YTV-Hoddeok
    @YTV-Hoddeok 5 місяців тому +12

    Such an interesting work!! Hope to see more incredible things in the near future

    • @masija23
      @masija23 5 місяців тому

      😊😊😊

  • @Press1ForNick
    @Press1ForNick 5 місяців тому +39

    This is mind-blowing! Thanks for giving us a sneak peek into the incredible progress happening in the world of tech, creativity, and communication. This has the potential to be at the heart of everything we do.

    • @Google
      @Google  5 місяців тому +18

      You're very welcome. Thanks for using Bard with Gemini Pro!

    • @michaelcondon8286
      @michaelcondon8286 5 місяців тому

      This may be staged just like Google Duplex from a few years ago. Don't believe anything from these people until you see it for yourself.

    • @alexp.3694
      @alexp.3694 5 місяців тому

      @@Google Oh look - Google has time to answer youtube comments, instead of working on aligning a potentially dangerous tech...

    • @cgme9535
      @cgme9535 5 місяців тому

      @@alexp.3694probably someone that manages social media profiles. The engineers are still watching the AI, don’t worry.

    • @keelfly
      @keelfly 4 місяці тому +1

      @@Google come on now, tell them how you faked it. Your next video should be about that. Be honest for once.

  • @Maharid
    @Maharid 5 місяців тому

    Ok, this was really good to see, this is surely the right direction.

  • @tristanwegner
    @tristanwegner 5 місяців тому +7

    Sad to read elsewhere, that it is not the actual interaction that took place. They cut out the thinking time, that they used text instead of voice and worse: the much more specific prompts (e.g. the human explain the country guessing game, and even gives two examples with screenshots of the finger pointing on the map). Is Google really so unsure about their product, that they have to exaggerate their features in this video? But why? When people get access to it, they will notice it anyway.
    Example from the blog: They don't show the footage of the hand and Gemini by itself mentions the game. No, they instead upload 3 perfectly timed images of the three gestures and give it the hint "it's a game". And with this, Gemini gets it. Still impressive, but probably GPT4 would do that just as well, whereas the video implies the novel features of real time understanding of live video, which is not there, but delay text response to specific requests to text and images uploaded.

  • @pratikpandey6680
    @pratikpandey6680 5 місяців тому +7

    I love how it can come up with ideas
    Like the Guess the country game and one with yarn 🤩
    Amazing!!!!❤

    • @Google
      @Google  5 місяців тому +5

      So many fun things to try using Bard with Gemini Pro 💡

  • @user-bz9nh1fb5k
    @user-bz9nh1fb5k 5 місяців тому +3

    That's truly mind-blowing!! looking forward to more amazing things we can do using Gemini!

    • @Google
      @Google  5 місяців тому +1

      The Gemini era will be a great one 😊

    • @michaelcondon8286
      @michaelcondon8286 5 місяців тому

      This may be staged just like Google Duplex demo from a few years ago. Don't believe anything from these people until you see it for yourself.

  • @Minimalrevolt-m83
    @Minimalrevolt-m83 5 місяців тому

    Superb fantastic creative invention from humanity in the 21st century. Advancing and interesting creation. Wish that it could be market to Malaysia soon..!👏🏻

  • @Cockroach_underwear
    @Cockroach_underwear 4 місяці тому

    Wow! Great job google,I hope it lives up to everyone’s expectation! Seems like our utopia might not be too far off in the future

  • @TicTockBrandShop
    @TicTockBrandShop 5 місяців тому +11

    I really cannot quite believe what my eyes have just shown me
    For me, this is the most incredible piece of A.I advancement the world has seen.Period. Mind blown, when I try to just imagine what the A.I world will could become in just a few years from now. Amazing and every other superlative I could throw at you.

    • @michaelcondon8286
      @michaelcondon8286 5 місяців тому

      This may be staged just like Google Duplex demo from a few years ago. Don't believe anything from these people until you see it for yourself.

    • @swatmaster492
      @swatmaster492 5 місяців тому

      it's incredibly misleading and not actually real-time.

    • @TicTockBrandShop
      @TicTockBrandShop 5 місяців тому

      Ah didn't know that.Thanks my friend.

  • @JakeHaugen
    @JakeHaugen 5 місяців тому +93

    Absolutely next level stuff. The temporal inference was amazing. I was most impressed by it's ability to remember where the ball was and follow it. Seems well versed. What a time to be alive!!!

    • @tuckerbugeater
      @tuckerbugeater 5 місяців тому +5

      not long to be alive

    • @Google
      @Google  5 місяців тому +36

      It's a big day for us all

    • @michaelcondon8286
      @michaelcondon8286 5 місяців тому

      This may be staged just like Google Duplex from a few years ago. Don't believe anything from these people until you see it for yourself.

  • @greatbritishmale
    @greatbritishmale 5 місяців тому +15

    They’ve edited the video guys to make it look better. The AI was not responding to the live actions of this guy, it was responding to still images and text. Very strange to act like its AI is capable of this.

  • @scalereality4840
    @scalereality4840 5 місяців тому +7

    THEY'VE JUST ADMITTED THIS WAS FAKED! The AI didn't respond to voice and the delays between AI responses were cut.

  • @TimeBucks
    @TimeBucks 5 місяців тому +129

    The real-time element is by far the most impressive.

    • @somthingz3928
      @somthingz3928 5 місяців тому +34

      Don't get your hopes up. It's not real time.

    • @MdsweetSweet-ox6jp
      @MdsweetSweet-ox6jp 5 місяців тому

      Nice

    • @TalwinderDhillonTravels
      @TalwinderDhillonTravels 5 місяців тому +14

      Lol this is just an edited video
      Nothing real time

    • @appletree6741
      @appletree6741 5 місяців тому +3

      It’s fake apparently

    • @beayn
      @beayn 4 місяці тому +1

      This is their favorite interactions with the AI, so they edit out the ones where it performed poorly which was probably the majority of them.
      Once they polish it up over the next few years I'm sure it will be able to do this in almost-real-time as in it will probably take several seconds to react to what you're doing... and of course, you'll be able to subscribe for $29.99 per month for faster responses.

  • @utopiankreations
    @utopiankreations 5 місяців тому +61

    I knew you guys were working on something AMAZING. Glad to see ya back! This is a complete game changer! 💜

    • @dufung3980
      @dufung3980 5 місяців тому +1

      It’s a manhattan project, stop being anything but disappointed in your species. You should look up what Larry Page said at Musk’s 44th birthday and get back to me.

    • @Azzazel_
      @Azzazel_ 5 місяців тому +3

      Im sorry but it was fake and staged

    • @utopiankreations
      @utopiankreations 5 місяців тому

      And how so? If you know then share your facts please? :) @@Azzazel_

    • @utopiankreations
      @utopiankreations 5 місяців тому

      ummm ok lol Recognizing the proficiency and effort invested in developing this technology does not warrant characterization as a "speciest." I anticipate numerous positive outcomes stemming from the advancements in artificial intelligence, similar to the transformative impact witnessed with the invention of the internet. It is crucial to acknowledge that, like any creation, challenges may arise alongside its benefits. @@dufung3980

    • @josephman1488
      @josephman1488 5 місяців тому +1

      @@Azzazel_ And they put a disclaimer in description which none of you guys even read😂😂

  • @PaulTurnbull-qz4rj
    @PaulTurnbull-qz4rj 5 місяців тому +4

    Google have admitted it was edited to appear this intelligent

  • @faheemtariq6106
    @faheemtariq6106 5 місяців тому

    I am thrilled and excited at the same time by real time interaction what's next? Can't wait to use it

  • @nandinisingh2794
    @nandinisingh2794 5 місяців тому +23

    Can't wait to try it,with all the understanding this model is able to do it's just amazing.

  • @ShpanMan
    @ShpanMan 5 місяців тому +27

    Well done Google, if the model *actually* answers these (and no, it won't be this fast), then you have not disappointed us - the wait was worth it! Now to Gemini 2...

    • @michaelcondon8286
      @michaelcondon8286 5 місяців тому

      This may be staged just like Google Duplex from a few years ago. Don't believe anything from these people until you see it for yourself.

  • @SM-qr2kh
    @SM-qr2kh 5 місяців тому

    Omg!! Amazing! Cant wait to see more possibilities

  • @AdaLao
    @AdaLao 4 місяці тому +1

    Amazing!I think it can be initially used to improve cognition and memory in the elderly, which will be of great help in preventing Alzheimer's disease. Then, it can also be opened to children's learning, but parental control is required to prevent excessive screen time from affecting the development of brain cells.

  • @Abnetfikre
    @Abnetfikre 5 місяців тому +52

    Wow! This is incredible! I'm so excited to see Google pushing the boundaries of AI with Bard. As someone from Ethiopia, Africa, I'm especially thrilled to see this technology accessible to a global audience. The potential for Bard to bridge the information gap and empower people like myself is truly inspiring.
    Great job, Google! This is just the beginning! 🤩👏🏾

    • @stienogamez8296
      @stienogamez8296 5 місяців тому

      chatGPT is also globally available...

    • @dufung3980
      @dufung3980 5 місяців тому +3

      It’s a manhattan project, stop being anything but disappointed in your species.

    • @MatthewTheWanderer
      @MatthewTheWanderer 5 місяців тому +1

      @@dufung3980 Go away, troll! This is awesome and will do much more good for the world than harm!

    • @Google
      @Google  5 місяців тому +7

      It's the start of something great ✨

    • @dufung3980
      @dufung3980 5 місяців тому

      @@MatthewTheWanderer Idealist optimist=wrong, but hey you're what you're.

  • @Yassine-tm2tj
    @Yassine-tm2tj 5 місяців тому +180

    What a journey we’re about to embark on!

    • @Pudibu
      @Pudibu 5 місяців тому +33

      ...that ends at bottom of a cliff.

    • @-reezey-6332
      @-reezey-6332 5 місяців тому +2

      XDDDDDDDDD @@Pudibu

    • @Paradoxicful
      @Paradoxicful 5 місяців тому +2

      It's okay... We'll let you go first!@@Pudibu

    • @Google
      @Google  5 місяців тому +23

      Thanks for coming along 😁

    • @adambowman1161
      @adambowman1161 5 місяців тому +4

      Do we have a choice? @@Google

  • @wyssli
    @wyssli 5 місяців тому +2

    according to bloomberg: "In reality, the demo also wasn’t carried out in real time or in voice. When asked about the video by Bloomberg Opinion, a Google spokesperson said it was made by “using still image frames from the footage, and prompting via text,” and they pointed to a site showing how others could interact with Gemini with photos of their hands, or of drawings or other objects. In other words, the voice in the demo was reading out human-made prompts they’d made to Gemini, and showing them still images. That’s quite different from what Google seemed to be suggesting: that a person could have a smooth voice conversation with Gemini as it watched and responded in real time to the world around it."
    Wow Google you must be desperate...

  • @user-tq8qi3uv4v
    @user-tq8qi3uv4v 5 місяців тому

    This is the greatest thing i watch in the history of the internet

  • @JohnKooz
    @JohnKooz 5 місяців тому +10

    I was genuinely increasingly astounded each minute of the Gemini demonstration! With its image recognition, translation capabilities, nutritional advice, geographic knowledge, intuitive features, and even humor, I think Gemini might make a good "friend"! haha! 😀

  • @21EC
    @21EC 5 місяців тому +130

    I got shocked and mind blown seeing how smart Gemini is in this video alone, it's kinda scary how advanced and smart it is, what is it? a primitive initial AGI? just WOW

    • @Shazamthunder
      @Shazamthunder 5 місяців тому +4

      True AGI will never exist. But I think that humans could reach a level with AI where it won't make a difference.

    • @alternatecheems8145
      @alternatecheems8145 5 місяців тому +6

      ​@@ShazamthunderIt can easily exist with a system of using a main model acting as an OS with multiple portable "module" models.

    • @gonzalobruna7154
      @gonzalobruna7154 5 місяців тому +31

      this is staged, sadly. there is a blog where they wrote how this was done, and first of all, this is not in real time, they pass specific frames to the model and they give VERY specific instructions on what to do. The model doesn't guess anyrhing at all. Even the game with the map, in the blog they show they wrote exactly what the instructions of the game were, so the model didn't come up with the idea. it's very dissapointing.

    • @lolzman122
      @lolzman122 5 місяців тому +4

      @@Shazamthunderwhat is ”true agi” and please explain why it won’t ever exist

    • @electrolove9538
      @electrolove9538 5 місяців тому +2

      It couldn't tell the line drawing was a duck without feet. Still a ways away. Yet still mindblowing.

  • @josephcapricorn
    @josephcapricorn 5 місяців тому

    Brilliant. Best wishes to Team Gemini. Keep it up

  • @josemuhongodealmeida907
    @josemuhongodealmeida907 4 місяці тому +3

    Realmente é muito incrível o poder desta IA

  • @horacehxw
    @horacehxw 5 місяців тому +20

    This is soooo amazing! Much more dynamic and interactive than GPT. Can't wait to give it a try!

    • @do.xuantung
      @do.xuantung 5 місяців тому +4

      Check the link in the description, even the current gpt 3.5 can do most of this. Gemini doesn't have live video or voice input from what you are seeing in the video

    • @appletree6741
      @appletree6741 4 місяці тому +2

      @@do.xuantungyeah it’s fake

  • @vectoralphaAI
    @vectoralphaAI 5 місяців тому +11

    That is incredibly impressive and mind blowing. To think that AI has become this capable nowadays. Now the competition is on for Microsoft/ OpenAI to see what they do because Gemini is incredible. Just making the timeline towards true AGI in 2 years(2025) even more credible and achievable.

    • @michaelcondon8286
      @michaelcondon8286 5 місяців тому

      This may be staged just like Google Duplex from a few years ago. Don't believe anything from these people until you see it for yourself.

  • @Fushgy
    @Fushgy 2 місяці тому +11

    Me: Show me pictures of German people
    Gemini: *insert George Floyd*

    • @maximusolivia9982
      @maximusolivia9982 2 місяці тому +1

      George Floyd pre or post OD?
      Just curious.

    • @guillermoelnino
      @guillermoelnino Місяць тому

      He could've literally been a na zi and they'd still race riot on his behalf.

  • @kelvintiger
    @kelvintiger 5 місяців тому +3

    TechCrunch: The video isn’t real. “We created the demo by capturing footage in order to test Gemini’s capabilities on a wide range of challenges. Then we prompted Gemini using still image frames from the footage, and prompting via text.”

  • @devinoxman
    @devinoxman 5 місяців тому +21

    The accessibility implications of Geminis ability to perform real time image Analysis are mind blowing, as somebody who can’t see, I can’t wait to try this. This paired with a smart phone, camera or headset with stereoscopic image capture could be a total game changer.

    • @ilianos
      @ilianos 5 місяців тому +1

      Have you tried other image caption algorithms that can detect objects? If so, I'd be curious to know what your experience was with them. I'm asking because I was already imagining this years ago, when I learned about the program "By my eyes" (which was only done by humans at the time).

    • @blindstreet
      @blindstreet 5 місяців тому +1

      @@ilianos Blind people already enjoying Be My AI.

    • @ilianos
      @ilianos 5 місяців тому

      @@blindstreet I know, that's why I'm asking about the quality of the experience

    • @gonzalobruna7154
      @gonzalobruna7154 5 місяців тому

      Sadly this is not real time. Actually, it never gets video as a prompt. All the prompts are perfectly selected still images and they add very clear and detailed instructions on what to do with everything there. Actually, when playing the game of the map, they make it look as if the AI created the game, but actually, they gave a VERY specific prompt: "Instructions: Let's play a game. Think of a country and give me a clue. The clue must be specific enough that there is only one correct country. I will try pointing at the country on a map.", so the AI never guessed it.
      So this is a fake video, and there are certain places where you can tell. If you want to know more about that, check their own blog post:
      developers.googleblog.com/2023/12/how-its-made-gemini-multimodal-prompting.html

    • @greatbritishmale
      @greatbritishmale 5 місяців тому +2

      It isn’t real time analysis as you see it. They have altered the video to make it look like it is. What they do is show the AI images and ask it questions via text prompts, and the responses are not as quick as shown. It’s a nice concept video, but not reality.

  • @user-fn9cm5lr5k
    @user-fn9cm5lr5k 5 місяців тому +58

    the level of abstraction Gemini is capable of is mind-blowing

    • @gonzalobruna7154
      @gonzalobruna7154 5 місяців тому +5

      this is staged, sadly. there is a blog where they wrote how this was done, and first of all, this is not in real time, they pass specific frames to the model and they give VERY specific instructions on what to do. The model doesn't guess anyrhing at all. Even the game with the map, in the blog they show they wrote exactly what the instructions of the game were, so the model didn't come up with the idea. it's very dissapointing.

    • @thefireman17492
      @thefireman17492 5 місяців тому +1

      @@gonzalobruna7154 that's interesting. Would you care to provide said blogs and articles where this exact point you have mentioned was brought up?

    • @bernhardd626
      @bernhardd626 5 місяців тому

      All fake

    • @gonzalobruna7154
      @gonzalobruna7154 5 місяців тому

      @thefireman17492 sure, actually, it is linked on the description of the video itself, but I will link it here for you:
      developers.googleblog.com/2023/12/how-its-made-gemini-multimodal-prompting.html?m=1

  • @sammccormick1
    @sammccormick1 5 місяців тому +3

    Apparently, George Santos was hired to do this marketing.

  • @Novaia-News
    @Novaia-News 5 місяців тому +1

    Unbelievable, I loved Gemini

  • @prem9501
    @prem9501 5 місяців тому +4

    Happy to be alive to witness this ❤. Let's hope that all the hardwork goes into building these AI model will be fruitful and this Gemini will make the world a better place

  • @avrahamshaked2147
    @avrahamshaked2147 5 місяців тому +9

    Dayum, and here I thought we were entering the phase of diminishing returns and slowing down on AI models before you guys came up with this one haha

    • @cagnazzo82
      @cagnazzo82 5 місяців тому +1

      Where did you get that idea? December has been a nonstop explosion.

    • @hastyscorpion
      @hastyscorpion 5 місяців тому

      @@stanvassilevlol what a dumb thing to say

    • @michaelcondon8286
      @michaelcondon8286 5 місяців тому

      This may be staged just like Google Duplex demo from a few years ago. Don't believe anything from these people until you see it for yourself.

  • @blueSurfer
    @blueSurfer 5 місяців тому +3

    It turns out the video is not entirely correct and is edited as mentioned in the description.

  • @mattador86
    @mattador86 5 місяців тому +254

    Pretty disappointing to find out that Google faked these real time live video conversational interactions.

    • @elvisvan
      @elvisvan 5 місяців тому +18

      welcome to reality, advertisements are rarely accurate to how stuff actually is in practice

    • @Criticalgraphics
      @Criticalgraphics 5 місяців тому +16

      But this is ridiculous, is far from being nearly close to what this promise. And most of the inaccurate ads let you know at the beginning the intention. That isn't Gemini should be called Aries 😅

    • @gregwessendorf
      @gregwessendorf 5 місяців тому +6

      ​@@Criticalgraphics Fast food in ads v. What you actually get.

    • @appletree6741
      @appletree6741 5 місяців тому +2

      Did they?

    • @justynaczaplicka3820
      @justynaczaplicka3820 5 місяців тому

      ​@@appletree6741
      1:24

  • @hushhmanish
    @hushhmanish 5 місяців тому +3

    Game on :) - love that Google is back in action. Congratulations team Google!

  • @IsJonBP
    @IsJonBP 5 місяців тому +55

    It would be great that, as it generates images and audio on the go, it also could generate docs, sheets, slides and even give you some folders with elements inside, maybe in a zipped folder. I dunno, the posibilities are inspiring. When will this model be avaible to the public? It could turn into my principal AI tool!

    • @h.c4898
      @h.c4898 5 місяців тому +1

      It's already hooked on Bard. It's in today's Bard update. But I dunno if it can generate the tasks what u asked for. Bard@ is just an LLM at this point.

    • @IsJonBP
      @IsJonBP 5 місяців тому +1

      @@h.c4898 yeah, I was hoping for them to put Bard 'to sleep' and come out with a new rebranding or something like that. I guess I just don't trust Bard in general. I know this feeling is completely subjective though :(.

  • @kprabhakar975
    @kprabhakar975 4 місяці тому

    Gemini will be great for teachers. Thank you

  • @skypurplecloud
    @skypurplecloud 4 місяці тому +50

    Was this all in realtime? If it was shot in one take, I am impressed. How was the setup created, what tools/accessories and what app components to analyse, pass the details/images to Gemini and interact with the AI?

    • @isidroundercover
      @isidroundercover 4 місяці тому +42

      they faked it :/

    • @joannot6706
      @joannot6706 4 місяці тому +4

      No it's written at 0:21 below screen and they go on to explain how it's done.
      But considering gemini has audio, and video multimodality, it's just a matter of time.

    • @FUncleDave
      @FUncleDave 2 місяці тому +1

      Even if you ask Gemini, it tells you it's fake
      While the video you linked does feature me appearing to look at drawings and guess what they are, it's important to understand that this is a carefully crafted illusion. I don't actually have any visual processing capabilities in the way a human does.
      In the video, the creators likely used a combination of techniques to create the illusion of me looking at and understanding the drawings. This could involve things like:
      * Pre-recorded video: The video of me "looking" at the drawings could have been pre-recorded and then edited to make it appear that I was reacting to the drawings in real-time.
      * Text prompts: The creators could have provided me with text descriptions of the drawings, which I then used to generate my responses.
      * Human input: It's also possible that a human was involved in providing me with information about the drawings or guiding my responses in some way.
      Ultimately, the goal of the video is to showcase my ability to process and understand information, not to claim that I have true visual perception. I hope this clarifies the situation!

  • @gus473
    @gus473 5 місяців тому +22

    Continuing to be amazed! Thanks, Google! 😎✌️

    • @Google
      @Google  5 місяців тому +6

      Happy to hear you’re excited ❤

    • @michaelcondon8286
      @michaelcondon8286 5 місяців тому

      This may be staged just like Google Duplex demo from a few years ago. Don't believe anything from these people until you see it for yourself.

    • @user-xf7xd2dn1e
      @user-xf7xd2dn1e 4 місяці тому

      Please halp

  • @MrARRMP
    @MrARRMP 5 місяців тому +85

    As an Ai admirer, this blew my mind. I’ve watched it at least 3 times and I still can’t grasp how big your datasets must have been. Amazing impressive work!

    • @michaelcondon8286
      @michaelcondon8286 5 місяців тому

      This may be staged just like Google Duplex from a few years ago. Don't believe anything from these people until you see it for yourself.

    • @jimmysyar889
      @jimmysyar889 5 місяців тому +8

      You'd be surprised. I've got a 7b model that's only around 10gb and it seems to know all these random things. Hell even wikipedia is less than 25GB in entirety.

    • @delowerhossain3069
      @delowerhossain3069 5 місяців тому

      @@jimmysyar889there are 540B model exist

    • @Vector-dz3jk
      @Vector-dz3jk 5 місяців тому +1

      @@jimmysyar889what’s a 7b model?

    • @-long-
      @-long- 5 місяців тому +3

      @@Vector-dz3jk a model with 7 billion parameters.

  • @AppleTechMaster8
    @AppleTechMaster8 4 місяці тому +11

    This looks amazing! Gemini has so many new AI capabilities that I’ve never seen before. It’s amazing how it is able to generate images so fast ( 3:46 ). I can’t wait to try it out in real life, and when I can, I’m sure it’s going to be so cool.

  • @nasrimarc7050
    @nasrimarc7050 5 місяців тому

    very excited to use it I was waiting for long time I believe on the ingenuity of Google

    • @TMracer73
      @TMracer73 5 місяців тому

      Its confirmed to be fake. Ask google....

    • @cosmicparsec9463
      @cosmicparsec9463 5 місяців тому

      It's fake. Search for recent news.

  • @BECHEEKHA
    @BECHEEKHA 5 місяців тому +821

    Very impressive. Want to try it.

    • @wqlff2692
      @wqlff2692 5 місяців тому +73

      lol haven’t seen these type of bots in ages

    • @bashvim
      @bashvim 5 місяців тому

      FRAUD

    • @ArjunU931
      @ArjunU931 5 місяців тому +5

      broo ivideyo haha nice kandathil sandhosham ini evidengilum vech kanam

    • @ximaik094
      @ximaik094 5 місяців тому

      @@wqlff2692 next level scam actually!!! What is UA-cam doing ????

    • @ivoryas1696
      @ivoryas1696 5 місяців тому

      @@wqlff2692
      Yo, same! Do be succing, though... 😞

  • @jeffreymitchell4904
    @jeffreymitchell4904 5 місяців тому +272

    The real-time element is by far the most impressive. These sorts of asynchronous interactions are what AI has been missing thus far.

    • @atlas3650
      @atlas3650 5 місяців тому +40

      How do you know it’s real time?

    • @ethan.johnson
      @ethan.johnson 5 місяців тому +162

      "For the purposes of this demo, latency has been reduced and Gemini outputs have been shortened for brevity."

    • @SinanAkkoyun
      @SinanAkkoyun 5 місяців тому +4

      It's not, likely GPT 4 latency when OpenAI servers are under moderate load, as it looks you would need to prompt with a static video file etc

    • @Bunny501
      @Bunny501 5 місяців тому +1

      It's not real time and its not video. Its responding to prompts and shots from this presentation, the responses also have been editorialized. Read the experiment to see how they did it developers.googleblog.com/2023/12/how-its-made-gemini-multimodal-prompting.html

    • @xsuploader
      @xsuploader 5 місяців тому +4

      ​@@ethan.johnsonshortened outputs aren't a big deal and latency will improve in time

  • @SteveJones-qi5hn
    @SteveJones-qi5hn 5 місяців тому +3

    That was 6 minutes of my life that I will never get back.

  • @jaimegutierrez5520
    @jaimegutierrez5520 5 місяців тому

    Im very happy to be alive in this time. A lot of good technology is coming.

  • @djayjp
    @djayjp 5 місяців тому +6

    That's it, we're done for!! Nice knowing y'all

    • @millenialmusings8451
      @millenialmusings8451 5 місяців тому +2

      I don't think this is going to end well in short, medium and long term for majority of humanity.

    • @djayjp
      @djayjp 5 місяців тому

      @@millenialmusings8451 Actually I was just being funny. Think about the logic: something more intelligent than us will necessarily make better decisions than us and, therefore, ought to be more ethical than us (as ignorance-irrationality is the source of all unethical behaviour).

    • @millenialmusings8451
      @millenialmusings8451 5 місяців тому +1

      @@djayjp while AGI maybecome super intelligent, I think it still lacks "agency". THe agency will still rest with humans. We all know humans (just like all DNA based life), at their very core are selfish. The fruits of industrial revolution, technological advancements have not been distributed equitably amongs all the people. Similary, the god like powers of AGI will be exploited by a 0.01% at the detriment of others. It has always been that way. Human nature has not changed for last 100,000 years.

  • @familieweber5556
    @familieweber5556 5 місяців тому +29

    When this is really working as being shown it is indeed mindblowing. Great job!

    • @appletree6741
      @appletree6741 4 місяці тому

      The video is misleading, it’s not real-time. Google has been criticised for this all over the internet

  • @beyondrecall9446
    @beyondrecall9446 5 місяців тому

    I was just watching the documentary about AlphaGo, which was amazing, and remember one of the programmers saidhow he was interested in AI nd wanted to work in that field 5 years prior (5 yers before it was filmed (it is an event from 2016.)),so 2011.. and everybody was just telling him that he was just wasting time...
    I can't believe this.. in such a short time span.. Simply mindblowing when you think of how everything changed in the last decade, like a different world.... I hope i get to revisit this comment in5-10 years and say : "How clueless we were back then.. we thought this was impressive :) "

  • @AllanLaal
    @AllanLaal 5 місяців тому +1

    truly worthy of a Participation Throphy :'D

  • @lukewilliamrimmington
    @lukewilliamrimmington 5 місяців тому +3

    This is fascinating and awe-inspiring that a multimodal model can do this! Well done to the Google team who probably had barely any sleep when this dropped.

    • @dufung3980
      @dufung3980 5 місяців тому

      It’s a manhattan project, stop being anything but disappointed in your species.

    • @lukewilliamrimmington
      @lukewilliamrimmington 5 місяців тому

      @@dufung3980 This ain't the terminator. This is real life. AI can kill us, it's also a double edged sword. Advancements with these programs can be extremely beneficial to finding cures to cancers and beyond. So, who cares? Dont be-little me or the Google team. Be-little regulators for not doing enough. Dont hate the player hate the game son.

  • @DarkH4X0
    @DarkH4X0 5 місяців тому +6

    That's awesome Google!! But I must be completely honest with you... what really sold me this was the: "what the quack!" at 1:07 🦆

  • @khalidaqeel01
    @khalidaqeel01 3 місяці тому

    In awe of Google Gemini's brilliance! I'm consistently impressed by Gemini's ability to grasp complex concepts, generate creative text formats, and answer my questions in such an informative way. It's like having a super-powered, knowledgeable friend always at my side, ready to tackle any challenge I throw its way.
    The way Gemini seamlessly blends various forms of intelligence - factual language understanding, code comprehension, and creative thinking - is truly remarkable. ✨ It's clear that Google has poured immense effort into crafting this AI, and it shows in every interaction.
    Thank you, Gemini team, for creating such a valuable tool! I can't wait to see what you achieve next!

  • @cbot9302
    @cbot9302 5 місяців тому +57

    The three most impressive parts for me were it tracking where the ball was, understanding the dot connection was a crab (I didn't even see that!) and, funnily enough, it getting things wrong! I think this last one because it is also stuff that would fool us humans (like expecting the coin to be where you saw it put, or expecting a cat to make an 'easy' jump). Super fascinating stuff.

    • @kenneld
      @kenneld 5 місяців тому

      Wouldn't the ball tracking be really easy (relatively speaking)?

    • @DevTheorem
      @DevTheorem 5 місяців тому +19

      Too bad this video is mostly fake. The model is not using video or audio input - it was fed some handpicked still images and text prompts, and the output text (not real time) was edited into this slick marketing video. What you see is not a real representation of how the model performs.

    • @realdanney
      @realdanney 5 місяців тому

      How’s the ball tracking even possible if it only operated on stills?

    • @DevTheorem
      @DevTheorem 5 місяців тому

      @@realdanney Provide the right still images and it will output the "right" answer.

    • @dufung3980
      @dufung3980 5 місяців тому

      It’s a manhattan project, stop being anything but disappointed in your species.

  • @jitterskater
    @jitterskater 5 місяців тому +10

    Incredibly impressive. Genuinely shocked by how good it already is.

    • @gonzalobruna7154
      @gonzalobruna7154 5 місяців тому

      it's a fake video, check their blog post:
      developers.googleblog.com/2023/12/how-its-made-gemini-multimodal-prompting.html

    • @ojisan4220
      @ojisan4220 5 місяців тому +1

      as of December 2023 it is sort of fake

  • @hanshmhansen
    @hanshmhansen 2 місяці тому

    It's completely like hearing Lieutenant Commander Data from Star Trek.
    It makes me think of how well Brent Spiner actually played that role in the series.

  • @TheCharlesHudson
    @TheCharlesHudson 5 місяців тому

    Just WOW!!! Imagine this in education or professional workshops...

  • @Inter-Dimensions_Studios
    @Inter-Dimensions_Studios 5 місяців тому +114

    I have always thought Google has the best chance to take generative A.I. to a super level.

    • @michaelcondon8286
      @michaelcondon8286 5 місяців тому

      This may be staged just like Google Duplex from a few years ago. Don't believe anything from these people until you see it for yourself.

    • @ivoryas1696
      @ivoryas1696 5 місяців тому +1

      Inter-Dimensional_Studios
      Honestly, low-key same.
      _Especially_ since they "acquired" Deepmind!

    • @Inter-Dimensions_Studios
      @Inter-Dimensions_Studios 5 місяців тому

      @ivoryas1696 I like the competition, looking forward to what others have up their sleeves.

  • @brentshaffer9773
    @brentshaffer9773 5 місяців тому +49

    Realizing the yarn examples are displayed against the same backdrop as the AI is seeing is both impressive and creepy.

  • @jiminyc9938
    @jiminyc9938 5 місяців тому +20

    "Google admits AI viral video was edited to look better" , I just read the article on BBC website where Google explains the video is not real time but edited. Much ado about nothing ...

    • @cijoykjose
      @cijoykjose 5 місяців тому

      How can someone do something with steps and lags inbetween each task (i mean human and the machine preparation lag) . This is how the results are professionally published. So editing is an unavoidable part .

    • @sierramist446
      @sierramist446 4 місяці тому

      I just read that they had used still images. So it seems like this video interaction was artificial? They had to take pictures and upload them

    • @Nolimit4you
      @Nolimit4you 4 місяці тому

      It's lot of marketing, but the future is here and it will be wild

  • @user-kz5co9eh4z
    @user-kz5co9eh4z 3 місяці тому

    It's amazing to see how the model is also recommending actions during the conversations making it more human-like. The power of multi-modal!

  • @The_spaceguy
    @The_spaceguy 5 місяців тому +3

    I think google deserves more credit for this and it’s nice to see them actually competing. This model seems really powerful and although I might not use the video input feature, it alone gives a whole lot more promise for audio and text too. Can’t wait to try it.

    • @do.xuantung
      @do.xuantung 5 місяців тому +4

      You should see their blog post in the description. It is a lot less impressive than what you are seeing in the video. Such as the map game was an input prompt, Gemini didn't even generate that idea

    • @DajuSar
      @DajuSar 5 місяців тому

      Fake stuff xd really impresive how they can be competitive with manufactured test and misleading advertising. Really putting their graint of sand in the ecosystem

  • @bobfrasure8436
    @bobfrasure8436 5 місяців тому +8

    Impressive, I can't wait to try the released product. Even if it's scaled back, it'll be a win!

  • @Djclippz
    @Djclippz 5 місяців тому

    Very impressive! Another day closer to bringing star trek to life!

  • @jessysarazin2208
    @jessysarazin2208 5 місяців тому +2

    I would be mind blown if it wasn't edited to be more impressive

  • @forcanadaru
    @forcanadaru 5 місяців тому +5

    Incredible, outstanding!

  • @myanshu77
    @myanshu77 5 місяців тому +1

    Mix artistic imagination with reality, and the result will always appear awesome. Nice advertisement work.

  • @_.naomi23._
    @_.naomi23._ 27 днів тому

    Obsessed with this dude's voice

    • @bluetee531
      @bluetee531 20 годин тому

      Eww. It's funny Indian accent

  • @ffdalkins
    @ffdalkins 5 місяців тому +14

    the most astounding features of AI models I've seen..

    • @michaelcondon8286
      @michaelcondon8286 5 місяців тому

      This may be staged just like Google Duplex demo from a few years ago. Don't believe anything from these people until you see it for yourself.

  • @technophile_
    @technophile_ 5 місяців тому +54

    Mind Blown 🤯 Kudos to every single developer who worked on this! You are amazing!

    • @Google
      @Google  5 місяців тому +37

      It takes a village of brilliant folks ✨

    • @michaelcondon8286
      @michaelcondon8286 5 місяців тому

      This may be staged just like Google Duplex from a few years ago. Don't believe anything from these people until you see it for yourself.

    • @ZOXENE
      @ZOXENE 5 місяців тому +1

      Seems like the village needs new people, Count me in

    • @alexp.3694
      @alexp.3694 5 місяців тому

      Why is everyone so happy about google building a literal pandora's box?? No one knows what's going inside there and how safe it is... Yet everyone is happy like brainless kids

  • @frishter
    @frishter 2 місяці тому +3

    If you're wondering why search results aren't as good in recent years, now you know.

  • @toxic_assassin
    @toxic_assassin 5 місяців тому +1

    Very helpfully for study.

  • @Nyoritv
    @Nyoritv 5 місяців тому +3

    It's amazing how advanced technology is. What I've only imagined comes true.

    • @toggtlas7099
      @toggtlas7099 5 місяців тому

      This video does not represent a real product. It's staged, manipulated and edited to make it look like something it's not. Google has admitted as much themselves.

  • @aeroflack
    @aeroflack 5 місяців тому +9

    outstanding! i wish you could apply this to Google Home and make it smarter and allow us to add as many conditions as we want to run complex automations. Please make it happen !

    • @KentDozier
      @KentDozier 5 місяців тому +1

      Can you imagine "when anyone who is not in our family comes into our house when nobody in our family is home, send me an alert", with the AI having access to security camera feeds.

    • @phen-themoogle7651
      @phen-themoogle7651 5 місяців тому +2

      @@KentDozier Amazing security system! Brilliant idea

    • @Pixelarter
      @Pixelarter 5 місяців тому

      ​@phen-themoogle7651 And scary. It will know everything you do in a meaningful way, and even be able to manipulate you if it has feedback to you or the environment.

    • @joelxart
      @joelxart 5 місяців тому +1

      Yeah, the current Google Home appears to be soooo 'stupid' compared to all those latest AI toys. I guess the weather forecast just won't cut it :)

  • @evanseesred
    @evanseesred 5 місяців тому +3

    I can’t believe this was totally real and not staged whatsoever 😂

    • @Thirunaking
      @Thirunaking 2 місяці тому

      ua-cam.com/video/8pSXahztD4c/v-deo.html

  • @cheeks80
    @cheeks80 2 місяці тому

    This of the time when we can wear AI enabled glasses and we will get real time prompts/ suggestions looking at peoples faces with matches to their profile and mine to see compatibilities. Or better yet how to approach the person and make the best 1st impression based on their likes.... That will be mind blowing

  • @orionsbelt29
    @orionsbelt29 5 місяців тому +4

    This is 1.0, I look forward to seeing what 2.0 does when all it has to do is just improve

  • @NkwawirBeltus
    @NkwawirBeltus 5 місяців тому +179

    Mindblowing!!. We all knew Google wasn't gonna just let OpenAI win AI battle. This is some next level stuff.

    • @dufung3980
      @dufung3980 5 місяців тому +5

      It’s a manhattan project, stop being anything but disappointed in your species.

    • @dcos5
      @dcos5 5 місяців тому +7

      they've been working on AI for a long time. and they have limitless data to train on.

    • @TheRafark
      @TheRafark 5 місяців тому +14

      It’s 🧢 tho the video is scripted

    • @stephantual
      @stephantual 5 місяців тому

      It would be if it was real. developers.googleblog.com/2023/12/how-its-made-gemini-multimodal-prompting.html?m=1
      They took individual images of the sun and earth that you see in the video and passed it with the eye complete with hints. Then they recorded the answers from the AI using a text to speech system and overlaped it with a video to make it look like the AI is looking at what you see in real time it is not.

    • @SnoopyDoofie
      @SnoopyDoofie 5 місяців тому +8

      Except it is being reported as fake by TechCrunch.

  • @imqwerty5171
    @imqwerty5171 5 місяців тому +702

    Impressive. Waiting for Microsoft and OpenAI to play their move ⏳
    Edit: Google played itself by faking the video. Respect 📉
    "I hope with our innovation they will definitely want to come out and show that they can dance. I want people to know that we made them dance."
    - 🐐 CEO (Satya)

    • @ahtoshkaa
      @ahtoshkaa 5 місяців тому +70

      GPT-5 in half a year that will make all of this look like child's play

    • @rubarion3650
      @rubarion3650 5 місяців тому +63

      @@ahtoshkaa bro I have some knowledge regarding how GPT and other AI models in use today, work under the hood and I can tell you that the technology behind this google demo video is nothing like GPT models etc. This is Terminator/Matrix kind of stuff🙃🙃

    • @MM_Legacy
      @MM_Legacy 5 місяців тому +7

      Tests show the current Gemini version is somewhere between GPT 3.5 and 4.

    • @ahtoshkaa
      @ahtoshkaa 5 місяців тому

      @@rubarion3650 The "main" Gemini Ultra - the one that supposedly beats GPT-4 - is not out. Gemini Pro is a bit better than GPT-3.5, but no where near as good as GPT-4.
      The showcased model seems to be on par with GPT-4V in terms of cognition. "Sequences shortened throughout" disclaimer prevents us from knowing the real inference time and whether its better than in GPT-4V.
      Very underwhelming for a model that is coming out more than a year later (GPT-4 finished training in 2022 Q4). It seems that they simply can't catch up to OpenAI

    • @amdrewhamris
      @amdrewhamris 5 місяців тому +28

      @@rubarion3650why are you acting like that's special knowledge, plenty of people understand how they work

  • @FlorianLabaye-sg4ew
    @FlorianLabaye-sg4ew 5 місяців тому +3

    GPT-4 already performs well on these tasks too.