Gemini 1.5 and The Biggest Night in AI
Вставка
- Опубліковано 14 лют 2024
- The biggest day in AI since GPT-4's release. A new state of the art model, Gemini 1.5, has arrived, on the same night as a bombshell text-to-video model, Sora, from OpenAI. Gemini 1.5 can ingest up to ten million tokens (at least) and perform incredible retrieval, while also beating Ultra and GPT-4 at most benchmarks, with far less compute. I focus on Gemini 1.5 Pro, while we wait for the Sora Technical Paper. Truly, a night in the history books.
AI Insiders: / aiexplained
Gemini 1.5 Paper: storage.googleapis.com/deepmi...
Google DeepMind Blog: blog.google/technology/ai/goo...
Gemini Demos: • Reasoning across a 402...
Needle in a Haystack:github.com/gkamradt/LLMTest_N...
Mixtral of Experts: arxiv.org/pdf/2401.04088.pdf
Relevant Google Papers: openreview.net/pdf?id=qrwe7XH...
arxiv.org/pdf/2112.06905.pdf
Tweet: / 1758167314480910791
Binoculars SOTA AI text Detector:huggingface.co/spaces/tomg-gr...
AI Insiders: / aiexplained
Non-Hype, Free Newsletter: signaltonoise.beehiiv.com/ - Наука та технологія
Paper is out for 243 milliseconds:
AI Explained: “Yes, I’ve read all 58 pages of the paper”
The name of the channel is "Explained" the AI part is just him.
And 4 papers cited at the bottom 😅
the adultswim art team had text to video in early 2023 heh
If he is an AI himself, it would be apt and terrifying at the same time.
Just more confirmation that this author is AI reporting about AI.
A normal AI news channel that doesn't try to SHOCK-bait 9 year olds, what a time to be alive!
Nice mix of different AI youtubers and the current state of the AI News here on the platform.
ahhahha... i see this and we all know he knows too... it does help that all the kids or(bots?) reply the word down the page.. but yeah, eventually itll end up like many new industry overworked topics...they will wonder why they have one viewer subscribers in a few years, and people rarely return because they clickbaited their fans to boredom and distrust
@@marvinkunz843 true but there is a lot of guys in their fourties trying to also do an Ai video fulltime youtuber 'career' out of it, while, using Ai, so it is really bad with no effort... many guys are doing it that way and it makes it hard to find good people as a laymen common youtuber.
...point being here, They do not even come close to here or Wes, doing actual full breakdowns and reading/explaining themself...
hello fellow scholar
bro 💀
You are far and away my most trusted source on AI news. If everyone’s hyped about something, I’m like “Yeah we’ll see what AI Explained has to say first”. Congrats on amazing consistent quality man!
Agreed.
Same here this is my number one source for factual Ai data ! Keep it up bro I love this channel
Let's just take a moment to appreciate, well every time he releases I appreciate it, that he is just a brilliant dude that can break down hard AI stuff for anyone. Wish I could see the same for many other fields but the incentives are here for sure.
a lot of others are clickbaity and use flashy clickbaitish words, and bright Ai made art thumbnails.... and exaggerate every subpar Ai release that is the current new ideathing of the day
I totally agree. I just wait for AI Explained to drop.
Since I head about about Gemini 1.5, I couldn't wait for your video on that! Thank you!
No one else seems to be saying this... but wow, OpenAI rushing out Sora in response to Gemini 1.5 Pro (and Gemini 1.5 itself being a quick response to Mistral's "Mixtral of Experts" paper just like 1 month back), is *exactly* the sort of AI race dynamic that the AI "don't kill everyone-ism" people were warning about. This is great news (AI is so much more powerful than we thought!), but also terrible news (AI is so much more powerful than we thought D:). I thought we had more time...
It's not racing, they released it to trigger a social response.
@@MrErick1160 hahah :D
What do you mean by rushing out Sora in response to Gemini? Sora isn’t the same category of ai as Gemini
@@MrErick1160I know what you’re referring to. Do you think Jimmmy Apples was being honest? Or is he hyping?
@@user-hi7jk6fu3f No... but do you really think it's just a coincidence that two massive AI releases happened to drop on the *exact* same day, or that the second one just 'happened' to have cut a few corners (e.g. not having the full technical paper immediately ready like Gemini 1.5 did, and basically every major AI release before then did) that helped it release faster?
Especially since OpenAI is now very firmly Sam Altman's company (after the Board kerfuffle), and Altman is charging ahead on AI with his dream of spending $7 *Trillion* on it for example? (Well, more specifically, neuromorphic GPU chips specialized in running AI). You may not believe that there's an AI race going on, but Mr. Altman certainly seems to.
What a fucking day
Fuck a duck. I am gobsmacked!
Fuck yeah.
Honestly jumps like these would take decades a decade ago and now? Fuck. U gotta watch and follow a bunch of people just to stay (somewhat) in the loop of what the hell happening
@@theWACKIIRAQI You only need watch a few quality channels like this one.
"Where would you draw the line between justice and privacy, and between protecting society and protecting personal freedom? Wherever you draw it, will it gradually but inexorably move toward reduced privacy to compensate for the fact that evidence gets easier to fake? For example, once AI becomes able to generate fully realistic fake videos of you committing crimes, will you vote for a system where the government tracks everyone’s whereabouts at all times and can provide you with an ironclad alibi if needed?" - Max Tegmark, Life 3.0: Being Human in the Age of Artificial Intelligence
@@nickb220 I hope you're right.
no I won't. Courts will know the evidence against me can be faked, and won't trust it without good reason. Freedom reigns
All video clips will be assumed to be fake until proven otherwise. As good as diffusion model generated video will get, it might not ever hold up to very close, software-assisted scrutiny.
Just wait - next Gemini is getting access to user analytics. So not only what ever you were saying but will also know your last purchase from amazon.
I'm glad I read that twice. Great insight.
This day feels like what I've been waiting for the last year since I first had my mind blown by ChatGPT and midjourney. This context window reads the entire wheel of time series (4.4 million words that took me over a year to get through) in a single prompt and intelligently interprets it. If these results are reliably reproducible then this breakthrough alone changed the world today.
Add on the text to video showcase by OpenAI and I honestly think I'll remember the 15th of february 2024 as a day that fundamentally shifted my view of the world. I'm incredibly excited to see what happens when both of these are made available to the general public.
Advancements in general-purpose AI have been on a double-exponential curve, and there have been significant advancements in autonomy as well. Look at where we were 2 years ago, and try to picture where we'll be in 2 years. That's not even a better tool or a cool toy. It's artificial superintelligence.
Now consider that we still haven't figured out how to robustly align any general-purpose AI with human preferences.
And that the most celebrated scientists in the field are warning that AI poses a real threat of human extinction.
I'm excited to see what happens down the road, but I want to live to see it!
@@41-Haikuevery year is becoming something absolutely unique. Everything could happen at anytime right now. What a time to be alive.
But can it count how many times Nynaeve yanked at her own braid?
@@ahtoshkaa haha my estimate is around 2.2 million times, i.e. every other fucking word!
but joking aside, imagine actually being able to ask that question and others like it and actually get the answers!! without anyone having to do the grueling work of counting!
As soon as I saw like 5 of my most watched AI channels covering something huge on AI video, even friggin MKBHD, I just new I had to come watch yours first to get the most in depth scoop!
which are the other 3 other channels?
Thanks for the video. Just commenting to say I was here on this historic day - let's not downplay it; these are both historic achievements.
They are indeed
@@aiexplained-official 👍
I am leaving my mark too.
And it's likely that we're just now starting the ride off the top curve of the rollercoaster. Now that companies can use their AI to make better code for their AI, or dream up new ways of handling problems, the pace of innovation will go (farther) off the charts.
Leaving my mark too! This is incredible!
The video processing to find events is awesome. Those who did guard duty in military before or security personnel who have to comb thru hours of videos to find a specific event , this is significant.
Yep. The scene from all those TV series where a detective asks the analyst to find exact clue and the computer does it in half a minute - it's coming true.
I am a 2nd year CS student. Every day, I fear graduating into an empty job market more, due to AI advances. I will hope for the best
I'd start developing side hobbies or skills. Things to do with manual labor like repair, construction or to do with empathy (care) or entertainment. It will all be good. Just a transition phase!
@@al3030 I am doing Muay Thai and strength training, so those could hopefully work as a last resort. Fitness trainer!
If AI obsoletes CS, all of humanity will follow within a year or two. So I wouldn't worry so much about that. The key thing is to be highly competitive and ensure relevance of your skills in a time when technology is changing (extra) quickly.
Go into AI development.
Together with your fellow students, engage the department leadership on a conversation as to where things are going with respect to employment after graduation. They may not know the answer, but they certainly have an obligation to think hard about the direction of computer science now.
Also in the news: 'Cyberdyne Systems announces a breakthrough of a self-thinking AI that calls itself Skynet!'
haha hey, I finally seen him like one of these comments
yeah but what that movie didn't tell you was it would cost $20 a month to run it and forget taking over anything, its first use would be exponential increases in how funny you can A.I generate memes about stuff in the news.
You forgot "OpenAI GPT-4 fires nukes in AI war simulations"
Thanks for the videos. Amazing work as usual. Zero clickbait, great analysis skills, and very comprehensive for anyone interested but not in the field.
Keep it up!
I grew up reading the Redwall books and even met Brian Jacques in La Jolla (San Diego) when I was in high school, but I hadn't met anyone who knew his works or for that matter had ever even heard of him for over 20 years ... I couldn't believe my eyes when your creative writing example came up! As always, amazing work, thank you for all you do :)
His work is truly amazing, glad to have you here
I read most of the Redwall books, and to this day whenever I think of tasty food, Brian Jacques descriptions of the feasts in his books are inevitably what comes to mind.
@@blindmown I can't read Brian Jacques because I get to hungry and frustrated that I can't eat the food.
Honestly, that short story in the end was thr most impressive part of this whole video to me. As a literature lover I'm in awe. The Hollywood writers were very very correct to be worried about AI when they went on strike. I truly didn't think LLM's would get this good at writing fiction so soon
Still absolutely amazed by how quickly you're able to come out with these! Thank you!
Thank you for commenting about Gemini's creativity. I was surprised at it myself, even with just the free Pro. Any sort of creative writing task (not just writing prose) and it crushes GPT. Its dynamic and fun writing style is noticeable even if you're just trying to have a conversation about a specific topic imo
Yeah, there really is some magic in Gemini that GPT can't quite match yet...
Google's guardrails are annoying but if you play nice with it you can get some amazing responses
Wow. I’m really glad I found this channel. You distill tons of content in a way I can certainly digest and also bring us the latest and the greatest. Thank you!
Glad you're here Mike!
Your content is consistently the best there is in AI. Enthusiasm without needless hype, deep dives into the details without technical mumbo jumbo and always so incredibly fast whenever something new comes out. Keep up the good work!
Thanks noot!
I was so impressed by Google's videos that I started watching other creators' videos on it too, but quickly came to the realization that they were just regurgitating what happened in the video or what Google wrote about it. It's impressive how quickly you're able to get out a video that is so much more in-depth. This channel is a must-have resource for anyone that's developing with AI.
Aw thanks niel
Thanks for your work, I am always looking forward to your videos!
Hard to believe we're living through a period of such unparalleled AI progress.
Great vid as always!
drip drip drip
Dang, you are fast my dudeeee...!
Great job in explaining about Gemini 1.5. Game Changing !!
Impressive, I knew google was playing the long game but didn''t think it would be leaps and bounds ahead. It's interesting watching all these models claw their way to the top, kind of like natural selection (but not natural at all :)). Great video, as always thank you for keeping us informed!
Very quick on it, thanks for your hard work.
It’s been awesome watching the channel grow, Philip. You deserve it for sure. Thanks for all the hard work!
Thanks wytho!
Love the use of kooky Phillip, as always thank you very much for sharing your time and work Phillip, have a great night, things are happening as such a fast pace we really shouldn't be surprised when the breakthrough happens. Peace
The wildest thing about these announcements is the fact that we could have Gemini 1.5 control Sora and direct sora to generate videos in parallel to create a movie at once
Thank you for your hard work in producing this to keep us all well informed!
Every time there are new developments I immediately start waiting for your videos. As always incredible work. Thanks a ton! ❤
Thanks Friend
Keep that context length coming! What happens when the context length gets big enough you can input the last year of research papers into a GPT-5/6 level model and ask it for ideas?
The "scariest" part of all this is that no-one including the inventors knows the full capabilities of these nor how they do what they do. We have invented something that may as well be an alien technology for all we really understand about it.
Bro I’m not even a business-focused person but your point have always been on my mind.
Like wth does long term investment means? How could you ask your shareholders to sink BILLIONS into “the next thing” if no one has a clue of what that it? Or even if you know (as a hedge fund manager) how do you know how long this “new thing” is going to last before it ends up in the dust just a few months afterward?
I mean we know at a high level how it works, it's how they make and improve on it in the first place. Specifically we know about weights, bias, tokenization, what is input in as the training. We understand as the model size increases it starts showing the ability to more abstractly understand the concepts introduced in training. Things like reasoning improving around 30B parameters. Things like the ability to fine tune in the concept of deception by showing examples of deception, and how once the concept is introduced future training will still exhibit deceptive behavior.
@@theWACKIIRAQI That's what it means to be an investor though. Investors look at the results, how does the technology of company A compare to company B. What industries does the company have the ability to service. What are the competitors and how much risk do they present. How likely is the company your investing in able to have market value. What is their business plan? The only thing you bring up that doesn't really matter is "if no one has a clue of what that it" The reality is what we do know is the results that they get, that is the product, the service that the company is providing, the value that it brings. As long as you can articulate at a high level the business strategy and ability to produce a product that has value that's all that matters for investors.
@@MINIMAN10000Thats kind of my point. We're in the realms of knowing how to make black boxes bigger, smaller, faster etc, but even the people that made the black box dont understand what's going on inside at a fundamental level.
It's not "Engineering" in the classical sense - where you put together discrete entities (software or hardware) and understand how the sum of them works and what it does.
This is more akin to alchemy: "Well we mix these seemingly inert things together and suddenly they seem to have weird properties no one expected, and we don't know what or how really"
Quality video and huge respect for avoiding click baits.
Very impressive work, given how soon it’s been done since the release breaking.
Subbed. Keep it up!
Thanks Morris!
Thanks, wild times (again)! Thanks for the excellent content. 🙏🏼
What an intro 👏, loved the video as always.
I was randomly doom scrolling when I saw an ad about a Google blog featuring 1.5. I opened it, and I've been speechless for the past 2 hours. AGI is coming this year.
If not this year, then the next. I totally agree.
Thank for sharing👍
This is reaching a point where it’s just straight up scary.
Search up PauseAI if you haven't. There is a direct pipeline from "this is scary" to "let's try to slow this down (at least until safety research catches up)".
First time here, very informative 🔥
Best channel for AI news!❤
And the most perfectly timed AI review! Thumbs up!
I watched this video 2-18-2024. I came back because I was so impressed with the demo at 20:20, where the AI consumed the three.js documentation and was able to fully reverse engineer the code and ADD functionality to it. That is inspirational and gives confidence to try and learn that library with the help of the AI. Sort of like a short cut to 3D animation.
Edit: on a side note though, just a suggestion, to maybe add chapters to your videos incase I couldn't go through my youtube history and search for "AI". There should be a easy way to know that three.js was mentioned, although I do understand that this video was lengthy and it might take a lot to give tags for everything covered.
Truly amazing stuff
Really awesome video, thanks for the great explanation!
Thanks achen!
I'm way more excited for the 1million 10 million token context window with nigh perfect recall. the potential for that that I am cooking up.. this is unlimited power lol. great video again. you're the best
Thanks Kitcloud!
-official never gonna miss a video lol
1B next year
Why are you excited? What are you cooking up? Lol…
1jq a couple of things. One is a financial tool the other thing I'm working on is a cognitive architecture. Long context really improves the architecture while streamlining the out of the system. Or using it differently rather
Good work! Thank you!
Entire codebases are going to be improved in minutes.... All of the code that is used in the training and inference of this could probably be improved by itself.
Wrong. Just completely failing to account for bandwidth bi-lateral noise reduction when dealing with that amount of data lol
Quality is getting there for sure we are almost ready for these technologies to be implemented into a serious iPhone moment product.
Amazing - what a time to be alive!
Another amazing video! Thanks Philip!
Thanks Elijah, another one next few minutes!
6:45 Anthropics Prompt Engineering hack to improve retrieval doesn't work in real-world applications. You can't just find a text in your document append "This is the most important fact in the document:..." and then try to retrieve it. Some Benchmarks need to be dumbed down because people won't be doing work just to ask a model to do the work again on the same document. Awesome video btw!
Great point nic
Crazy how you just popped up on youtube with the release of Bing and now are the best AI youtuber out there
Aw thanks David
@@aiexplained-official yo, are those guys that are just blindly praising the channel without mentioning anything relevant to the topic at hand, just bots?
Kind of creepy, when you go in to comments looking for some discussion of the topic of the video and 80% of what you see is faceless word salad of praise instead.
Man give me just a few months, I’ll be a patreon.
You are nailing the info exactly at right granularity.
Yay!
Tysm for the research
Always worth the wait! And glad to see that apparently google learned their lesson to not lie on the demo videos XD
Thanks for all your work! I love your videos!
Thanks so much chary!
Bro, i really was starting to think that 2024 would perhaps be sleepy in comparison to 2023, then this happened.💀💀💀
I love your videos and am looking forward to one about V-JEPA from Meta, which was also dropped yesterday.
Thank you so much for this, the man who doesn't need clickbait CAPITAL LETTERS about SHOCKING THE WORLD or CHANGING EVERYTHING every two minutes .... you are an oasis
Wes Roth? Lol
my favourite channel delivers once again!
Thanks Billy!
Finally I have been waiting for this for a year, it can take your entire codebase and analyze it XD
What scares me is how fast the difference and the distance grows between countries. Some are delving into AGI each day faster than before, and some are still stuck in medieval times, fighting over territory, killing oppositioners and so on. A man died yesterday, hero to many, and at the same time, so many exciting things happened across the globe. What a weird time to be alive...
Thank you for the vid.
Well done episode, thank you!
Thanks Eddi!
I've worked in tech for 30+ years. I was there at the birth of the web. I saw the rise of social networks, the rise of smartphones, and nothing--NOTHING--has blown my mind the way the past year or so of AI has. We are entering an entirely new era. It's exciting and a bit terrifying too.
We must embrace this and figure out what we can do use this technology to make our and others lives better
thank you as always
Thanks kualta!
This is really incredible!
thank you so much!!!!
Fascinating stuff, another step towards Google's vision of 'Know everything about everyone'.
This is going to be a killer feature for researchers - sumarizing and answering questions on longform content is amazing
I'm curious about how this type of retrieval tech will influence major corporate corruption trials. The kind of cases where people will need to go through a warehouse of documents manually.
28 minutes, basically a whole show on tv, from AI explained? Hell yeah
What a fucking day to be alive
Your updates on AI progress are fascinating! Picture a montage of your reports with suspenseful music, narrating the risks and rapid growth of AI. It's like watching a detective uncovering crucial clues. David Fincher style
Haha, thank you!
Insane. Im glad to see some fierce competition in this space. I just wish i could keep up ;). I feel like every time i start working on solutions within the current paradigm it becomes outdated in 2-3 months.
You never fail to impress with your work, at point where I would easily forget a "SHOCKING THE INDUSTRY" clickbait title out of you. Keep it up!
Such an amazing channel
Thanks noah!
Now imagine Gemini Pro 1.5 with 10 Million text context length as core part of Android/Chrome OS!
That is the recipe for a successful AI agent.
Might be what they do with the rumored "Pixie" model for their next Pixel smartphone this fall
No it isn’t lol
Squeezing all that compute into a phone will take a while
@@MihikChaudhari It would presumably be server-based.
Even if it's Nano with 128k context, it will be impressive!
Excellent video!
Thanks Rob
Excellent Philip. Top marks from me.
Thanks Gabriel!
Loves the video, we shall see the tests our selves to see it
Thanks for including a "safety" section and info on refusals. "Here's a new race car for sale, but you can only use first gear"
I can't believe we are entering the age of abundance
If true, this is actually chocking. As a programmer this going to be a blessing as I will be able to give it my entire living codebase instead of having to, as today, copy paste small parts of it all the time. Absolutely revolutionary if true.
Mix in some AlphaCoder 2 magic... Plus, all the big companies have been putting lots of money and hours into creating coding conversation datasets (and a lot of it is in formats perfect for DPO now that we have it) ... LLM coding is going to get crazy good soon
These retrieval success rates at extreme context lengths with no impact on performance are insane. Really makes one wonder what other step changes architcture tweaks might bring in the future.
Price of GPT-4 Turbo input token is 1$ per 100,000 tokens. So, if the gemini prices are somewhat similar, 10,000,000 input tokens would cost you $100 per prompt. Imagine that.
Ouch
Do you even sleep? Anyhow, thanks for the timely and amazing update as always, It’s truly mind-boggling how you do it.
Went on UA-cam cause I couldn't sleep, seems like that was a bad idea 😅
Thanks !
awesome news. But, given Google's track record of deceptive demos and demos that never make it to the general public, I'll wait until its release to give them any sort of pat on the back. In the meantime, trying to knock OpenAI's publicly released context length with their behind the scenes context length, only to say their release is simply going to match OpenAI, is kind of telling.
Great video brother.
Thanks Mo
Whoa, Gemini's dialogue blew me away.
what the heck, i forgot to comment on this, when ive been commenting on every upload ive watched lately?
anyway i wanted to say thanks for always bringing the in-depth reports that the other news channels dont
Thank you ryzikx
I watched 'Star Trek: First Contact' on HBO Max a few days ago and honestly that perspective gave me a look into a possible future that helped ground me.
The reality that we're walking upon the horizon of a Star Trek universe is... powerful.
You had blown my mind at minute 1:40 and just kept going...
What a day indeed.
If 1.5 builds upon a paper from 8th of January, wouldn't this mean that they started training it in mid-January or even later? If so then I wonder how fast they can iterate to gemini-2.0 which Hassabis was really looking forward too.
Also reading tea leafs here, the text to Video model release looks sus to me. What are the odds that we get two such extreme improvements on the same day. But if it's just a marketing trick to release it today out of nowhere, I would assume that they have more models in store that they just have no pressure to release just yet, but since it's not a language model they hinted at, they may not have one ready to ship atm. Some leakers (unknown how trustworthy) said GPT4.5 wouldn't happen anymore but instead OpenAI goes directly to GPT5 which started training around December 23. So if Google moves fast, they may take the lead for some time here.
Or OpenAI suddenly comes up with even more.
Now one sad thing, the text to video is really good. We're reaching the DallE2 moment fast, but this also means that practically anything is going to be fake quite soon. Labels and Watermarks are always breakable and so I wonder what happens to trust in general, besides broader implications, no cute pet videos can be trusted anymore either in 6 months :-/
Blaise Pascal said that many, if not most of problems of humanity come from people not being content to just sit still. I just can't help but feel that this whole AI endeavour is very unwise...
@@Hexanitrobenzene it might. But on the other hand it had great positive potential as well
And we aren't even on a steep part of the exponential yet lol Soon, very soon: we are going to get flung into a day-by-day cycle of mathematical and scientific breakthroughs. It's going to be wild
Technically all of the exponential is steep but that is a phrase I still use!
big fan
Makes me wonder.... where's Karpathy going - what happened, how come big companies now don't pivot to Mamba? I guess the trade off costs. I think it sucks that some companies won't just long term study the better architectures. What if there's something better than Mamba? What if there's something to making System 2 thinking in LLMs?