@@apache937optimistic, it gets exponentially more difficult to consistently generate 10-20 minutes at one moment. And mixing audio/music/sound effects/dialogue and planning everything out to make it at the same level and quality would take a lot of time. Maybe if ASI comes out in 2026 it’ll be possible to get results instantly, but it’ll even be tricky for AGI to make something like that. Could take a few days to generate that and have the AI edit it until you get a good result(like imagine a team of humans working on it). There needs to be new breakthroughs in how AI edits videos and strings everything together to make longer stuff consistent. We haven’t seen anything one shot several minutes at once, and unfortunately are probably still a few to several years off from having it make full episodes (unless ASI gets here in 2026)
@@phen-themoogle7651 It seems weird that you think ASI is an easier problem to solve than creating long videos, given that creating longer videos would be a subset of what ASI could do... you entertain ASI coming soon, but not longer videos, unless we get ASI. Very strange reasoning.
AI is great for commercial tasks like creating generic ads or commercials. I wouldn't say it's ideal for filmmaking, but it could work for things like B-roll or when you need stock footage. In that case, it might be useful.
Sooo close to perfection, barely missing the mark, 9/10! Way better than the other AI video models that only get a 4/10 in my book. Maybe in one year short clips will be completely indistinguishable from reality! And then in a couple years we can generate both audio and video at the same time with perfection for up to a few minutes zero shot. I’m not expecting full length movies to be generated without people editing and mixing them a lot themselves, for at least 2-5 years still. It’ll be exponentially more difficult to remain consistent over a period of 30 minutes to 1-2 hours!!
That's my estimate also. I think 1-2 years it might even be possible to have real-time generated video indistinguishable from real. I don't know what kind of resolution it will be at, but I stand behind the idea of real-time generated video in that time frame.
The sausage-eating guy literally inhaled it like a vacuum at the end. And that must have been a particularly tough banana that that other guy was eating. Otherwise...this stuff is amazingly accurate.
I'm pretty sure it's a steak but yeah fully agreed. This is definitely the best on the market. Not perfect but a heck of a lot better than the competition.
Video rankings: A = Indestinguishable from reality B = Minor issues that will probably go unnoticed (morphing, changing, ect..) C = Issues that will probably be noticed but are pretty minor D = Problems with it that also will be noticed but are relativley major ones that kind of draw your attention to them (problems with pretty much the focal point of the video) (Order from start to end with each clip) - B - A - A - C - A - B - B - C - A - A - A - D - A (Aside from the weird thing were she only sips a little bit if it but that doesn't make it more or less realistic, don't really overthink it I guess...) Total = 13 clips A = 7 B = 3 C = 2 D = 1 Conclusion: Congrats Google Deepmind! at least according to me more than half of these videos are practically indestingushable from reality, the only other factor to know to be able to draw a less biased comparison would be to know how he "cherry picked" them, if at all, the more cherry picked the more it's general score will go down...
For most cases, I can find issues only if I really look at it a lot. Some are easier though. Examples: 0:12 bubbles are very fast, disappearing inside the liquid before reaching the top. 0:20 The meat is gone in an instant at the end. Fork still has a lot of steam. 0:30 when the man eats a banana, I think the markings on the banana don't match the angle of eating. 1:04 red straw appears out of nowhere, and also disappears when done drinking... For once, I didn't see issues with hands, though. That's very rare.
Does Veo 2 allow uploading images with people? It’s unfortunate that Sora doesn’t. My family and I have enjoyed using Kling to animate family photos be creative with them
That’s definitely an issue / missing thing right now. I know it’s possible. I’ve seen UA-camrs do amazing things with LoRAs etc but it seems like a complex process involving multiple tools, each requiring purchasing credits / subscriptions, from what I can gather
One question, will Veo 2 be paid like the other AI tools that generate videos from text prompts? Or will it have limited use when it is available or is it free?
To the person who told me that AI will never replace film. Here you go! (This is just the beginning) Not sure if they already implemented MAMBA/min-RNN architectures Regulations can't stop this, you only have 2 ways of regulating AI 1. Regulate matrix multiplication (equivalent of regulating addition and multiplication) 2. Regulate the distribution of databases (that includes libraries, media, free-speech, USB, contact list of whistleblowers etc.) Good luck with that Personally, I think they should regulate USB/database transfers through ohysical borders (like airports etc.) I literally can smuggle a USB containing 25% of a private key for a crypto wallet worth billions (but sanctioned) With 0 checks from the airport Also, nothing stops Google from implementing their film AI in a more AI relaxed place vs a regulated place. Nothing stops hiring 1 guy to rent this AI to create thousands of films while the true director is inside the regulated AI country (unless your gonna spy on all potential directors)
This is such a leap forward that the other genAI video competitors might as well give up photorealistic rendering of humans. They should juat go for mograph, cartoon/anime and different flavours of CG.
@@stoggi15 I am not talking about the length here. I am talking about the fundamentals of story which can be applied to any length. You can simply find this out by asking the question "if I extend any of these clips from 5 seconds to 5 minutes, will there be any story?" and the answer is no
@@ElfTaleFilms Probably yes but also how do you really know that they won't be able to eat let's say all of the banana in that time, with a longer context window given for this specific task it could be done, so you're tedhnically right but only because it's not even currently designed to generate that type of length, it's usually 8-10 second clips
Do we really need this stuff? What happens when we can't tell what's real and what's fake. So many other important problems we should be working on and we're doing stuff like this.
Since nothing like this has existed before we don't really know what all the potential applications could be. And I kind of see it as a stepping stone; there might not be a lot of use in just generating random videos from text props or from images, but the video models themselves, especially when combined with other models, might be able to do really powerful things in the near future. Like render completely realistic interactive worlds in virtual reality. Which could have all kinds of uses in entertainment and training and who knows what else.
@R2Bl3nd All of that is great, but the reality is it will be, and really already is, in the hands of the general public. It's already happening, and it's already stupid. Whatever happened to solving equality and world hunger? Instead we're creating realistic videos with text prompts and people are still starving.
@Itsjamilagain Why one or the other? We can do both. People are working on both. We don't have to halt all technological processes until everyone is caught up. Then no innovation could happen. Different people are cut out to do different things. Why deny someone's dream, if their dream is to work on this kind of thing? Should we shame everyone who doesn't want to directly solve world hunger? Are you solving world hunger right now, or supporting those who are? Everyone has their own way to contribute.
@@R2Bl3nd This is a privileged take. There is ABSOLUTELY more effort and emphasis being put on this AI stuff than there is into genuine social problems. As someone who has been through these things and tried to use the services myself. The system SUCKS. And I do my part. When I had money I gave out money. I bought homeless people food and coffee and donated clothes. Meanwhile the other day I saw some stupid video of an oversized owl, and nobody knew it was fake. Can we do both? Absolutely. Why is one not being prioritized over the other? And again, what exactly is the use of giving this stuff to the average person who will most certainly abuse it? Social media has become more toxic and destructive than anything. But when it came out it was the next big thing. Humans are WAY past our peak need for technology. Once you get to the point where people are losing autonomy and cannot think for themselves, you have gone too far. This is going too far.
@@R2Bl3nd Looks like YT deleted my response and it was way too drawn out for me to rewrite it. I'm just going to say I strongly disagree with this point. Humans have been past our peak need for technology since like 2010. Everything else now is "quality of life", but as quality of life improves humans become less autonomous. We've become lazy, obese, entitled, even selfish. This whole year has been about AI. Where is all the focus on humanity and social services? Instead now privileged people can create our own videos via text, and poor people are still digging through the trash.
How is it useless??? The lack of foresight people have is absolutely astounding. It's like when electricity was first demonstrated by deflecting a compass needle with a wire, people said "what good is it?". The difference is back then, they didn't have a long history of break neck speed technological progress behind them.
It will save a ton of money in video production B-roll and VFX shots, and enable independent filmmakers to make big budget Hollywood style movies, but yeah it's totally useless. lol
Reply to this comment with your Google Veo 2 prompt!
Creating a subscriber video where I try your prompts with Veo 2!
imagine going home after a whole day and just type 'generate me a new the office episode'
2 years or less
@@apache937optimistic, it gets exponentially more difficult to consistently generate 10-20 minutes at one moment. And mixing audio/music/sound effects/dialogue and planning everything out to make it at the same level and quality would take a lot of time. Maybe if ASI comes out in 2026 it’ll be possible to get results instantly, but it’ll even be tricky for AGI to make something like that. Could take a few days to generate that and have the AI edit it until you get a good result(like imagine a team of humans working on it). There needs to be new breakthroughs in how AI edits videos and strings everything together to make longer stuff consistent. We haven’t seen anything one shot several minutes at once, and unfortunately are probably still a few to several years off from having it make full episodes (unless ASI gets here in 2026)
NEver happen
@@24_f_p_s in the early 2000, blockbuster CEO said "Video will never be a big thing on the internet." So yeah, we dont know whats going to happen.
@@phen-themoogle7651 It seems weird that you think ASI is an easier problem to solve than creating long videos, given that creating longer videos would be a subset of what ASI could do... you entertain ASI coming soon, but not longer videos, unless we get ASI. Very strange reasoning.
Will smith eating spaghetti is the ultimate benchmark
True, but maybe they don't allow celebrities there.
It's over, ya'll. Google Veo 2 is the future for AI Filmmaking of films & cinema
It's not *the* future, but it's a large step towards it for sure...
AI is great for commercial tasks like creating generic ads or commercials. I wouldn't say it's ideal for filmmaking, but it could work for things like B-roll or when you need stock footage. In that case, it might be useful.
@@stoggi15No he's right dead Internet theory is true it's over
Sooo close to perfection, barely missing the mark, 9/10! Way better than the other AI video models that only get a 4/10 in my book. Maybe in one year short clips will be completely indistinguishable from reality!
And then in a couple years we can generate both audio and video at the same time with perfection for up to a few minutes zero shot. I’m not expecting full length movies to be generated without people editing and mixing them a lot themselves, for at least 2-5 years still. It’ll be exponentially more difficult to remain consistent over a period of 30 minutes to 1-2 hours!!
That's my estimate also. I think 1-2 years it might even be possible to have real-time generated video indistinguishable from real. I don't know what kind of resolution it will be at, but I stand behind the idea of real-time generated video in that time frame.
@@TeddyLeppard i think short movies 30mins best possible in 2030..
So we are in a simulation !
The sausage-eating guy literally inhaled it like a vacuum at the end.
And that must have been a particularly tough banana that that other guy was eating.
Otherwise...this stuff is amazingly accurate.
I'm pretty sure it's a steak but yeah fully agreed. This is definitely the best on the market. Not perfect but a heck of a lot better than the competition.
Video rankings:
A = Indestinguishable from reality
B = Minor issues that will probably go unnoticed (morphing, changing, ect..)
C = Issues that will probably be noticed but are pretty minor
D = Problems with it that also will be noticed but are relativley major ones that kind of draw your attention to them (problems with pretty much the focal point of the video)
(Order from start to end with each clip)
- B
- A
- A
- C
- A
- B
- B
- C
- A
- A
- A
- D
- A (Aside from the weird thing were she only sips a little bit if it but that doesn't make it more or less realistic, don't really overthink it I guess...)
Total = 13 clips
A = 7
B = 3
C = 2
D = 1
Conclusion: Congrats Google Deepmind! at least according to me more than half of these videos are practically indestingushable from reality, the only other factor to know to be able to draw a less biased comparison would be to know how he "cherry picked" them, if at all, the more cherry picked the more it's general score will go down...
these are 0 shot
For most cases, I can find issues only if I really look at it a lot. Some are easier though. Examples:
0:12 bubbles are very fast, disappearing inside the liquid before reaching the top.
0:20 The meat is gone in an instant at the end. Fork still has a lot of steam.
0:30 when the man eats a banana, I think the markings on the banana don't match the angle of eating.
1:04 red straw appears out of nowhere, and also disappears when done drinking...
For once, I didn't see issues with hands, though. That's very rare.
Does Veo 2 allow uploading images with people? It’s unfortunate that Sora doesn’t. My family and I have enjoyed using Kling to animate family photos be creative with them
Jerrod, this is amazing! I want to see more! Thanks
Thank you so much, more to come!
I want to see a demo of someone looking in the mirror and turning around in a circle, slipping on slippery surfaces, and fist fight
Why maximum quality of your videos is just 720p?
This is the output resolution of Veo 2.
These are amazing Jerrod! But how do you have access to Veo 2 when the world doesn't have it it?
Waiting list and on the discord, a few people were lucky to get access. I’m doing my best to repay this by generating a lot!
imagine if it can create consistent character :D
That’s definitely an issue / missing thing right now. I know it’s possible. I’ve seen UA-camrs do amazing things with LoRAs etc but it seems like a complex process involving multiple tools, each requiring purchasing credits / subscriptions, from what I can gather
@@ManoOne-Music-Production yea
and backgrounds from different angles. That would be a deal.
@@jamesroth7852 yes
Pretty accurate! Wow!
One question, will Veo 2 be paid like the other AI tools that generate videos from text prompts? Or will it have limited use when it is available or is it free?
I don’t have any details about the price :( will update when I know more.
If we aren't in a simulation we will be soon.
The better we are deceived, the more "impressed" we become. Lol
The eyes, is what makes you know it's ai.
To the person who told me that AI will never replace film.
Here you go!
(This is just the beginning)
Not sure if they already implemented MAMBA/min-RNN architectures
Regulations can't stop this, you only have 2 ways of regulating AI
1. Regulate matrix multiplication (equivalent of regulating addition and multiplication)
2. Regulate the distribution of databases (that includes libraries, media, free-speech, USB, contact list of whistleblowers etc.)
Good luck with that
Personally, I think they should regulate USB/database transfers through ohysical borders (like airports etc.)
I literally can smuggle a USB containing 25% of a private key for a crypto wallet worth billions (but sanctioned)
With 0 checks from the airport
Also, nothing stops Google from implementing their film AI in a more AI relaxed place vs a regulated place.
Nothing stops hiring 1 guy to rent this AI to create thousands of films while the true director is inside the regulated AI country (unless your gonna spy on all potential directors)
Hasn't ai heard of soup spoons?
Apparently not 😂
That banana is harder than the steak, but.... btw its still impressive.
Coherence is generally excellent, but everything looks soft, as if it was out of focus. That's a problem.
Might also be the quality of the video, we can only export in 720p at this point.
Tell me what you want but i reaaaallllly believe that was trained on youtube, tooo good videos to not be
Duhhhh
Where is willsmith in all this?
Check my shorts, did a comparison.
This is such a leap forward that the other genAI video competitors might as well give up photorealistic rendering of humans. They should juat go for mograph, cartoon/anime and different flavours of CG.
The space for this technology will be super competitive, I hope everyone can find their niche.
Well Sora also came out with impressive video. But looks where they are. 😂
Oh boy! It’s lit
Imaginou um filtro sob um jogo ou o melhor um jogo feito de forma imaginária 😂
The prompt is only 7,200 lines, but hey, AI is really efficient
My prompt for this was like 2 sentences 😂
Its over for mukbangers
Some of the food feels like it's....stretchy
Godbless u man😊❤
See! They can't finish food, there is no story!
bro what 💀
@@prodvxmp Thats the fundamental of story telling, beginning and ending, and it seems to be missing in Veo 2 too ! :(
@@ElfTaleFilmsIts not able to generate outputs that long though,
@@stoggi15 I am not talking about the length here. I am talking about the fundamentals of story which can be applied to any length. You can simply find this out by asking the question "if I extend any of these clips from 5 seconds to 5 minutes, will there be any story?" and the answer is no
@@ElfTaleFilms Probably yes but also how do you really know that they won't be able to eat let's say all of the banana in that time, with a longer context window given for this specific task it could be done, so you're tedhnically right but only because it's not even currently designed to generate that type of length, it's usually 8-10 second clips
Queria assistir fantasia, tipo jogos ou isekai com isso
No more spaghetti
We're cooked 💀💀💀
INSANE!!!
AI still struggles with skin texture
It does! Though this is a massive challenge for VFX too.
I'm scared 😢
is it yours , did u generate it ? , if yes can you try with clay animation ? thanks
It's all mine, will add clay animation to the list to try!
@@jerrod_lew thanks
And… go back and watch the OG Will Smith spaghetti video (March 2023)
Yeah that's terrifying
I did the comparisons! ua-cam.com/users/shortsrqtQBEP3Q30
@ thanks for the link! Crazy to see them side-by-side
Ok this one was a bit uncanny
Do we really need this stuff? What happens when we can't tell what's real and what's fake. So many other important problems we should be working on and we're doing stuff like this.
Since nothing like this has existed before we don't really know what all the potential applications could be. And I kind of see it as a stepping stone; there might not be a lot of use in just generating random videos from text props or from images, but the video models themselves, especially when combined with other models, might be able to do really powerful things in the near future. Like render completely realistic interactive worlds in virtual reality. Which could have all kinds of uses in entertainment and training and who knows what else.
@R2Bl3nd All of that is great, but the reality is it will be, and really already is, in the hands of the general public. It's already happening, and it's already stupid.
Whatever happened to solving equality and world hunger? Instead we're creating realistic videos with text prompts and people are still starving.
@Itsjamilagain Why one or the other? We can do both. People are working on both. We don't have to halt all technological processes until everyone is caught up. Then no innovation could happen. Different people are cut out to do different things. Why deny someone's dream, if their dream is to work on this kind of thing? Should we shame everyone who doesn't want to directly solve world hunger? Are you solving world hunger right now, or supporting those who are? Everyone has their own way to contribute.
@@R2Bl3nd This is a privileged take. There is ABSOLUTELY more effort and emphasis being put on this AI stuff than there is into genuine social problems. As someone who has been through these things and tried to use the services myself. The system SUCKS. And I do my part. When I had money I gave out money. I bought homeless people food and coffee and donated clothes.
Meanwhile the other day I saw some stupid video of an oversized owl, and nobody knew it was fake. Can we do both? Absolutely. Why is one not being prioritized over the other?
And again, what exactly is the use of giving this stuff to the average person who will most certainly abuse it? Social media has become more toxic and destructive than anything. But when it came out it was the next big thing.
Humans are WAY past our peak need for technology. Once you get to the point where people are losing autonomy and cannot think for themselves, you have gone too far.
This is going too far.
@@R2Bl3nd Looks like YT deleted my response and it was way too drawn out for me to rewrite it.
I'm just going to say I strongly disagree with this point. Humans have been past our peak need for technology since like 2010. Everything else now is "quality of life", but as quality of life improves humans become less autonomous. We've become lazy, obese, entitled, even selfish.
This whole year has been about AI. Where is all the focus on humanity and social services? Instead now privileged people can create our own videos via text, and poor people are still digging through the trash.
Ok. So now do something interesting with it. I can see this anytime.
How about them eating rocks.
READY TO USE FOR PROPAGANDA SLANDER IN 2025
This is completely useless, but very impressive
How come?
@@xviii5780 We're working on shovels, but once we can get digging, there will be an abundance of gold
How is it useless??? The lack of foresight people have is absolutely astounding. It's like when electricity was first demonstrated by deflecting a compass needle with a wire, people said "what good is it?". The difference is back then, they didn't have a long history of break neck speed technological progress behind them.
It will save a ton of money in video production B-roll and VFX shots, and enable independent filmmakers to make big budget Hollywood style movies, but yeah it's totally useless. lol
You hit the nail on the head here!
Work on physics