OpenAI Sora: The Age Of AI Is Here!
Вставка
- Опубліковано 15 лют 2024
- ❤️ Check out Weights & Biases and sign up for a free demo here: wandb.me/papers
OpenAI Sora: openai.com/sora
📝 My paper with the latent space material synthesis:
users.cg.tuwien.ac.at/zsolnai...
📝 My latest paper on simulations that look almost like reality is available for free here:
rdcu.be/cWPfD
Or this is the orig. Nature Physics link with clickable citations:
www.nature.com/articles/s4156...
🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Bret Brizzee, Gaston Ingaramo, Gordon Child, Jace O'Brien, John Le, Kyle Davis, Lukas Biewald, Martin, Michael Albrecht, Michael Tedder, Owen Skarpness, Richard Putra Iskandar, Richard Sundvall, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here: / twominutepapers
Thumbnail background design: Felícia Zsolnai-Fehér - felicia.hu
Károly Zsolnai-Fehér's research works: cg.tuwien.ac.at/~zsolnai/
Twitter: / twominutepapers
#openai #sora - Наука та технологія
History in the making.
certainly. this is crazy!
It's time to create true hd videos with ufos ands alians from deep space.
Someday two minute papers will use AI to make their videos instead
@@frommarkham424 no, that would be horrible ):
@@frommarkham424 They're doing this already.
guess we are starting to reach that second paper down the line
😂 It's an infinite loop
“But two papers down the line…” 🤓
@@CatfoodChronicles6737what’s with the emoji
Two more paper, and we can make 10 minute video
@@luiginotcool it's meant to either call the original commentor a nerd or show that what their comment says is what a nerd would say
in this case, it's meant to show that the quoted thing is what a nerd would say
One year ago, we had a problem with making hands look realistic in PICTURES
It's insane isn't it.
the problem is that dalle thnks that fingers is a pattern, with video it harder to recreate as pattern so videos is better to train ai
@@ZintomV1 insane that every artist, every film maker, every 3d animator, every writer and every programmer will all be unemployed within 3 years...
@@aegisgfx
Naturally, people are gonna be freed to pursue other opportunities which leads to newer purpose. Who knows what that looks like?
That is still an unsolved problem for local image generators
This statement on OpenAI's website gave me chills:
"Sora serves as a foundation for models that can understand and simulate the real world, a capability we believe will be an important milestone for achieving AGI."
That is insane thing to say overpowering humans shouldn't be the goal of the developers, AI should empower us!
Lol@@morrgash
@@morrgashget rid of your superiority over the machine. It doesn't exiwt
AGI?
@@SayedSafwan Artificial General Intelligence
Just the fact that the spots on the dalmatian were consistent was impressive.
The consistency is mindboggling
Agi is here, în closed doors
Not exactly, look at inner side of the dog legs.
But yeah, still VERY impressive.
@@JuniorDjjrMixMods OK, I must have missed that. Still convincing enough to fool my perception at a glance.
The dalmation is teleporting though a piece of wood to the next window sill.... video physics are nonexistent and deapth really sucks, but the still images look nice.
It's not just how amazing all this is.
It's also how fast it is happening.
Its the speed that's worrying. We're well on the path to having the creative arts be replaced by AI, at least commercially. The actual labor people don't want to do is hardly any closer to being replaced. There's no hint of the government thinking about programs to financially support people who lose their jobs to AI. The rampant misuse or at least careless use of such a powerful technology is terrifying.
@@Fa1seP0sitive It's not the government's job to replace people's jobs when they lose them to technological advancement. It's up to the people that lose their job to either learn a new skill, or use the AI tools to make them more productive. This concept is not new. We used to have teams of men with shovels digging foundations for buildings. Not we have diesel powered backhoes. We didn't need the government to pay shovel diggers or find them new work. They all simply moved on from shovel digging.
@@Fa1seP0sitive That is a grim future waiting for us unfortunately.
@@Fa1seP0sitive every artist, every film maker, every 3d animator, every writer and every programmer will all be unemployed within 3 years
@@aegisgfx False. Just stop
Quote from "i robot" movie in 2004 : "can a robot turn a canvas into a beautiful masterpiece?"
Then 20 years later in two minute papers : "AI can turn a prompt into a beautiful video"
Can a robot tern a canvas into a masterpiece? Yes.
Can a robot pick manure? No?
You shovel is over there.
@@Rusu421 What are you talking about? A robot can 100% pick manure
Canvas has no inherent information in it, a prompt is a direction.
@@Rusu421Robots can already do that, and you don't even need advanced AI for that.
@@Obsolete386 it could, but it much more expensive
I dont normally comment on videos but this is genuinely both terrifying and unbelievably fascinating.
You summed it up well, I hope that this development is only used for good things but I fear what it may be capable of.
@@enterchannelname8981 , sure! What technology has ever been used for evil?
Textbook example of exponential growth in technology. One day the AI is churning random blurry images that vaguely resembles the prompt and the next year it is producing realistic HD seconds-long videos.
1 minute long
@@VperVendetta1992I highly doubt there's such a hard limit. It might get stuck on a motif that you specify should be in the video if you just tell it to continue producing frames. But it's not hard to imagine you can add or remove things from the prompt, continuing from the last n frames produced.
And next year it will make full length movies and series. Correctly recalling past details into future episodes. Consistent in imagery and story. Able to take on the most demanding prompts.
I know exponential growth is already nuts but this is genuinely exponuts.
@@2DReanimation they mentioned in the update that the first sora will be making videos up to a minute long
The Enterprise holodeck is basically two papers down the line. Just imagine wearing a quest and ask *computer: Italian bistro but with a futuristic twist* and you are there
Thats what I am waiting for...
That's the E in STEM. Engineering breakthroughs don't advance as fast as the T in STEM.
That's actually possible now... Holy shit.
Startrek holodeck is the only thing I know will never exist.
It might exist in some sort of VR solution, but not in the fashion we see in Startrek.
Exactly what I was thinking.
AI is going to bring Star Trek to reality.
Gonna be honest: the fact that we can no longer trust images and even videos going into an election year is something I find terrifying
Interesting timing of release
@@FlamespeedyAMV not really, there are elections every two years, every year is going into an election year.
Morons didn't need fabricated videos and images to believe in bullshit, maybe it won't change a thing.
Nothing can be worse than the last 5 years with biden and trump. Its can only get better
I don't think AI video will play a major role in the US election this year. The mid terms and the 2028 election is another story completely.
Wow, that is crazy fast advancement. It seemed like pure sci-fi just a few papers back.
Rich will become richer. We all gonna loose our job. Lets revoult against AI. Its unethical and we dont want to be cyborg.
@@mernkanthri3941Technology isn't the problem of inequality. It's greedy humans.
@@FunFreakeyy and technology is a product of greedy humans. They are one and the same and will be until AI replaces us entirely, skynet will be real because we can't control ourselves
@@mernkanthri3941 You might lose your job because you can't spell or use proper grammar... it is LOSE not LOOSE and it is "let's" (a contraction of "let us"), not "lets" (same with it's/its).
@@mernkanthri3941speak for yourself, i want to be enhanced
I can already see myself pasting all my favorite books and seeing them come to life.
By the end of the decade, we will be able to change the camera position in a generated photorealistic 3D world in real-time.
I am blown away by Sora. I had a small existential crisis today ngl. I KNOW that neural networks can THEORETICALLY approximate any function aka they are able to THEORETICALLY simulate anything, but I really thought Sora-level model is at least 5 years away.
I don't know how hard is to merge this technology with Gaussian splatting, but it shouldn't be an impossible task...
@@maniccporcupine Yes it has already been tested out. Right now however it it's static. Imagine if you have the movie playing and you can fly around all the action w/o regenerating the scene.
@@SiimKoger Mind bending. I've never had this thought before in my life. Unreal.
This is the beginning of the end of the human spirit, understanding of reality, and basic relevance.
With LLMs I could appreciate how they worked and how added complexity could produce the extremely impressive results we have been seeing. With imagery and video like this I can't even begin to understand what's going on under the hood. What basic elements it's building relationships between? In LLMs it's tokens, with text to video it's tokens to ???? .... It's like we've been given technology by advanced aliens.
They do it...
They do go into a bit of the architecture in the Sora announcement page. It's a transformer based diffusion model, where multiple frames go in at once, broken into patches that are analogous to tokens. The transformer compresses them into the latent space, then diffusion happens there guided by the prompt embedding. Then the diffused latent space is decoded into frames again by the rest of the transformer. They claim that by doing multiple frames at once, it improves the temporal coherence; that tracks with those results to me.
The temporal coherence from handling several frames together is a big step-up. @@IceMetalPunk
@@IceMetalPunk Thank you for your clear explanation, it's incredible how something like this can be achieved from chaining those ideas.
I feel OpenAI has already achieved a form of AGI secretly (thus the recent firing debacle) and is releasing snippets to keep generating more money to become unstoppable.
This is insanely fascinating and scary at the same time. The grand age of AI is right in front our doosteps.
Exactly, one could see an Dystopian Age of Deception coming, thats even way more worse than it is already.
Or in the lines of Neo *"How do we know what is real?"*
every artist, every film maker, every 3d animator, every writer and every programmer will all be unemployed within 3 years
Nope, that wont happen! @@aegisgfx
Yes, I agree that AI will become even more advanced, but humans will evolve along with AI, working together and/or finding a way to be more important in their Job.
And the fact that even though many, or heck even the majority, would tend to use AI to "do stuff" there will still be some groups that would prefer humans to do "their stuff" etc. meaning since you have used *every* you cant be right. Even though as those who become unemployed due to AI doing a "better job" might get together to do their own thing, and that will be good too. So they are no longer unemployed.
The fact that AI could be misused to create even more fake news could be a bit more alarming.
@TakahiroShinsaku Companies will just cut out the artists if they can use ai for cheaper
Yea sure the greedy Corps will do it ASAP@@irecordwithaphone1856 but not all would be that Greedy. Some would even focus on Artists that use their Skills and enhance these with AI to safe some time creating their Work. You see the current Technology boom is kind of similar as the one we had from around 1760 to about 1820-1840, many had fear to lose their Job to the "Maschines" but they adapted to their Situation and work did go on.
I do not think, that Companys such as Larian, CD Project Red or similar Indiedevs would cut their Human Artists out to make more money. ESG DEI Pandering Corps, will surely do it ASAP. In the end, AI might even be used to cut out some nasty work, to make it a Blessing rather than a Curse, Humans surely are very adaptive if necessary.🙏
Next, you have to imagine an AI that pairs with this to generate realistic audio that goes along with the video. Incredible!
Okay we are in that one paper which was going to be down the line
They should’ve shown another iteration of Will Smith eating spaghetti for comparison
100% better not be blocked by content policies
Love being a passive viewer with you and the community on this journey. Amazing to believe we may just witness the first forms of AGI in our lifetimes together on your channel. Can't wait for your video on that when it happens.
Pay close attention to AI news on September this year.
frankly it's a little concerning how soon AGI may begin to enter our lives - what comes 2 papers down the line for AGI?
Yup, we will share this experience together on this channel! Can't wait for the new robots and new LLMs to clash!
It is like technology right out of SF movie like the Total Recall (1990 movie). I never imagined that such SF technology would come true in my life.
Exactly! I very distinctly remember watching The Running Man as a kid and thinking "that's some BS" when they quickly synthesized a fake gory death clip when they just couldn't kill the real guy. Less than 40 years later, here we are. Unreal.
I expected this at the end of the year, not mid-February! Progress is so insanely fast.
i did not anticipate the quality of it, but i guess robots are still in the menu for the end of the year hopefully?
@@and_I_am_Life_the_fixer_of_all I guess we will see advanced robots, in maybe just 6 months, not even the end of year .... The progress is making me scared. People and even I say, that AI will help people, but another side is that, only people who own it will benefitt more, and others people will become jobless, poor, everyone will struggle etc. Anything can happen heaven or hell, no one knows
IKR? Even just the fact that we're measuring the time between these giant leaps in months instead of years or decades says a lot.
@@ainovice6634 ah, same thing was said about the industrial revolution and electricity and yet, here we all are
@@and_I_am_Life_the_fixer_of_allrobots are depending on robotics, wich are not advancing as quickly as AI
This is the scariest thing, 2 papers down the line and we won't know whats real and whats fake
That's already true
It's already true. There are some channels on YT that have the AI do a good bit of lifting and you can't tell.
Social media is full of human made fake content for attention so what’s the difference?
@@favesongslist every artist, every film maker, every 3d animator, every writer and every programmer will all be unemployed within 3 years
@@aegisgfx That's partly wrong.
The producers, the CEOs, and the filmmakers who own a company won't be replaced.
They will replace most people below them to save money and release things faster.
Craziest moment in technology 🤯
Nah.
The craziest moment will be when AI powered robotics makes romantic partners redundant and a significant part of society as we know it crumbles and falls apart.
Just think of all those suffering husbands who haven't had sex in years suddenly free 🤣
@@mnomadvfx every artist, every film maker, every 3d animator, every writer and every programmer will all be unemployed within 3 years
Ray Kurzweil said it best Technological advancements will be moving so fast we won't be able to keep up, hence the singularity when we will have to merge with Technology just to keep pace.
I'm reading Age of Intelligent Machines right now, great book!
2 papers down the line!? My guess: Any video can turn into high-fidelity VR/AR stereoscopic media on-demand (3D). that or a real-time quantized version. hyperreality OTW🔥
I was thinking for game design you just block out a rough prototype and label all the objects, like this box is a "expensive high end car" then the ai generates the visuals for the game
@@peterwilkinson1975 Imagine how much crap we can generate! It's so cool. Now anyone with absolutely no skill and hyper ego can feel himself a creator. What a time
@@TheFeanture it goes way beyond the short term, in ten years anyone can build what is now considered AAA games. It all comes down to vision. If you’re an ego maniac or not. Technical knowledge of game making will become less of an obstacle to vision. I think this is the trend where technical knowledge will take a back seat to vision. If you have vision I think that’s the skill of the future.
@@peterwilkinson1975 that it so obvious that practise and hard work is not related to vision. Vision it is all about - "I am so great but tehnology is not there yet to show it". Good we all soon unjoy it in full scale, but not to long.
@@TheFeanture I just feel like so many people are living in the previous paradigm, where what is valued is based on it. Vision is hard, it’s why there are so few games like Elden ring, and so many like assassins creed. Most people just copy success so they can have the same. Where they are only interested in the end result. Like the “thing” is just a means to an end. Like clout or feeding the ego. The great people see that the “thing” is the end.
Happy to see all my years of handwork and learning is for nothing 😀. Imagine having to restart your career even before it begins! What a time to be alive 🤗
Robots are next
When I saw the Sora video yesterday, I couldn't wait to see your analysis. Thank you.
I mean soon enough we're going to be able to feed this thing a book and it'll produce a 10 hour tv show
Exactly, what I been trying to explain was coming for years now. What worries me our politicians have no idea about AI :(
@@favesongslistwhat worries me is that they do
@@WALLACE9009good, this needs to be stopped before it becomes skynet
@@WALLACE9009 my comment has been deleted :(
@@WALLACE9009 Sad they will not post my replies :(
The beginning of the end of many people's careers in the VFX industry
It's the beginning of the end of the humankind. The progress in AI will just get faster in future and it's already this fast.
Prompting and random result is not a silver bullet. Master all ai tools require similar knowledge as other software tools.
End of people who cant learn to work with new tools, but beginning for people who can be creative and use new technology to go further in creativity, quality, cost and efficiency
This could enable us to move Hollywood movie industry back home to New Jersey where it was first invented by Thomas Edison when he built Black Mariah studios, the big reason the movie industry was moved to California was because California could offer more land for studio set construction cheap labor and year round sunny weather for video shoots. This video AI eliminates the need for all of that, as the AI can simply generate any world needed for the film virtually within the storage capacity of its hard drive, which can easily fit in a single room. All with no need for set builders, and manipulate those worlds as needed without any need for set hands. The AI can simply generate whatever weather is wanted for the shot on command without any need for the right climate. And since all filming is purely virtual, there will no longer be those tragic accidents, like what happened to that cinematographer on the set of Rust, or Brandon Lee in the filming of The Crow. If anything Hollywood will now have to move to New Jersey for the computer Science and Technology expertise needed to maintain and utilize all the computer equipment needed for this new virtual Hollywood.
@@alphastar5626 If you think that people in the VFX industry don't know how to use new tools you are completely wrong
I really appreciate that you explain the technical details a bit more in this video
So I've watched 2MP for years and this is probably the only time I've been really impressed. As someone who majored in graphics and AI and worked in development for 15 years (albeit 89 - '03) I have zero idea how this is possible.
i didnt even realize how insane this is. holy flip dude. im sqeeking like a little child. this is so amazing!!! what!
Its only a matter of time until an AI is gonna look through google earth/google maps and a bunch of pictures and create a virtual world based of our real one.
This tech is super impressive and I’m really interested to see how it works. Something weird I’ve noticed is how many tech bros are celebrating artists losing their careers, don’t understand what there is to be happy about or why they have some a big gripe against artists.
It's weirder considering what AI is doing to art it will do to tech jobs, too.
@@sciencecompliance235 Yeah AI is going to impact hundreds of thousands of jobs and entire industries
Maybe because a big part of the "art" made by those famous celebrities lately just sucks (Also, it seems that many artists are involved in a bunch of dark and immoral stuff).
To me, playing with AI even at its current level, is already much more fun than watching any Disney movie released in the last couple of years.
I would prefer to see many of today's "artists" go away and start seeing poor people with great ideas and imagination, but without much resources, becoming the next artists and influencers of tomorrow thanks to AI.
@@denisgabriel4645 Can you specify which "famous celebrities" you're referring to? The majority of the artists I follow are genuine people who love their job and enjoy making art. These "famous celebrities" you talk about are generally a loud minority amongst a group of hundreds of thousands of artists.
Also, the whole rhetoric of AI "helping poor artists" is simply untrue. Most artists started off with a piece of paper and pencil, items which are not expensive at all. Anyone can start drawing if they want to, even committing 5-10 minutes a day can drastically improve your drawing skills. Filmmakers can start with their very own smartphone or a cheap budget camera. Not only that, but being under a certain budget can actually be a good thing since it requires you to think outside the box, which can make the art even better. Point is, making art was never expensive to begin with.
This idea of AI helping poor artists is actually the opposite of reality. AI will only discourage young and aspiring artists from pursuing art because they're scared of how AI will ruin their entire career. And trying to force artists to use AI to "adapt or die" is wrong because artists enjoy the process just as much as the result. AI strips away that entire process, and even when generative AI improves and has a lot more control/fidelity in the future, the entire process is simply not enjoyable anymore, limited to nothing more than a few simple doodles/brush strokes followed by text prompts to enhance/change them.
It's cool to learn about how generative AI works and how these models function, but it's delusional to believe generative AI is a net positive for artists, let alone poor artists.
@@denisgabriel4645 AI can't be used by poor people and most art isn't made by those celebrities that won't even be affected, also thease models are build on real artists that are almost completely middle to lower class and aren't earning anything from 99% of their hard work that when the algorithm lacks the AI starts do self-destruct sinse AI works have awful quality and logic because it doesn't live in our world what means it doesn't know what it is doing, who really benefits from all this are big companies LIKE DISNEY that saves penny's and speeds up the production of even worse products taht will flood the market lowering the value of everybody hard work, if you really liked smaller studios you would at least think about that, you lack sympathy because you are an a-hole and there is no rationalizing that lol
1:35 correction, Dall-e 3 is not the king of photoreal images, because it has a safety filter.
You can make actual photoreal people with offline stable diffusion based models.
But yes, having this in video format is a revolution.
It used to be able to do photorealism, during the first week after its release.
When I first saw these earlier today it made me realize what the first people to see a film were thinking. Absolutely incredible.
Can't wait to see what people create in the future.
yeah, because they also had no audio 🙂
I don't know, maybe video footage of me committing a crime that I never did and getting me arrested.
I needed your take on Sora today. Thank you so much for putting it up
I thought we would get there in 4-5 years, but not that fast !
It's exponential, I was expecting it in 1 year but yea we're very early
exponential technological growth
Singularity coming soon
But processing power is the slowest factor (and therefore, the most important one) here. For the moment we can type a prompt and get a high quality video in a decent amount of time *using average hardware* and not paying monthly fees or per-prompt fees to anyone, for that moment I think it will be ~15 to 20 years from now. Even 30 years more seems possible. Why that figures? Just look how the availability of "processing power" for average Joes (not rich/not so poor people) advanced through the years. It has been paaainfully slow, to my taste.
@@AetherStreamer the most important about IA for is that it should run locally on consumer hardware for one reason : censorship.
I don't understand this pace of progress. It's mind boggling how good these videos look.
yes a small number of people are being paid huge money to take all the jobs away from the rest of us, SO AMAZING!!!
these are likely just a few of the best results cherry-picked from many, many poor attempts. but still impressive to see what's possible!
There will come a time where you will HAVE to watch let’s play videos after you complete a game because every single play through is unique
This is definetely the best channel UA-cam ever recommended me. It happened 2 years ago i guess. And since then, watching some videos and seeing what is possible. Even for me, that understands all of that, it's just mindblowing. I'm living, witnessing a new technology revolution, something i just can't describe. Google has no chance against it and i'm happy as it finally has a competitor. Even the other online softwares, they have no chance against it at all.
I live in Brazil and here it would be very expensive for me to pay to use Chat GPT4. But after seeing that, maybe it's not that expensive anymore.
Next milestone will be to take a Functional MRI and show a subject thousands of images while the AI reads the FMRI data. Once trained, then the FMRI-AI can combine with SORA to read your thoughts and turn them into movies.
They're opening pandora's box with this
We are only a few years away from full length AI movies with virtual actors that appear just as realistic as current movies.
And that is a economic, moral and intellectual catastrophe
The speed at which this is developing is CRAZY
impressive. Mentally it's difficult to keep pace. I keep thinking "ah maybe in a year it will be able to do X" only to see this happening a month or 2 later. The animations you see on UA-cam now are very very rudimentary compared to this but where state of the art less than two months ago. All our ideas will be outdated before we haven had to chance to ponder on it or even enjoy it, in a sense. I think we should look into Artificial Generated Money now because the entertainment industry will not need human beings anymore soon. These are image related developments but audio will probably also move really quickly. Being a human-based creative will be something special in the near future.
So agree, I tried to explain what was coming to so many people and like our politicians just do not get it :(
The problem is the lack of control in these ai image and video output, you can’t describe by words exactly how something should be. More complex tools to come. It’s also going to be a lot of different tools, which means it’s not everyone can do. Creative people will always find new ways to use things. As more you use ai tools as more you see all flaws and problems.
@@hombacom AI can already self prompt right? or at least something like ChatGPT can suggest effective prompts to reach a certain goal (I have no experience with the accuracy of these suggestion, at this point they are probably not reliable enough yet) So even those taks will eventually be things we can leave to AI.....
@@RoryRonde self prompt is just nonsense. ai doesn’t understand reality, we feed and tag it, it’s biased and we create no perfect software. we are already tired of current ai and want next version, that says something about how well ai last before humans need to improve it
How is this even possible? It blows my mind!
This is going to open the creative door to literally anyone! Ever had an idea for a video but just don't know how to make them? Ever started to look for specific clips online to help you illustrate and visualize your idea but come up empty handed and just give up on it? Ever wanted to make a video game but just never had the time to learn how to model textures and animations? I never thought something like this would be possible in my lifetime. What a time to be alive!
I knew that its coming for a while. Was following this channel for years. Saw this progress. For me first real bell was AlphaZero in 2017. AI that learned to play Go on super human level with human data. Since then it felt like its questions of time to find ways to do similar things for everything else. Now 3-7 year later its happening for text, images, audio, video. Physics simulations. And math and coding too... Its very hard to say where we will be in 7 years.... Does feel like AI is very close... Less then 7 years... Some optimists think its this year... Probably too optimistic... But sheesh who knows...
Holy mother of papers, indeed! Károly, you gave me goose bumps. Not the first time by any stretch, but probably by far the most intense!
It would be interesting to talk about exactly how much "compute" is required. If I wanted to make 90 minutes of video what would the cost be relative to a Hollywood budget? Will this technology be within reach of megacorporations only, average corporations, ordinary people or the developing world?
Mega coroporations have been using this for years, we are getting the breadcrumbs
Great question
Stable Video Diffusion can be run on as small as 4GB of VRAM at its worst.
It won't be long before Two Minute Papers will be replaced by the very thing he makes a living discussing lol.
@@quad849 You say that as if AI won't be better than everyone regardless of what you do, or how good you are at what you do. AI and robotics will eventually be better than the best of what humans are even capable of, for any job...
2 more papers down the line, Dr. Cylon releases the last paper of mankind UwU
Definitely a moment in Media History. Great video learned so much. Thanks.
I'm just grateful that they solved the Cronenberg body horror bugs that plagued earlier versions of AI art.
Can you imagine when you feed a complete let's play of a certain game to Sora to generate a highly detailed and realistic version. For example GTA4. Then use all the new footage to feed another algorithm from NVidia so you can use it as a shader. Making it possible to have extremely realistic textures but with a low cost in compute because it's just an overlay. With this technique we would be able to change the output from old games but also new games completely. We would only need a basic framework and can use a shader for the graphics.
Did you miss the part where it mentioned how much computational power you'd need for this?
At first. But this is what you would run offline obviously. Just like the old DLSS models you (Nvidia or the developers) would generate a big dataset and this dataset is used for the shaders. So yeah, this would need lots of compute but only to build the dataset.
@@diridibindy5704 We also had a lot of noise of people saying LLMs need insane computing power and now I can run it on the linux terminal of my 2016 laptop lol
I imagined this some time ago, the new AIs we'll simply enhance whatever the 3d part of the gpu renders, it doesnt even have to be very realistic, the AI will do that part
@@diridibindy5704 did you miss the part about looking 2 papers down the line?
I want to see this make existing games look different during runtime! It would work almost like an overlay that makes it look different
“Computer, make legend of Zelda have a blade runner aesthetic”
But genuinely, I want this to happen! Imagine playing old N64 games/other games with a whole new generation of art. It would be so amazing. I wonder if it is possible for sprite work as well@@neighbor9672
Good job, you are improving your speed of release -> video cycle 😂
Nice adaptation to the singularity!
I’ve run out of superlatives for how impressive this all is. It’s hard to comprehend, but this is what exponential progress looks like! Be prepared for full-length AI generated films by next year (or sooner).
There is a near infite amount of boxes. Each subsequent box is harder to reach and harder to crack. The boxes contain useful tools for reaching and opening boxes and each subsequent box contains more potent tools and can be opened with the tools in previous boxes. There is no closing the boxes.
Somewhere down the line, there is a box that contains a little robot that can open boxes on it's own and which can reach new boxes faster than any human.
Yeah, Pandora's box.
Prompt: explain AGI using Pandora’s Box and Russian Dolls
@@anon5704that's very interesting that your mind went there too, I started out the comment with boxes inside boxes as well, but I decided to change it
@@RazorbackPTOh, boxes can be closed, just ask the Aztecs
3:30
I know it's astonishing and I searched for errors.
But I love how the woman in the background is a giant walking on the river or floating in the air. 😉
other than putting everyone out of work, what is this technology good for?
@@aegisgfx Giving more creative people the possible tool to create their ideas into a movie without them being able to use CGI, expensive cameras or software like PremierePro or DavinciResolve
@@aegisgfx Creating our own entertainment without anyone shoving their ideas on us?
@@huyked I'm fairly certain the fascination of it would die off rather quickly as seems to happen with the text-to-image scene. Of course there are many who keep being super hyped about it and that's fine, but most are not. Most will not be able to use it for their own entertainment to begin with due to the computational requirements and this tech will just cause suffering in form of propaganda, scams, cost-cuttings and such.
@@aegisgfx The unemployment thing is absolutely a huuuuge concern (hope we don't starve), but you're a luddite. At this rate we could probably cure aging in a few years. If current tech can already do so much for entertainment and media, which were long thought to be out of machine reach, imagine what else could be accomplished.
What in the actual... that looks so good it's freaking unbelievable!
This combination is perfect: Apple vision pro (similar or improved) + A.I. (creation of a world) + time dilation (an A.I. in itself that makes you a virtual world and that every second in real life is a year in this world virtual and that only 1 second passes physically for us) = eternal life
This is breathtakingly magnificent
I would love a video on how AI has exploded so suddenly and how such amazing progress is being made so quickly. What new technique or algorithm or model is making this possible. Especially with Sora with such an unbelievable jump in quality.
here's the rough timeline to catch you up:
2012: Deep learning is resurrected as a field of interest, due to success at image classification. Fueled by availability of cheap computing power (GPUs) and huge amounts of training data on the internet.
2017: The transformer model architecture was invented. Foundation for nearly all recent AI advances.
2018: GPT-1 was invented for text generation. Used transformer architecture.
2019, 2020: GPT-2 and GPT-3 showed that better quality could be milked from transformer models with just more parameters, more compute power, and more data, with no upper limit in sight, rather than waiting for a true advancement in AI theory.
2021: CLIP and Dalle are released. This was the missing link between text and images. Now text can be mapped to image features, and image features mapped to text.
2022: Rudimentary image generators begin popping up like Disco Diffusion, with a cult following of artists and tech enthusiasts constantly testing and refining. Dalle-2 is announced. Midjourney is released on a paid subscription. Stable Diffusion is released for free, with little concern for potential misuse. An AI-generated image wins an art competition. Artists on Artstation stage protests over the platform's AI policy.
Late 2022: ChatGPT is released, becoming the fastest-growing app of all time. Not a new technology per se, but rather a modified GPT-3 (from 2020) using reinforcement learning concepts to respond to commands better.
Late 2022: the success of ChatGPT causes "Code Red" at Google. Other major tech companies panic as well and allocate vast funding for AI technologies, for fear of being left behind or made obsolete.
2023: With so much funding and corporate backing, the era of generative AI begins in full. Every major tech company announces their plan to incorporate generative AI. Thousands of startups are launched using the ChatGPT API. Will AI tech fail to deliver to investors or lose public interest, causing yet another AI winter? Or is there enough momentum that major advances are inevitable? Only time will tell.
(plus a ton of other stuff along the way too, like AlphaFold solving protein folding, AlphaGo defeating the champion Go player, recommender systems becoming ubiquitous, hype around self driving cars, models/datasets made available for free by researchers and hobbyists, vast amounts of free data scraped/stolen from social media, deepfakes, etc. etc.)
Have a look at Moore's law and then remember that the budgets are so big that we are going multiple times faster than Moore's law.
Also Moore's law is exponential
The only channel I watch regularly about AI content, you do it best !!
It’s terrifying at so many levels. The fact that it’s happening so fast, that it gets even more difficult to understand what is happening every day and that it can produce results better than an editor in a basement!
The age of observable reality is nearing its end.
Digital media will become completely untrustworthy.
Go outside
@@bbokser Go to your workshop and work on your tools, wouldn’t that be fun? Smartass
Through a screen that is
Well that's just nuts! A few years ago, I speculated that one day AI could create entire movies - script, camerawork, voices, the whole thing - but I imagined we'd be well into the 2030s before that would happen.
It looks like we're getting pretty damn close already!
I wonder if studios like Disney, Paramount, and Universal will try to somehow stop all this tech from actually being available to the general public. Kinda like the oil industry suppressing alternative energy designs or buying up the license rights for technology.
@@Jacen777They really can't, although Disney has their own robotics and AI development that will likely always stay private. Big studios will however use this kind of technology and probably have more control over it. AI voiceovers are already quite advanced and likely soon many voice actors won't be offered work anymore. Although "non-AI" movies may also become a marketing term and boost sales, as many customers will prefer "the real deal". Others won't likely care.
@@Jacen777 You most likely think the opposite: the executives of those studios will want this tech to improve so they can replace more artists in an attempt to reduce the cost of making movies. What they’ll realize though is how much better human-made films are compared to generic AI slop.
@@TheCreativeNick Seeing as people already like "slop" from the music industry, I think you are over-estimating how high the quality bar is for the general public. Actually I'm looking forward to it - we won't see so many millionaire runway coat-hangers, so-called actors, and auto-tuned pop stars any more.
@@DeceptiveRealities Trust me, AI won't be raising the bar on anything. It'll only be as good as the music artists it's trained on. You're right, there are a lot of really bad songs out there right now, but the idea of the majority of music right now being "runway coat-hangers, so-called actors and auto-tuned pop stars" is completely false. There are thousands of artists out there, big and small, that are not like that at all. The entire music industry doesn't just comprise of "mainstream" music artists, you need to widen your range.
Was really waiting for your take on Sora
I am running out of words to describe how amazed I am with these advancements. There are only so many adjectives.... 😁
Time for you to invent more adjectives. :D
One of them is cataclysmic?
Get it from chat gpt😂
We will be able to create interactive movies ! Where we put VR glasses on and say: create a harry potter hogwarts tour where I can talk to the characters and play games with them, even talk to them about what it's like to be in their world? We will 100% have people dating characters in alternate matrixical universes with ease by 2025. MAYBE even earlier.
Notice the lack of humans interacting and talking to each other? Ever heard of uncanny valley?
2025 is a bit wildly optimistic
@@zubinkyntolet’s meet again in a year
@@gxrsky i mean i hope it does!! I just think that's too soon for what the OP described
@@hombacom Just a matter of time. Hell..not a couple years ago we had text to scribbles and now look at what there is.
the part about gaming is still a dream to behold.rendering is easy to do,the whole point of game engine is to have real-time render,which is of an entire other level.
Having real-time rendering AI video is not something imaginable in the close future : that would ask for a precision quality still not possible(Like snipers in FPS),rigidbody issues would happens everywhere,and considering the GPU power needed for such AI,having it on a local computer at a instant-render is just fiction at this point lol
how about it just spits out code to run a game. It create the models textures, audio and logic. Make we can have a super simplistic basic and fast rendering engine that outputs to a realtime AI enhancer.
@@ranguy1379 thats a much better idea !like photoshop,having the AI fasten the creation of assets and such.that would be revolutionary tbh
We also thought AI generating video that you could hardly discern from reality was fiction. Brother, nothing is fiction at this point.
That's easy. People are ALREADY turning SORA footage into realtime NERFS that you can walk through in VR. It doesn't have to do it all realtime because if the person is walking about, it just has to generate what they see where they are and build it up as they walk along. Just like games like GTA do now.
I remembered people saying the exact same thing when it comes to these AI videos... "there's no way we will see HD quality within these couple years, it's too complex" yet it has arrived.
Two more papers down the line you would be able to generate a short film containing several consistent shots, all telling a consistent narrative. Can't wait!
I've been waiting for this for some time mainly for movies and games but possibilities are basically endless
This was posted at the tech company I work for. Very impressive
and right after that everyone went back to their desk to get their resumes ready?
This is the future movies in the 90s were showing us.
Bravo! When I saw the OpenAI announcement, I was wondering how long it would take for Károly to talk about it, and how excited he would get. 😂
What a time to be alive, indeed!! This is so mind-blowingly poweful that I think it will take a long time before safety testing has been completed. However, I really hope OpenAI keep trying to improve this model in parallel with safety testing so that, once it is released to the broader public, we will get rapid improvements in quick succession. Also, this is the first application of generative AI that I think will, in the very short term,result in a non-trivial number of jobs losses. Therefore, I think OpenAI has a responsibility to mitigate this by making it as easy as possible for people to use, including providing in-depth prompting guides.
As technologically incredible as this is. I find it terrible how this met with such seemingly blind enthusiasm. This has massive destructive potential and I feel like we're going to see exactly that in full swing.
as if anything can stop this. VFX is incredibly expensive and the movie industry is a multi-billion-dollar affair. This reduces the cost exponentially, and you can bet companies will gladly pour money into this tech.
I.... am... speechless....
OK I am genuinely happy I can see some parts that are not perfect, cause its that close to perfection
The implications this has for society are impossible to overstate, and the ones that matter the most are the most negative implications.
I’m genuinely becoming more scared than excited of the future.
Remember seeing the first of these, will smith eating spaghetti lol. This is far beyond that.
yes its all funny till AI takes all our jobs and we all starve to death, LOL
@@aegisgfx Only thing certain is that it'll get a bit worse before we get to that next golden age of technology. Current times are not so bad but it can be so much better. Like univesal income and having every person basic living needs guaranteed. That's what we lack at the moment. Some 1 to 2 billion people struggling to have their basic daily needs met shouldn't be a thing in 21st century.
I remember playing around with Dall-e mini just a few years back. How far we’ve come is astounding.
In a few years I would use this to put all my thoughts into movies and convey them to the rest of the world, once the AI generated actors will be on the level of good human actors.
What a time to be alive!
So it begins
I am 100% certain that humanity will use this techonology for personal enlightenment, culture, and the highest human endeavors.
My heart leaped at the possibilty of finally being able to walk around in an "simulated" Japan -
A world where I can visit shops without worrying about my anxiety, or moneyor a language barrier and people actually smile at me o.o
We are so screwed. There are enough agents looking to do wrong to someone else, the amount of propaganda that will be automated with this after an open source version gets just as good will be terrifying. Videos can now be dismissed as AI forged no matter what too.
1 of both of 2 things will happen: human curation and verification as it used to work and cryptographic proof from the equipment manufacturer it was filmed with their equipment and not changed.
I think two more papers down the line will basically give us complete directorial control.
- World modeling to the point where scenes have exact mathematical reference points when possible, and camera prompts can navigate them in a precisely specified manner (ex. swing camera 3 meters to the left and 2 meters upward while tracking character X, do so over 4 seconds starting at the 20 second mark).
- Precise character and object generation where each independent object has their own seed and can be inserted into any video and modified / animated independently. We will have the ability to develop variations of these items separately and choose which one to add in a scene when needed (AI will adapt object / character to fit scene during merge).
- Audio generation will be added that match the scene and can be modified in a similar fashion to objects / characters. (ex. add a course voice to character X and have him yell "watch out" to character Y starting at the 25.5 second mark).
- Existing video merge will improve and solo actors will be able to record themselves acting and import their performance into an AI tool that converts them into an independent character that can be modified by the AI prompt. The performance will be retained but the visual look of the character can be changed and even the voice. This gives a single actor the ability to create an entire multi-character video with acting depth by themself without the viewer knowing.
- Videos will be capable of reaching up to 5 minutes, making them 100% viable for pretty much any film/tv work.
- Human facial lip syncing to pre-recorded or generated audio will be added, making localization easier and allowing creatives to dully direct a scene without actors (although actors may provide better video of their performance to get the really emotional up-close shots).
So many more existing or future potential improvements can be made to this tech. It truly is the beginning of an entire renaissance of entertainment creation and independent creatives.
Similar to what you're saying, I think we will have AI-generated photo-realistic interactive 3D environments from text prompts very soon.
This is insane! Imagine how good it will be when physics simulations get better!
The only thing that helped me consistently identify that the videos were AI generated were the wacky physics... when that's fixed this will be indistinguishable from a real video to most people!
What a time to be alive!
Nah, I would rather be a medieval knight, typing a word salad to get stuff done is boring and soulless, it makes me depressed that might become the standard way to do things, really making stuff is fun you know ?
This is FANTASTIC... really impressive, i love.
If it can take an existing video and transform it to whatever you ask, this means one could be able to make a very simplified animation of whatever and render it to real-life graphics. That's f**king nuts!
it's already done in images
Very excited to have this open source
They probably won't make it open-source
@@devrim-oguzThey won't but the OSS community will catch up in a year give or take, the only issue is that it'll only be available to people with enough computational power
I dont want to lose my job , i worked so hard on my skills for years and i love what I do , thats it , thats my take
The guys who made this at OpenAI must be proud of what they're achieving. They're the pioneers of the future