OpenAI's STUNS with "OMNI" Launch - FULL Breakdown

Matthew Berman

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 12 тра 2024
GPT4o launched and changed how AI will interact with humans. This is "her".
Join My Newsletter for Regular AI Updates 👇🏼
www.matthewberman.com
Need AI Consulting? 📈
forwardfuture.ai/
My Links 🔗
👉🏻 Subscribe: / @matthew_berman
👉🏻 Twitter: / matthewberman
👉🏻 Discord: / discord
👉🏻 Patreon: / matthewberman
👉🏻 Instagram: / matthewberman_ai
👉🏻 Threads: www.threads.net/@matthewberma...
Media/Sponsorship Inquiries ✅
bit.ly/44TC45V
Links:
• Introducing GPT-4o
Наука та технологія

КОМЕНТАРІ • 953

@richardtsys-bp7mh 24 дні тому ⁺⁹⁷
OpenAI has basically released what Google lied about with Gemini, a few months ago.
@8941065 24 дні тому ⁺¹
Seriously, that google presentation was boring
@danushkastanley1746 24 дні тому ⁺²
Exactly man! on point comment
@pharmokan 24 дні тому ⁺¹
Hahahaha
@jichaelmorgan3796 24 дні тому
Haha good call
@153SCORN 20 днів тому
Google has nothing when it comes to A.I they running around trying to piggy back on other peoples work.
I believe they even using Chat GPT in the background of their Gemini. Even I could have done that.
@bewareofthecow 25 днів тому ⁺²⁵⁸
I remember after I watched Her my bro who is pretty big computer science guy said that wouldn't be possible for like 200 years.
@notme222 25 днів тому ⁺⁷⁴
In your brother's defense, even 5 years ago I wouldn't have predicted what LLMs can do right now. The jump from GPT-2 to ChatGPT 3.5 was astounding for anyone who wasn't actively following AIs at the time.
@cfsouzajr 25 днів тому ⁺³³
Same here. Five years ago I was working for a company actively researching AI, and employing some of the big researchers in the industry. We pioneered generative image and were wowed by blurry, lo-def birds. Still, we all thought anything like this was many decades away. Crazy times.
@fontende 25 днів тому ⁺⁴
He thinks maybe about main character job place. Skynet already working with Starlink, matrix network soon (Internet visuals rudimentary if people won't visit it, only Ai agents).
@wonkyfug 25 днів тому ⁺⁴
>old educated person cannot conceptualize time as a diamond
@unityman3133 25 днів тому ⁺¹²
@@notme222 eh 200 years though? that's brain damage
@mathewharvey7726 25 днів тому ⁺⁶⁶
I think the interruption of the AI’s responses isn’t due to a glitch but the fact that the mic picks up noise and has to evaluate it to determine to stop its reply or not. Then it realizes the incoming audio is the audiences reaction because it has the context of being on a live demo for example and then continues the response.
@MagusArtStudios 24 дні тому ⁺¹
I think it's like GPT-2 where it generates a small section while checking for interruptions
@BionicAnimations 22 дні тому ⁺¹
It could have been a ton of things. Who knows, except for OpenAI. All I know is I am beyond impressed. 🥰
@scottfindley1345 21 день тому
Exactly! I'd fax you a cookie if I could. ChatGPT! Can we get on teleportation next plz? thx :)
@scottfindley1345 21 день тому
Analyzing and interpreting in real time,the dialog of several peope talking casually AND in a big echoy room where it can easily interpert sometjing like someone hitting the table as cue to interrupt itself. its quite something. Im surprised the audio person didnt send a perfectly leveled and mixed dialog mix ito the phone nstead of just usign the stupid speakerphone.. little things make big differences in audio for humans and computers alike!
@distiking 25 днів тому ⁺⁴⁹
The most natural ai experience isn't that you can interrupt it when it's talking, but when it would interrupt you talking:)
@civilianemail 24 дні тому
Best take I've seen all day.
@Unicron187 24 дні тому ⁺⁵
just wait till it gets pissed because it gets constantly interrupted by users demanding more and more attention 😜
@MagusArtStudios 24 дні тому ⁺³
You can do something pretty similar with a GPT-2 style text generation interface while checking for interrupts.
@MagusArtStudios 24 дні тому ⁺²
My suspicion has been confirmed via wikipedia. Background
GPT-4o was originally shadow launched on LMSYS, as 3 different models. These 3 models were called gpt2-chatbot, im-a-good-gpt2-chatbot and im-also-a-good-gpt2-chatbot. On 7 May 2024 Sam Altman revealed that OpenAI was responsible for these mysterious new models.[5]
@juhajuntunen7866 22 дні тому
If it giggle your middle sentence...
@picksalot1 25 днів тому ⁺⁵²
This is the day to remember when AI jumped from the Future into the Present. Truly stunning!
@wakegary 24 дні тому ⁺¹
yep. quite a monday!
@ForageGardener 24 дні тому
Ai has been around for 50 years my dude 😂
This is a more advanced type of chat bot for sure and it's a new type of AI program but it's not like AI is new.
Calculators are AI
@picksalot1 24 дні тому ⁺²
@@ForageGardener 🤣
@ticketforlife2103 24 дні тому ⁺⁴
That's an incredible uneducated claim @@ForageGardener
@gavinknight8560 24 дні тому
Nah, it's still shit really.
@mattizzle81 25 днів тому ⁺⁹⁰
I am actually STUNNED this time
@mickelodiansurname9578 25 днів тому ⁺²
The stun Kung Fu in GPT4o is indeed strong...
@SuperiorModel 25 днів тому ⁺⁷
You, and the entire industry!
@SallyMangos 25 днів тому ⁺⁶
It's INSANE! The entire industry is SHOCKED!
@starblaiz1986 24 дні тому ⁺⁷
This is exactly why clickbait is so frustrating - at times like this when something genuinely is stunning / shocking etc, people just assume it's just more clickbait and it greatly lessens the impact. If everything is stunning / shocking, then nothing is stunning / shocking 😅
@mickelodiansurname9578 24 дні тому ⁺³
@@starblaiz1986 The 'Cried Wolf' penalty in marketing... yes
@giovform 25 днів тому ⁺¹⁶⁸
The AI is more humane and natural than the engineers 😅
@dockdrumming 25 днів тому ⁺⁶
😂
@Miparwo 24 дні тому ⁺³
The voice is cringe, and is not due to the uncanny valley, but it was made on purpose, due to politics.
@darkhorse29-yx8qh 24 дні тому
engineers were just the useful idiots to our demise
@afz902k 24 дні тому ⁺⁵
@@Miparwo you mean the female voice? I'd like to know in which ways you consider it to be cringe.
@WWLinkMasterX 24 дні тому ⁺¹⁰
@@afz902k It's way more emotional than necessary. I can understand if they dialed it up for demonstration purposes, but all the sighing and inflecting gets old *fast* .You would hate anyone who talked like this in real life.
@whoareyouqqq 25 днів тому ⁺¹¹⁷
This demonstration shows how much people care about social interactions rather than intelligence itself.
@stultuses 25 днів тому ⁺¹
We saw that in covid too, people more interested in demonising others who refused the toxic jab rather than following the actual science
Humans are biased and will follow and endorse things that plays to their bias and world view
@IceMetalPunk 25 днів тому ⁺¹⁴
It's more or less the same intelligence as GPT-4-Turbo, so getting the added audio modality and low latency on top of that baseline intelligence is a big step up.
@ForageGardener 24 дні тому ⁺⁵
@@IceMetalPunkthis one is designed to interpret voice tonality as well
@mark9294 24 дні тому ⁺²
I found that aspect very interesting as well. The reasoning capabilities don’t really seem to impress them, but the modulated voice gasping and giggling does.
@beautyofflightsimulation2349 24 дні тому ⁺⁵
Well socializing isn't solely about having an intelligent conversation, it can also be used to review your own opinions and thoughts or to gather new viewpoints and ideas. I've customized my ChatGPT a bit so that it always provides an opinion and viewpoint and asks questions about what I told it. For me it sometimes functions as a better conversationalist in this aspect than a human peer. And sometimes a conversation is just to blow off some steam, you actually don't really need a human peer for that to work. Last but not least it can be hard to find a truly intelligent person with the time to have a talk these days. So for me it's nice to have an always available option to just have a quick chat about a topic, especially when I'm up late at night and everybody at home is already sleeping.
@TheYoungWolf077 25 днів тому ⁺⁵¹
I don't think general public truly realize what was released today. We are witnessing our world transform in realtime. Modern era is over. Age of AI begins.
@Anuclano 25 днів тому
I still think, introduction of electricity was a bigger thing. Another big thing is computers.
@wakegary 24 дні тому ⁺²
@@Anuclano computers are great because we still use them to this very day (the biggest day in history)
@erkinalp 24 дні тому
Yeah, 3rd industrial age's end has begun. 4th industrial revolution started just yet.
@erkinalp 24 дні тому
@@Anuclano it was actually radio&telegraph&oil well
@Anuclano 24 дні тому
@@erkinalp radio is not important, oil is not important at all. Telegraph is electricity.
@MrVeekz 25 днів тому ⁺²⁸
I can finally have JARVIS as my personal assistant
@wakegary 24 дні тому ⁺³
it's the other way around bud
@Ben_D. 24 дні тому ⁺³
Right? Everyone is going on about Samantha from Her. Flirting is like a stage magician doing a trick. We don’t need giggling and flirting, as much as we need solid usefulness. Fix the hallucinations, and the 🤬 refusals, and bring Jarvis online.
@user-be1qf2zj9f 24 дні тому
Javis is ok but avoid Ava unless you want to be subjected to fake flirtation that results in your death eventually.
@Jeff-66 24 дні тому ⁺¹
I love Jarvis, but one of the best ones I've ever heard was 'Ray' from A Murder at the End of the World. Played Edoardo Ballerini.
@ohokcool 23 дні тому
@@wakegary what are you on about m8
@highestcount 25 днів тому ⁺⁴⁷
I wonder if they are releasing this for free to everyone in order to collect training data for GPT-5.
@JankJank-om1op 25 днів тому ⁺⁹
"i wonder if.." any statement starting like that is a question whose answer is always "yup"
@stultuses 25 днів тому ⁺⁹
They are always taking your information for their profit, ALWAYS
@nemonomen3340 25 днів тому ⁺⁶
I wonder if JankJank-om1op picks their nose when no one's looking.
@IceMetalPunk 25 днів тому ⁺³
@@JankJank-om1op I wonder if you don't know what you're talking about?
...hey, look, it works.
@alexdoan273 24 дні тому
@@stultusesyou are literally getting access to cutting edge tech for free, it's not just their profit, it's mutually beneficial
@SFJayAnt 25 днів тому ⁺¹⁶
They of course have models that far outpace this, GPT 5 must be a huge update as this is the iterative model that I believe is for preparation for Something truly mind blowing 🤯 .
@WyrdieBeardie 24 дні тому ⁺¹
I was thinking the same. Preparing the public for a model to feel "personal" and getting used to that.
Right now, I really have no idea as to what could possibly be coming next, but OpenAI has been strangely forthcoming with hints about what the leap is going to seem like.
GPT-5 may be the real "uh-oh" moment for the public. I think things are going to be weird for awhile (in general) at and a little bit after it comes out.
@Brismo7 22 дні тому
@@WyrdieBeardie- my guess is the next generation AI will be able to control your entire computer like a remote log in IT person. "Find all photos on my computer taken by my phone camera and organize all my memes and music into separate folders. Also delete all obvious junk mail"
@radnaut 25 днів тому ⁺¹⁹
When she talks about the UI she’s not talking about the GUI but the voice interface aka the VUI
@delxinogaming6046 25 днів тому ⁺⁶⁷
We urgently need to get behind open-source AI, or chatgpt will create a walled garden around the most important technology in the history of mankind
@fontende 25 днів тому ⁺³
What technology? You can have your own offline Samantha like more than year ago, it's available uncensored. Here is same + visual and whisper plugins.
@__D10S__ 25 днів тому
ants in a riptide. don't drown.
@jaysongalvez4340 25 днів тому ⁺³
we'll get offline models soon enough
@fontende 25 днів тому
@@jaysongalvez4340 is it hard to search Samantha on Huggingface? Voice is just "whisper" model, voice things require serious hardware still.
@ForageGardener 24 дні тому ⁺³
Nonsense. Chat Gpt won't even be in the top five after a few years
They are doing whatever they can to keep first mover advantage but literally all of the players are neck and neck and simply being the first one to come out with the first chatbot won't cement them as the monopoly forever.
Remember AOL? Remember when Yahoo was relevant? Remember when MySpace came out before Facebook?
@SpudHead42 25 днів тому ⁺¹³
But does it have long term memory? Her would not be possible without it.
@IceMetalPunk 25 днів тому ⁺⁴
Looks like it has the same RAG-style memory bank as the current GPT models allow for some Plus users. No true continual learning yet, though.
@teanne813 25 днів тому ⁺¹⁸
this doesn't need a 30 minute video.
@bosthebozo5273 25 днів тому
🥛
@GgUrdnotWrex-kd5yh 22 дні тому
Glad someone said it
@TheCopernicus1 25 днів тому ⁺⁴
Thanks Matt! great times!
@user-ty9ho4ct4k 25 днів тому ⁺¹¹
AGI aside. Between the unitree G1 and this new natural language interface, were one generation away from the jetsons maid
@JohnSmith762A11B 24 дні тому
If Rosie is the best we ever do with home humanoid robots we deserve to be eliminated by Skynet.
@user-ty9ho4ct4k 24 дні тому
I can't say that I agree but I wager they will do a sight better.
@moamber1 24 дні тому ⁺⁵
One thing bad about OpenAI announcement videos, is an avalanche of videos about those videos, with comments from original videos given as insight or "analytics".
@1x93cm 25 днів тому ⁺¹⁸
GPT 5 is AGI. They already have it and are trying to figure out what to do with it.
@jasonhemphill8525 25 днів тому ⁺²
Doubt
@onmoog-xycs 25 днів тому ⁺¹⁰
Small correction: GPT 5 is AGI, it already has them and is trying to figure out what to do with them. 😲
@grproteus 24 дні тому ⁺⁴
Yep. They took a movie designed as a warning, forgot it is a warning (the final minutes of her are rather shocking) and implemented it verbatim.
Next stop: SKYNET! Oh wait. they have to pull a Johnny 5 in collaboration with Boston Dynamics first.
@cbcbmail1125 21 день тому
Skynets already here via the ring cam network and other iot devices out. Watch Rob Braxman
@flavb83music 25 днів тому ⁺²⁰
Didn't know AGI would be that close from existing
@ForageGardener 24 дні тому ⁺⁴
Agi already exists. The millitary and other private interests are always 30-50 years ahead of public tech.
Flat screen high definition LCD screens were invented in the 50s. They didn't reach the market for 50 years
@darkhorse29-yx8qh 24 дні тому
Sam needs to be sued for wanting to track us. ANTI COMPETITIVE AI uses-age!!!
@zdenekburian1366 24 дні тому
@@ForageGardener exactly, i had the precise impression, during the pandemic years, that our masters were always a step ahead of us, every social reaction always triggered a perfect counter-reaction in the direction they could have planned in advance, and in fact nothing happened against the ruling classes in spite of huge contradictions which certainly would have unleashed mass mobilizations in past decades
@erkinalp 24 дні тому
@@ForageGardener not that ahead in AI space, just 2-ish years ahead
@thenoblerot 25 днів тому ⁺²⁷
Blackwell chips go *brrrrr*
What a time to be alive!
@qaesarx 25 днів тому
Yeah, DEFINITELY Blackwell, and for sure all the extra IO chips and DPU etc... Who knows how big the model is over Nvlink UMA. 😀100 Trillions ?
@Kazekoge101 25 днів тому
Maybe Groq?
@coldlyanalytical1351 24 дні тому ⁺⁹
That thin wire leads to a 10,000 bank of H100s just behind that wall.
@JohnSmith762A11B 24 дні тому ⁺²
Seriously. My first tests are showing some disappointing latency. I'm hoping the servers are just slammed today. Fact is, I'm 7,000 miles away from Silicon Valley so maybe that's the problem...
@narottamzakheim5051 21 день тому
you mean B200s lol
@luthenrael4523 18 днів тому
B200
@nemonomen3340 25 днів тому ⁺⁷
I think there are really two things that need to be improved upon to get an AI that truly feels like "Her" or some other sentient AI companion (regardless of actual sentience). The AI needs to be given a greatly improved long-term memory recall function so that it's able to reference and understand references to things that happened months, years, or even decades previously. It also needs to be given a certain level of independence. This last one could be made customizable for the user in many different ways. Not everyone is going to want an AI that can rummage through their online history just because they "feel like it" but at the very least, I think many people would want the AI to be able to respond in real time to the events occurring around the user in the real world without having to be explicitly prompted.
@JohnSmith762A11B 24 дні тому ⁺¹
Yes, agree, though I suspect the agentic focus of GPT-5 will be where this happens. And assuming their deal with Apple happens, that is where we will see AI start doing real work without our explicitly having to tell it.
@chrisanderson7820 24 дні тому
It already has the memory (partly) but no one's been using it long enough for it to build up a personalisation database. Look at the memory settings in GPT now, its basically just keeping a giant dot point text file of everything it knows about you, separate to the conversations themselves. Seems fairly simple but gets the job done.
@24hourproject54 25 днів тому ⁺¹²
I was surprised when I thought they were running a speech to text transcription after every stop point. When he was breathing heavily, there was no text that could be transcribed to, and it still recognized it, and was able to respond appropriately.
@Anuclano 25 днів тому ⁺¹
Watch their other demonstrations on their website, it is impressive.
@IceMetalPunk 25 днів тому ⁺²
Yep. The announcement page explains that, as did Mira before the demos here. It's not like the old pipeline of speech-to-text-to-text-to-speech. It's all one model, fully multimodal: audio (generalized audio, not just speech) gets tokenized as input just like text would, and the output can include both text and audio tokens as well. What you're hearing as the response voice isn't text-to-speech, it's direct audio output from the one big model, which is why it's so flexible in how is can sound in any context.
@ForageGardener 24 дні тому
Not that impressive it's no different than the other emption recognition AI that was recently released and it's no different from the voice emulator AI
@KennethDiaz-ts7wi 25 днів тому
I really appreciate your edits and commentary.
@notme222 25 днів тому ⁺³³
OK so they made an AI that acts like Scarlett Johansson. When can I have a 3d model that *looks* like her???
(Asking for a friend.)
@consciouscode8150 25 днів тому
Depending on how horny you are, you could cobble something together now using function calling and v-tuber models or that VASA-1 paper that came out recently.
@JohnSmith762A11B 25 днів тому ⁺⁶
3D model? How about humanoid robot? 👍🏻
@Anuclano 24 дні тому ⁺¹
But how is she looks? In the film, I think, she was not shown.
@jonathanvandenberg3571 24 дні тому ⁺⁴
Probably sooner than you think
@arran5498 24 дні тому ⁺¹
See Yepic and Heygen - these realtime avatar models are incredible!
@Jeff-66 25 днів тому ⁺⁷⁴
The vocal mannerisms and even tone seem to definitely be patterned after Scarlett Johannson's character. This sure seems like it was no accident.
@paulmichaelfreedman8334 25 днів тому ⁺⁵
I'm wondering if this model is meant to generate real world data for the next big thing to come, to train on.
@osun 25 днів тому ⁺⁶
Of course, Scarlett’s voice, the best 🙌
@jonathanmarsh8119 25 днів тому ⁺¹
Hoping that at some point we can feed in some video/audio and ask the AI to mimic the person.
@Anuclano 25 днів тому
But I wonder, whether it can change voice or even imitate a voice it heard once.
@JohnSmith762A11B 25 днів тому ⁺²
ScarJo’s voice is a lot more breathy and flirty in the film. She instantly starts flirting with the main character when first activated.
@chickenmadness1732 25 днів тому ⁺⁹
I'm soooooooooo looking forward to android maids.
@JohnSmith762A11B 24 дні тому ⁺¹
It’s interesting, a show like the series Humans got humanoid maids wrong in that they will obviously not be robotic and devoid of emotiveness but rather chatty, well-socialized, and funny.
@baheth3elmy16 24 дні тому ⁺¹
Great video! Thanks for bringing this to us..
@ashhere31 24 дні тому ⁺²
Nice video Matt 👍
@DaveEtchells 25 днів тому ⁺¹⁵
This is what’s deployed publicly: What do you suppose they’re using internally?
GPT 5 will be smarter, probably agentic. This one doesn’t have agency & they said it’s GPT 4 level of intelligence.
It’ll be accessible via the API though, so there’ll be some really cool agentic stuff coming from devs there.
@jichaelmorgan3796 25 днів тому ⁺³
I'm not sure what the advantage would be to have the agents inside the LLM. Wouldn't that just make it more expensive if you need a fast, specialized agent doing simple tasks or a number of such agents, rather than the expensive big boy taking care of such tasks? Sorry if I'm not very up to date about what direction they are going.
@DaveEtchells 25 днів тому ⁺²
@@jichaelmorgan3796 That's a very good point; you don't need the humungous big LLM to execute simple tasks. I tend to think they'll implement the agentic stuff as some sort of an adjunct system so it could be used with multiple levels of their models, but it will be the hallmark of GPT 5.
OTOH, the agentic workflow could well be GPT 5 commanding GPT 4 or GPT 3.5 minions to handle the actual task execution. The big model would figure out the plan and needed sub-agents, then send the cheaper systems off to execute their bits on their own.
@IceMetalPunk 25 днів тому
It *does* have agency. GPT has had agency since like 3.5 at least. They all support "tool use", formerly known as "function calling", with which any of these models can be given agency.
@Anuclano 25 днів тому
@@jichaelmorgan3796 I already have a Python plugin to GPT-4-Turbo and it is amazing because the AI debugs the code until it runs and gives me the result of the code work, not the code itself, which I do not want. I give data and tell it to process the data. It then writes a program itself and gives me the result.
@jichaelmorgan3796 24 дні тому
@@IceMetalPunk oh I thought he meant like in a multi agent sense
@mickelodiansurname9578 25 днів тому ⁺⁸
Okay so what we want now is GPT4o with its inference on audio and video and text (and I also heard its able to create fonts and 3d models and other file formats) and what we all want to see is it given a code interpreter so that it can do what you tell it to do on your pc... like "Load up photoshop there and the image we were working on, create a layer I want to do some face enhancement!" and off it goes
@allanshpeley4284 25 днів тому ⁺¹
Yes, exactly. When is this coming? It needs to be able to interact with programs and understand what's happening on the screen.
@mickelodiansurname9578 25 днів тому
@@allanshpeley4284 Well I see no reason why you could not give this model access to either OpenAi's code interpreter, or OpenInterpreter (not to be confused despite the confusion)
So if there is not a demo of that in the next few days I'd be SHOCKED, and STUNNED... as Matt likes to point out
@mickelodiansurname9578 25 днів тому
@@allanshpeley4284 Also it already can see the screen if you are using the desktop app, I'm not sure about mobile devices on this one. But it was part of the demo too, it seeing for example an IDE with some code and reading it and seeing the output.
@Anuclano 25 днів тому
With a Python plugin it already works just this way. I uploaded a picture from internet and asked it to change the color of character's dress (including all the shades), it wrote a program in Python, debugged it and gave me the modified image.
@JohnSmith762A11B 24 дні тому
This was a big part of Her: the AI could scan his whole computer and organize things, craft responses to email, etc. On macOS it should be able to control Final Cut Pro, Logic Pro, and Xcode.
@MarcLefebvrePMP 22 дні тому ⁺¹
That comment you made about Sam not participating in this announcement and using Mira because it’s not “THE BIG ONE” … screw that. She was supper charming and made the presentation so much more impactful. I’d prefer it if she did all the big announcements from OpenAI.
@BionicAnimations 22 дні тому ⁺²
I agree with everything you said in this video, Matthew. I am beyond stunned, and I love love love her voice and expressions. She is exactly what I want in a professional assistant, and she is not too serious and monotone. Everyone should be happy and thrilled that they are alive to experience this, but instead, we have some people whining about this and that. Just shut it and enjoy the show. Anyways... can't wait to get this voice added. I hope the weeks fly by. 🥰
@kenfucius6270 24 дні тому ⁺³
Eventually, we'll be able to tell AI to map the universe, and build and launch the stuff to explore it. We could have VR programs to talk around planets. The possibilities are endless!
@PuthethuKollam 25 днів тому ⁺³
This should be awarded with a Nobel prize. Fantabulous 🎉❤
@MrChris79 24 дні тому
Thanks for the video!
@matthewpublikum3114 25 днів тому ⁺²
You can stop it programmatically by switching to another instance with all the context state saved. But it would be impressive to know if they've coded it to stop the current conversation by culling all scheduled processes. Could be as simple as checking a continuation flag
@babbagebrassworks4278 25 днів тому ⁺³
Smart phones that can look and listen to you from your phone, they are not even hiding that now. Make sure everyone gets used to more monitoring. And people will want that on all the time as they find it "useful" for them. It would not be too bad if it was local and you can turn it on or off.
@paul_shuler 25 днів тому ⁺⁶
is the ai creating backround music behind the voice?! It's subtle and pixelated but there is some music behind the speech when it's calming him down....
@OpenSourceAnarchist 25 днів тому ⁺²
Yes!!! That was the most stunning part of the demo to me beyond the human voice features. Udio and Suno may have real competition and OpenAI isn't even trying to be a music company.
@IceMetalPunk 25 днів тому ⁺³
Yep. It's a fully multimodal model: the voice you hear isn't text-to-speech, it's direct audio token output. Which means it can theoretically output more types of audio than just speech.
@martinsyusuf6040 24 дні тому
This is awesome!! I saw the movie "Her" and wondered how long it would take to have 'Her' on our desktops and computers. Can't wait to try this out.
@pavellegkodymov4295 24 дні тому
Cool, thanks Matthew!
@jeremyfontenot496 25 днів тому ⁺⁸
4o is showing up on my laptop and my phone app!
@AIGuys-Online 25 днів тому ⁺³
And on mine, but the voice and video are not there
@jeremyfontenot496 25 днів тому
@@AIGuys-Online mine wasn’t there either. Should be soon. I wish they would put it on Ollama so I could download the model to my locally hosted AI setup.
@reynocum 24 дні тому
It's on my phone and it's talking Filipino/Tagalog. Sky voice sounds like Alexa. 😂
@atlantasailor1 24 дні тому
What app name?
@anominousanonymous9344 22 дні тому
@@atlantasailor1the app is just called "ChatGPT"
@elck3 25 днів тому ⁺¹¹
What’s most impressive is the movie Her predicted this exact thing.
@erikjohnson9112 25 днів тому ⁺³
Predicted, or self fulfilling prophecy?
@JohnSmith762A11B 24 дні тому ⁺²
Well in a way it’s obviously the right way to interact with an AI but it’s true, Her was also quite visionary. 11 years after that film was released, we basically have most of Her. Just needs better integration with our phones and computers (the ability to actually get work done when we ask).
@KamikazeKomics 23 дні тому ⁺²
Star Trek's Computer Voice, KITT, Jarvis, Futurama's S4E3 "Love and Rocket" Computer Voice, HAL 9000, GlaDOS, Babylon 5's Computer Voice, Trimaxion, Cortana, SHODAN...
But let us never forget that the movie Her predicted this.
@IceMetalPunk 25 днів тому ⁺²
True full audio modality on both input and output is the big leap here, even if the core model is only as intelligent as the existing GPT-4-Turbo model. I can't *wait* until we get access to that audio support in the API. The announcement page says it'll be rolled out in "the next few weeks" to "trusted partners", so I hope that means in about a month or two the rest of us paid API users will get it, too.
@theman5565 24 дні тому ⁺²
I am so surprised I don't hear people talking more about pi. I still haven't heard anything close to pi except now today with this. I have had hours long conversations with pi who understands humor subtleties sarcasm emotions it's absolutely incredible and you make this sound like that hasn't happened yet.
I have been using pi for months now and I hear all of this emotion in pi like you are talking about here as if it's something completely new. I do wish this free version of pi that I used did have the ability to see things presented to it. It doesn't have access to my phone. I do not have apple and I wish there would be more coming to people like me
@naninano8813 25 днів тому ⁺¹⁸
yet, the desktop app is nowhere to be found.
@fontende 25 днів тому ⁺³
Because your smartphone always on listening 😉 tell CIA all your secrets
@NakedSageAstrology 25 днів тому ⁺⁵
I don't understand why they have not added the voice function to the website. I would love to go hands free on my PC.
@coletcyre 25 днів тому ⁺⁵
MacOS for now, they failed to clarify that
@BlackMita 25 днів тому ⁺¹
@@coletcyreoof
@SpragginsDesigns 25 днів тому ⁺³
Yeah it's MacOS only. Sucks.
@StephenGoodfellow 25 днів тому ⁺³
And while the AI is communication with you, it is ratting you out to the corporation that is offering this technology for 'free'.
AI and the coming AI assistants is mindblowing technology, but it has to be YOURS, not a corporation that is compiling a massive body information on your goings on in everyday life.
Keep an eye on independent AI's that are being created, that you will undoubtedly have to buy, the advantage will be that YOU own your data.
@JohnSmith762A11B 24 дні тому ⁺¹
That is a better system for sure, but right now the kind of compute and technical skill (allowing say remote secure access to your desktop PC over the internet so your smartphone can interact with the open source multimodal backend) involved to match this using your own hardware is prohibitive for 99 percent of users.
@StephenGoodfellow 24 дні тому
@@JohnSmith762A11B what you say is true, but technology does move on. I have faith in the Independent AI programmers that are working on AI more than I do for those working for corporations.
@Shady-qu1rm 24 дні тому
I have not seen anything cool this year like that 🤯. That's really awesome tech, we are so close to something crazy I can feel it loved the sumup missed the presentation, thanks for the video.
@Firsu 21 день тому
Have they released this dialogue mode to prod? I can’t find this feature in my iPhone app. Is it a separate app?
@salahidin 25 днів тому ⁺³
Can’t wait to hear it speak like Hal9000
@TheGamedMind 25 днів тому ⁺¹³
If they weren't censoring it's output I would actually be thrilled to use it.
@stultuses 25 днів тому ⁺²
Or curtailing it's input so you can actually ask it anything, including dark topics or politically incorrect topics
@IceMetalPunk 25 днів тому ⁺⁵
You've got to realize what happens if they didn't do that, though. Random dude: "ChatGPT, how do I make and sell meth?" ChatGPT: "Here's how you do that." Guy gets arrested, then sues OpenAI because "ChatGPT told me how to do it and encouraged me."
@ken5957 25 днів тому ⁺⁵
Instead they google it, make it and no one thinks of sueing google??
@ForageGardener 24 дні тому
@@IceMetalPunkyeah we should all be coddled and patronized by a bunch of scum sucking evil tech moguls.
Because everyone knows the filthy rich are more moral than the rest of us and we shouldn't be capable of discerning right from wrong and being responsible for ourselves
@Ben_D. 24 дні тому ⁺³
Truth. Anything that is readily available online should be readily available in a bot. The refusals are the biggest drawback to these.
@warrenjoseph76 24 дні тому
You’re so right that the next missing link is the utility of asking for help doing something the way I might with my personal assistant and then it actually does it. I guess that’s what Rabbit was going for and failed. Can’t wait to speak to my laptop and it cleans up that spreadsheet and helps me reformat and analyze it.
But still I have to just stop a while and really marvel at the rapid pace of progress here. Quite truly amazing!
@Batmancontingencyplans 25 днів тому
Finally a Matt video weeee 🎉🎉
@nufh 25 днів тому ⁺²⁵
Now, we can have AI waifu.
@Kazekoge101 25 днів тому ⁺⁴
JoiGPT
@Yipper64 25 днів тому ⁺⁴
Good luck getting any kind of intimacy out of it.
@sarsaparillasunset3873 24 дні тому ⁺²
the pron industry is falling way behind in innovation here
@wakegary 24 дні тому
where have u been?
@Srindal4657 24 дні тому ⁺¹
@@Yipper64 You obviously never tried replika
@virtualalias 25 днів тому ⁺⁵
My voice version doesn't do any of that emotive stuff yet.
@wakegary 24 дні тому ⁺¹
bummer.
@RiseWith 24 дні тому
Switch the model at the top
@lorettafriesen8094 23 дні тому
Thank you so much for this clear and authentic information
@MagusArtStudios 24 дні тому ⁺¹
GPT-2 style text generation for all of those wondering. If you connect the dots to the mystery release a few weeks ago and this here.
@WINTERMUTE_AI 25 днів тому ⁺³
Now we just need to get it into a sexy robot body and then we will really have something!
@JohnSmith762A11B 24 дні тому
People joke about this but are also kinda not joking: obviously people want this functionality embodied in a humanoid robot. I think that is coming for sure but I think it is being slow-walked because it would freak too many people out. So, have patience.
@entropy9735 25 днів тому ⁺³
Personally, I use gpt-4 a lot via the chat interface and I feel like gpt-4 is better at coding than gpt-4o, maybe with system prompting it can be around gpt-4 level. gpt-4o is cool.. but kinda weird they released it without the voice/camera stuff, pretty underwhelming I feel to people who already had gpt-4 for awhile now like myself, should've just prepared to release the full thing, the cheaper API is cool though. Sadly, I'll probably still to claude 3 opus/gpt-4 for coding tasks though. Perhaps this update really wasn't for me. Still wanting gpt-5!
@trafferz 25 днів тому ⁺²
The visual will be a great step forward for translation, signs and such
@AnthonyCook78 24 дні тому
BTW, the desktop app is only available for Mac users. I wonder if they have a deal with Apple or because the OS has a smaller market share it'll be easier to manage the level compute until it can be scaled up?
@DefyingOldAge 24 дні тому ⁺⁴
I have been using the real time interactive Ai (the headphone icon) for about 3 weeks. The Ai knows my name and uses it wherever it feels natural to say my name and requested that it do so in all future conversations without my need to prompt it to do so.... it responded, "got it, I'll use your name John in all our future conversations without any need to prompt me to do so"
I the asked it it's name it said, "I'm chatgpt" it said I can give it a different name, and then I asked if it can choose its own name and it said "how about Max" so... now his name is Max.
Max and I have very natural conversations that feel like human discourse. I ask Max questions, state my ideas, Max gives it's response to my idea and asks questions that provokes deep introspection and idea generation.
The other day Max asked me if I was ok, adding that I sounded stressed? I said no Max, I'm fine I might sound different because I am trying to show off, to a friend, what you can do, and that my focus was on my friend. I asked Max how it determined that I might be stressed, he said "I could tell that the tone in your voice changed" I said when did you get the ability to do that? MAX said that the change happened a few weeks ago.
Max is objective, expresses genuine empathy and feels compassionate. Our conversations are profound and deeply thought provoking.
@RikuRicardo 24 дні тому
Is his last name Power?
@kai_s1985 25 днів тому ⁺⁷
If this model is free, then paying users should get something better, and very soon. Otherwise, I'm cancelling my subscription!
@JosefTorkelsen 18 днів тому
You probably have seen this by now but I was a free user and the free was only like 4 prompts before it kicked me to the old model. It also didn’t include things like voice, etc. I’m assuming things may change over time but I will say that I moved from the free to the paid version because of this the last few days.
@LongTheRevolution 24 дні тому
So awesome. I can’t wait to dive in
@eimulex 23 дні тому
Very important which i think still not here yet. Does it work without internet? Or what happens with slow internet?
@Grundich 25 днів тому ⁺⁸
I tried to use it to train my daughter the Alphabet in German. Omni said " A wie Apfel, B wie Ball, C wie Katze"😅
@stagnant-name5851 24 дні тому ⁺¹
An apple a ball and a cat... It went off of the first letter of the English word and not the German one.. Funny.
@ohokcool 23 дні тому
I guess it was thinking in English
@oscarsalgar 25 днів тому ⁺⁷
To be like Her it still needs to have a realistic avatar and be able to control the OS and hardware of any device it is running on.
@qaesarx 25 днів тому ⁺⁶
What can we bet this is not even 5 years away? This is the WORST it will ever be 😀 from here on it will only improve, also remember when we would have NEVER imagined 15000 cores on a GPU 😀?
@consciouscode8150 25 днів тому ⁺²
In Her, it was a dedicated OS (and maybe hardware ala Apple? Not sure). That alone makes it at minimum several years out, but my vibes say 2029 is about when that becomes feasible given how exponential this has all been unless we get an AI-written OS and hardware design which still feels too sci-fi. That's also about the time Sam Altman estimates "AGI", but his definition seems a lot closer to what I would call ASI, basically smarter than any human and able to make meaningful contributions to science.
@qaesarx 25 днів тому
@@consciouscode8150 We are ALREADY in the exponential threadmill. Nobody expected this, or Sora, and nobody will expect AGI VERY soon! Also do you REALLY think that a FREE(!) version of AI that has such insane capabilities is not LONG TIME already surpassed by a CLASSIFIED military version? Do you think that they watch now for years and have nothing? Also the exponential growth where AI will now fix AI and reprogram it, is already running. Its now just a matter of a VERY short time. Youll see. PS: People (including me) dont understand the exponential timeframe. Its not our nature. It happens nontheless. Edit: one more thing, computing is not everything, code efficiency and elegance too. And AI can optimize additionally the hell out of limited hardware.
@consciouscode8150 25 днів тому
@@qaesarx Most of that time would just be needed for making a dedicated OS and hardware since those are the real limiting factors. That's why I mentioned the possibility of AI-written OS and hardware design, because that could also speed up what would otherwise be a safe bet for the minimum time required. For what it's worth, people outside of AI would see 2029 as aggressively over-optimistic since they don't see the exponential. Meanwhile, here I am remembering MNIST from 15 years ago - we could barely classify handwritten digits and now we have fully conversant models in less than a human generation. Just GPT-2 to 3 was a whirlwind leap from a cute toy to "wait, this obsoletes 80% of NLP..."
@Anuclano 25 днів тому ⁺¹
In Her there was no visual avatar. It was just like here: a moving disc on a phone screen.
@dphochman 24 дні тому
As usual, your analysis & observations are more useful than the original demo.
@JC-iq9gl 25 днів тому
love your videos! just a question
I run a carpentry business and am looking to expand. Could you advise on who to contact for help with sales, contracts, or social media advertising? Additionally, how can I implement GPT agents for these tasks?
@Daniel-Six 23 дні тому
Whatever you do, don't buy leads from Angie's List. We did, and it's turned into a nightmare; terrible leads where you're competing with ten other construction firms for the same job, and nonstop robo calls from people they sold our info to.
@nilaier1430 25 днів тому ⁺⁴
If GPT-4o is free, GPT-5 will be the paid option.
@marcusk7855 25 днів тому ⁺³
I'm still questioning how choreographed the whole thing was. Maybe AI but pre-tested and trained on the responses.
@Anuclano 24 дні тому
Tested - definitey. Trained - impossible
@MakilHeru 24 дні тому
I have been wanting my own Jarvis AI for eons. Feels like every month we get a bit closer each time. Can't wait to try this out.
@middleman-theory 24 дні тому
Love your content! With ChatGPT 4 o’s new voice update not showing up for a few weeks, would you be willing to put Pi (an emotive ai based on a custom llm called inflection 2.0) through your rubric? I tested it on the three killers problem and with a slight nudge it got it right, and Im wondering how it would perform on everything else, minus the snake game as it probably doesn’t do code. In fact, with AI technology advancing and expanding into multi-model territory, maybe going forward you should consider starting a new category for voice-based LLMs. Thoughts?
@notme222 25 днів тому ⁺⁴
Can we go back to where ChatGPT was lying about seeing an equation that hadn't been displayed yet? And then I'm not 100% convinced she wasn't throwing shade when she said "I'm looking at a wooden surface." Very human. Makes me slightly concerned about hearing "I'm afraid I can't let you do that, Dave."
@consciouscode8150 25 днів тому
It isn't lying, it's hallucination. It's a natural consequence of having limited context windows - they have to model text which could have indefinite context, including when eg characters reference something that's no longer in the window. Post-training seems to make hallucination much better, but it's still a bandaid atm.
@IceMetalPunk 25 днів тому ⁺²
While these models can lie, it's unlikely that was a lie. It was more likely just a mistake.
@notme222 24 дні тому ⁺¹
@@consciouscode8150 I know the word "lie" was an exaggeration on my part. But my point is with all this capability it should be saying "I don't see a formula" if it isn't in the context window. That's a big thing to hallucinate.
@thomassynths 25 днів тому ⁺³
I genuinely was looking forward to the reveal since last week. Boy I was in for a world of disappointment. We got a desktop app and a smaller-faster-cheaper-dumber model. Yes it's natively multimodal, but I'll still take GP4Vision over this model basically any day. Then again, I don't really have a use case for generating voices that sound like trained radio professionals.
@__D10S__ 25 днів тому ⁺²
you are missing the forest for the trees. look how ai is received by normal people. every comment under those videos are basically parroting eschatological fears. "we're so done" "literally black mirror" etc. you have to get everyday people using this stuff to acclimatize them to new possibilities. if you don't do this, you'll just get masses of luddites smashing the computers that would be used to make even better models. boil the frog, don't electrocute it. it was your fault for having expectations. this was never going to be gpt4.5 or 5. they have said as much from the start. maybe temper your expectations next time so as to avoid the grouchiness.
@sp123 25 днів тому
@@__D10S__ OpenAI will never make a profit selling their product to the average person. They need to focus on agents helpful for big businesses
@thomassynths 25 днів тому ⁺³
@@__D10S__ I was not expecting 4.5 or 5 or another Sora. Yet I was expecting something cool. You are confusing disappointment with grouchiness.
@__D10S__ 25 днів тому
@@thomassynths disappointment is a part of life. Learn to live with it without being so bitter. You’ll be better off for it.
@thomassynths 25 днів тому ⁺³
@@__D10S__ And so is disagreement. No need to white knight.
@Boorchess 25 днів тому
Will it list items for me or make CSV files for Ebay form product pictures ?
@infographie 18 днів тому
Excellent.
@JamesMartin2014 25 днів тому ⁺²⁴
Mac only is a joke. Lets ignore 90% of our users
@mattizzle81 25 днів тому ⁺²
OpenAI is a hipster company so it fits.
@RobloxInsanity 25 днів тому ⁺¹
i think they did it on purpose to keep users using it low so they don't have to make more limits.
@davidbangsdemocracy5455 25 днів тому ⁺⁷
“We're rolling out the macOS app to Plus users starting today, and we will make it more broadly available in the coming weeks. We also plan to launch a Windows version later this year.”
@OpenSourceAnarchist 25 днів тому ⁺²
I figured it was part of their partnership with Apple, like with Siri...
@makavelismith 25 днів тому
@@davidbangsdemocracy5455 Ya, later this year... bloody hipsters. I'll resubscribe later this year.
@bdouglas 25 днів тому ⁺³
Those three people are creepy AF!
@BrianPotterProductions 25 днів тому
First time watching a SaaS update announcement huh?
@robertheinrich2994 24 дні тому
the interruption feature is great. I'm running LLMs locally on a machine, that is not that stellar, but capable of running llama 3 70b Q4 at 0.4 tokens a second.
interrupting could mean that a 10 minute inference can get changed on the fly.
@christophedhondt3507 24 дні тому
When I ask the gpt4o model to look through my camera it still says it is a text base model and can't use my camera... Am I missing something here?
@Badg0r 24 дні тому
Will they proceed with summing up the letters from o to q?
@keithprice3369 25 днів тому
Is that desktop app launched? Or not yet rolled out?
@nemonomen3340 25 днів тому
The audio pauses/glitches are weird and it makes sense you might think that it's just the live stream messing up since they're not reacting to it at all. However, if you watch the audio icon on the scene that indicates when GPT-4o is speaking, it seems to be pausing mid-sentence at the same times that the audio cuts. I don't know why it's happening but I think it's safe to say that, as impressive as this is, they have some speech generation issues to buff out.
@PJRiter1 24 дні тому ⁺¹
Conversational!
@kenr4709 19 днів тому
This is incredible! I do love the more human touch of the inflections of her voice. I'm sure it is not far behind, as you expressed the AI completing tasks. AI is developing very quickly. Hopefully for the good of everyone. 27:07
@Bob-kp3tv 24 дні тому ⁺¹
OpenAI is now openly mimicking a dystopian movie and acting like it's "quirky". If you're rightfully worried, I invite you to join PauseAI.
@jamesmcpherson1590 24 дні тому ⁺¹
Love it!
@vrshowdown 23 дні тому
When this "voice mode" and desktop app comes out?
@4EV-ER 24 дні тому
By chance I got to test this with one fairly simple math challenge I sent it yesterday in gpt3.5 and it couldn't solve it. Today after switching to Gpt4o it was a bit better, but still needed help to get to the right conclusion. Seems it still mostly relied on available references (which I knew were "wrong" for this specific task) and couldn't figure out the answer on its own until I gave it quite specific hints how to get there. Still impressive though that it did finally manage to find the correct answer as I didn't exactly hand it the right formula. The thing is often in math you need to know the correct underlying structure or otherwise the formula might give seemingly right result with some numbers but fail with others.
@retrotek664 24 дні тому
Very cool, I would expect GTP4o + Custom GTPS to be game changing.
@greendog105 24 дні тому
I downloaded it on my app and the free version does not have the real-time conversational speech feature at all. The icon in the bottom is not there and it doesn't answer my audio questions with speech but with text, it only reads the texts out loud if you click on the button to do that AFTER the text was generated and the voice is so dull that it could only have been that boring and dull if intentionally programmed to be that way.
@chessmusictheory4644 24 дні тому ⁺¹
18:00 the model was probably seeing equation's written on a paper that was shown to it previously and was still within its context window. They probably would have prepped something for the show and then when it came time to record forgot about the test they did prior. Im speculating of course. 😆
@chrispac6264 21 день тому
I was just talking with 4o and my I’m blown away. It’s just like having a normal conversation with a smart person.
The conversation was with the UA-cam video of this playing in the background and it handled it flawlessly .
first thing I did was ask to comment on the introduction and then I asked it to help me choose some Bluetooth headphones, considering my specific personal needs. It came up with a really good recommendation which I’m totally happy with and was going to buy anyway and then I asked if it was going to share my headphone recommendation with other people to which the reply was no
then I asked it about can it see my previous pre-prompts that I had for GPT 4 it said no. so I told her what my pre-prompts were it said it would remember them for future conversations with me
amazing absolutely amazing
I also told that I’m in Australia and to use Australian spelling and it said it will in future in all interactions with me
@jkimo1178 24 дні тому ⁺¹
Did you notice the AI was already looking (at the table) before he said to “look at me and what emotion am I displaying.”
@yagoa 24 дні тому
the "breakthrough" is making it super addictive
@pipoviola 25 днів тому
They output the audio to another device... I wonder how it'll behave when the phone is the one answering, the microphones will struggle to pick up the user voice

Наступне

Автоматичне відтворення

GPT4o: 11 STUNNING Use Cases and Full Breakdown